[RLlib] Headnode without GPU triggers torch/CUDA de-serialization error

### What happened + What you expected to happen

Running the following will result in an error if we use a CPU-node as a headnode.
I didn't test for other algorithms or versions but assume that the issue exists there, too.
The issue is that the MetricsLogger sometimes returns torch tensors that are on GPU and torch serializes and de-serializes to the same device type.

Error:

> File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/rllib/algorithms/algorithm.py", line 2865, in get_state
>     state[COMPONENT_LEARNER_GROUP] = self.learner_group.get_state(
>                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/rllib/core/learner/learner_group.py", line 481, in get_state
>     state[COMPONENT_LEARNER] = self._get_results(results)[0]
>                                ^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/rllib/core/learner/learner_group.py", line 632, in _get_results
>     raise result_or_error
>   File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/rllib/utils/actor_manager.py", line 861, in _fetch_result
>     result = ray.get(ready)
>              ^^^^^^^^^^^^^^
>   File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper
>     return fn(*args, **kwargs)
>            ^^^^^^^^^^^^^^^^^^^
>   File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
>     return func(*args, **kwargs)
>            ^^^^^^^^^^^^^^^^^^^^^
>   File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/_private/worker.py", line 2822, in get
>     values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
>                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/_private/worker.py", line 932, in get_objects
>     raise value
> ray.exceptions.RaySystemError: System error: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

### Versions / Dependencies

Ray 2.46

### Reproduction script


`
from ray.rllib.algorithms import appo

alg_config = (
    appo.APPOConfig()
    .api_stack(
        enable_rl_module_and_learner=True,
        enable_env_runner_and_connector_v2=True,
    )
    .environment(
        env="CartPole-v1",
        disable_env_checking=True,
    )
    .learners(
        num_learners=1,
        num_gpus_per_learner=1,
    )
    .reporting(min_time_s_per_iteration=1)
)

algo = alg_config.build()

algo.train()

state = algo.get_state()
`

### Issue Severity

None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RLlib] Headnode without GPU triggers torch/CUDA de-serialization error #53467

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RLlib] Headnode without GPU triggers torch/CUDA de-serialization error #53467

Description

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions