[RLLib] Document how to change Algorithm configuration when restoring a checkpoint #40777

kronion · 2023-10-30T01:51:55Z

Description

I'm trying to restore an RLLib algorithm from a checkpoint and change the configuration before resuming training. My main objective is to change the number of rollout workers between runs, but I may need to adjust other configuration details as well, e.g. env config. I assume this is possible, but I can't find any specific documentation, and the obvious approaches don't seem to work.

For example, this doesn't work:

# If I don't enable eager execution manually, restoring the checkpoint fails
import tensorflow as tf
tf.compat.v1.enable_eager_execution()


from ray import tune
from ray.rllib.algorithms import ppo

...

    ppo_config = (
        ppo.PPOConfig()
            .rl_module(_enable_rl_module_api=False)
            .environment(env=Env, env_config=env_config)
            .framework(framework="tf2", eager_tracing=True)
            .rollouts(**rollout_config)
            .training(**training_config, _enable_learner_api=False)
            .resources(**resources_config)
    )
    tuner = tune.Tuner(
        ppo.PPO,
        param_space=ppo_config,
    )

    restore_path = input()
    if restore_path:
            algo = ppo.PPO.from_checkpoint(restore_path)
            tuner = tune.Tuner(
                algo,
                param_space=ppo_config,
            )

    tuner.fit()

If I restore a checkpoint from a training session with 5 rollout workers, the new session will also have 5 rollout workers, regardless of what I pass in as param_space.

I also considered the Tuner.restore() API, like this:

tuner = tune.Tuner.restore(restore_path, ppo.PPO, resume_errored=True, param_space=ppo_config)

But the docs specifically say that changing the param_space is unsupported: https://docs.ray.io/en/master/tune/api/doc/ray.tune.Tuner.restore.html#ray-tune-tuner-restore

The closest thing I could find was here in the Tune FAQ: https://docs.ray.io/en/latest/tune/faq.html#how-can-i-continue-training-a-completed-tune-experiment-for-longer-and-with-new-configurations-iterative-experimentation

But it's not clear how to apply this to an RLLib Algorithm. It isn't obvious how to extract an AlgorithmConfig from a checkpoint, modify it, and then build a new Algorithm instance.

Assuming there's a pattern for how to modify the config, it would be great to add to the documentation. If this isn't actually possible, I think it would be an important feature to add.

Link

No response

The text was updated successfully, but these errors were encountered:

angelinalg · 2023-11-07T17:52:08Z

Defering to eng to determine final priority. It seems like a P1, to me.

Finebouche · 2023-12-01T01:25:05Z

This would be really usefull

kronion added docs An issue or change related to documentation triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Oct 30, 2023

anyscalesam added the rllib RLlib related issues label Oct 30, 2023

sven1977 self-assigned this Nov 1, 2023

sven1977 added P2 Important issue, but not time-critical and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Nov 15, 2023

Finebouche mentioned this issue Dec 1, 2023

Fails restoring weights #41508

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLLib] Document how to change Algorithm configuration when restoring a checkpoint #40777

[RLLib] Document how to change Algorithm configuration when restoring a checkpoint #40777

kronion commented Oct 30, 2023

angelinalg commented Nov 7, 2023

Finebouche commented Dec 1, 2023

[RLLib] Document how to change Algorithm configuration when restoring a checkpoint #40777

[RLLib] Document how to change Algorithm configuration when restoring a checkpoint #40777

Comments

kronion commented Oct 30, 2023

Description

Link

angelinalg commented Nov 7, 2023

Finebouche commented Dec 1, 2023