Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLLib] Document how to change Algorithm configuration when restoring a checkpoint #40777

Open
kronion opened this issue Oct 30, 2023 · 2 comments
Assignees
Labels
docs An issue or change related to documentation P2 Important issue, but not time-critical rllib RLlib related issues

Comments

@kronion
Copy link

kronion commented Oct 30, 2023

Description

I'm trying to restore an RLLib algorithm from a checkpoint and change the configuration before resuming training. My main objective is to change the number of rollout workers between runs, but I may need to adjust other configuration details as well, e.g. env config. I assume this is possible, but I can't find any specific documentation, and the obvious approaches don't seem to work.

For example, this doesn't work:

# If I don't enable eager execution manually, restoring the checkpoint fails
import tensorflow as tf
tf.compat.v1.enable_eager_execution()


from ray import tune
from ray.rllib.algorithms import ppo

...

    ppo_config = (
        ppo.PPOConfig()
            .rl_module(_enable_rl_module_api=False)
            .environment(env=Env, env_config=env_config)
            .framework(framework="tf2", eager_tracing=True)
            .rollouts(**rollout_config)
            .training(**training_config, _enable_learner_api=False)
            .resources(**resources_config)
    )
    tuner = tune.Tuner(
        ppo.PPO,
        param_space=ppo_config,
    )

    restore_path = input()
    if restore_path:
            algo = ppo.PPO.from_checkpoint(restore_path)
            tuner = tune.Tuner(
                algo,
                param_space=ppo_config,
            )

    tuner.fit()

If I restore a checkpoint from a training session with 5 rollout workers, the new session will also have 5 rollout workers, regardless of what I pass in as param_space.

I also considered the Tuner.restore() API, like this:

tuner = tune.Tuner.restore(restore_path, ppo.PPO, resume_errored=True, param_space=ppo_config)

But the docs specifically say that changing the param_space is unsupported: https://docs.ray.io/en/master/tune/api/doc/ray.tune.Tuner.restore.html#ray-tune-tuner-restore

The closest thing I could find was here in the Tune FAQ: https://docs.ray.io/en/latest/tune/faq.html#how-can-i-continue-training-a-completed-tune-experiment-for-longer-and-with-new-configurations-iterative-experimentation

But it's not clear how to apply this to an RLLib Algorithm. It isn't obvious how to extract an AlgorithmConfig from a checkpoint, modify it, and then build a new Algorithm instance.

Assuming there's a pattern for how to modify the config, it would be great to add to the documentation. If this isn't actually possible, I think it would be an important feature to add.

Link

No response

@kronion kronion added docs An issue or change related to documentation triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Oct 30, 2023
@anyscalesam anyscalesam added the rllib RLlib related issues label Oct 30, 2023
@sven1977 sven1977 self-assigned this Nov 1, 2023
@angelinalg
Copy link
Contributor

Defering to eng to determine final priority. It seems like a P1, to me.

@sven1977 sven1977 added P2 Important issue, but not time-critical and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Nov 15, 2023
@Finebouche
Copy link

This would be really usefull

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs An issue or change related to documentation P2 Important issue, but not time-critical rllib RLlib related issues
Projects
None yet
Development

No branches or pull requests

5 participants