Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIR] Add RLTrainer interface, implementation, and examples #23465

Merged
merged 32 commits into from
Apr 9, 2022

Conversation

krfricke
Copy link
Contributor

@krfricke krfricke commented Mar 24, 2022

Why are these changes needed?

This PR adds a RLTrainer to Ray AIR. It works for both offline and online use cases. In offline training, it will leverage the datasets key of the Trainer API to specify a dataset reader input, used e.g. in Behavioral Cloning (BC). In online training, it is a wrapper around the rllib trainables making use of the parameter layering enabled by the Trainer API.

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

python/ray/ml/config.py Outdated Show resolved Hide resolved
@@ -4,7 +4,7 @@
from ray import tune
from ray.ml.train.integrations.tensorflow import TensorflowTrainer

from ray.ml.examples.tensorflow.tensorflow_mnist_example import train_func
from ray.ml.examples.tf.tensorflow_mnist_example import train_func
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reason for this rename is that otherwise the examples won't work if executed from the examples working dir (as we try to import tensorflow as tf - and it will use the local directory then)

Copy link
Member

@gjoliver gjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty good for RLlib stuff. 2 minor questions.


trainer = RLTrainer(
run_config=RunConfig(stop={"training_iteration": 5}),
scaling_config={
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for being out-of-touch with the latest AIR api, why do we have RunConfig, but scaling_config is a plain dict?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's currently the case - agree it's a bit inconsistent here. We should probably at least accept dicts for runconfig, too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, ScalingConfig is soon to be a dataclass as well. After we (I) make the change so that Tune can construct a search space from it.

python/ray/ml/examples/rllib_example.py Outdated Show resolved Hide resolved
@richardliaw
Copy link
Contributor

@krfricke Can you add a PR description?

python/ray/ml/config.py Outdated Show resolved Hide resolved

trainer = RLTrainer(
run_config=RunConfig(stop={"training_iteration": 5}),
scaling_config={
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, ScalingConfig is soon to be a dataclass as well. After we (I) make the change so that Tune can construct a search space from it.

python/ray/ml/config.py Outdated Show resolved Hide resolved
python/ray/ml/train/integrations/rllib/rl_trainer.py Outdated Show resolved Hide resolved
python/ray/ml/train/integrations/rllib/rl_trainer.py Outdated Show resolved Hide resolved
python/ray/tune/trial_runner.py Outdated Show resolved Hide resolved
# Conflicts:
#	python/ray/tune/trial_runner.py
@krfricke krfricke changed the title [AIR/wip] Rllib offline trainer [AIR] Rllib offline trainer Apr 4, 2022
python/ray/ml/BUILD Show resolved Hide resolved
python/ray/ml/train/integrations/rl/rl_trainer.py Outdated Show resolved Hide resolved
python/ray/ml/examples/rl_example.py Show resolved Hide resolved
python/ray/ml/train/integrations/rl/rl_trainer.py Outdated Show resolved Hide resolved
python/ray/ml/train/integrations/rl/rl_trainer.py Outdated Show resolved Hide resolved
python/ray/ml/train/integrations/rl/rl_trainer.py Outdated Show resolved Hide resolved

return config

def training_loop(self) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me think that we should reconsider our interface for Trainer. Currently we make the assumption that training_loop will be overridden by all subclasses (hence why it is an abstract method), but is not the case here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this is just a thought- it should not block this PR!)

@amogkam amogkam self-assigned this Apr 7, 2022
@richardliaw richardliaw added this to the Ray AIR milestone Apr 8, 2022
@krfricke krfricke changed the title [AIR] Rllib offline trainer [AIR] Add RLTrainer interface, implementation, and examples Apr 9, 2022
@krfricke krfricke merged commit 8c2e471 into ray-project:master Apr 9, 2022
@krfricke krfricke deleted the ml/rllib-trainer branch April 9, 2022 00:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants