[AIR] Add RLTrainer interface, implementation, and examples #23465

krfricke · 2022-03-24T15:27:07Z

Why are these changes needed?

This PR adds a RLTrainer to Ray AIR. It works for both offline and online use cases. In offline training, it will leverage the datasets key of the Trainer API to specify a dataset reader input, used e.g. in Behavioral Cloning (BC). In online training, it is a wrapper around the rllib trainables making use of the parameter layering enabled by the Trainer API.

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

rllib/offline/dataset_reader.py

python/ray/ml/config.py

krfricke · 2022-03-29T23:28:56Z

python/ray/ml/examples/tf/tune_tensorflow_mnist_example.py

@@ -4,7 +4,7 @@
 from ray import tune
 from ray.ml.train.integrations.tensorflow import TensorflowTrainer

-from ray.ml.examples.tensorflow.tensorflow_mnist_example import train_func
+from ray.ml.examples.tf.tensorflow_mnist_example import train_func


reason for this rename is that otherwise the examples won't work if executed from the examples working dir (as we try to import tensorflow as tf - and it will use the local directory then)

gjoliver

looks pretty good for RLlib stuff. 2 minor questions.

gjoliver · 2022-03-30T04:58:51Z

python/ray/ml/examples/rllib_example.py

+
+    trainer = RLTrainer(
+        run_config=RunConfig(stop={"training_iteration": 5}),
+        scaling_config={


sorry for being out-of-touch with the latest AIR api, why do we have RunConfig, but scaling_config is a plain dict?

Yeah that's currently the case - agree it's a bit inconsistent here. We should probably at least accept dicts for runconfig, too

FYI, ScalingConfig is soon to be a dataclass as well. After we (I) make the change so that Tune can construct a search space from it.

python/ray/ml/examples/rllib_example.py

richardliaw · 2022-03-30T16:34:23Z

@krfricke Can you add a PR description?

python/ray/ml/config.py

xwjiang2010 · 2022-03-30T16:59:23Z

python/ray/ml/examples/rllib_example.py

+
+    trainer = RLTrainer(
+        run_config=RunConfig(stop={"training_iteration": 5}),
+        scaling_config={


FYI, ScalingConfig is soon to be a dataclass as well. After we (I) make the change so that Tune can construct a search space from it.

python/ray/ml/config.py

python/ray/ml/train/integrations/rllib/rl_trainer.py

python/ray/tune/trial_runner.py

# Conflicts: # python/ray/tune/trial_runner.py

# Conflicts: # python/ray/tune/tune.py

python/ray/ml/BUILD

python/ray/ml/train/integrations/rl/rl_trainer.py

python/ray/ml/examples/rl_example.py

# Conflicts: # python/ray/ml/config.py

python/ray/ml/train/integrations/rl/rl_trainer.py

amogkam · 2022-04-07T16:55:49Z

python/ray/ml/train/integrations/rl/rl_trainer.py

+
+        return config
+
+    def training_loop(self) -> None:


This makes me think that we should reconsider our interface for Trainer. Currently we make the assumption that training_loop will be overridden by all subclasses (hence why it is an abstract method), but is not the case here.

(this is just a thought- it should not block this PR!)

# Conflicts: # python/ray/tune/impl/tuner_internal.py

Kai Fricke added 4 commits March 24, 2022 11:47

wip

2958411

wip

d56514a

wip

d7dd789

[AIR/wip] RLlib offline trainer

a51b964

krfricke assigned sven1977 Mar 24, 2022

Kai Fricke added 2 commits March 24, 2022 22:52

Make work

55a77f7

Use datasets arg

6dfbe69

gjoliver reviewed Mar 25, 2022

View reviewed changes

rllib/offline/dataset_reader.py Outdated Show resolved Hide resolved

Kai Fricke added 2 commits March 29, 2022 08:58

Merge branch 'master' into ml/rllib-trainer

5a944a7

Add online example, add stopper attribute to run config

12033e2

krfricke marked this pull request as ready for review March 29, 2022 23:27

krfricke requested review from sven1977 and avnishn as code owners March 29, 2022 23:27

krfricke requested a review from xwjiang2010 March 29, 2022 23:27

krfricke assigned gjoliver, avnishn and xwjiang2010 Mar 29, 2022

krfricke requested a review from matthewdeng March 29, 2022 23:27

krfricke assigned matthewdeng and richardliaw Mar 29, 2022

krfricke requested a review from richardliaw March 29, 2022 23:27

krfricke commented Mar 29, 2022

View reviewed changes

python/ray/ml/config.py Show resolved Hide resolved

krfricke commented Mar 29, 2022

View reviewed changes

python/ray/ml/config.py Outdated Show resolved Hide resolved

krfricke commented Mar 29, 2022

View reviewed changes

gjoliver approved these changes Mar 30, 2022

View reviewed changes

Kai Fricke added 2 commits March 30, 2022 09:27

Merge remote-tracking branch 'upstream/master' into ml/rllib-trainer

955660a

Rename bc --> ppo

fc99db8

xwjiang2010 reviewed Mar 30, 2022

View reviewed changes

Merge branch 'master' into ml/rllib-trainer

2e33db0

# Conflicts: # python/ray/tune/trial_runner.py

krfricke changed the title ~~[AIR/wip] Rllib offline trainer~~ [AIR] Rllib offline trainer Apr 4, 2022

Kai Fricke added 3 commits April 4, 2022 12:32

Merge branch 'master' into ml/rllib-trainer

3df5454

# Conflicts: # python/ray/tune/tune.py

Merge branch 'master' into ml/rllib-trainer

fcce55f

Update BUILD

8686f1c

xwjiang2010 reviewed Apr 4, 2022

View reviewed changes

python/ray/ml/BUILD Show resolved Hide resolved

python/ray/ml/train/integrations/rl/rl_trainer.py Outdated Show resolved Hide resolved

python/ray/ml/examples/rl_example.py Show resolved Hide resolved

Kai Fricke added 10 commits April 5, 2022 08:05

Add offline end to end test

08b6e16

Fix arg

4ac41f8

Just one worker

353386b

Rename parameter

e084847

Default config

98c77fe

Merge branch 'master' into ml/rllib-trainer

969adee

# Conflicts: # python/ray/ml/config.py

Merge branch 'master' into ml/rllib-trainer

eb7077c

Fix config import

dfe4c33

Merge remote-tracking branch 'upstream/master' into ml/rllib-trainer

71a2a5b

Fix init call

f4b34ac

krfricke requested review from ArturNiederfahrenhorst and smorad as code owners April 7, 2022 15:46

xwjiang2010 approved these changes Apr 7, 2022

View reviewed changes

amogkam reviewed Apr 7, 2022

View reviewed changes

Fix init call

1fe3e8b

amogkam self-assigned this Apr 7, 2022

Merge branch 'master' into ml/rllib-trainer

fb68a47

richardliaw added this to the Ray AIR milestone Apr 8, 2022

Kai Fricke added 4 commits April 7, 2022 22:23

Fix config merging

a8c7498

FIx offline data generation

988a720

Fix naming, tensorflow example dir

1e8bfc1

Merge remote-tracking branch 'upstream/master' into ml/rllib-trainer

4a1ee2b

# Conflicts: # python/ray/tune/impl/tuner_internal.py

krfricke changed the title ~~[AIR] Rllib offline trainer~~ [AIR] Add RLTrainer interface, implementation, and examples Apr 9, 2022

krfricke merged commit 8c2e471 into ray-project:master Apr 9, 2022

krfricke deleted the ml/rllib-trainer branch April 9, 2022 00:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AIR] Add RLTrainer interface, implementation, and examples #23465

[AIR] Add RLTrainer interface, implementation, and examples #23465

krfricke commented Mar 24, 2022 •

edited

krfricke Mar 29, 2022

gjoliver left a comment

gjoliver Mar 30, 2022

krfricke Mar 30, 2022

xwjiang2010 Mar 30, 2022

richardliaw commented Mar 30, 2022

xwjiang2010 Mar 30, 2022

amogkam Apr 7, 2022

amogkam Apr 8, 2022

[AIR] Add RLTrainer interface, implementation, and examples #23465

[AIR] Add RLTrainer interface, implementation, and examples #23465

Conversation

krfricke commented Mar 24, 2022 • edited

Why are these changes needed?

Related issue number

Checks

krfricke Mar 29, 2022

Choose a reason for hiding this comment

gjoliver left a comment

Choose a reason for hiding this comment

gjoliver Mar 30, 2022

Choose a reason for hiding this comment

krfricke Mar 30, 2022

Choose a reason for hiding this comment

xwjiang2010 Mar 30, 2022

Choose a reason for hiding this comment

richardliaw commented Mar 30, 2022

xwjiang2010 Mar 30, 2022

Choose a reason for hiding this comment

amogkam Apr 7, 2022

Choose a reason for hiding this comment

amogkam Apr 8, 2022

Choose a reason for hiding this comment

krfricke commented Mar 24, 2022 •

edited