[RLlib] New ConnectorV2 API #06: Changes in SingleAgentEpisode & SingleAgentEnvRunner. #42296

sven1977 · 2024-01-10T12:20:40Z

This PR adds some changes to SingleAgentEpisode & SingleAgentEnvRunner:

SingleAgentEnvRunner now utilizes the user-configured EnvToModule and ModuleToEnv connector pipelines.
Hence, SingleAgentEnvRunner does NOT anymore:
- Have to deal with internal (RNN) states.
- Action sampling, logp, probs computations as these are all done automatically in the default ModuleToEnv pipeline.
Add set APIs to SingleAgentEpisode, such that custom connectors are able to manipulate an episode's data, e.g. for observation framestacking, reward clipping, etc..
New set API had to also be supported then by InfiniteLookbackBuffer, which sits at the core of all episode classes.
Updated test cases and added new ones for set APIs.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 · 2024-01-10T12:21:59Z

rllib/env/single_agent_env_runner.py

@@ -623,40 +559,6 @@ def stop(self):
        # Close our env object via gymnasium's API.
        self.env.close()

-    # TODO (sven): Replace by default "to-env" connector.


Not necessary anymore here in EnvRunner.

This is default ModuleToEnv connector behavior now.

…runner_support_connectors_06_small_changes_on_env_runner_and_episode

Signed-off-by: sven1977 <svenmika1977@gmail.com>

kouroshHakha · 2024-01-11T21:52:15Z

rllib/connectors/learner/default_learner_connector.py

+        # TODO (sven): Convert data to proper tensor formats, depending on framework
+        #  used by the RLModule. We cannot do this right now as the RLModule does NOT
+        #  know its own device. Only the Learner knows the device. Also, on the
+        #  EnvRunner side, we assume that it's always the CPU (even though one could
+        #  imagine a GPU-based EnvRunner + RLModule for sampling).
+        # if rl_module.framework == "torch":
+        #    data = convert_to_torch_tensor(data, device=??)
+        # elif rl_module.framework == "tf2":
+        #    data =


uncomment or remove?

It's a TODO on an open question with the possible code-solution commented out. I'll leave this in. We need to unify this behavior (numpy to tensor) for all connector types in the near future to not cause user confusion.
The blocker right now is the fact that an RLModule does not know its own device today (only Learners do (GPU or CPU) and EnvRunners assume they are always on the CPU). Thus, connectors have to means to perform this conversion step properly.

kouroshHakha · 2024-01-11T22:00:09Z

rllib/env/single_agent_env_runner.py

+                    rl_module=self.module,
+                    episodes=self._episodes,
+                    explore=explore,
+                    # persistent_data=None, #TODO


Is there a TODO here? What is the todo exactly?

kouroshHakha · 2024-01-11T22:00:36Z

rllib/env/single_agent_env_runner.py

+                    data=to_env,
+                    episodes=self._episodes,
+                    explore=explore,
+                    # persistent_data=None, #TODO


Signed-off-by: sven1977 <svenmika1977@gmail.com>

…runner_support_connectors_06_small_changes_on_env_runner_and_episode

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…runner_support_connectors_06_small_changes_on_env_runner_and_episode

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…ode & SingleAgentEnvRunner. (ray-project#42296)

wip

b34d330

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 requested review from avnishn, ArturNiederfahrenhorst, smorad, maxpumperla and kouroshHakha as code owners January 10, 2024 12:20

sven1977 assigned kouroshHakha Jan 10, 2024

sven1977 added rllib RLlib related issues rllib-newstack labels Jan 10, 2024

sven1977 commented Jan 10, 2024

View reviewed changes

sven1977 changed the title ~~[RLlib] EnvRunners support ConnectorV2 API: Changes in SingleAgentEpisode & SingleAgentEnvRunner.~~ [RLlib] New ConnectorV2 API #06: Changes in SingleAgentEpisode & SingleAgentEnvRunner. Jan 10, 2024

sven1977 added 3 commits January 10, 2024 18:21

Merge branch 'master' of https://github.com/ray-project/ray into env_…

be39b1e

…runner_support_connectors_06_small_changes_on_env_runner_and_episode

Merge branch 'master' of https://github.com/ray-project/ray into env_…

3bfd5d2

…runner_support_connectors_06_small_changes_on_env_runner_and_episode

fixes

3825475

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 requested review from richardliaw, ericl and edoakes as code owners January 11, 2024 11:09

sven1977 added 4 commits January 11, 2024 12:30

fixes

252f7a3

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fixes

e7bc45b

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fixes

e3ba317

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fixes

b3e7b85

Signed-off-by: sven1977 <svenmika1977@gmail.com>

kouroshHakha approved these changes Jan 11, 2024

View reviewed changes

sven1977 added 6 commits January 12, 2024 10:27

fixes

8d4ba0a

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into env_…

ca3782f

…runner_support_connectors_06_small_changes_on_env_runner_and_episode

fixes

90c91ba

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fixes

4605dcb

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into env_…

3583e80

…runner_support_connectors_06_small_changes_on_env_runner_and_episode

fixes

af25485

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 merged commit 6da2636 into ray-project:master Jan 12, 2024
9 checks passed

vickytsang pushed a commit to ROCm/ray that referenced this pull request Jan 12, 2024

[RLlib] New ConnectorV2 API ray-project#6: Changes in SingleAgentEpis…

4d5cc2e

…ode & SingleAgentEnvRunner. (ray-project#42296)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] New ConnectorV2 API #06: Changes in SingleAgentEpisode & SingleAgentEnvRunner. #42296

[RLlib] New ConnectorV2 API #06: Changes in SingleAgentEpisode & SingleAgentEnvRunner. #42296

sven1977 commented Jan 10, 2024 •

edited

sven1977 Jan 10, 2024

kouroshHakha Jan 11, 2024

sven1977 Jan 12, 2024

kouroshHakha Jan 11, 2024

kouroshHakha Jan 11, 2024

[RLlib] New ConnectorV2 API #06: Changes in SingleAgentEpisode & SingleAgentEnvRunner. #42296

[RLlib] New ConnectorV2 API #06: Changes in SingleAgentEpisode & SingleAgentEnvRunner. #42296

Conversation

sven1977 commented Jan 10, 2024 • edited

Why are these changes needed?

Related issue number

Checks

sven1977 Jan 10, 2024

Choose a reason for hiding this comment

kouroshHakha Jan 11, 2024

Choose a reason for hiding this comment

sven1977 Jan 12, 2024

Choose a reason for hiding this comment

kouroshHakha Jan 11, 2024

Choose a reason for hiding this comment

kouroshHakha Jan 11, 2024

Choose a reason for hiding this comment

sven1977 commented Jan 10, 2024 •

edited