Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Make actions sent by RLlib to the env immutable. #24262

Merged
merged 4 commits into from
Apr 29, 2022

Conversation

simonsays1980
Copy link
Collaborator

@simonsays1980 simonsays1980 commented Apr 27, 2022

Why are these changes needed?

As actions could be mutated by users in the environment step() function and produces hardly traceable errors a better solution is to warn users not to mutate the actions (directly). This PR gives a solution by setting the numpy flag WRITEABLE to False in the _env_runner() function whn calling the sampler.

The last PR I sent throwed many errors in tests due to the fact that actions can be of different type depending on the action space. This PR includes now immutability for any type of action using tree.map_structure() and the MappingProxyType from Python to make dicts immutable.

Related issue number

#23890

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@simonsays1980
Copy link
Collaborator Author

@gjoliver

As said the errors before were due to the fact that actions from different action spaces could of non-numpy type. In this case in has to be accounted for the structure. This PR should now work for any action structure, as MappingProxyType makes dicts immutable, setflags(write=Fale) deal with the numpy.ndarrays and Tuple and other basic types like int, float, etc. are immutable by nature.

@@ -1250,6 +1250,20 @@ def _process_policy_eval_results(
episode._set_last_action(agent_id, action)

assert agent_id not in actions_to_send[env_id]
# Flag actions as immutable to notify the user when trying to change it
# and to avoid hardly traceable errors.
def make_action_immutable(obj):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks awesome. what do you think we move this into rllib/utils/numpy.py?
I will try to reuse it for some connectors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks awesome. what do you think we move this into rllib/utils/numpy.py? I will try to reuse it for some connectors.

@gjoliver Thanks! Yes I can do this. Regarding the connectors the dataclass decorator could also become interesting as it makes classes immutable by the __hash__() function - I was thinking about using it here, but it was an overload

… to tree.traverse as the former function does not include the containing object, but only the contained ones.
Copy link
Member

@gjoliver gjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, awesome tech :)

@sven1977
Copy link
Contributor

Awesome PR @simonsays1980 ! Thanks for going the extra mile here after our concerns about the first "deepcopy" solution.
Really appreciate this. Looks super neat with the tests and everything.

@sven1977 sven1977 changed the title Make action immutable [RLlib] Make actions sent by RLlib to the env immutable. Apr 29, 2022
@sven1977 sven1977 merged commit ff575ee into ray-project:master Apr 29, 2022
@krfricke krfricke mentioned this pull request Apr 29, 2022
6 tasks
krfricke added a commit that referenced this pull request Apr 29, 2022
#24262 broke linting. This fixes this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants