-
Notifications
You must be signed in to change notification settings - Fork 391
[BugFix] ActionMask is compatible with composite action specs #3022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3022
Note: Links to docs will display an error until the docs builds have been completed. ❌ 5 New Failures, 21 Pending, 22 Unrelated FailuresAs of commit 32c62e4 with merge base 5a13341 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
>>> env = TransformedEnv(base_env, ActionMask()) | ||
>>> r = env.rollout(10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already in the lines above.
This transform is useful to ensure that randomly generated actions | ||
respect legal actions, by masking the action specs. | ||
It reads the mask from the input tensordict after the step is executed, | ||
and adapts the mask of the finite action spec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took the occasion to refine a bit the docstring.
action_spec = self.container.full_action_spec | ||
keys = self.container.action_keys | ||
if len(keys) == 1: | ||
action_spec = action_spec[keys[0]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh I could not really see why this check was in place. I hope I did not miss any chesterton fence!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good, there is some confusion in the existing code about in_keys and action_keys but this seems to be aligned with what we want.
We should maybe have a validation step when we register a transform in an env. For example here, when we set the container to the transform, we should check that the action_key
arg is part of the action_keys of the parent env.
action_spec = self.action_spec | ||
mask = tensordict.get(self.in_keys[1], None) | ||
if mask is not None: | ||
mask = mask.to(action_spec.device) | ||
action_spec.update_mask(mask) | ||
|
||
# TODO: Check that this makes sense | ||
with _set_missing_tolerance(self, True): | ||
tensordict_reset = self._call(tensordict_reset) | ||
return tensordict_reset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this entire logic is already handled by _call
630b39d
to
32c62e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks!
Description
ActionMask
was not compatible with composite action specs because of the lines:that afaiu do not serve a real purpose.
This PR removes this check, and allows tu use
ActionMask
to mask one factor of a composite action spec.Types of changes
What types of changes does your code introduce? Remove all that do not apply:
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!