-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bypass out-of-sync Gym registry in SubprocVecEnv by resolving EnvSpec #160
Conversation
Confirmed that the new test fails before this PR. |
7431fb4
to
f527f0f
Compare
Just to flag I'm still not sure about this particular explanation, although I agree with the overall conclusion that it's a bad interaction between Admittedly this doesn't explain why it works in Ray local mode. My guess is that the key difference is that when using Ray the worker is executing a pickled function rather than native code. This plausibly would cause a difference in how the helper method is pickled and sent to the (We probably don't need to get to the bottom of this if the Also if you haven't seen it https://codewithoutrules.com/2018/09/04/python-multiprocessing/ is a fun discussion of issues with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the explanation in the comment. Otherwise LGTM.
src/imitation/util/util.py
Outdated
# Previously, we directly called `gym.make(env_name)`. | ||
# | ||
# That direct approach was problematic (especially in combination with Ray) | ||
# because the forkserver from which subprocesses are forked might have been |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure this is what's going on? It's fine as a hypothesis but I don't want to immortalize in a comment something we're not sure about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm 80% confident the forkserver
has a very minimal state, since:
- The Python
multiprocessing
docs say "No unnecessary resources are inherited". - The code seems to use
spawn
to start the forkserver, which starts a new Python process. - This blog states " Note that children retain a copy of the forkserver state. This state is intended to be relatively simple, but it is possible to adjust this through the multiprocess API through the set_forkserver_preload() method."
With that said, there does seem to be some logic to preload modules (by default, I think, the __main__
module i.e. the entrypoint to the script). So IIUC, forkserver
is intended to execute the code of the parent process (i.e. all imports), but should not execute an if __name__ == '__main__'
-guarded block (Python docs explicitly state need to have this guard).
Seems plausible that starting forkserver
in a Ray worker messes with the autodetection of what to import.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can't get to the bottom of this, may be better to leave the comment vague and cite this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that we should be vague about the cause in the comments and note the PR, I've made these changes.
Codecov Report
@@ Coverage Diff @@
## master #160 +/- ##
=========================================
Coverage ? 87.64%
=========================================
Files ? 64
Lines ? 4542
Branches ? 0
=========================================
Hits ? 3981
Misses ? 561
Partials ? 0
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two minor suggestions: expanding documentation and removing unused argument
src/imitation/util/util.py
Outdated
@@ -49,15 +49,30 @@ def make_vec_env(env_name: str, | |||
max_episode_steps: If specified, wraps VecEnv in TimeLimit wrapper with | |||
this episode length before returning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this episode length before returning. | |
this episode length before returning. Otherwise, defaults to `max_episode_steps` for `env_name` in the Gym registry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expanded this comment a bit more in 902fe96. Wanted to note that the gym registry total timesteps thing is default behavior for gym.make
.
Co-Authored-By: Adam Gleave <adam@gleave.me>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, can merge once CI passes.
We ran into an unusual problem when using
imitation.scripts.parallel
to parallelize training of our custom ant environment. The subprocesses created byimitation.util.make_vec_env
reported that all our custom environments weren't registered, but when we ranimitation.scripts.parallel
. During debuggins, we found that it didn't help to forcibly reregister the environments in scope of the stack trace before the inner worker function to SubprocVecEnv. (Even forcibly reregistering environments insidemake_vec_env
itself did not suffice)The most likely explanation is that the Python forkserver used by
SubprocVecEnv
was somehow prematurely initialized byray
so that new processes forked from the forkserver didn't include our custom environments inside the Gym registry. (edit: We are actually quite unsure if this is actually the cause of the bug. THis is just a hypothesis)h/t: @AdamGleave for the idea to resolve the custom environment's
EnvSpec
before initializing theSubprocVecEnv
.