Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: Allow Nesting of Sync/Async VectorEnvs #2104

Closed
wants to merge 19 commits into from

Conversation

lebrice
Copy link

@lebrice lebrice commented Nov 20, 2020

Allows for the nesting of Async/Sync VectorEnvs

  • 2-level nesting of any combination of SyncVectorEnv/AsyncVectorEnv (i.e. Sync/Sync, Sync/Async, Async/Sync, Async/Async*)
  • Tested with the "CubeCrash-v0" and "CartPole-v0" environments

*: When nesting Async/Async environments, only the innermost env can have daemon=True, since daemonic processes cannot have children.

Somewhat related to #2072 (based on a suggestion from @tristandeleu ). A new Wrapper could also be eventually be introduced in another PR to "unchunk" the observations/actions/rewards/dones, (flatten the first two batch dimensions).

There currently aren't any tests specifically about the handling of Tuple/Dict/non-standard observation/action spaces. I could add some, if those cases aren't already covered by test_[sync/async]_vector_env.py.

Signed-off-by: Fabrice Normandin fabrice.normandin@gmail.com

@jkterry1
Copy link
Collaborator

Reviewer: @vwxyzjn

@vwxyzjn
Copy link
Contributor

vwxyzjn commented Aug 5, 2021

@lebrice thanks for contributing the PR. Would you mind fixing the linting error? Also, would you mind giving a short code sample to test it out? Maybe it's just me but I had a hard time understanding the partial(...) related code in test_vector_env.py.

@jkterry1
Copy link
Collaborator

@lebrice could you please fix tests?

@jkterry1
Copy link
Collaborator

@lebrice and fix the merge conflicts?

@lebrice
Copy link
Author

lebrice commented Sep 30, 2021

Yes, will do! Currently in the rush for ICLR.

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>
Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>
Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>
Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>
Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>
@pseudo-rnd-thoughts
Copy link
Contributor

@lebrice Do you have any plans for this PR?

@lebrice
Copy link
Author

lebrice commented Apr 17, 2022

Hey @pseudo-rnd-thoughts!
Yeah I just rebased + pushed, so I consider this as ready for review again.
If you want to take your revenge and roast my PR, be my guest! :P

@pseudo-rnd-thoughts
Copy link
Contributor

Thanks @lebrice could you fix the lint issue.
The PR looks fine, I can’t see any issues however could you explain an application and use case of this PR?
Plus why should this be in gym and not in a separate project that provides this extension?

@lebrice
Copy link
Author

lebrice commented Apr 21, 2022

Could you explain an application and use case of this PR?

This can be used to do Chunking.

Chunking means having multiple sequential operations within each asynchronous process. This is useful in reducing the overhead of multiprocessing, and allowing users to use large batch sizes, even with a limited number of cores. In this case here, this means being able to use a very large number of vectorized environments.

For example, in my experience, with only a small number of cpus, e.g. 4 cpus, laptops start struggling really hard when using pure AsyncVectorEnvs with batch sizes >~ 32, even for simple envs like CartPole, since that involves creating 32 python processes.
With this fix, you can instead create just one worker process per CPU, i.e. an AsyncVectorEnv with num_envs = 4, where each worker is stepping through a SyncVectorEnv with a num_envs of 8.

Plus why should this be in gym and not in a separate project that provides this extension?

Because this is more of a bugfix than an extension, and it's an improvement to the existing components in gym, not a new component.

Hope that clears it up, thanks for taking a look!

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>
@lebrice
Copy link
Author

lebrice commented Apr 25, 2022

Hey @pseudo-rnd-thoughts @vwxyzjn @jkterry1 there seems to be a bug in the test setup here: https://github.com/openai/gym/runs/6114261693?check_suite_focus=true#step:4:96

Looks like a pygame error in an unrelated test, due to there not being a display? Not sure why it would have passed for python 3.8 and not python 3.9 though..

Otherwise I think this is good to go on my end.


assert batch_size(env.action_space) == n_outer_envs

with env:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using environment as a context manager here?

Copy link
Author

@lebrice lebrice Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it gets closed and the resources are freed.
This isn't absolutely necessary, since the env going out of scope should have the same effect.
I like it since it's more explicit, and also made it easier for me to spot if there were some errors when closing the nested AsyncVectorEnvs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that makes sense. My issue is that it is a code style that we do not follow anywhere else in the tests so could you just have a env.close() at the end of the script to make it simpler for anyone looking at the tests

Copy link
Author

@lebrice lebrice Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, no problem.
Fixed in f47596f .

However, just for the record, I disagree:
I think having to remember to close the env at the end of the test is ugly, error-prone, and also unnecessary, since in all cases the env is closed when it gets out of scope. In my opinion, the only tests where an env should be closed explicitly are 1) the tests related to closing the envs, and 2) tests that require creating a temporary environment just to check spaces (as in the case here).

For example, consider this:
https://github.com/lebrice/gym/blob/49ee20904ac3a4a1dba3020d1ebd11076848f376/tests/vector/test_vector_env.py#L52-L54

@pseudo-rnd-thoughts
Copy link
Contributor

@lebrice that is a strange bug as I can't replicate it locally, neither from terminal or docker environment.
Have you merged with the master? This bug has not occurred to anyone else, could you make a minor change to rerun the CI and see if that effects it

@lebrice
Copy link
Author

lebrice commented Apr 26, 2022

I believe you can re-run the workflows that failed from the Actions view @pseudo-rnd-thoughts . This would be better IMO than me making an empty commit.
If you can't see a "re-run failed workflows" button on the top right, then let me know and I'll push a little commit.

import numpy as np
import pytest

from gym.spaces import Tuple
from gym import Space, spaces
Copy link
Contributor

@pseudo-rnd-thoughts pseudo-rnd-thoughts Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove spaces as I dont think you should need to import the module and add have the following line

Copy link
Author

@lebrice lebrice Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pseudo-rnd-thoughts
Copy link
Contributor

@lebrice Additionally, could you update the gym documentation to explain this update
https://github.com/Farama-Foundation/gym-docs

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>
@lebrice
Copy link
Author

lebrice commented Apr 26, 2022

Hey @pseudo-rnd-thoughts. Sure, I guess we could mention that this is now possible (or that the bug has been fixed, depending on your perspective) in the "advanced" portion of the VectorEnv API documentation.

However, I think it's probably best that we hold off until #2072, so that we don't encourage users to use chunking manually, and instead properly document the new VectorEnv subclass in the docs.

Does that sounds reasonable?

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>
@pseudo-rnd-thoughts
Copy link
Contributor

Hi @lebrice, that sounds reasonable to me. It is just at some point, we need to update the vector API for these changes

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>
@lebrice
Copy link
Author

lebrice commented Apr 27, 2022

Hey @pseudo-rnd-thoughts can we run the workflows again?

@pseudo-rnd-thoughts
Copy link
Contributor

Hey, @lebrice, will do, fyi it will probably be at the weekend when this gets fully merged I hope as @RedTachyon is the other reviewer for this PR and is at a conference all week.

@lebrice
Copy link
Author

lebrice commented Apr 28, 2022

No rush, thanks @pseudo-rnd-thoughts !

@RedTachyon
Copy link
Contributor

Can you explain exactly the interaction between this and #2072? As I understand, this PR would enable an underlying mechanism for the env batching, and then we would want the users to directly use the solution implemented in #2072?

@@ -635,6 +635,16 @@ def _worker(index, env_fn, pipe, parent_pipe, shared_memory, error_queue):
assert shared_memory is None
env = env_fn()
parent_pipe.close()

def step_fn(actions):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we defining this function? It doesn't seem to be called anywhere. If it's actually used, it'd need a type hint and some comments/docstring

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in fcebd76

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Actually fixed in 15208e3

if isinstance(env, VectorEnv):
# VectorEnvs take care of resetting the envs that are done.
pass
elif done:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it maybe be better to just do a single condition if done and isinstance(env, Env) or if done and not isinstance(env, VectorEnv)? (not sure which would be clearer, but the pass gave me a bit of a pause since it's rarely present in released code, and it felt more like something unfinished)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 055087b

self._rewards = np.zeros((self.num_envs,), dtype=np.float64)
self._dones = np.zeros((self.num_envs,), dtype=np.bool_)
shape = (self.num_envs,)
if isinstance(self.envs[0].unwrapped, VectorEnv):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ever possible that this condition would be different for self.envs[0] and self.envs[1]? Also, do we actually need to unwrap it? (I'm not sure off the top of my head how wrappers interact with VectorEnvs)

Copy link
Author

@lebrice lebrice May 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont see how this would only be true for some envs.. I mean, it would mean that they are passing a mix of Envs and VectorEnvs to a VectorEnv, which I can't imagine being useful..

Yeah unwrapping it is necessary, since most wrappers can work the same way for both envs and VectorEns.

if isinstance(env, VectorEnv):
# VectorEnvs take care of resetting the envs that are done.
pass
elif done:
info["terminal_observation"] = observation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need this with the AutoReset wrapper introduced recently? And wouldn't it cause a redundant double reset in some cases?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is still necessary, since the if done check doesn't work with VectorEnvs. This check is there so we don't reset the env when the episode is done (as is currently still done in the case of a single env here).
Not sure if/how the AutoReset wrapper relates to this.
Lmk if that wasn't clear.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 055087b

@@ -58,3 +62,113 @@ def test_custom_space_vector_env():

assert isinstance(env.single_action_space, CustomSpace)
assert isinstance(env.action_space, Tuple)


@pytest.mark.parametrize("base_env", ["Pendulum-v1", "CartPole-v1"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why parametrize only over the two envs instead of all of them?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean over all gym envs?
I mean, sure, I'm all for it, but the current vectorenv tests (i.e. the only test above) is only using CartPole-v1

Copy link
Author

@lebrice lebrice May 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parametrized the test with all classic_control + toy_text envs in 9c0e308

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @RedTachyon is this good? (testing with all envs where should_skipp_env_spec_for_test = False)?
As-is, the parametrization of this test generates 579 tests, which take about 1:20 to run on my end (with pytest xdist and 4 parallel workers). I think this will probably take something like 5 minutes to run on GitHub, depending on the machine's hardware.

(here's a link for that function btw: https://github.com/lebrice/gym/blob/e913bc81b83b43ae8ca9b3a02c981b74d31017ea/tests/envs/spec_list.py#L20)

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>
Signed-off-by: Fabrice Normandin <normandf@mila.quebec>
Signed-off-by: Fabrice Normandin <normandf@mila.quebec>
Signed-off-by: Fabrice Normandin <normandf@mila.quebec>
Signed-off-by: Fabrice Normandin <normandf@mila.quebec>
Signed-off-by: Fabrice Normandin <normandf@mila.quebec>
Signed-off-by: Fabrice Normandin <normandf@mila.quebec>
@pseudo-rnd-thoughts
Copy link
Contributor

@lebrice Could you fix the CI issues?

@lebrice
Copy link
Author

lebrice commented May 18, 2022

Hey there @pseudo-rnd-thoughts ! Yes, will do!

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>
Signed-off-by: Fabrice Normandin <normandf@mila.quebec>
@lebrice
Copy link
Author

lebrice commented May 18, 2022

Hey @pseudo-rnd-thoughts I think I fixed everything on my end, but I'd need your approval so the CI hooks can run.

@pseudo-rnd-thoughts
Copy link
Contributor

@lebrice There still seems to be an issue with the PR plus looking at the runtime, the CI has jumped from around 1 to 2 minutes to 10+ minutes. Could you investigate why your PR is causing the time to increase

@lebrice
Copy link
Author

lebrice commented May 19, 2022

@lebrice There still seems to be an issue with the PR plus looking at the runtime, the CI has jumped from around 1 to 2 minutes to 10+ minutes. Could you investigate why your PR is causing the time to increase

Yes that's what I was mentioning here:

#2104 (comment)

I must have misinterpreted what he meant by "using all of them" (gym envs)? In any case, I'd be happy to turn it down, if that's not what he meant.

@pseudo-rnd-thoughts
Copy link
Contributor

Sorry, I hadn't seen that comment. Personally, I think that is overkill and we could test with a single env type, just to see if it works. I will message @RedTachyon about it

@pseudo-rnd-thoughts
Copy link
Contributor

@lebrice In the meantime, could you fix the CI so it works with the current tests as the number of environments can be modified easily

@jkterry1 jkterry1 closed this May 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants