Bugfix: Allow Nesting of Sync/Async VectorEnvs #2104

lebrice · 2020-11-20T19:54:00Z

Allows for the nesting of Async/Sync VectorEnvs

2-level nesting of any combination of SyncVectorEnv/AsyncVectorEnv (i.e. Sync/Sync, Sync/Async, Async/Sync, Async/Async*)
Tested with the "CubeCrash-v0" and "CartPole-v0" environments

*: When nesting Async/Async environments, only the innermost env can have daemon=True, since daemonic processes cannot have children.

Somewhat related to #2072 (based on a suggestion from @tristandeleu ). A new Wrapper could also be eventually be introduced in another PR to "unchunk" the observations/actions/rewards/dones, (flatten the first two batch dimensions).

There currently aren't any tests specifically about the handling of Tuple/Dict/non-standard observation/action spaces. I could add some, if those cases aren't already covered by test_[sync/async]_vector_env.py.

Signed-off-by: Fabrice Normandin fabrice.normandin@gmail.com

jkterry1 · 2021-07-30T21:09:59Z

Reviewer: @vwxyzjn

vwxyzjn · 2021-08-05T01:33:27Z

@lebrice thanks for contributing the PR. Would you mind fixing the linting error? Also, would you mind giving a short code sample to test it out? Maybe it's just me but I had a hard time understanding the partial(...) related code in test_vector_env.py.

jkterry1 · 2021-09-25T18:11:46Z

@lebrice could you please fix tests?

jkterry1 · 2021-09-28T18:31:23Z

@lebrice and fix the merge conflicts?

lebrice · 2021-09-30T02:26:39Z

Yes, will do! Currently in the rush for ICLR.

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

pseudo-rnd-thoughts · 2022-04-17T16:49:21Z

@lebrice Do you have any plans for this PR?

lebrice · 2022-04-17T18:27:03Z

Hey @pseudo-rnd-thoughts!
Yeah I just rebased + pushed, so I consider this as ready for review again.
If you want to take your revenge and roast my PR, be my guest! :P

pseudo-rnd-thoughts · 2022-04-20T08:59:53Z

Thanks @lebrice could you fix the lint issue.
The PR looks fine, I can’t see any issues however could you explain an application and use case of this PR?
Plus why should this be in gym and not in a separate project that provides this extension?

lebrice · 2022-04-21T03:10:46Z

Could you explain an application and use case of this PR?

This can be used to do Chunking.

Chunking means having multiple sequential operations within each asynchronous process. This is useful in reducing the overhead of multiprocessing, and allowing users to use large batch sizes, even with a limited number of cores. In this case here, this means being able to use a very large number of vectorized environments.

For example, in my experience, with only a small number of cpus, e.g. 4 cpus, laptops start struggling really hard when using pure AsyncVectorEnvs with batch sizes >~ 32, even for simple envs like CartPole, since that involves creating 32 python processes.
With this fix, you can instead create just one worker process per CPU, i.e. an AsyncVectorEnv with num_envs = 4, where each worker is stepping through a SyncVectorEnv with a num_envs of 8.

Plus why should this be in gym and not in a separate project that provides this extension?

Because this is more of a bugfix than an extension, and it's an improvement to the existing components in gym, not a new component.

Hope that clears it up, thanks for taking a look!

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

lebrice · 2022-04-25T21:51:09Z

Hey @pseudo-rnd-thoughts @vwxyzjn @jkterry1 there seems to be a bug in the test setup here: https://github.com/openai/gym/runs/6114261693?check_suite_focus=true#step:4:96

Looks like a pygame error in an unrelated test, due to there not being a display? Not sure why it would have passed for python 3.8 and not python 3.9 though..

Otherwise I think this is good to go on my end.

pseudo-rnd-thoughts · 2022-04-25T22:32:47Z

tests/vector/test_vector_env.py

+
+    assert batch_size(env.action_space) == n_outer_envs
+
+    with env:


Why are we using environment as a context manager here?

So it gets closed and the resources are freed.
This isn't absolutely necessary, since the env going out of scope should have the same effect.
I like it since it's more explicit, and also made it easier for me to spot if there were some errors when closing the nested AsyncVectorEnvs.

Ok, that makes sense. My issue is that it is a code style that we do not follow anywhere else in the tests so could you just have a env.close() at the end of the script to make it simpler for anyone looking at the tests

Sure, no problem.
Fixed in f47596f .

However, just for the record, I disagree:
I think having to remember to close the env at the end of the test is ugly, error-prone, and also unnecessary, since in all cases the env is closed when it gets out of scope. In my opinion, the only tests where an env should be closed explicitly are 1) the tests related to closing the envs, and 2) tests that require creating a temporary environment just to check spaces (as in the case here).

For example, consider this:
https://github.com/lebrice/gym/blob/49ee20904ac3a4a1dba3020d1ebd11076848f376/tests/vector/test_vector_env.py#L52-L54

pseudo-rnd-thoughts · 2022-04-25T22:35:43Z

@lebrice that is a strange bug as I can't replicate it locally, neither from terminal or docker environment.
Have you merged with the master? This bug has not occurred to anyone else, could you make a minor change to rerun the CI and see if that effects it

lebrice · 2022-04-26T04:23:15Z

I believe you can re-run the workflows that failed from the Actions view @pseudo-rnd-thoughts . This would be better IMO than me making an empty commit.
If you can't see a "re-run failed workflows" button on the top right, then let me know and I'll push a little commit.

pseudo-rnd-thoughts · 2022-04-26T07:21:04Z

tests/vector/test_vector_env.py

 import numpy as np
 import pytest

-from gym.spaces import Tuple
+from gym import Space, spaces


Could you remove spaces as I dont think you should need to import the module and add have the following line

fixed in 49ee20904ac3a4a1dba3020d1ebd11076848f376

pseudo-rnd-thoughts · 2022-04-26T13:45:37Z

@lebrice Additionally, could you update the gym documentation to explain this update
https://github.com/Farama-Foundation/gym-docs

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

lebrice · 2022-04-26T21:08:35Z

Hey @pseudo-rnd-thoughts. Sure, I guess we could mention that this is now possible (or that the bug has been fixed, depending on your perspective) in the "advanced" portion of the VectorEnv API documentation.

However, I think it's probably best that we hold off until #2072, so that we don't encourage users to use chunking manually, and instead properly document the new VectorEnv subclass in the docs.

Does that sounds reasonable?

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

pseudo-rnd-thoughts · 2022-04-26T21:19:27Z

Hi @lebrice, that sounds reasonable to me. It is just at some point, we need to update the vector API for these changes

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

lebrice · 2022-04-27T18:47:27Z

Hey @pseudo-rnd-thoughts can we run the workflows again?

pseudo-rnd-thoughts · 2022-04-27T19:52:22Z

Hey, @lebrice, will do, fyi it will probably be at the weekend when this gets fully merged I hope as @RedTachyon is the other reviewer for this PR and is at a conference all week.

lebrice · 2022-04-28T20:16:05Z

No rush, thanks @pseudo-rnd-thoughts !

RedTachyon · 2022-05-01T23:00:04Z

Can you explain exactly the interaction between this and #2072? As I understand, this PR would enable an underlying mechanism for the env batching, and then we would want the users to directly use the solution implemented in #2072?

RedTachyon · 2022-05-01T23:05:33Z

gym/vector/async_vector_env.py

@@ -635,6 +635,16 @@ def _worker(index, env_fn, pipe, parent_pipe, shared_memory, error_queue):
    assert shared_memory is None
    env = env_fn()
    parent_pipe.close()
+
+    def step_fn(actions):


Why are we defining this function? It doesn't seem to be called anywhere. If it's actually used, it'd need a type hint and some comments/docstring

Fixed in fcebd76

Oops. Actually fixed in 15208e3

RedTachyon · 2022-05-01T23:19:00Z

gym/vector/async_vector_env.py

+                if isinstance(env, VectorEnv):
+                    # VectorEnvs take care of resetting the envs that are done.
+                    pass
+                elif done:


Would it maybe be better to just do a single condition if done and isinstance(env, Env) or if done and not isinstance(env, VectorEnv)? (not sure which would be clearer, but the pass gave me a bit of a pause since it's rarely present in released code, and it felt more like something unfinished)

Fixed in 055087b

RedTachyon · 2022-05-01T23:20:51Z

gym/vector/sync_vector_env.py

-        self._rewards = np.zeros((self.num_envs,), dtype=np.float64)
-        self._dones = np.zeros((self.num_envs,), dtype=np.bool_)
+        shape = (self.num_envs,)
+        if isinstance(self.envs[0].unwrapped, VectorEnv):


Is it ever possible that this condition would be different for self.envs[0] and self.envs[1]? Also, do we actually need to unwrap it? (I'm not sure off the top of my head how wrappers interact with VectorEnvs)

I dont see how this would only be true for some envs.. I mean, it would mean that they are passing a mix of Envs and VectorEnvs to a VectorEnv, which I can't imagine being useful..

Yeah unwrapping it is necessary, since most wrappers can work the same way for both envs and VectorEns.

RedTachyon · 2022-05-01T23:23:17Z

gym/vector/async_vector_env.py

+                if isinstance(env, VectorEnv):
+                    # VectorEnvs take care of resetting the envs that are done.
+                    pass
+                elif done:
                    info["terminal_observation"] = observation


Do we actually need this with the AutoReset wrapper introduced recently? And wouldn't it cause a redundant double reset in some cases?

Yeah this is still necessary, since the if done check doesn't work with VectorEnvs. This check is there so we don't reset the env when the episode is done (as is currently still done in the case of a single env here).
Not sure if/how the AutoReset wrapper relates to this.
Lmk if that wasn't clear.

Fixed in 055087b

RedTachyon · 2022-05-01T23:23:42Z

tests/vector/test_vector_env.py

@@ -58,3 +62,113 @@ def test_custom_space_vector_env():

    assert isinstance(env.single_action_space, CustomSpace)
    assert isinstance(env.action_space, Tuple)
+
+
+@pytest.mark.parametrize("base_env", ["Pendulum-v1", "CartPole-v1"])


Why parametrize only over the two envs instead of all of them?

You mean over all gym envs?
I mean, sure, I'm all for it, but the current vectorenv tests (i.e. the only test above) is only using CartPole-v1

Parametrized the test with all classic_control + toy_text envs in 9c0e308

Hey @RedTachyon is this good? (testing with all envs where should_skipp_env_spec_for_test = False)?
As-is, the parametrization of this test generates 579 tests, which take about 1:20 to run on my end (with pytest xdist and 4 parallel workers). I think this will probably take something like 5 minutes to run on GitHub, depending on the machine's hardware.

(here's a link for that function btw: https://github.com/lebrice/gym/blob/e913bc81b83b43ae8ca9b3a02c981b74d31017ea/tests/envs/spec_list.py#L20)

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

pseudo-rnd-thoughts · 2022-05-17T20:33:18Z

@lebrice Could you fix the CI issues?

lebrice · 2022-05-18T00:54:34Z

Hey there @pseudo-rnd-thoughts ! Yes, will do!

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

lebrice · 2022-05-18T21:20:18Z

Hey @pseudo-rnd-thoughts I think I fixed everything on my end, but I'd need your approval so the CI hooks can run.

pseudo-rnd-thoughts · 2022-05-19T10:32:35Z

@lebrice There still seems to be an issue with the PR plus looking at the runtime, the CI has jumped from around 1 to 2 minutes to 10+ minutes. Could you investigate why your PR is causing the time to increase

lebrice · 2022-05-19T16:16:17Z

@lebrice There still seems to be an issue with the PR plus looking at the runtime, the CI has jumped from around 1 to 2 minutes to 10+ minutes. Could you investigate why your PR is causing the time to increase

Yes that's what I was mentioning here:

#2104 (comment)

I must have misinterpreted what he meant by "using all of them" (gym envs)? In any case, I'd be happy to turn it down, if that's not what he meant.

pseudo-rnd-thoughts · 2022-05-19T19:09:55Z

Sorry, I hadn't seen that comment. Personally, I think that is overkill and we could test with a single env type, just to see if it works. I will message @RedTachyon about it

pseudo-rnd-thoughts · 2022-05-20T13:46:28Z

@lebrice In the meantime, could you fix the CI so it works with the current tests as the number of environments can be modified easily

jkterry1 added the Needs Reviewer label Jul 30, 2021

tristandeleu mentioned this pull request Jul 31, 2021

Discussion of Vector Environment API Breaking Changes #2279

Open

jkterry1 removed the Needs Reviewer label Jul 31, 2021

lebrice force-pushed the bugfix-nesting-vectorenvs branch from 7c2d346 to 53772da Compare August 3, 2021 15:15

lebrice added 5 commits April 4, 2022 23:06

Bugfix: Allow nesting of Sync/Async VectorEnvs

8af640b

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Add support for wrapped inner VectorEnvs

20aac89

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Minor change in test for nesting of VectorEnvs

5f5abcf

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Make test a bit clearer

02af0da

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Update tests a bit and fix bugs with bool(array)

846d026

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

lebrice force-pushed the bugfix-nesting-vectorenvs branch from bcc5c54 to 846d026 Compare April 5, 2022 03:22

Fix pre-commit issues

332d55b

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

lebrice mentioned this pull request Apr 25, 2022

[gym.vector] Add BatchedVectorEnv, (chunking + flexible n_envs) #2072

Closed

pseudo-rnd-thoughts reviewed Apr 25, 2022

View reviewed changes

pseudo-rnd-thoughts reviewed Apr 26, 2022

View reviewed changes

Remove unused imports

49ee209

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

Remove with-block from test

f47596f

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

Fix pre-commit flake8 errors

5f2e339

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

RedTachyon reviewed May 1, 2022

View reviewed changes

lebrice added 7 commits May 2, 2022 13:45

Remove unused step_fn in async_vector_env.py

fcebd76

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

Remove if block with pass statement

055087b

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

Parametrize with all envs, make test more robust

9c0e308

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

Fix warnings in test

c9f4d50

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

Remove other unused step_fn

15208e3

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

Reuse should_skip_env_for_test logic

c9cfe2e

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

Minor touch-ups

b08c2b4

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

lebrice added 3 commits May 18, 2022 17:07

Fix typos and type errors in test util function

5791789

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

Merge branch 'master' into bugfix-nesting-vectorenvs

3a42311

Fix error in test, make quicker, fix pre-commit

c62cceb

Signed-off-by: Fabrice Normandin <normandf@mila.quebec>

jkterry1 closed this May 23, 2022


		assert batch_size(env.action_space) == n_outer_envs

		with env:

Bugfix: Allow Nesting of Sync/Async VectorEnvs #2104

Bugfix: Allow Nesting of Sync/Async VectorEnvs #2104

Conversation

lebrice commented Nov 20, 2020 • edited Loading

jkterry1 commented Jul 30, 2021

vwxyzjn commented Aug 5, 2021

jkterry1 commented Sep 25, 2021

jkterry1 commented Sep 28, 2021

lebrice commented Sep 30, 2021

pseudo-rnd-thoughts commented Apr 17, 2022

lebrice commented Apr 17, 2022

pseudo-rnd-thoughts commented Apr 20, 2022

lebrice commented Apr 21, 2022 • edited Loading

lebrice commented Apr 25, 2022 • edited Loading

Choose a reason for hiding this comment

lebrice Apr 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lebrice Apr 26, 2022 • edited Loading

Choose a reason for hiding this comment

pseudo-rnd-thoughts commented Apr 25, 2022

lebrice commented Apr 26, 2022

pseudo-rnd-thoughts Apr 26, 2022 • edited Loading

Choose a reason for hiding this comment

lebrice Apr 26, 2022 • edited Loading

Choose a reason for hiding this comment

pseudo-rnd-thoughts commented Apr 26, 2022

lebrice commented Apr 26, 2022

pseudo-rnd-thoughts commented Apr 26, 2022

lebrice commented Apr 27, 2022

pseudo-rnd-thoughts commented Apr 27, 2022

lebrice commented Apr 28, 2022

RedTachyon commented May 1, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lebrice May 2, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lebrice May 2, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pseudo-rnd-thoughts commented May 17, 2022

lebrice commented May 18, 2022

lebrice commented May 18, 2022

pseudo-rnd-thoughts commented May 19, 2022

lebrice commented May 19, 2022

pseudo-rnd-thoughts commented May 19, 2022

pseudo-rnd-thoughts commented May 20, 2022

lebrice commented Nov 20, 2020 •

edited

Loading

lebrice commented Apr 21, 2022 •

edited

Loading

lebrice commented Apr 25, 2022 •

edited

Loading

lebrice Apr 26, 2022 •

edited

Loading

lebrice Apr 26, 2022 •

edited

Loading

pseudo-rnd-thoughts Apr 26, 2022 •

edited

Loading

lebrice Apr 26, 2022 •

edited

Loading

lebrice May 2, 2022 •

edited

Loading

lebrice May 2, 2022 •

edited

Loading