Upgrade gym #613

ycheng517 · 2022-04-27T04:08:05Z

I have marked all applicable categories:
- exception-raising fix
- algorithm implementation fix
- documentation modification
- new feature
I have reformatted the code using make format (required)
I have checked the code using make commit-checks (required)
If applicable, I have mentioned the relevant/related issue(s)
If applicable, I have listed every items in this Pull Request below

fixes some deprecation warnings due to new changes in gym version 0.23:

use env.np_random.integers instead of env.np_random.randint
support seed and return_info arguments for reset (addresses Setting seed, return_info, options for reset #605)

codecov-commenter · 2022-04-27T04:24:44Z

Codecov Report

Merging #613 (edfcbcf) into master (aba2d01) will decrease coverage by 0.44%.
The diff coverage is 76.76%.

@@            Coverage Diff             @@
##           master     #613      +/-   ##
==========================================
- Coverage   93.69%   93.25%   -0.45%     
==========================================
  Files          72       72              
  Lines        4805     4890      +85     
==========================================
+ Hits         4502     4560      +58     
- Misses        303      330      +27

Flag	Coverage Δ
unittests	`93.25% <76.76%> (-0.45%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
tianshou/env/worker/ray.py	`76.59% <50.00%> (-8.78%)`	⬇️
tianshou/env/worker/dummy.py	`76.47% <53.84%> (-12.82%)`	⬇️
tianshou/env/worker/subproc.py	`87.02% <65.11%> (-7.06%)`	⬇️
tianshou/env/worker/base.py	`64.81% <66.66%> (+1.85%)`	⬆️
tianshou/env/pettingzoo_env.py	`90.00% <77.77%> (-1.49%)`	⬇️
tianshou/env/venv_wrappers.py	`85.71% <83.33%> (+0.52%)`	⬆️
tianshou/env/venvs.py	`94.07% <92.85%> (-0.33%)`	⬇️
tianshou/data/collector.py	`94.24% <100.00%> (+0.42%)`	⬆️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

Trinkle23897 · 2022-04-27T13:49:26Z

I'll take a look later today. Thanks for the contribution anyway!

cc @ultmaster

test/modelbased/test_ppo_icm.py

setup.py

tianshou/env/pettingzoo_env.py

tianshou/env/worker/subproc.py

This reverts commit 558a5ea.

ycheng517 · 2022-05-13T05:25:24Z

@Trinkle23897 can you please take another look at this PR? I've implemented your suggested changes and also made reset support return_info and seed in the venvs.

Trinkle23897 · 2022-05-14T13:42:00Z

tianshou/env/venvs.py

+            self.workers[i].send(None, **kwargs)
+        ret_list = [self.workers[i].recv() for i in id]
+
+        if "return_info" in kwargs and kwargs["return_info"]:
+            obs_list = [r[0] for r in ret_list]
+        else:
+            obs_list = ret_list


Is it possible to check it by type? Or do you have any other better approach?

if isinstance(ret_list[0], (tuple, list)) and len(ret_list[0]) == 2 and isinstance(ret_list[0][1], dict): # return obs, info # return obs

I know it's a little bit confusing since the observation may also be a tuple. However, I personally don't like this gym's API. Usually the user won't change the environment return data type during the whole process, so the return_info argument should be placed in __init__ instead of reset function.

From this point of view, envpool uses gym_reset_return_info option in initialization. However, this method cannot support envpool. If possible, would you please add a test case for envpool? Though it only affects the venv wrapper part.

P.S. gym will finally remove this argument in reset function.

Also, we don't support tuple observation space and recommend using dict space. This is quite a strong assumption you can use here. If anyone uses tuple observation and meet exception, we can raise the corresponding hint for them by saying something like please change your observation space from tuple to array or dict space

I think checking by type is a workable approach, adopted it in 364b46e .
Also added test for envpool, and added exception for using tuple observation in the same commit

Trinkle23897 · 2022-05-21T14:02:45Z

tianshou/env/venvs.py

+        has_infos = isinstance(ret_list[0], (tuple, list)) and len(
+            ret_list[0]
+        ) == 2 and isinstance(ret_list[0][1], dict)
+        if has_infos:


Can we use self.has_infos to reduce checking overhead for all classes?

if self.has_info is None: self.has_info = isinstance(ret_list[0], (tuple, list)) and len( ret_list[0] ) == 2 and isinstance(ret_list[0][1], dict) if self.has_info: ... else: ...

implemented in 662a68a. (although I did if hasattr(self, "reset_returns_info"), as mypy complains about typing in a bunch of places if I do self.has_info is None). A warning that with this change, users may experience a surprise if they call reset with return_info set to True and then False, but maybe this isn't likely to happen.

Have you ever tested with a full training pipeline?

There are some options:

venv.reset only returns obs, no change to collector

vecv.reset only returns (obs, info), change collector to adapt

venv.reset can return either (obs) or (obs, info), collector needs to handle both cases

I personally favor the second approach.

Have you ever tested with a full training pipeline?

You mean run it with one of the scripts in the examples folder? Haven't done that yet but can do.

I'm not sure if the second approach would work in all cases, since there could be some environments that don't return info along with obs right now. It would also be jumping ahead of the planned Gym API changes. If the 3rd option also sound good with you, I can implement that.

I mean if there's no info, we attach an empty dict. This can save a lot of code compared with 3.

If the 3rd option also sound good with you, I can implement that.

I'm ok with it.

added support to collector for option 3 in 75ecd18

Also I ran python examples/box2d/lunarlander_dqn.py and it works fine

Trinkle23897 · 2022-05-26T11:56:43Z

tianshou/data/collector.py

-            obs = self.preprocess_fn(obs=obs,
-                                     env_id=np.arange(self.env_num)).get("obs", obs)
+        if self.gym_reset_return_info:
+            obs, info = self.env.reset(return_info=True)


I think you cannot do this, at least envpool will fail

see my other comment just now #613 (comment) . With my proposal we shouldn't run into this issue. Open to other ideas as well.

should be fixed now

Trinkle23897 · 2022-05-26T11:58:38Z

tianshou/data/collector.py

+    def _reset_env_with_ids(
+        self, local_ids: Union[List[int], np.ndarray], global_ids: Union[List[int],


Could you please add a test to check the correctness of info storage order?

added some checks to info storage in test_collector in c2ff71f , hopefully that's what you were looking for.

tianshou/data/collector.py

ultmaster · 2022-06-11T23:40:28Z

tianshou/data/collector.py

    def collect(
        self,
        n_step: Optional[int] = None,
        n_episode: Optional[int] = None,
        random: bool = False,
        render: Optional[float] = None,
        no_grad: bool = True,
+        gym_reset_kwargs: Optional[Dict[str, Any]] = None,


Any chance this can be a callable? Sometimes I want different reset kwargs for different environments, e.g., different seeds.

I'd like to just get this PR wrapped up, and introduce additional functionality in future PRs

tianshou/env/worker/subproc.py

Trinkle23897

LGTM

Trinkle23897 · 2022-06-27T21:46:33Z

BTW it's better to update the docs to show how to use this return_info feature, but we can do it in the following PR.

fixes some deprecation warnings due to new changes in gym version 0.23: - use `env.np_random.integers` instead of `env.np_random.randint` - support `seed` and `return_info` arguments for reset (addresses thu-ml#605)

Yifei Cheng added 3 commits April 25, 2022 23:25

upgrade version of cartpole

558a5ea

np_random.randint --> np_random.integers

bbbcd81

try to use env.reset(seed=seed) instead of env.seed(seed)

c296427

Yifei Cheng and others added 2 commits April 27, 2022 09:23

commit checks

95d6d87

Merge branch 'master' into upgrade-gym

f0c5945

Trinkle23897 linked an issue Apr 27, 2022 that may be closed by this pull request

Setting seed, return_info, options for reset #605

Closed

8 tasks

jkterry1 mentioned this pull request Apr 27, 2022

Setting seed, return_info, options for reset #605

Closed

8 tasks

Trinkle23897 reviewed Apr 28, 2022

View reviewed changes

test/modelbased/test_ppo_icm.py Outdated Show resolved Hide resolved

setup.py Show resolved Hide resolved

tianshou/env/pettingzoo_env.py Outdated Show resolved Hide resolved

ultmaster reviewed Apr 29, 2022

View reviewed changes

tianshou/env/worker/subproc.py Outdated Show resolved Hide resolved

ycheng517 added 8 commits May 10, 2022 22:34

Merge branch 'master' into upgrade-gym

5812300

Revert "upgrade version of cartpole"

b1ceefd

This reverts commit 558a5ea.

fix merge error

4eae57c

make venvs and env workers to support reset()->[obs, info]

61358d5

clean up

fcc2cfc

add test case for reset with optional kwargs

2563439

satisfy checks

b82eb0e

pettingzoo reset supports return_info

d644280

ycheng517 requested a review from Trinkle23897 May 13, 2022 05:24

ycheng517 added 2 commits May 13, 2022 01:39

small addition

8ffd633

fix mypy

861a1ba

Trinkle23897 reviewed May 14, 2022

View reviewed changes

Trinkle23897 and others added 5 commits May 14, 2022 09:47

Merge branch 'master' into upgrade-gym

2e7485e

return info based on the return type of env.reset

364b46e

switch tuple observation Exception to TypeError

d99b5ae

remove debug prints

88d865a

Merge branch 'master' into upgrade-gym

c5c7246

Trinkle23897 reviewed May 21, 2022

View reviewed changes

ycheng517 added 2 commits May 21, 2022 23:11

check reset_returns_info once

662a68a

support reset returns info in collector

75ecd18

Trinkle23897 reviewed May 26, 2022

View reviewed changes

tianshou/data/collector.py Outdated Show resolved Hide resolved

ycheng517 and others added 9 commits June 6, 2022 01:56

dynamically check reset retval in collector

c2ff71f

bump gym version to 0.23.1 and fix mypy

12cf50f

fix lint check

6c60d53

undo changes to test_sac_with_il

fe06182

doc formatting

144e88a

Merge branch 'master' into upgrade-gym

f34cfec

undo changes to test_sac_with_il.py

5d4b9a0

undo changes to test/continuous/test_sac_with_il.py

f5eef9c

test/continuous/test_sac_with_il.py

be7148a

ultmaster reviewed Jun 11, 2022

View reviewed changes

ycheng517 added 2 commits June 21, 2022 22:02

Merge remote-tracking branch 'origin/master' into upgrade-gym

784f749

undo caching reset_return_info in subproc

edfcbcf

Trinkle23897 approved these changes Jun 27, 2022

View reviewed changes

Trinkle23897 merged commit 43792bf into thu-ml:master Jun 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade gym #613

Upgrade gym #613

ycheng517 commented Apr 27, 2022 •

edited

Loading

codecov-commenter commented Apr 27, 2022 •

edited

Loading

Trinkle23897 commented Apr 27, 2022 •

edited

Loading

ycheng517 commented May 13, 2022

Trinkle23897 May 14, 2022 •

edited

Loading

Trinkle23897 May 14, 2022

ycheng517 May 21, 2022

Trinkle23897 May 21, 2022 •

edited

Loading

ycheng517 May 22, 2022 •

edited

Loading

Trinkle23897 May 22, 2022 •

edited

Loading

ycheng517 May 24, 2022 •

edited

Loading

Trinkle23897 May 24, 2022 •

edited

Loading

Trinkle23897 May 24, 2022

ycheng517 May 25, 2022

Trinkle23897 May 26, 2022

ycheng517 Jun 1, 2022

ycheng517 Jun 6, 2022

Trinkle23897 May 26, 2022

ycheng517 Jun 1, 2022

ycheng517 Jun 6, 2022

ultmaster Jun 11, 2022

ycheng517 Jun 22, 2022

Trinkle23897 left a comment

Trinkle23897 commented Jun 27, 2022

		def _reset_env_with_ids(
		self, local_ids: Union[List[int], np.ndarray], global_ids: Union[List[int],

Upgrade gym #613

Upgrade gym #613

Conversation

ycheng517 commented Apr 27, 2022 • edited Loading

codecov-commenter commented Apr 27, 2022 • edited Loading

Codecov Report

Trinkle23897 commented Apr 27, 2022 • edited Loading

ycheng517 commented May 13, 2022

Trinkle23897 May 14, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Trinkle23897 May 21, 2022 • edited Loading

Choose a reason for hiding this comment

ycheng517 May 22, 2022 • edited Loading

Choose a reason for hiding this comment

Trinkle23897 May 22, 2022 • edited Loading

Choose a reason for hiding this comment

ycheng517 May 24, 2022 • edited Loading

Choose a reason for hiding this comment

Trinkle23897 May 24, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Trinkle23897 left a comment

Choose a reason for hiding this comment

Trinkle23897 commented Jun 27, 2022

ycheng517 commented Apr 27, 2022 •

edited

Loading

codecov-commenter commented Apr 27, 2022 •

edited

Loading

Trinkle23897 commented Apr 27, 2022 •

edited

Loading

Trinkle23897 May 14, 2022 •

edited

Loading

Trinkle23897 May 21, 2022 •

edited

Loading

ycheng517 May 22, 2022 •

edited

Loading

Trinkle23897 May 22, 2022 •

edited

Loading

ycheng517 May 24, 2022 •

edited

Loading

Trinkle23897 May 24, 2022 •

edited

Loading