Fix breaking changes in OpenAI gym regarding step and reset return signatures #334

MatteoH2O1999 · 2022-09-08T22:57:46Z

Should fix #333

MatteoH2O1999 · 2022-11-05T21:56:35Z

This should be compatible with the new gym changes from patch 0.26.0

MatteoH2O1999 · 2022-12-06T17:16:26Z

Hey @hsahovic,
This should now also fix #342

victor-cristino · 2022-12-25T20:39:44Z

Hi @MatteoH2O1999,
I'm having errors using your branch FixSeed. I changed those four commited files and I get this error while trying to python rl_with_new_open_ai_gym_wrapper.py

(.venv) victor@MacBook-Pro-de-Victor examples % python rl_with_new_open_ai_gym_wrapper.py
2022-12-25 21:23:53.669695: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
/Users/victor/Desktop/poke-env-victor/.venv/lib/python3.9/site-packages/gym/utils/env_checker.py:186: UserWarning: WARN: Official support for the seed function is dropped. Standard practice is to reset gym environments using env.reset(seed=<desired seed>)
logger.warn(
/Users/victor/Desktop/poke-env-victor/.venv/lib/python3.9/site-packages/gym/utils/env_checker.py:169: UserWarning: WARN: return_info is deprecated as an optional argument to reset. resetshould now always return obs, info where obs is an observation, and info is a dictionarycontaining additional information.
logger.warn(
Traceback (most recent call last):
File "/Users/victor/Desktop/poke-env-victor/examples/rl_with_new_open_ai_gym_wrapper.py", line 182, in
asyncio.get_event_loop().run_until_complete(main())
File "/Users/victor/opt/anaconda3/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
return future.result()
File "/Users/victor/Desktop/poke-env-victor/examples/rl_with_new_open_ai_gym_wrapper.py", line 79, in main
check_env(test_env)
File "/Users/victor/Desktop/poke-env-victor/.venv/lib/python3.9/site-packages/gym/utils/env_checker.py", line 297, in check_env
check_reset_return_type(env)
File "/Users/victor/Desktop/poke-env-victor/.venv/lib/python3.9/site-packages/gym/utils/env_checker.py", line 201, in check_reset_return_type
assert isinstance(
AssertionError: The result returned by env.reset() was not a tuple of the form (obs, info), where obs is a observation and info is a dictionary containing additional information. Actual type: <class 'numpy.ndarray'>

Any idea? Thank you.

Víctor

MatteoH2O1999 · 2022-12-25T23:49:58Z

@victor-cristino, the problem is an incopatibility between the library used in the example and the new Gym API. Try using the new wrapper when not testing the env and check if it changes.
To ensure we are both in the same situation, could you also please create a clean venv and run

(venv) pip install -r requirements.txt -r requirements-dev.txt -r examples/requirements.txt

victor-cristino · 2022-12-26T21:32:15Z

@victor-cristino, the problem is an incopatibility between the library used in the example and the new Gym API. Try using the new wrapper when not testing the env and check if it changes. To ensure we are both in the same situation, could you also please create a clean venv and run
(venv) pip install -r requirements.txt -r requirements-dev.txt -r examples/requirements.txt

I git cloned your branch FixSeed, got requirements done and run

(pyenv) victor@MacBook-Pro-de-Victor poke-env-matteo % python examples/rl_with_new_open_ai_gym_wrapper.py
2022-12-26 22:22:33.640091: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
File "/Users/victor/Desktop/poke-env-matteo/examples/rl_with_new_open_ai_gym_wrapper.py", line 15, in
from poke_env.player import (
ImportError: cannot import name 'wrap_for_old_gym_api' from 'poke_env.player' (/Users/victor/opt/anaconda3/envs/pyenv/lib/python3.9/site-packages/poke_env/player/init.py)

Seems to be a small problem about imports.

Víctor

MatteoH2O1999 · 2022-12-27T02:16:03Z

@victor-cristino you are not actually using the new version. With anaconda you have to update what is installed in your virtual env. Right now you have installed the base repo.
If you wish to use development versions of libraries, the preferred method is using virtualenv. Anaconda is more for Python users then it is for Python devs as its environments have a larger scope, while venvs are on a per-repo basis (and have less "moving parts").
In general you should do something like

git clone repo/to/use
git checkout branch/to/use
cd path/to/repo
python -m venv venv
source venv/bin/activate (on mac)
./venv/Scripts/activate (on windows)
python -m pip install --upgrade pip
pip install wheel
pip install -r requirements.txt
pip install -e .

from here, the dev version of the repo is installed in the virtual environment and you can use it

python examples/example.py

The -e means "editable", so if you update the repo, the package gets updates as well

victor-cristino · 2022-12-27T20:49:18Z

@victor-cristino you are not actually using the new version. With anaconda you have to update what is installed in your virtual env. Right now you have installed the base repo. If you wish to use development versions of libraries, the preferred method is using virtualenv. Anaconda is more for Python users then it is for Python devs as its environments have a larger scope, while venvs are on a per-repo basis (and have less "moving parts"). In general you should do something like
git clone repo/to/use
git checkout branch/to/use
cd path/to/repo
python -m venv venv
source venv/bin/activate (on mac)
./venv/Scripts/activate (on windows)
python -m pip install --upgrade pip
pip install wheel
pip install -r requirements.txt
pip install -e .
from here, the dev version of the repo is installed in the virtual environment and you can use it
python examples/example.py
The -e means "editable", so if you update the repo, the package gets updates as well

Hi @MatteoH2O1999 First of all thank you for your quick clean instructive replies. I followed your instructions and I could carry out my first training

(venv) victor@MacBook-Pro-de-Victor poke-env-matteo % python3.9 examples/rl_with_new_open_ai_gym_wrapper.py
2022-12-27 21:33:29.008726: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
/Users/victor/Desktop/poke-env-matteo/venv/lib/python3.9/site-packages/gym/utils/env_checker.py:186: UserWarning: WARN: Official support for the seed function is dropped. Standard practice is to reset gym environments using env.reset(seed=<desired seed>)
logger.warn(
/Users/victor/Desktop/poke-env-matteo/venv/lib/python3.9/site-packages/gym/utils/env_checker.py:169: UserWarning: WARN: return_info is deprecated as an optional argument to reset. resetshould now always return obs, info where obs is an observation, and info is a dictionarycontaining additional information.
logger.warn(
/Users/victor/Desktop/poke-env-matteo/venv/lib/python3.9/site-packages/gym/utils/passive_env_checker.py:233: DeprecationWarning: np.bool8 is a deprecated alias for np.bool_. (Deprecated NumPy 1.24)
if not isinstance(terminated, (bool, np.bool8)):
2022-12-27 21:33:59.845171: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-27 21:33:59.854589: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled
Training for 10000 steps ...
Interval 1 (0 steps performed)
/Users/victor/Desktop/poke-env-matteo/venv/lib/python3.9/site-packages/keras/engine/training_v1.py:2356: UserWarning: Model.state_updates will be removed in a future version. This property should not be used in TensorFlow 2.0, as updates are applied automatically.
updates=self.state_updates,
1/10000 [..............................] - ETA: 12:59 - reward: 0.0000 7/10000 [..............................] - ETA: 1:36 - reward: -0.043610000/10000 [==============================] - 145s 14ms/step - reward: 0.7293
done, took 144.719 seconds
Results against random player:
DQN Evaluation: 96 victories out of 100 episodes
Results against max base power player:
DQN Evaluation: 68 victories out of 100 episodes
Evaluation with included method: (24.119834780487803, (17.428497305421068, 35.677077047523596))
Cross evaluation of DQN with baselines:
SimpleRLPlayer 3 RandomPlayer 5 MaxBasePowerPlay 3 SimpleHeuristics 2
SimpleRLPlayer 3 0.98 0.82 0.06
RandomPlayer 5 0.02 0.12 0.04
MaxBasePowerPlay 3 0.18 0.88 0.06
SimpleHeuristics 2 0.94 0.96 0.94

(venv) victor@MacBook-Pro-de-Victor poke-env-matteo %

Eventually I am trying to develop a self play project for vgc (doubles). Actually I think examples/experimental-self-play is not working at the moment but that's another issue :) Thanks a lot.

Víctor

codecov · 2022-12-31T14:40:09Z

Codecov Report

Merging #334 (8f41500) into master (e367b20) will increase coverage by 0.10%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #334      +/-   ##
==========================================
+ Coverage   89.10%   89.20%   +0.10%     
==========================================
  Files          35       38       +3     
  Lines        3616     3475     -141     
==========================================
- Hits         3222     3100     -122     
+ Misses        394      375      -19

KenanRustamov · 2023-02-20T19:20:21Z

this fixes the issues with the example for me. However, only issue I get though is that for some reason, the async call to reset the environment never returns and the program hangs forever, I fixed this by removing a line of code in the "env = copy.copy(env)" on line 756 in the wrap_for_old_gym_api. Let me know if you get this issue too.

MatteoH2O1999 · 2023-02-20T20:20:36Z

Could you be a little more specific: which method of which class are you calling? With what parameters? What OS and Python version are you using?

MatteoH2O1999 · 2023-02-20T20:52:16Z

Also could we please move this to an isssue?

MatteoH2O1999 · 2023-02-20T20:53:06Z

@hsahovic I'm going to close this PR and open a new one in order to keep this branch working and allow the new rebased one to be merged. Check out #359

KenanRustamov · 2023-02-20T21:24:14Z

@MatteoH2O1999 Sure thing, in the rl_with_new_open_ai_gym_wrapper.py the .close() and .reset_env() functions called on an environment such as train_env or eval_env cause the program to hang no matter which inputs I give it. This only happens to environments that have been wrapped by the wrap_for_old_gym_api() function as it does not happen with the test_env in the example. Of course, the entire program does not work if this wrap is not applied as this is the point of this PR I assume.

For me, removing the line "env = copy.copy(env)" in the function wrap_for_old_gym_api() fixed the issue. I made no other changes. I am on windows 11, python 3.10.10 and here is my pip list:
Package Version

absl-py 1.4.0
alabaster 0.7.13
astunparse 1.6.3
async-generator 1.10
asynctest 0.13.0
attrs 22.2.0
Babel 2.11.0
black 22.6.0
bleach 6.0.0
cachetools 5.3.0
certifi 2022.12.7
cfgv 3.3.1
charset-normalizer 3.0.1
click 8.1.3
cloudpickle 2.2.1
colorama 0.4.6
coverage 7.1.0
dataclasses-json 0.5.7
distlib 0.3.6
docutils 0.18.1
exceptiongroup 1.1.0
filelock 3.9.0
flake8 6.0.0
flatbuffers 23.1.21
gast 0.4.0
google-auth 2.16.1
google-auth-oauthlib 0.4.6
google-pasta 0.2.0
grpcio 1.51.1
gym 0.26.2
gym-notices 0.0.8
h5py 3.8.0
identify 2.5.18
idna 3.4
imagesize 1.4.1
importlib-metadata 6.0.0
iniconfig 2.0.0
intervaltree 3.1.0
jaraco.classes 3.2.3
Jinja2 3.1.2
keras 2.10.0
Keras-Preprocessing 1.1.2
keras-rl2 1.0.5
keyring 23.13.1
libclang 15.0.6.1
libcst 0.4.9
Markdown 3.4.1
markdown-it-py 2.1.0
MarkupSafe 2.1.2
marshmallow 3.19.0
marshmallow-enum 1.5.1
mccabe 0.7.0
mdurl 0.1.2
more-itertools 9.0.0
mypy-extensions 1.0.0
nodeenv 1.7.0
numpy 1.24.2
oauthlib 3.2.2
opt-einsum 3.3.0
orjson 3.8.6
packaging 23.0
pathspec 0.11.0
pip 22.3.1
pkginfo 1.9.6
platformdirs 3.0.0
pluggy 1.0.0
pre-commit 3.0.4
protobuf 3.19.6
psutil 5.9.4
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycodestyle 2.10.0
pyflakes 3.0.1
Pygments 2.14.0
pyre-check 0.9.15
pyre-extensions 0.0.30
pytest 7.2.1
pytest-asyncio 0.20.3
pytest-cov 4.0.0
pytest-timeout 2.1.0
pytz 2022.7.1
pywin32-ctypes 0.2.0
PyYAML 6.0
readme-renderer 37.3
requests 2.28.2
requests-oauthlib 1.3.1
requests-toolbelt 0.10.1
rfc3986 2.0.0
rich 13.3.1
rsa 4.9
setuptools 65.5.0
six 1.16.0
snowballstemmer 2.2.0
sortedcontainers 2.4.0
Sphinx 6.1.3
sphinx-rtd-theme 1.2.0
sphinxcontrib-applehelp 1.0.4
sphinxcontrib-devhelp 1.0.2
sphinxcontrib-htmlhelp 2.0.1
sphinxcontrib-jquery 2.0.0
sphinxcontrib-jsmath 1.0.1
sphinxcontrib-qthelp 1.0.3
sphinxcontrib-serializinghtml 1.1.5
tabulate 0.9.0
tensorboard 2.10.1
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
tensorflow 2.10.1
tensorflow-estimator 2.10.0
tensorflow-io-gcs-filesystem 0.30.0
termcolor 2.2.0
TestSlide 2.7.0
tomli 2.0.1
twine 4.0.2
typeguard 2.13.3
typing_extensions 4.5.0
typing-inspect 0.8.0
urllib3 1.26.14
virtualenv 20.19.0
webencodings 0.5.1
websockets 10.4
Werkzeug 2.2.3
wheel 0.38.4
wrapt 1.14.1
zipp 3.14.0

I am running it in a virtual environment created with the exact specifications you had mentioned previously in VS Code. I am also running a local pokemon-showdown in the way it is described in the README. Let me know if I missed something, I am quite new to poke-env.

MatteoH2O1999 · 2023-02-21T04:08:27Z

@KenanRustamov , try now with the updated non-rebased FixSeed branch

KenanRustamov · 2023-02-26T18:17:35Z

@MatteoH2O1999 New branch works, seems like the changes you made to replace the .copy method fixed it. Thanks!

MatteoH2O1999 force-pushed the FixSeed branch 4 times, most recently from 18144c8 to 8c5c0dc Compare September 10, 2022 01:08

MatteoH2O1999 force-pushed the FixSeed branch 2 times, most recently from 1a7f522 to 6516b1e Compare November 5, 2022 21:43

MatteoH2O1999 changed the title ~~Fix seed in OpenAI API~~ Fix breaking changes in OpenAI gym regarding step and reset return signatures Nov 5, 2022

MatteoH2O1999 force-pushed the FixSeed branch from 6516b1e to dadcfe1 Compare November 5, 2022 21:55

MatteoH2O1999 force-pushed the FixSeed branch 2 times, most recently from c91d78e to ef22784 Compare November 5, 2022 23:30

MatteoH2O1999 added 4 commits December 6, 2022 18:15

Fix new gym breaking change of reset return signature

15a9126

Fix integration test for check_env

a5de92e

Fix done method if _challenge_task is None

86d3437

Fix hsahovic#342

87acc12

MatteoH2O1999 force-pushed the FixSeed branch from 708c4d6 to 87acc12 Compare December 6, 2022 17:15

Add wrap function for legacy API

26f0141

MatteoH2O1999 force-pushed the FixSeed branch from 12e92b3 to e715236 Compare December 31, 2022 14:38

MatteoH2O1999 force-pushed the FixSeed branch 4 times, most recently from 5027778 to 51a3d41 Compare December 31, 2022 15:17

Update doc and tests

af5d782

Fix code lint

f9b6b0e

MatteoH2O1999 force-pushed the FixSeed branch from 51a3d41 to f9b6b0e Compare December 31, 2022 15:48

MatteoH2O1999 force-pushed the FixSeed branch 3 times, most recently from 3de63e4 to 13e8c0a Compare February 20, 2023 20:39

Fix merge conflicts

8f41500

MatteoH2O1999 force-pushed the FixSeed branch from 13e8c0a to 8f41500 Compare February 20, 2023 20:39

MatteoH2O1999 mentioned this pull request Feb 20, 2023

Fix breaking changes in OpenAI gym regarding step and reset return signatures #359

Merged

MatteoH2O1999 closed this Feb 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix breaking changes in OpenAI gym regarding step and reset return signatures #334

Fix breaking changes in OpenAI gym regarding step and reset return signatures #334

MatteoH2O1999 commented Sep 8, 2022

MatteoH2O1999 commented Nov 5, 2022 •

edited

MatteoH2O1999 commented Dec 6, 2022

victor-cristino commented Dec 25, 2022 •

edited

MatteoH2O1999 commented Dec 25, 2022

victor-cristino commented Dec 26, 2022

MatteoH2O1999 commented Dec 27, 2022

victor-cristino commented Dec 27, 2022

codecov bot commented Dec 31, 2022 •

edited

KenanRustamov commented Feb 20, 2023

MatteoH2O1999 commented Feb 20, 2023

MatteoH2O1999 commented Feb 20, 2023

MatteoH2O1999 commented Feb 20, 2023

KenanRustamov commented Feb 20, 2023 •

edited

MatteoH2O1999 commented Feb 21, 2023 •

edited

KenanRustamov commented Feb 26, 2023

Fix breaking changes in OpenAI gym regarding step and reset return signatures #334

Fix breaking changes in OpenAI gym regarding step and reset return signatures #334

Conversation

MatteoH2O1999 commented Sep 8, 2022

MatteoH2O1999 commented Nov 5, 2022 • edited

MatteoH2O1999 commented Dec 6, 2022

victor-cristino commented Dec 25, 2022 • edited

MatteoH2O1999 commented Dec 25, 2022

victor-cristino commented Dec 26, 2022

MatteoH2O1999 commented Dec 27, 2022

victor-cristino commented Dec 27, 2022

codecov bot commented Dec 31, 2022 • edited

Codecov Report

KenanRustamov commented Feb 20, 2023

MatteoH2O1999 commented Feb 20, 2023

MatteoH2O1999 commented Feb 20, 2023

MatteoH2O1999 commented Feb 20, 2023

KenanRustamov commented Feb 20, 2023 • edited

MatteoH2O1999 commented Feb 21, 2023 • edited

KenanRustamov commented Feb 26, 2023

MatteoH2O1999 commented Nov 5, 2022 •

edited

victor-cristino commented Dec 25, 2022 •

edited

codecov bot commented Dec 31, 2022 •

edited

KenanRustamov commented Feb 20, 2023 •

edited

MatteoH2O1999 commented Feb 21, 2023 •

edited