Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix breaking changes in OpenAI gym regarding step and reset return signatures #334

Closed
wants to merge 8 commits into from

Conversation

MatteoH2O1999
Copy link
Contributor

Should fix #333

@MatteoH2O1999 MatteoH2O1999 changed the title Fix seed in OpenAI API Fix breaking changes in OpenAI gym regarding step and reset return signatures Nov 5, 2022
@MatteoH2O1999
Copy link
Contributor Author

MatteoH2O1999 commented Nov 5, 2022

This should be compatible with the new gym changes from patch 0.26.0

@MatteoH2O1999
Copy link
Contributor Author

Hey @hsahovic,
This should now also fix #342

@victor-cristino
Copy link

victor-cristino commented Dec 25, 2022

Hi @MatteoH2O1999,
I'm having errors using your branch FixSeed. I changed those four commited files and I get this error while trying to python rl_with_new_open_ai_gym_wrapper.py


(.venv) victor@MacBook-Pro-de-Victor examples % python rl_with_new_open_ai_gym_wrapper.py
2022-12-25 21:23:53.669695: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
/Users/victor/Desktop/poke-env-victor/.venv/lib/python3.9/site-packages/gym/utils/env_checker.py:186: UserWarning: WARN: Official support for the seed function is dropped. Standard practice is to reset gym environments using env.reset(seed=<desired seed>)
logger.warn(
/Users/victor/Desktop/poke-env-victor/.venv/lib/python3.9/site-packages/gym/utils/env_checker.py:169: UserWarning: WARN: return_info is deprecated as an optional argument to reset. resetshould now always return obs, info where obs is an observation, and info is a dictionarycontaining additional information.
logger.warn(
Traceback (most recent call last):
File "/Users/victor/Desktop/poke-env-victor/examples/rl_with_new_open_ai_gym_wrapper.py", line 182, in
asyncio.get_event_loop().run_until_complete(main())
File "/Users/victor/opt/anaconda3/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
return future.result()
File "/Users/victor/Desktop/poke-env-victor/examples/rl_with_new_open_ai_gym_wrapper.py", line 79, in main
check_env(test_env)
File "/Users/victor/Desktop/poke-env-victor/.venv/lib/python3.9/site-packages/gym/utils/env_checker.py", line 297, in check_env
check_reset_return_type(env)
File "/Users/victor/Desktop/poke-env-victor/.venv/lib/python3.9/site-packages/gym/utils/env_checker.py", line 201, in check_reset_return_type
assert isinstance(
AssertionError: The result returned by env.reset() was not a tuple of the form (obs, info), where obs is a observation and info is a dictionary containing additional information. Actual type: <class 'numpy.ndarray'>


Any idea? Thank you.

Víctor

@MatteoH2O1999
Copy link
Contributor Author

@victor-cristino, the problem is an incopatibility between the library used in the example and the new Gym API. Try using the new wrapper when not testing the env and check if it changes.
To ensure we are both in the same situation, could you also please create a clean venv and run

(venv) pip install -r requirements.txt -r requirements-dev.txt -r examples/requirements.txt

@victor-cristino
Copy link

@victor-cristino, the problem is an incopatibility between the library used in the example and the new Gym API. Try using the new wrapper when not testing the env and check if it changes. To ensure we are both in the same situation, could you also please create a clean venv and run

(venv) pip install -r requirements.txt -r requirements-dev.txt -r examples/requirements.txt

I git cloned your branch FixSeed, got requirements done and run

(pyenv) victor@MacBook-Pro-de-Victor poke-env-matteo % python examples/rl_with_new_open_ai_gym_wrapper.py
2022-12-26 22:22:33.640091: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
File "/Users/victor/Desktop/poke-env-matteo/examples/rl_with_new_open_ai_gym_wrapper.py", line 15, in
from poke_env.player import (
ImportError: cannot import name 'wrap_for_old_gym_api' from 'poke_env.player' (/Users/victor/opt/anaconda3/envs/pyenv/lib/python3.9/site-packages/poke_env/player/init.py)

Seems to be a small problem about imports.

Víctor

@MatteoH2O1999
Copy link
Contributor Author

@victor-cristino you are not actually using the new version. With anaconda you have to update what is installed in your virtual env. Right now you have installed the base repo.
If you wish to use development versions of libraries, the preferred method is using virtualenv. Anaconda is more for Python users then it is for Python devs as its environments have a larger scope, while venvs are on a per-repo basis (and have less "moving parts").
In general you should do something like

git clone repo/to/use
git checkout branch/to/use
cd path/to/repo
python -m venv venv
source venv/bin/activate (on mac)
./venv/Scripts/activate (on windows)
python -m pip install --upgrade pip
pip install wheel
pip install -r requirements.txt
pip install -e .

from here, the dev version of the repo is installed in the virtual environment and you can use it

python examples/example.py

The -e means "editable", so if you update the repo, the package gets updates as well

@victor-cristino
Copy link

@victor-cristino you are not actually using the new version. With anaconda you have to update what is installed in your virtual env. Right now you have installed the base repo. If you wish to use development versions of libraries, the preferred method is using virtualenv. Anaconda is more for Python users then it is for Python devs as its environments have a larger scope, while venvs are on a per-repo basis (and have less "moving parts"). In general you should do something like

git clone repo/to/use
git checkout branch/to/use
cd path/to/repo
python -m venv venv
source venv/bin/activate (on mac)
./venv/Scripts/activate (on windows)
python -m pip install --upgrade pip
pip install wheel
pip install -r requirements.txt
pip install -e .

from here, the dev version of the repo is installed in the virtual environment and you can use it

python examples/example.py

The -e means "editable", so if you update the repo, the package gets updates as well

Hi @MatteoH2O1999 First of all thank you for your quick clean instructive replies. I followed your instructions and I could carry out my first training


(venv) victor@MacBook-Pro-de-Victor poke-env-matteo % python3.9 examples/rl_with_new_open_ai_gym_wrapper.py
2022-12-27 21:33:29.008726: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
/Users/victor/Desktop/poke-env-matteo/venv/lib/python3.9/site-packages/gym/utils/env_checker.py:186: UserWarning: WARN: Official support for the seed function is dropped. Standard practice is to reset gym environments using env.reset(seed=<desired seed>)
logger.warn(
/Users/victor/Desktop/poke-env-matteo/venv/lib/python3.9/site-packages/gym/utils/env_checker.py:169: UserWarning: WARN: return_info is deprecated as an optional argument to reset. resetshould now always return obs, info where obs is an observation, and info is a dictionarycontaining additional information.
logger.warn(
/Users/victor/Desktop/poke-env-matteo/venv/lib/python3.9/site-packages/gym/utils/passive_env_checker.py:233: DeprecationWarning: np.bool8 is a deprecated alias for np.bool_. (Deprecated NumPy 1.24)
if not isinstance(terminated, (bool, np.bool8)):
2022-12-27 21:33:59.845171: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-27 21:33:59.854589: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled
Training for 10000 steps ...
Interval 1 (0 steps performed)
/Users/victor/Desktop/poke-env-matteo/venv/lib/python3.9/site-packages/keras/engine/training_v1.py:2356: UserWarning: Model.state_updates will be removed in a future version. This property should not be used in TensorFlow 2.0, as updates are applied automatically.
updates=self.state_updates,
1/10000 [..............................] - ETA: 12:59 - reward: 0.0000 7/10000 [..............................] - ETA: 1:36 - reward: -0.043610000/10000 [==============================] - 145s 14ms/step - reward: 0.7293
done, took 144.719 seconds
Results against random player:
DQN Evaluation: 96 victories out of 100 episodes
Results against max base power player:
DQN Evaluation: 68 victories out of 100 episodes
Evaluation with included method: (24.119834780487803, (17.428497305421068, 35.677077047523596))
Cross evaluation of DQN with baselines:
SimpleRLPlayer 3 RandomPlayer 5 MaxBasePowerPlay 3 SimpleHeuristics 2
SimpleRLPlayer 3 0.98 0.82 0.06
RandomPlayer 5 0.02 0.12 0.04
MaxBasePowerPlay 3 0.18 0.88 0.06
SimpleHeuristics 2 0.94 0.96 0.94

(venv) victor@MacBook-Pro-de-Victor poke-env-matteo %


Eventually I am trying to develop a self play project for vgc (doubles). Actually I think examples/experimental-self-play is not working at the moment but that's another issue :) Thanks a lot.

Víctor

@codecov
Copy link

codecov bot commented Dec 31, 2022

Codecov Report

Merging #334 (8f41500) into master (e367b20) will increase coverage by 0.10%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #334      +/-   ##
==========================================
+ Coverage   89.10%   89.20%   +0.10%     
==========================================
  Files          35       38       +3     
  Lines        3616     3475     -141     
==========================================
- Hits         3222     3100     -122     
+ Misses        394      375      -19     

@KenanRustamov
Copy link

this fixes the issues with the example for me. However, only issue I get though is that for some reason, the async call to reset the environment never returns and the program hangs forever, I fixed this by removing a line of code in the "env = copy.copy(env)" on line 756 in the wrap_for_old_gym_api. Let me know if you get this issue too.

@MatteoH2O1999
Copy link
Contributor Author

Could you be a little more specific: which method of which class are you calling? With what parameters? What OS and Python version are you using?

@MatteoH2O1999
Copy link
Contributor Author

Also could we please move this to an isssue?

@MatteoH2O1999
Copy link
Contributor Author

@hsahovic I'm going to close this PR and open a new one in order to keep this branch working and allow the new rebased one to be merged. Check out #359

@KenanRustamov
Copy link

KenanRustamov commented Feb 20, 2023

@MatteoH2O1999 Sure thing, in the rl_with_new_open_ai_gym_wrapper.py the .close() and .reset_env() functions called on an environment such as train_env or eval_env cause the program to hang no matter which inputs I give it. This only happens to environments that have been wrapped by the wrap_for_old_gym_api() function as it does not happen with the test_env in the example. Of course, the entire program does not work if this wrap is not applied as this is the point of this PR I assume.

For me, removing the line "env = copy.copy(env)" in the function wrap_for_old_gym_api() fixed the issue. I made no other changes. I am on windows 11, python 3.10.10 and here is my pip list:
Package Version


absl-py 1.4.0
alabaster 0.7.13
astunparse 1.6.3
async-generator 1.10
asynctest 0.13.0
attrs 22.2.0
Babel 2.11.0
black 22.6.0
bleach 6.0.0
cachetools 5.3.0
certifi 2022.12.7
cfgv 3.3.1
charset-normalizer 3.0.1
click 8.1.3
cloudpickle 2.2.1
colorama 0.4.6
coverage 7.1.0
dataclasses-json 0.5.7
distlib 0.3.6
docutils 0.18.1
exceptiongroup 1.1.0
filelock 3.9.0
flake8 6.0.0
flatbuffers 23.1.21
gast 0.4.0
google-auth 2.16.1
google-auth-oauthlib 0.4.6
google-pasta 0.2.0
grpcio 1.51.1
gym 0.26.2
gym-notices 0.0.8
h5py 3.8.0
identify 2.5.18
idna 3.4
imagesize 1.4.1
importlib-metadata 6.0.0
iniconfig 2.0.0
intervaltree 3.1.0
jaraco.classes 3.2.3
Jinja2 3.1.2
keras 2.10.0
Keras-Preprocessing 1.1.2
keras-rl2 1.0.5
keyring 23.13.1
libclang 15.0.6.1
libcst 0.4.9
Markdown 3.4.1
markdown-it-py 2.1.0
MarkupSafe 2.1.2
marshmallow 3.19.0
marshmallow-enum 1.5.1
mccabe 0.7.0
mdurl 0.1.2
more-itertools 9.0.0
mypy-extensions 1.0.0
nodeenv 1.7.0
numpy 1.24.2
oauthlib 3.2.2
opt-einsum 3.3.0
orjson 3.8.6
packaging 23.0
pathspec 0.11.0
pip 22.3.1
pkginfo 1.9.6
platformdirs 3.0.0
pluggy 1.0.0
pre-commit 3.0.4
protobuf 3.19.6
psutil 5.9.4
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycodestyle 2.10.0
pyflakes 3.0.1
Pygments 2.14.0
pyre-check 0.9.15
pyre-extensions 0.0.30
pytest 7.2.1
pytest-asyncio 0.20.3
pytest-cov 4.0.0
pytest-timeout 2.1.0
pytz 2022.7.1
pywin32-ctypes 0.2.0
PyYAML 6.0
readme-renderer 37.3
requests 2.28.2
requests-oauthlib 1.3.1
requests-toolbelt 0.10.1
rfc3986 2.0.0
rich 13.3.1
rsa 4.9
setuptools 65.5.0
six 1.16.0
snowballstemmer 2.2.0
sortedcontainers 2.4.0
Sphinx 6.1.3
sphinx-rtd-theme 1.2.0
sphinxcontrib-applehelp 1.0.4
sphinxcontrib-devhelp 1.0.2
sphinxcontrib-htmlhelp 2.0.1
sphinxcontrib-jquery 2.0.0
sphinxcontrib-jsmath 1.0.1
sphinxcontrib-qthelp 1.0.3
sphinxcontrib-serializinghtml 1.1.5
tabulate 0.9.0
tensorboard 2.10.1
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
tensorflow 2.10.1
tensorflow-estimator 2.10.0
tensorflow-io-gcs-filesystem 0.30.0
termcolor 2.2.0
TestSlide 2.7.0
tomli 2.0.1
twine 4.0.2
typeguard 2.13.3
typing_extensions 4.5.0
typing-inspect 0.8.0
urllib3 1.26.14
virtualenv 20.19.0
webencodings 0.5.1
websockets 10.4
Werkzeug 2.2.3
wheel 0.38.4
wrapt 1.14.1
zipp 3.14.0

I am running it in a virtual environment created with the exact specifications you had mentioned previously in VS Code. I am also running a local pokemon-showdown in the way it is described in the README. Let me know if I missed something, I am quite new to poke-env.

@MatteoH2O1999
Copy link
Contributor Author

MatteoH2O1999 commented Feb 21, 2023

@KenanRustamov , try now with the updated non-rebased FixSeed branch

@KenanRustamov
Copy link

@MatteoH2O1999 New branch works, seems like the changes you made to replace the .copy method fixed it. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

check_env fails due to missing random number generator
3 participants