Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Solved][Bug]executing load_and_render_model.py #124

Closed
binakn opened this issue Jun 13, 2023 · 5 comments
Closed

[Solved][Bug]executing load_and_render_model.py #124

binakn opened this issue Jun 13, 2023 · 5 comments
Assignees
Labels
bug Something isn't working question Further information is requested

Comments

@binakn
Copy link

binakn commented Jun 13, 2023

I'm new to MARLlib and am currently in the process of understanding all the great things it can do :)
Unfortunately, when executing python load_and_render_model.py from the examples directory, I get the following error:

2023-06-13 16:57:15,259 ERROR trial_runner.py:1124 -- Trial MAPPOTrainer_mpe_simple_spread_95240_00000: Error processing restore. Traceback (most recent call last): File "/opt/conda/lib/python3.9/site-packages/ray/tune/trial_runner.py", line 1117, in _process_trial_restore self.trial_executor.fetch_result(trial) File "/opt/conda/lib/python3.9/site-packages/ray/tune/ray_trial_executor.py", line 788, in fetch_result result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT) File "/opt/conda/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper return func(*args, **kwargs) File "/opt/conda/lib/python3.9/site-packages/ray/worker.py", line 1625, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(TypeError): ray::MAPPOTrainer.restore_from_object() (pid=144888, repr=MAPPOTrainer) File "/opt/conda/lib/python3.9/site-packages/ray/tune/trainable.py", line 433, in restore_from_object self.restore(checkpoint_path) File "/opt/conda/lib/python3.9/site-packages/ray/tune/trainable.py", line 411, in restore self.load_checkpoint(checkpoint_path) File "/opt/conda/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 830, in load_checkpoint self.__setstate__(extra_data) File "/opt/conda/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 289, in __setstate__ Trainer.__setstate__(self, state) File "/opt/conda/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 1813, in __setstate__ self.workers.local_worker().restore(state["worker"]) File "/opt/conda/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1274, in restore objs = pickle.loads(objs) TypeError: an integer is required (got type bytes)

I'd appreciate any pointer to what is maybe going wrong. Thank you !

@Theohhhu Theohhhu self-assigned this Jun 14, 2023
@Theohhhu Theohhhu added question Further information is requested bug Something isn't working labels Jun 14, 2023
@Theohhhu
Copy link
Collaborator

Hi,

It appears to be a bug with cloudpickle caused by an incompatible Python and cloudpickle version. Try installing cloudpickle==2.0.0 using pip and see if it resolves the issue. Let me know if you need further assistance.

@binakn
Copy link
Author

binakn commented Jun 14, 2023

Hi,

thank you so much for your fast reply and the suggestion.

I tried again with cloudpickle==2.0.0 but still get the same error. Maybe there is some other issue? I added my pip list below, my python version is Python 3.9.7. My gym version is at 0.21.1 due to the following issue: openai/gym#3200

Package Version


anyio 3.7.0
argon2-cffi 21.3.0
argon2-cffi-bindings 21.2.0
arrow 1.2.3
asttokens 2.2.1
async-lru 2.0.2
async-timeout 4.0.2
attrs 23.1.0
Babel 2.12.1
backcall 0.2.0
beautifulsoup4 4.12.2
bleach 6.0.0
certifi 2023.5.7
cffi 1.14.5
charset-normalizer 3.1.0
click 8.1.3
cloudpickle 2.0.0
colorama 0.4.6
comm 0.1.3
conda 4.10.1
conda-package-handling 2.0.2
conda_package_streaming 0.8.0
contourpy 1.1.0
cryptography 38.0.4
cycler 0.11.0
debugpy 1.6.7
decorator 5.1.1
defusedxml 0.7.1
dm-tree 0.1.8
exceptiongroup 1.1.1
executing 1.2.0
fastjsonschema 2.17.1
filelock 3.12.2
fonttools 4.40.0
fqdn 1.5.1
grpcio 1.54.2
gym 0.21.1
gym-notices 0.0.8
icecream 2.1.3
idna 3.4
imageio 2.31.1
importlib-metadata 4.13.0
importlib-resources 5.12.0
ipykernel 6.23.2
ipython 8.14.0
ipython-genutils 0.2.0
isoduration 20.11.0
jedi 0.18.2
Jinja2 3.1.2
json5 0.9.14
jsonpointer 2.3
jsonschema 4.17.3
jupyter_client 8.2.0
jupyter_core 5.3.0
jupyter-events 0.6.3
jupyter-lsp 2.2.0
jupyter_server 2.6.0
jupyter_server_terminals 0.4.4
jupyterlab 4.0.2
jupyterlab-pygments 0.2.2
jupyterlab_server 2.23.0
kiwisolver 1.4.4
lz4 4.3.2
MarkupSafe 2.1.3
marllib 1.0.3
matplotlib 3.7.1
matplotlib-inline 0.1.6
mistune 2.0.5
msgpack 1.0.5
nbclassic 1.0.0
nbclient 0.8.0
nbconvert 7.5.0
nbformat 5.9.0
nest-asyncio 1.5.6
networkx 3.1
notebook 6.5.4
notebook_shim 0.2.3
numpy 1.20.3
opencv-python 3.4.18.65
overrides 7.3.1
packaging 23.1
pandas 2.0.2
pandocfilters 1.5.0
parso 0.8.3
PettingZoo 1.12.0
pexpect 4.8.0
pickleshare 0.7.5
Pillow 9.5.0
pip 23.1.2
platformdirs 3.5.3
prometheus-client 0.17.0
prompt-toolkit 3.0.38
protobuf 3.20.3
psutil 5.9.5
ptyprocess 0.7.0
pure-eval 0.2.2
pycosat 0.6.4
pycparser 2.21
pyglet 2.0.7
Pygments 2.15.1
pyOpenSSL 23.2.0
pyparsing 3.0.9
pyrsistent 0.19.3
PySocks 1.7.1
python-dateutil 2.8.2
python-json-logger 2.0.7
pytz 2023.3
PyWavelets 1.4.1
PyYAML 6.0
pyzmq 25.1.0
ray 1.8.0
redis 4.5.5
requests 2.31.0
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
ruamel-yaml-conda 0.15.80
scikit-image 0.19.3
scipy 1.10.1
Send2Trash 1.8.2
setuptools 67.7.2
six 1.16.0
sniffio 1.3.0
soupsieve 2.4.1
stack-data 0.6.2
SuperSuit 3.2.0
tabulate 0.9.0
tensorboardX 2.6
terminado 0.17.1
tifffile 2023.4.12
tinycss2 1.2.1
tomli 2.0.1
torch 1.9.0
tornado 6.3.2
traitlets 5.9.0
typing_extensions 4.6.3
tzdata 2023.3
uri-template 1.2.0
urllib3 2.0.3
wcwidth 0.2.6
webcolors 1.13
webencodings 0.5.1
websocket-client 1.5.3
wheel 0.40.0
zipp 3.15.0
zstandard 0.19.0

@Theohhhu
Copy link
Collaborator

It seems that the issue may lie in the model saving and loading process. The pretrained model used in load_and_render_model.py was not trained and saved under Python 3.9, but rather under Python 3.6.13 which is an older version. I have two proposed solutions for you:

  1. Downgrade your Python version to approximately 3.6.13 and reinstall all the required packages.
  2. Train a new model on your local machine using Python 3.9 or any desired version. After running this script for several iterations, modify the model and parameter paths here in the load_and_render_model.py file to load the newly saved model.

I highly recommend the second :).

@binakn
Copy link
Author

binakn commented Jun 15, 2023

As recommended I tried option 2 and got it working - I just had to do a single additional change where I downgraded to pyglet==1.5.11 because of another error.

Thank you for your support ! :)

@binakn binakn closed this as completed Jun 15, 2023
@Theohhhu
Copy link
Collaborator

cheers.

@Theohhhu Theohhhu pinned this issue Jun 15, 2023
@Theohhhu Theohhhu changed the title Error when executing load_and_render_model.py [Solved][Bug]executing load_and_render_model.py Jun 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants