Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker repos missing files #7

Closed
patterntrade opened this issue Dec 21, 2018 · 13 comments
Closed

Docker repos missing files #7

patterntrade opened this issue Dec 21, 2018 · 13 comments
Labels
bug Something isn't working

Comments

@patterntrade
Copy link

Have rl-baselines-zoo, GPU edition, pulled, not built.

Trying to run:

docker run -it --runtime=nvidia --rm --network host --ipc=host --name test --mount src="$(pwd)",target=/root/code/stable-baselines,type=bind araffin/stable-baselines bash -c 'cd /root/code/stable-baselines/ && pytest tests/'

Am running:

sudo docker run --runtime=nvidia -it araffin/stable-baselines bash

Traversing into /root/code/, the directory is empty. It seems there is something wrong about the repository. Similar issues with the rl-zoo image.

I have little experience with docker, so I might well have missed something.

Kind regards

@araffin
Copy link
Owner

araffin commented Dec 22, 2018

Hello,
Did you try running experiments with the shell script?

./run_docker_gpu.sh python train.py --algo ppo2 --env CartPole-v1

I am not 100% that the gpu image works (i have to fix a bug where tf is installed without gpu support), however the cpu image works , it used for continuous integration.

Edit: for the files, that is normal (cf stable baselines doc where the command is explained)

@patterntrade
Copy link
Author

The GPU image doesn`t work, error msg like:

...
line 35, in
from tensorflow.python.keras import backend
File "/root/venv/lib/python3.5/site-packages/tensorflow/python/keras/backend/init.py", line 22, in
from tensorflow.python.keras._impl.keras.backend import abs
ImportError: cannot import name 'abs'

Resolved by in the container:
source venv/bin/activate
pip install keras
pip install --upgrade tensorflow-gpu

Now it works!

Thanks for setting up this repository and the docker images, very helpful.

Merry Christmas!

:-)

@araffin
Copy link
Owner

araffin commented Dec 23, 2018

Ok, I'll try to update the image then.

@araffin araffin added the bug Something isn't working label Dec 23, 2018
@araffin
Copy link
Owner

araffin commented Jan 17, 2019

Hello again,
I updated the docker image, it should be fixed now, can you confirm this?

@patterntrade
Copy link
Author

patterntrade commented Jan 23, 2019 via email

@araffin
Copy link
Owner

araffin commented Jan 23, 2019

Are you using this dockerfile: https://github.com/araffin/rl-baselines-zoo/blob/master/docker/Dockerfile.gpu ?

Stable-Baselines is installed here

The built image: https://hub.docker.com/r/araffin/rl-baselines-zoo

EDIT: Oh, I see, since the beginning you seems to be using stable-baselines docker image instead of the rl zoo docker image.

@patterntrade
Copy link
Author

patterntrade commented Jan 23, 2019 via email

@araffin
Copy link
Owner

araffin commented Jan 23, 2019

The doc is already updated ... cf https://stable-baselines.readthedocs.io/en/master/guide/install.html#using-docker-images
"
If you are looking for docker images with stable-baselines already installed in it, we recommend using images from RL Baselines Zoo.

Otherwise, the following images contained all the dependencies for stable-baselines but not the stable-baselines package itself. They are made for development.
"

@patterntrade
Copy link
Author

patterntrade commented Jan 27, 2019 via email

@araffin
Copy link
Owner

araffin commented Jan 27, 2019

Ok, did you try the cpu image?
If it does not work with the cpu image, I'm afraid the problem may come from your machine, because the cpu image is tested at each push on Travic CI.
What you are seeing is the entrypoint.sh trying to create a fake X server in order to be able to launch any env that requires one.
Btw, why do you have to use sudo? Did you follow the post-installation?

@patterntrade
Copy link
Author

patterntrade commented Jan 27, 2019 via email

@patterntrade
Copy link
Author

patterntrade commented Jan 28, 2019 via email

@patterntrade
Copy link
Author

patterntrade commented Jan 29, 2019

ADDITIONAL INFO> EXTRACTS FROM BUILD LOG GPU IMAGE

I edited the entrypoint.sh to not try and make a fake X server. Then I can build and run. I dont think the errors in the previous post are due to cartpole trying to display something, its in the code. Might it be an issue with the version of Tensorflow used?

Get:332 http://archive.ubuntu.com/ubuntu xenial/universe amd64 libopenmpi-dev amd64 1.10.2-8ubuntu1 [537 kB]
**debconf: delaying package configuration, since apt-utils is not installed**
Fetched 225 MB in 7min 7s (527 kB/s)

Successfully installed virtualenv-16.3.0
**You are using pip version 8.1.1, however version 19.0.1 is available.**
You should consider upgrading via the 'pip install --upgrade pip' command.
Using base prefix '/usr'
New python executable in /root

**Collecting joblib (from stable-baselines==2.4.0)
  Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/joblib/**
  Downloading https://files.pythonhosted.org/packages/49/d9/4ea194a4c1d0148f9446054b9135f47218c23ccc6f649aeb09fab4c0925c/joblib-0.13.1-py2.py3-none-any.whl (278kB)

Successfully built html5lib
**tensorflow 1.12.0 has requirement tensorboard<1.13.0,>=1.12.0, but you'll have tensorboard 1.8.0 which is incompatible.**
Installing collected packages: html5lib, bleach, tensorboard, tensorflow-gpu

So docker build gave some warnings, but for some reason built the image anyway. I`m not sure that explains the issues in the previous entry or not.

Now, every time I try to build a new Docker image, it just uses local files. Not sure how I can force it to redo from download, or if that has any merit at all.

yycho0108 pushed a commit to yycho0108/rl-baselines-zoo that referenced this issue Feb 2, 2021
yycho0108 pushed a commit to yycho0108/rl-baselines-zoo that referenced this issue Feb 2, 2021
* Refactored benchmark.py
now using f-string everywhere

* More Cleaner parser.add_arguments

* Added TODO.

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants