running R2D2 without Docker #11

turmeric-blend · 2020-03-28T07:54:04Z

I'm trying to run seed rl (R2D2) without Docker on Ubuntu 18.04. I've tried to decouple the files as much as I can from docker. When I try to run r2d2_main.py in leaner mode in the terminal,
python atari/r2d2_main.py --run_mode=learner --logtostderr --pdb_post_mortem --num_actors=2,

I get this error:

Traceback (most recent call last):
  File "atari/r2d2_main.py", line 27, in <module>
    from seed_rl.agents.r2d2 import learner
  File "/home/dave/Documents/AI/2020_seed_rl/seed_rl/agents/r2d2/learner.py", line 38, in <module>
    from seed_rl import grpc
  File "/home/dave/Documents/AI/2020_seed_rl/seed_rl/grpc/__init__.py", line 21, in <module>
    from seed_rl.grpc.python.ops import *  
  File "/home/dave/Documents/AI/2020_seed_rl/seed_rl/grpc/python/ops.py", line 25, in <module>
    from seed_rl.grpc.python.ops_wrapper import gen_grpc_ops
  File "/home/dave/Documents/AI/2020_seed_rl/seed_rl/grpc/python/ops_wrapper.py", line 25, in <module>
    gen_grpc_ops = tf.load_op_library(os.path.join(tf.compat.v1.resource_loader.get_data_files_path(), '../grpc_cc.so'))
  File "/home/dave/anaconda3/envs/dave/lib/python3.7/site-packages/tensorflow_core/python/framework/load_library.py", line 57, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /home/dave/Documents/AI/2020_seed_rl/seed_rl/grpc/python/../grpc_cc.so: undefined symbol: _ZN10tensorflow14DataTypeStringENS_8DataTypeE

The only change I made to r2d2_main.py is add

import sys
sys.path.insert(1, '/home/dave/Documents/AI/2020_seed_rl/')

for path purposes.

The text was updated successfully, but these errors were encountered:

lespeholt · 2020-03-29T20:22:40Z

Do you build the grpc library in docker? If no, try and do that.
If you do already: I have seen this error before with specific versions of TF, so make sure you use exactly the same.

turmeric-blend · 2020-03-30T10:07:17Z

@lespeholt

My TF version is 2.1.0 which is the same.

I think my issue is with the grpc library as when I ran this simple example I got the same error.

However, as I am quite new to Docker and grpc, I am not quite sure how to 'build' grpc from docker (even after reading up on grpc and docker).

The file structure of grpc in seed_rl seems quite different from those examples given by the tutorial and example repository. The following gave me a lot of confusion:

there is no _pb2_grpc.py generated file as specified in the tutorials.
there seems to be a grpc.cc file and grpc_cc.so file, which is not common if using grpc python.
i am looking for a way to run r2d2 without docker, so it seems that building grpc from docker would not be ideal.

Above all, do we really need the c++ and .so files to use grpc? Is there a way to do it like the examples in the tutorial (using grpc with python) without those files and without docker?

thanks for your patients and I really appreciate all the help I can get thanks ! (:

lespeholt · 2020-03-30T10:48:50Z

We are using C++ grpc, not Python grpc.

Do you right now use the prebuild .so file (i.e. you don't try to build the grpc library?

turmeric-blend · 2020-03-30T11:56:54Z

@lespeholt

Do you right now use the prebuild .so file (i.e. you don't try to build the grpc library?

yes, both seed_rl for r2d2 (without docker) using
python atari/r2d2_main.py --run_mode=learner --logtostderr --pdb_post_mortem --num_actors=2
and this simple example runs on the existing .so file from the repository.

We are using C++ grpc, not Python grpc.

Is there any advantages running c++ instead of python?

lespeholt · 2020-04-06T08:32:48Z

I'm not sure what goes wrong for you, the following in Docker works:

FROM ubuntu:18.04

RUN apt-get update && apt-get install -y tmux libsm6 libxext6 libxrender-dev python3-pip
RUN pip3 install --upgrade pip
RUN pip3 install tensorflow==2.1.0

this should be fairly close to what you're doing.

turmeric-blend · 2020-04-06T08:44:03Z

ok @lespeholt I will look into it again on running without docker.

Also, is it possible to use grpc python instead of grpc c++? do you know if there would be any slowly down in speed/bandwidth or if there are any features which cant be implemented in grpc Python?

lespeholt · 2020-04-06T09:08:05Z

Using Python grpc would be significantly slower than C++ and the custom batching.

galdl · 2020-07-30T18:40:47Z

Reviving this. I'm trying to do the same since apparently profiling with nvprof is problematic inside the docker; I'm getting segmentation faults.

I'm getting a very similar error: tensorflow.python.framework.errors_impl.NotFoundError: /home/nvidia/PycharmProjects/seed_rl/grpc/python/../grpc_cc.so: undefined symbol: _ZN10tensorflow8OpKernel11TraceStringEPNS_15OpKernelContextEb

Is there a solution proposed here? I'm not sure I understood. The suggestion to compile grpc within the docker is not relevant, right? since I'm not using docker...

lespeholt · 2020-08-05T20:33:39Z

you can still compile grpc inside docker, copy the file and then not use docker at all when you run the training.

zhuliwen · 2020-08-10T00:28:24Z

How can I run this code without docker?

I did it successfully, here I want to share my experience.

1. First, you can create a virtual environment using conda or virtualenv (my python version is python3.6.7 ), installing the following packages:

absl-py==0.9.0
appdirs==1.4.4
asn1crypto==0.24.0
astunparse==1.6.3
atari-py==0.2.6
cachetools==4.1.1
certifi==2020.6.20
cffi==1.14.0
chardet==3.0.4
cloudpickle==1.3.0
cryptography==2.1.4
decorator==4.4.2
distlib==0.3.1
filelock==3.0.12
future==0.18.2
gast==0.3.3
google-auth==1.18.0
google-auth-oauthlib==0.4.1
google-pasta==0.2.0
grpcio==1.30.0
gym==0.17.2
h5py==2.10.0
idna==2.10
importlib-metadata==1.7.0
importlib-resources==3.0.0
Keras-Preprocessing==1.1.2
keyring==10.6.0
keyrings.alt==3.0
Markdown==3.2.2
numpy==1.19.0
oauthlib==3.1.0
opencv-python==4.3.0.36
opt-einsum==3.2.1
Pillow==7.2.0
protobuf==3.12.2
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycairo==1.19.1
pycparser==2.20
pyglet==1.5.0
PyGObject==3.36.1
requests==2.24.0
requests-oauthlib==1.3.0
rsa==4.6
scipy==1.4.1
SecretStorage==2.3.1
six==1.15.0
tensorboard==2.2.2
tensorboard-plugin-wit==1.7.0
tensorflow-estimator==2.2.0
tensorflow-gpu==2.2.0
tensorflow-probability==0.9.0
termcolor==1.1.0
urllib3==1.25.9
Werkzeug==1.0.1
wrapt==1.12.1
zipp==3.1.0

2. Second, we need to configure the cuda_10 environment

conda install cudatoolkit==10.1
conda install cudnn -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/linux-64/

3. Third, we add the seed_rl path to the python path, the pwd is under the seed_rl folder

export PYTHONPATH=$(dirname "$PWD"):$PYTHONPATH

4. Forth, we create 5 tmux windows named learner, actor0, actor1, actor2, actor3

Run the following commands in these 5 windows respectively：

python3 atari/r2d2_main.py --run_mode=learner --logtostderr --pdb_post_mortem  --num_actors=4
CUDA_VISIBLE_DEVICES='' python3 atari/r2d2_main.py --run_mode=actor --logtostderr --pdb_post_mortem  --num_actors=4 --task=0
CUDA_VISIBLE_DEVICES='' python3 atari/r2d2_main.py --run_mode=actor --logtostderr --pdb_post_mortem  --num_actors=4 --task=1
CUDA_VISIBLE_DEVICES='' python3 atari/r2d2_main.py --run_mode=actor --logtostderr --pdb_post_mortem  --num_actors=4 --task=2
CUDA_VISIBLE_DEVICES='' python3 atari/r2d2_main.py --run_mode=actor --logtostderr --pdb_post_mortem  --num_actors=4 --task=3

Note: you should make sure that you have grpc_cc.so file (9.4M) under the grpc folder.

That's all, hope you can succeed！

lespeholt · 2020-08-10T05:56:41Z

@zhuliwen thanks!

galdl · 2020-08-12T12:31:24Z

Excellent, I'll give it a try. Thanks a lot!

omurammm · 2022-04-16T20:21:12Z

FYI, the appropriate version should be used as this repository is updated.
The version is written in docker files.
https://github.com/google-research/seed_rl/blob/master/docker/Dockerfile.grpc
https://github.com/google-research/seed_rl/blob/master/docker/Dockerfile.atari

Now, you need to use

tensorflow-gpu==2.4.1
tensorflow-probability==0.11.0

lespeholt mentioned this issue Apr 23, 2020

Installation #18

Closed

lespeholt mentioned this issue Aug 20, 2020

Unable to Instantiate gRPC Server #42

Closed

lespeholt closed this as completed Jan 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

running R2D2 without Docker #11

running R2D2 without Docker #11

turmeric-blend commented Mar 28, 2020

lespeholt commented Mar 29, 2020

turmeric-blend commented Mar 30, 2020

lespeholt commented Mar 30, 2020

turmeric-blend commented Mar 30, 2020 •

edited

lespeholt commented Apr 6, 2020

turmeric-blend commented Apr 6, 2020

lespeholt commented Apr 6, 2020

galdl commented Jul 30, 2020

lespeholt commented Aug 5, 2020

zhuliwen commented Aug 10, 2020

lespeholt commented Aug 10, 2020

galdl commented Aug 12, 2020

omurammm commented Apr 16, 2022

running R2D2 without Docker #11

running R2D2 without Docker #11

Comments

turmeric-blend commented Mar 28, 2020

lespeholt commented Mar 29, 2020

turmeric-blend commented Mar 30, 2020

lespeholt commented Mar 30, 2020

turmeric-blend commented Mar 30, 2020 • edited

lespeholt commented Apr 6, 2020

turmeric-blend commented Apr 6, 2020

lespeholt commented Apr 6, 2020

galdl commented Jul 30, 2020

lespeholt commented Aug 5, 2020

zhuliwen commented Aug 10, 2020

How can I run this code without docker?

1. First, you can create a virtual environment using conda or virtualenv (my python version is python3.6.7 ), installing the following packages:

2. Second, we need to configure the cuda_10 environment

3. Third, we add the seed_rl path to the python path, the pwd is under the seed_rl folder

4. Forth, we create 5 tmux windows named learner, actor0, actor1, actor2, actor3

Note: you should make sure that you have grpc_cc.so file (9.4M) under the grpc folder.

lespeholt commented Aug 10, 2020

galdl commented Aug 12, 2020

omurammm commented Apr 16, 2022

turmeric-blend commented Mar 30, 2020 •

edited