-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is distributed training supported? #7
Comments
can you explain this a bit, cause even when i run with Mon Mar 23 14:55:03 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 410.104 Driver Version: 410.104 CUDA Version: 10.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 108... Off | 00000000:06:00.0 Off | N/A | | 28% 50C P8 17W / 250W | 304MiB / 11178MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 24508 C python3 147MiB | | 0 25486 C python3 147MiB | +-----------------------------------------------------------------------------+ The things are also the same when using 16 envs
|
I feel that this means the whole thing is running on CPU and not the GPU |
Changing Traceback (most recent call last): File "dreamer.py", line 463, in <module> main(parser.parse_args()) File "dreamer.py", line 422, in main actspace = train_envs[0].action_space File "/home/arunavo/Pairs-Trading/dreamer/wrappers.py", line 395, in action_space self._action_space = self.__getattr__('action_space') File "/home/arunavo/Pairs-Trading/dreamer/wrappers.py", line 402, in __getattr__ return self._receive() File "/home/arunavo/Pairs-Trading/dreamer/wrappers.py", line 436, in _receive message, payload = self._conn.recv() File "/usr/lib/python3.6/multiprocessing/connection.py", line 251, in recv return _ForkingPickler.loads(buf.getbuffer()) File "/home/arunavo/Pairs-Trading/dreamer/wrappers.py", line 306, in __getattr__ return getattr(self._env, name) File "/home/arunavo/Pairs-Trading/dreamer/wrappers.py", line 306, in __getattr__ return getattr(self._env, name) File "/home/arunavo/Pairs-Trading/dreamer/wrappers.py", line 306, in __getattr__ return getattr(self._env, name) [Previous line repeated 328 more times] RecursionError: maximum recursion depth exceeded while calling a Python object |
@arunavo4 , maybe it is caused by CUDA version. Tensorflow 2.1.0 only supports CUDA 10.1. After changing the CUDA version, the code runs smoothly on my machine. |
There are some features that should allow to interact with a vectorized environment. In this case, the agent receives a batch of inputs and produces a batch of actions. The environments are stepped in sync but in parallel, either each using a In practice, I've found the computational bottleneck to be training the world model and not environment interaction, so I haven't tested vectorized acting much. |
@danijar So you are saying that leaving it to the default is the best way to train it? |
@IcarusWizard Did you try with arati? and was you GPU being utilized? I am in the process of upgrading to new cuda I will let you know if i make progress |
@arunavo4 It works on Atari too, and GPUs are utilized. You just need to pass additional arguments like |
@IcarusWizard Thanks a lot now it finally works! Now it uses the GPU very well. |
Exactly, those are the necessary flags for discrete actions. You may want to tune some of the other hyper parameters for Atari as well (e.g. |
Hi! I cannot run dreamer on the GPU too, can you share some tips about it? |
@CR-Gjx Just make sure you use exactly Tensorflow 2.1.0 and CUDA 10.1. |
@danijar Thank you for this work. I had this question.
The text was updated successfully, but these errors were encountered: