Fail to run tutorial_Isaac_Gym.py #169

planetbalileua · 2022-06-17T14:40:51Z

Hello! Thank you for creating this brilliant library! This is so helpful on a personal project I am working on.
I faced an error when trying to run tutorial_Isaac_Gym.py in the example folder:

Traceback (most recent call last):
  File "/home/meow/anaconda3/envs/igym/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/meow/anaconda3/envs/igym/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/meow/ElegantRL/elegantrl/train/run.py", line 162, in run
    env = build_env(args.env, args.env_func, args.env_args)
  File "/home/meow/ElegantRL/elegantrl/train/config.py", line 249, in build_env
    env = env_func(**kwargs_filter(env_func.__init__, env_args.copy()))
  File "/home/meow/ElegantRL/elegantrl/envs/IsaacGym.py", line 45, in __init__
    env: VecTask = isaac_task(
  File "/home/meow/ElegantRL/elegantrl/envs/isaac_tasks/ant.py", line 69, in __init__
    super().__init__(
  File "/home/meow/ElegantRL/elegantrl/envs/isaac_tasks/base/vec_task.py", line 213, in __init__
    self.create_sim()
  File "/home/meow/ElegantRL/elegantrl/envs/isaac_tasks/ant.py", line 156, in create_sim
    self._create_envs(
  File "/home/meow/ElegantRL/elegantrl/envs/isaac_tasks/ant.py", line 199, in _create_envs
    self.joint_gears = to_torch(motor_efforts, device=self.device)
  File "/home/meow/Downloads/IsaacGym_Preview_3_Package/isaacgym/python/isaacgym/torch_utils.py", line 16, in to_torch
    return torch.tensor(x, dtype=dtype, device=device, requires_grad=requires_grad)
  File "/home/meow/anaconda3/envs/igym/lib/python3.8/site-packages/torch/cuda/__init__.py", line 216, in _lazy_init
    torch._C._cuda_init()
RuntimeError: CUDA error: out of memory

I'm running this on NVIDIA RTX3070TI with 8GB VRAM, and my CUDA version is:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0

The same Ant(with 2048env) example was working when I test it using the original isaac gym train.py. I'm pretty sure that I have free VRAM (~7.2GB) when running this but it still appears the CUDA out of memory error. My torch version is 1.11.0.

I have also tried to reduce the number of envs, batch size, network size and other parameters, but the error remains.

Once again thank you so much for any possible help on this issue

supersglzc · 2022-06-20T01:59:28Z

Hi planetbalileua and thanks for reaching out!

We realize that some codes are not consistent due to the fast iteration and we are doing refactorings.
For Isaac Gym users, I have published a single process version with a demo on Ant and Humanoid. Could you please try that and see if the error remains?

planetbalileua · 2022-06-21T03:37:48Z

Hi supersglzc!
The single process version after some small modifications works fine!
The changes I made:
Add

    args.if_use_per

and comment out line 60 in elegant/rl/train/evaluator.py (which is using wandb)
Thank you so so much for you help!

YangletLiu · 2022-06-29T19:51:47Z

Would you like to test the updated file at: https://github.com/AI4Finance-Foundation/ElegantRL/blob/master/examples/tutorial_Isaac_Gym.py

planetbalileua · 2022-06-30T14:18:46Z

Hi!
I have tested the updated file and there's an error on finding train_and_evaluate_mp in run.py for the latest release.
Some other errors from my side:

ImportError: cannot import name 'ReplayBufferList' from 'elegantrl.train.replay_buffer' (/home/meow/ElegantRL/elegantrl/train/replay_buffer.py)

So I added replay buffer list in replay_buffer.py

  File "/home/meow/ElegantRL/elegantrl/agents/AgentPPO.py", line 657, in AgentPPOHterm
    def __init__(self, net_dim: int, state_dim: int, action_dim: int, gpu_id: int = 0, args: Arguments = None):
NameError: name 'Arguments' is not defined

Added from elegantrl.train.config import Arguments

Thank you again for updating!

supersglzc · 2022-07-21T03:19:18Z

Fixed the errors. The issue is closed.

,-#186,-#188 Fix issues #169, #184, #186, #188

YangletLiu assigned supersglzc Jun 19, 2022

YangletLiu added the bug Something isn't working label Jun 19, 2022

supersglzc closed this as completed Jul 21, 2022

supersglzc added a commit that referenced this issue Jul 21, 2022

Fix issues #169, #184, #186, #188

925aa61

supersglzc added a commit that referenced this issue Jul 21, 2022

Merge pull request #189 from AI4Finance-Foundation/Fix-issues-#169,-#184

ebb9a59

,-#186,-#188 Fix issues #169, #184, #186, #188

Skylark0924 mentioned this issue Nov 3, 2022

The assets data need to be packaged #231

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fail to run tutorial_Isaac_Gym.py #169

Fail to run tutorial_Isaac_Gym.py #169

planetbalileua commented Jun 17, 2022

supersglzc commented Jun 20, 2022

planetbalileua commented Jun 21, 2022 •

edited

YangletLiu commented Jun 29, 2022

planetbalileua commented Jun 30, 2022

supersglzc commented Jul 21, 2022

Fail to run tutorial_Isaac_Gym.py #169

Fail to run tutorial_Isaac_Gym.py #169

Comments

planetbalileua commented Jun 17, 2022

supersglzc commented Jun 20, 2022

planetbalileua commented Jun 21, 2022 • edited

YangletLiu commented Jun 29, 2022

planetbalileua commented Jun 30, 2022

supersglzc commented Jul 21, 2022

planetbalileua commented Jun 21, 2022 •

edited