Errors on new install #67

jarlva · 2023-08-12T08:35:00Z

After following the steps here by installing the DI dependency and running, on a new install, python3 -u zoo/classic_control/cartpole/config/cartpole_muzero_config.py I now get:

/home/user/miniconda3/envs/light/lib/python3.9/site-packages/gym/wrappers/step_api_compatibility.py:39: DeprecationWarning: WARN: Initializing environment in old step API which returns one bool instead of two. It is recommended to set `new_step_api=True` to use new step API. This will be the default behaviour in future.
  deprecation(
/home/user/miniconda3/envs/light/lib/python3.9/site-packages/gym/wrappers/step_api_compatibility.py:39: DeprecationWarning: WARN: Initializing environment in old step API which returns one bool instead of two. It is recommended to set `new_step_api=True` to use new step API. This will be the default behaviour in future.
  deprecation(
/home/user/miniconda3/envs/light/lib/python3.9/site-packages/gym/core.py:268: DeprecationWarning: WARN: Function `env.seed(seed)` is marked as deprecated and will be removed in the future. Please use `env.reset(seed=seed)` instead.
  deprecation(
/home/user/miniconda3/envs/light/lib/python3.9/site-packages/gym/core.py:268: DeprecationWarning: WARN: Function `env.seed(seed)` is marked as deprecated and will be removed in the future. Please use `env.reset(seed=seed)` instead.
  deprecation(
Traceback (most recent call last):
  File "/home/user/py/LightZero/zoo/classic_control/cartpole/config/cartpole_muzero_config.py", line 93, in <module>
    train_muzero([main_config, create_config], seed=0, max_env_step=max_env_step)
  File "/home/user/py/LightZero/lzero/entry/train_muzero.py", line 158, in train_muzero
    new_data = collector.collect(train_iter=learner.train_iter, policy_kwargs=collect_kwargs)
  File "/home/user/py/LightZero/lzero/worker/muzero_collector.py", line 383, in collect
    policy_output = self._policy.forward(stack_obs, action_mask, temperature, to_play, epsilon)
  File "/home/user/py/LightZero/lzero/policy/muzero.py", line 520, in _forward_collect
    network_output = self._collect_model.initial_inference(data)
  File "/home/user/py/LightZero/lzero/model/muzero_model_mlp.py", line 170, in initial_inference
    latent_state = self._representation(obs)
  File "/home/user/py/LightZero/lzero/model/muzero_model_mlp.py", line 218, in _representation
    latent_state = self.representation_network(observation)
  File "/home/user/miniconda3/envs/light/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user/py/LightZero/lzero/model/common.py", line 280, in forward
    return self.fc_representation(x)
  File "/home/user/miniconda3/envs/light/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user/miniconda3/envs/light/lib/python3.9/site-packages/torch/nn/modules/container.py", line 139, in forward
    input = module(input)
  File "/home/user/miniconda3/envs/light/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user/miniconda3/envs/light/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Exception ignored in: <function MuZeroCollector.__del__ at 0x7f59b7722310>
Traceback (most recent call last):
  File "/home/user/py/LightZero/lzero/worker/muzero_collector.py", line 181, in __del__
    self.close()
  File "/home/user/py/LightZero/lzero/worker/muzero_collector.py", line 171, in close
    self._env.close()
  File "/home/user/py/DI-engine/ding/envs/env_manager/subprocess_env_manager.py", line 635, in close
    p.send(['close', None, None])
  File "/home/user/miniconda3/envs/light/lib/python3.9/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/home/user/miniconda3/envs/light/lib/python3.9/multiprocessing/connection.py", line 411, in _send_bytes
    self._send(header + buf)
  File "/home/user/miniconda3/envs/light/lib/python3.9/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Exception ignored in: <function MuZeroEvaluator.__del__ at 0x7f59b7722af0>
Traceback (most recent call last):
  File "/home/user/py/LightZero/lzero/worker/muzero_evaluator.py", line 170, in __del__
    self.close()
  File "/home/user/py/LightZero/lzero/worker/muzero_evaluator.py", line 160, in close
    self._env.close()
  File "/home/user/py/DI-engine/ding/envs/env_manager/subprocess_env_manager.py", line 635, in close
    p.send(['close', None, None])
  File "/home/user/miniconda3/envs/light/lib/python3.9/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/home/user/miniconda3/envs/light/lib/python3.9/multiprocessing/connection.py", line 411, in _send_bytes
    self._send(header + buf)
  File "/home/user/miniconda3/envs/light/lib/python3.9/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

The text was updated successfully, but these errors were encountered:

puyuan1996 · 2023-08-12T08:45:16Z

Hello,

This error might occur due to a mismatch between your installed torch and CUDA versions and your GPU hardware settings. You can try the following command to install torch and its corresponding CUDA:

pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html

If similar errors persist, please provide details on your hardware device and torch version so we can better pinpoint the issue.

Best wishes.

PaParaZz1 closed this as completed Aug 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors on new install #67

Errors on new install #67

jarlva commented Aug 12, 2023

puyuan1996 commented Aug 12, 2023

Errors on new install #67

Errors on new install #67

Comments

jarlva commented Aug 12, 2023

puyuan1996 commented Aug 12, 2023