Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Has anyone encountered the same problem when I made such an error after executing the training command? #141

Open
Star-down opened this issue Apr 23, 2024 · 4 comments

Comments

@Star-down
Copy link

2024-04-23 10:00:57,809 agent number of parameters: 4346693
Traceback (most recent call last):
File "/home/dwl/ss/sound-spaces/ss_baselines/av_nav/run.py", line 101, in
main()
File "/home/dwl/ss/sound-spaces/ss_baselines/av_nav/run.py", line 95, in main
trainer.train()
File "/home/dwl/ss/sound-spaces/ss_baselines/av_nav/ppo/ppo_trainer.py", line 267, in train
rollouts.observations[sensor][0].copy_(batch[sensor])
RuntimeError: The size of tensor a (5) must match the size of tensor b (128) at non-singleton dimension 1
Exception ignored in: <function VectorEnv.del at 0x7a436e41e670>
Traceback (most recent call last):
File "/home/dwl/ss/habitat-lab/habitat/core/vector_env.py", line 592, in del
self.close()
File "/home/dwl/ss/habitat-lab/habitat/core/vector_env.py", line 463, in close
write_fn((CLOSE_COMMAND, None))
File "/home/dwl/ss/habitat-lab/habitat/core/vector_env.py", line 118, in call
self.write_fn(data)
File "/home/dwl/ss/habitat-lab/habitat/utils/pickle5_multiprocessing.py", line 62, in send
self.send_bytes(buf.getvalue())
File "/home/dwl/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/dwl/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
self._send(header + buf)
File "/home/dwl/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 373, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

@swapb94
Copy link

swapb94 commented Jun 11, 2024

Were you able to resolve this, @Star-down ?

@Star-down
Copy link
Author

@swapb94 no,I tried to remove the redundant dimensions, but the program stopped due to a memory leak after running for about 20 minutes.

@swapb94
Copy link

swapb94 commented Jun 11, 2024

@ChanganVR , any suggestions?

@swapb94
Copy link

swapb94 commented Jun 11, 2024

@ChanganVR , any suggestions?

I followed the step-by-step installation guide, checked out both habitat-lab and habitat-sim to v0.1.7
However, when running cache_observations gives
ImportError: cannot import name 'HabitatSimSensor' from 'habitat.sims.habitat_simulator.habitat_simulator,
a similar issue is already open (#134).

In order to resolve this, I copied habitat-lab/habitat/sims/habitat_simulator/habitat_simulator.py from habitat-labv0.2.2 to habitat-labv0.1.7
After doing this cache_observations.py runs without any errors, however running
python ss_baselines/av_nav/run.py --run-type eval --exp-config ss_baselines/av_nav/config/audionav/replica/test_telephone/audiogoal_depth.yaml EVAL_CKPT_PATH_DIR data/pretrained_weights/audionav/av_nav/replica/heard.pth
gives

2024-06-11 10:43:26,164 Initializing dataset AudioNav
2024-06-11 10:43:26,182 initializing sim SoundSpacesSim
2024-06-11 10:43:26,532 Initializing task AudioNav
Sequential(
(0): Conv2d(1, 32, kernel_size=(8, 8), stride=(4, 4))
(1): ReLU(inplace=True)
(2): Conv2d(32, 64, kernel_size=(4, 4), stride=(2, 2))
(3): ReLU(inplace=True)
(4): Conv2d(64, 64, kernel_size=(3, 3), stride=(2, 2))
(5): Flatten()
(6): Linear(in_features=2304, out_features=512, bias=True)
(7): ReLU(inplace=True)
)

    Layer (type)               Output Shape         Param #

================================================================
Conv2d-1 [-1, 32, 15, 16] 4,128
ReLU-2 [-1, 32, 15, 16] 0
Conv2d-3 [-1, 64, 6, 7] 32,832
ReLU-4 [-1, 64, 6, 7] 0
Conv2d-5 [-1, 64, 4, 5] 36,928
Flatten-6 [-1, 1280] 0
Linear-7 [-1, 512] 655,872
ReLU-8 [-1, 512] 0

Total params: 729,760
Trainable params: 729,760
Non-trainable params: 0

Input size (MB): 0.03
Forward/backward pass size (MB): 0.19
Params size (MB): 2.78
Estimated Total Size (MB): 3.00

0%| | 0/1000 [00:00<?, ?it/s]Traceback (most recent call last):
File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/run.py", line 101, in
main()
File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/run.py", line 97, in main
trainer.eval(args.eval_interval, args.prev_ckpt_ind, config.USE_LAST_CKPT)
File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/common/base_trainer.py", line 105, in eval
result = self._eval_checkpoint(self.config.EVAL_CKPT_PATH_DIR, writer)
File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/ppo/ppo_trainer.py", line 522, in _eval_checkpoint
_, actions, _, test_recurrent_hidden_states = self.actor_critic.act(
File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/ppo/policy.py", line 44, in act
features, rnn_hidden_states = self.net(
File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/ppo/policy.py", line 206, in forward
x.append(self.visual_encoder(observations))
File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/models/visual_cnn.py", line 156, in forward
depth_observations = depth_observations.permute(0, 3, 1, 2)
RuntimeError: permute(sparse_coo): number of dimensions in the tensor input does not match the length of the desired ordering of dimensions i.e. input.dim() = 5 is not equal to len(dims) = 4
0%| | 0/1000 [00:00<?, ?it/s]
Exception ignored in: <function VectorEnv.del at 0x7f57e21de790>
Traceback (most recent call last):
File "/media/scratch2/projects2/soundspaces/habitat-lab/habitat/core/vector_env.py", line 588, in del
self.close()
File "/media/scratch2/projects2/soundspaces/habitat-lab/habitat/core/vector_env.py", line 459, in close
write_fn((CLOSE_COMMAND, None))
File "/media/scratch2/projects2/soundspaces/habitat-lab/habitat/core/vector_env.py", line 118, in call
self.write_fn(data)
File "/media/scratch2/projects2/soundspaces/habitat-lab/habitat/utils/pickle5_multiprocessing.py", line 63, in send
self.send_bytes(buf.getvalue())
File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)
File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants