Has anyone encountered the same problem when I made such an error after executing the training command? #141

Star-down · 2024-04-23T02:02:41Z

2024-04-23 10:00:57,809 agent number of parameters: 4346693
Traceback (most recent call last):
File "/home/dwl/ss/sound-spaces/ss_baselines/av_nav/run.py", line 101, in
main()
File "/home/dwl/ss/sound-spaces/ss_baselines/av_nav/run.py", line 95, in main
trainer.train()
File "/home/dwl/ss/sound-spaces/ss_baselines/av_nav/ppo/ppo_trainer.py", line 267, in train
rollouts.observations[sensor][0].copy_(batch[sensor])
RuntimeError: The size of tensor a (5) must match the size of tensor b (128) at non-singleton dimension 1
Exception ignored in: <function VectorEnv.del at 0x7a436e41e670>
Traceback (most recent call last):
File "/home/dwl/ss/habitat-lab/habitat/core/vector_env.py", line 592, in del
self.close()
File "/home/dwl/ss/habitat-lab/habitat/core/vector_env.py", line 463, in close
write_fn((CLOSE_COMMAND, None))
File "/home/dwl/ss/habitat-lab/habitat/core/vector_env.py", line 118, in call
self.write_fn(data)
File "/home/dwl/ss/habitat-lab/habitat/utils/pickle5_multiprocessing.py", line 62, in send
self.send_bytes(buf.getvalue())
File "/home/dwl/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/dwl/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
self._send(header + buf)
File "/home/dwl/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 373, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

swapb94 · 2024-06-11T08:07:52Z

Were you able to resolve this, @Star-down ?

Star-down · 2024-06-11T08:12:43Z

@swapb94 no,I tried to remove the redundant dimensions, but the program stopped due to a memory leak after running for about 20 minutes.

swapb94 · 2024-06-11T09:44:16Z

@ChanganVR , any suggestions?

swapb94 · 2024-06-11T10:01:46Z

@ChanganVR , any suggestions?

I followed the step-by-step installation guide, checked out both habitat-lab and habitat-sim to v0.1.7
However, when running cache_observations gives
ImportError: cannot import name 'HabitatSimSensor' from 'habitat.sims.habitat_simulator.habitat_simulator,
a similar issue is already open (#134).

In order to resolve this, I copied habitat-lab/habitat/sims/habitat_simulator/habitat_simulator.py from habitat-labv0.2.2 to habitat-labv0.1.7
After doing this cache_observations.py runs without any errors, however running
python ss_baselines/av_nav/run.py --run-type eval --exp-config ss_baselines/av_nav/config/audionav/replica/test_telephone/audiogoal_depth.yaml EVAL_CKPT_PATH_DIR data/pretrained_weights/audionav/av_nav/replica/heard.pth
gives

2024-06-11 10:43:26,164 Initializing dataset AudioNav
2024-06-11 10:43:26,182 initializing sim SoundSpacesSim
2024-06-11 10:43:26,532 Initializing task AudioNav
Sequential(
(0): Conv2d(1, 32, kernel_size=(8, 8), stride=(4, 4))
(1): ReLU(inplace=True)
(2): Conv2d(32, 64, kernel_size=(4, 4), stride=(2, 2))
(3): ReLU(inplace=True)
(4): Conv2d(64, 64, kernel_size=(3, 3), stride=(2, 2))
(5): Flatten()
(6): Linear(in_features=2304, out_features=512, bias=True)
(7): ReLU(inplace=True)
)
    Layer (type)               Output Shape         Param #
================================================================
Conv2d-1 [-1, 32, 15, 16] 4,128
ReLU-2 [-1, 32, 15, 16] 0
Conv2d-3 [-1, 64, 6, 7] 32,832
ReLU-4 [-1, 64, 6, 7] 0
Conv2d-5 [-1, 64, 4, 5] 36,928
Flatten-6 [-1, 1280] 0
Linear-7 [-1, 512] 655,872
ReLU-8 [-1, 512] 0

Total params: 729,760
Trainable params: 729,760
Non-trainable params: 0

Input size (MB): 0.03
Forward/backward pass size (MB): 0.19
Params size (MB): 2.78
Estimated Total Size (MB): 3.00

0%| | 0/1000 [00:00<?, ?it/s]Traceback (most recent call last):
File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/run.py", line 101, in
main()
File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/run.py", line 97, in main
trainer.eval(args.eval_interval, args.prev_ckpt_ind, config.USE_LAST_CKPT)
File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/common/base_trainer.py", line 105, in eval
result = self._eval_checkpoint(self.config.EVAL_CKPT_PATH_DIR, writer)
File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/ppo/ppo_trainer.py", line 522, in _eval_checkpoint
_, actions, _, test_recurrent_hidden_states = self.actor_critic.act(
File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/ppo/policy.py", line 44, in act
features, rnn_hidden_states = self.net(
File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/ppo/policy.py", line 206, in forward
x.append(self.visual_encoder(observations))
File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/media/scratch2/projects2/soundspaces/sound-spaces/ss_baselines/av_nav/models/visual_cnn.py", line 156, in forward
depth_observations = depth_observations.permute(0, 3, 1, 2)
RuntimeError: permute(sparse_coo): number of dimensions in the tensor input does not match the length of the desired ordering of dimensions i.e. input.dim() = 5 is not equal to len(dims) = 4
0%| | 0/1000 [00:00<?, ?it/s]
Exception ignored in: <function VectorEnv.del at 0x7f57e21de790>
Traceback (most recent call last):
File "/media/scratch2/projects2/soundspaces/habitat-lab/habitat/core/vector_env.py", line 588, in del
self.close()
File "/media/scratch2/projects2/soundspaces/habitat-lab/habitat/core/vector_env.py", line 459, in close
write_fn((CLOSE_COMMAND, None))
File "/media/scratch2/projects2/soundspaces/habitat-lab/habitat/core/vector_env.py", line 118, in call
self.write_fn(data)
File "/media/scratch2/projects2/soundspaces/habitat-lab/habitat/utils/pickle5_multiprocessing.py", line 63, in send
self.send_bytes(buf.getvalue())
File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)
File "/home/swapnil/anaconda3/envs/ss/lib/python3.9/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Has anyone encountered the same problem when I made such an error after executing the training command? #141

Has anyone encountered the same problem when I made such an error after executing the training command? #141

Star-down commented Apr 23, 2024

swapb94 commented Jun 11, 2024

Star-down commented Jun 11, 2024

swapb94 commented Jun 11, 2024

swapb94 commented Jun 11, 2024

================================================================
Conv2d-1 [-1, 32, 15, 16] 4,128
ReLU-2 [-1, 32, 15, 16] 0
Conv2d-3 [-1, 64, 6, 7] 32,832
ReLU-4 [-1, 64, 6, 7] 0
Conv2d-5 [-1, 64, 4, 5] 36,928
Flatten-6 [-1, 1280] 0
Linear-7 [-1, 512] 655,872
ReLU-8 [-1, 512] 0

Total params: 729,760
Trainable params: 729,760
Non-trainable params: 0

Input size (MB): 0.03
Forward/backward pass size (MB): 0.19
Params size (MB): 2.78
Estimated Total Size (MB): 3.00

Has anyone encountered the same problem when I made such an error after executing the training command? #141

Has anyone encountered the same problem when I made such an error after executing the training command? #141

Comments

Star-down commented Apr 23, 2024

swapb94 commented Jun 11, 2024

Star-down commented Jun 11, 2024

swapb94 commented Jun 11, 2024

swapb94 commented Jun 11, 2024

================================================================ Conv2d-1 [-1, 32, 15, 16] 4,128 ReLU-2 [-1, 32, 15, 16] 0 Conv2d-3 [-1, 64, 6, 7] 32,832 ReLU-4 [-1, 64, 6, 7] 0 Conv2d-5 [-1, 64, 4, 5] 36,928 Flatten-6 [-1, 1280] 0 Linear-7 [-1, 512] 655,872 ReLU-8 [-1, 512] 0

Total params: 729,760 Trainable params: 729,760 Non-trainable params: 0

Input size (MB): 0.03 Forward/backward pass size (MB): 0.19 Params size (MB): 2.78 Estimated Total Size (MB): 3.00

================================================================
Conv2d-1 [-1, 32, 15, 16] 4,128
ReLU-2 [-1, 32, 15, 16] 0
Conv2d-3 [-1, 64, 6, 7] 32,832
ReLU-4 [-1, 64, 6, 7] 0
Conv2d-5 [-1, 64, 4, 5] 36,928
Flatten-6 [-1, 1280] 0
Linear-7 [-1, 512] 655,872
ReLU-8 [-1, 512] 0

Total params: 729,760
Trainable params: 729,760
Non-trainable params: 0

Input size (MB): 0.03
Forward/backward pass size (MB): 0.19
Params size (MB): 2.78
Estimated Total Size (MB): 3.00