You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
0%| | 0/480000 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 212, in <module>
train(training_dbs, validation_db, args.start_iter)
File "train.py", line 141, in train
training_loss, focal_loss, pull_loss, push_loss, regr_loss = nnet.train(**training)
File "/home/wlk/Center/CenterNet/nnet/py_factory.py", line 82, in train
loss_kp = self.network(xs, ys)
File "/home/wlk/anaconda3/envs/CornerNet_Lite/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/wlk/Center/CenterNet/models/py_utils/data_parallel.py", line 66, in forward
inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids, self.chunk_sizes)
File "/home/wlk/Center/CenterNet/models/py_utils/data_parallel.py", line 77, in scatter
return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim, chunk_sizes=self.chunk_sizes)
File "/home/wlk/Center/CenterNet/models/py_utils/scatter_gather.py", line 30, in scatter_kwargs
inputs = scatter(inputs, target_gpus, dim, chunk_sizes) if inputs else []
File "/home/wlk/Center/CenterNet/models/py_utils/scatter_gather.py", line 25, in scatter
return scatter_map(inputs)
File "/home/wlk/Center/CenterNet/models/py_utils/scatter_gather.py", line 18, in scatter_map
return list(zip(*map(scatter_map, obj)))
File "/home/wlk/Center/CenterNet/models/py_utils/scatter_gather.py", line 20, in scatter_map
return list(map(list, zip(*map(scatter_map, obj))))
File "/home/wlk/Center/CenterNet/models/py_utils/scatter_gather.py", line 15, in scatter_map
return Scatter.apply(target_gpus, chunk_sizes, dim, obj)
File "/home/wlk/anaconda3/envs/CornerNet_Lite/lib/python3.7/site-packages/torch/nn/parallel/_functions.py", line 89, in forward
outputs = comm.scatter(input, target_gpus, chunk_sizes, ctx.dim, streams)
File "/home/wlk/anaconda3/envs/CornerNet_Lite/lib/python3.7/site-packages/torch/cuda/comm.py", line 148, in scatter
return tuple(torch._C._scatter(tensor, devices, chunk_sizes, dim, streams))
RuntimeError: Expected the device associated with the stream at index 4 (was 24371) to match the device supplied at that index (expected 48) (scatter at /opt/conda/conda-bld/pytorch_1544202130060/work/torch/csrc/cuda/comm.cpp:199)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f922e9ebcc5 in /home/wlk/anaconda3/envs/CornerNet_Lite/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: torch::cuda::scatter(at::Tensor const&, c10::ArrayRef<long>, c10::optional<std::vector<long, std::allocator<long> > > const&, long, c10::optional<std::vector<c10::optional<at::cuda::CUDAStream>, std::allocator<c10::optional<at::cuda::CUDAStream> > > > const&) + 0x85d (0x7f926f55552d in /home/wlk/anaconda3/envs/CornerNet_Lite/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #2: <unknown function> + 0x4fae71 (0x7f926f55ae71 in /home/wlk/anaconda3/envs/CornerNet_Lite/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #3: <unknown function> + 0x112176 (0x7f926f172176 in /home/wlk/anaconda3/envs/CornerNet_Lite/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #11: THPFunction_apply(_object*, _object*) + 0x5a1 (0x7f926f36dbf1 in /home/wlk/anaconda3/envs/CornerNet_Lite/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
Exception ignored in: <function tqdm.__del__ at 0x7f922ad38a60>
Traceback (most recent call last):
File "/home/wlk/anaconda3/envs/CornerNet_Lite/lib/python3.7/site-packages/tqdm/_tqdm.py", line 885, in __del__
self.close()
File "/home/wlk/anaconda3/envs/CornerNet_Lite/lib/python3.7/site-packages/tqdm/_tqdm.py", line 1090, in close
self._decr_instances(self)
File "/home/wlk/anaconda3/envs/CornerNet_Lite/lib/python3.7/site-packages/tqdm/_tqdm.py", line 454, in _decr_instances
cls.monitor.exit()
File "/home/wlk/anaconda3/envs/CornerNet_Lite/lib/python3.7/site-packages/tqdm/_monitor.py", line 52, in exit
self.join()
File "/home/wlk/anaconda3/envs/CornerNet_Lite/lib/python3.7/threading.py", line 1029, in join
raise RuntimeError("cannot join current thread")
RuntimeError: cannot join current thread
terminate called without an active exception```
The text was updated successfully, but these errors were encountered:
I use pytorch=1.0.0 annd modify the code in 'nnet/py_factory.py'
which import the 'from torch.nn import DataParallel' instead of 'from models.py_utils.data_parallel import DataParallel'
Anybody knows the reason ?
there is the blog.
The text was updated successfully, but these errors were encountered: