Topaz training erro #172
Replies: 2 comments 12 replies
-
We'll look into this. It looks like this may actually be a multiprocessing error in our data loader, but could you also provide the command you used to run |
Beta Was this translation helpful? Give feedback.
-
This error comes after we create the dataloader, as we get items from it, so I don't think it should involve reading any files. If anything, I think there would be an error in writing the output or saving the model. I just ran the tutorial code in a notebook and it worked fine. Could you perhaps create a new conda environment, install Topaz there, and try again? |
Beta Was this translation helpful? Give feedback.
-
When I do a topaz training, it show me there haven't some file or directory, but don't show me the concrete file or directory.Like that:
Traceback (most recent call last):
File "/home/amax/miniconda3/envs/topaz/bin/topaz", line 33, in
sys.exit(load_entry_point('topaz-em==0.2.5', 'console_scripts', 'topaz')())
File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/topaz/main.py", line 148, in main
args.func(args)
File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/topaz/commands/train.py", line 695, in main
, save_prefix=save_prefix, use_cuda=use_cuda, output=output)
File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/topaz/commands/train.py", line 577, in fit_epochs
, use_cuda=use_cuda, output=output)
File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/topaz/commands/train.py", line 552, in fit_epoch
for X,Y in data_iterator:
File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 521, in next
data = self._next_data()
File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1186, in _next_data
idx, data = self._get_data()
File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1152, in _get_data
success, data = self._try_get_data()
File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 990, in _try_get_data
data = self._data_queue.get(timeout=timeout)
File "/home/amax/miniconda3/envs/topaz/lib/python3.6/multiprocessing/queues.py", line 113, in get
return _ForkingPickler.loads(res)
File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 289, in rebuild_storage_fd
fd = df.detach()
File "/home/amax/miniconda3/envs/topaz/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/home/amax/miniconda3/envs/topaz/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/home/amax/miniconda3/envs/topaz/lib/python3.6/multiprocessing/connection.py", line 487, in Client
c = SocketClient(address)
File "/home/amax/miniconda3/envs/topaz/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient
s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory
Beta Was this translation helpful? Give feedback.
All reactions