-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When training my own datasets, it interrupted #54
Comments
solved! |
hello, i meet the same problem, can you tell me how can i solve it, thanks! |
I have solve it ,thanks! |
|
@yzhu20 |
@zhuyu-cs Hello, could you please share your solution? thanks a lot |
It will stop after show these informations. The bus shows as follow:
File "train.py", line 43, in prefetch_data
data, ind = sample_data(db, ind, data_aug=data_aug)
File "/home/zhuyu/CenterNet/sample/coco.py", line 199, in sample_data
return globals()[system_configs.sampling_function](db, k_ind, data_aug, debug)
File "/home/zhuyu/CenterNet/sample/coco.py", line 169, in kp_detection
tl_regrs[b_ind, tag_ind, :] = [fxtl - xtl, fytl - ytl]
IndexError: index 128 is out of bounds for axis 1 with size 128
Process Process-1:
Traceback (most recent call last):
File "/home/zhuyu/anaconda3/envs/CenterNet/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/zhuyu/anaconda3/envs/CenterNet/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "train.py", line 47, in prefetch_data
raise e
File "train.py", line 43, in prefetch_data
data, ind = sample_data(db, ind, data_aug=data_aug)
File "/home/zhuyu/CenterNet/sample/coco.py", line 199, in sample_data
return globals()[system_configs.sampling_function](db, k_ind, data_aug, debug)
File "/home/zhuyu/CenterNet/sample/coco.py", line 169, in kp_detection
tl_regrs[b_ind, tag_ind, :] = [fxtl - xtl, fytl - ytl]
IndexError: index 128 is out of bounds for axis 1 with size 128
training loss at iteration 100: 10.80356502532959
focal loss at iteration 100: 10.398816108703613
pull loss at iteration 100: 0.026798348873853683
push loss at iteration 100: 0.11498038470745087
regr loss at iteration 100: 0.26297056674957275
0%| | 101/480000 [02:48<222:26:08, 1.67s/it]Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/zhuyu/anaconda3/envs/CenterNet/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/home/zhuyu/anaconda3/envs/CenterNet/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "train.py", line 51, in pin_memory
data = data_queue.get()
File "/home/zhuyu/anaconda3/envs/CenterNet/lib/python3.6/multiprocessing/queues.py", line 113, in get
return _ForkingPickler.loads(res)
File "/home/zhuyu/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 151, in rebuild_storage_fd
fd = df.detach()
File "/home/zhuyu/anaconda3/envs/CenterNet/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/home/zhuyu/anaconda3/envs/CenterNet/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/home/zhuyu/anaconda3/envs/CenterNet/lib/python3.6/multiprocessing/connection.py", line 487, in Client
c = SocketClient(address)
File "/home/zhuyu/anaconda3/envs/CenterNet/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient
s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory
Can anyone help me ? Thank you very much!!!
The text was updated successfully, but these errors were encountered: