Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dataloader] SIGCHLD handler should poll the queue for exception first #22136

Open
ssnl opened this issue Jun 24, 2019 · 1 comment
Open

[dataloader] SIGCHLD handler should poll the queue for exception first #22136

ssnl opened this issue Jun 24, 2019 · 1 comment
Labels
module: dataloader Related to torch.utils.data.DataLoader and Sampler triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@ssnl
Copy link
Collaborator

ssnl commented Jun 24, 2019

Otherwise it may cause weird exception prints like

#Traceback (most recent call last):
#  File "<string>", line 1, in <module>
#  File "/miniconda/lib/python3.7/multiprocessing/spawn.py", line 105, in spawn_main
#    exitcode = _main(fd)
#  File "/miniconda/lib/python3.7/multiprocessing/spawn.py", line 115, in _main
#    self = reduction.pickle.load(from_parent)
#AttributeError: Can't get attribute 'Dataset' on <module '__mp_main__' from #'/deepspeech.pytorch/convasr/bug_.py'>
#Traceback (most recent call last):
#  File "/miniconda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 512, in _try_get_batch
#    data = self.data_queue.get(timeout=timeout)
#  File "/miniconda/lib/python3.7/multiprocessing/queues.py", line 104, in get
#    if not self._poll(timeout):
#  File "/miniconda/lib/python3.7/multiprocessing/connection.py", line 257, in poll
#    return self._poll(timeout)
#  File "/miniconda/lib/python3.7/multiprocessing/connection.py", line 414, in _poll
#    r = wait([self], timeout)
#  File "/miniconda/lib/python3.7/multiprocessing/connection.py", line 920, in wait
#    ready = selector.select(timeout)
#  File "/miniconda/lib/python3.7/selectors.py", line 415, in select
#    fd_event_list = self._selector.poll(timeout)
#  File "/miniconda/lib/python3.7/site-packages/torch/utils/data/_utils/signal_handling.py", line 63, in handler
#    _error_if_any_worker_fails()
#RuntimeError: DataLoader worker (pid 6082) exited unexpectedly with exit code 1. Details are #lost due to multiprocessing. Rerunning with num_workers=0 may give better error trace.

#During handling of the above exception, another exception occurred:

#Traceback (most recent call last):
#  File "bug_.py", line 15, in <module>
#    next(iter(loader))
#  File "/miniconda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 577, in #__next__
#    idx, batch = self._get_batch()
#  File "/miniconda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 554, in #_get_batch
#    success, data = self._try_get_batch()
#  File "/miniconda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 520, in #_try_get_batch
#    raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str))
# RuntimeError: DataLoader worker (pid(s) 6082) exited unexpectedly

cc @vadimkantorov

@vadimkantorov
Copy link
Contributor

damned complicated data loading :(

@VitalyFedyunin VitalyFedyunin added module: dataloader Related to torch.utils.data.DataLoader and Sampler triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jun 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: dataloader Related to torch.utils.data.DataLoader and Sampler triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

3 participants