-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Closed
Labels
module: bottleneckRelated to torch.utils.bottleneckRelated to torch.utils.bottleneckmodule: dataloaderRelated to torch.utils.data.DataLoader and SamplerRelated to torch.utils.data.DataLoader and Samplerquansight-nackHigh-prio issues that have been reviewed by Quansight and are judged to be not actionable.High-prio issues that have been reviewed by Quansight and are judged to be not actionable.triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Description
torch.utils.bottleneck
doesn't work properly when the code contains a data loader that uses more than 0 threads.
Minimum reproducible example (mwe.py
):
import argparse
import torch
import torch.utils.data
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='mwe')
parser.add_argument('--num-workers', default=0, type=int)
args = parser.parse_args()
data = torch.rand(10, 1000)
target = torch.rand(10)
dataset = torch.utils.data.TensorDataset(data, target)
data_loader = torch.utils.data.DataLoader(dataset,
batch_size=2, num_workers=args.num_workers)
for i, batch in enumerate(data_loader):
pass
Running the script via:
python -m torch.utils.bottleneck -- mwe.py --num-workers 0
works fine, while
python -m torch.utils.bottleneck -- mwe2.py --num-workers 1
crashes with the following stack trace:
Traceback (most recent call last):
File "/private/home/fmassa/.conda/envs/detectron_v2/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/private/home/fmassa/.conda/envs/detectron_v2/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/private/home/fmassa/.conda/envs/detectron_v2/lib/python3.6/site-packages/torch/utils/bottleneck/__main__.py", line 280, in <module>
main()
File "/private/home/fmassa/.conda/envs/detectron_v2/lib/python3.6/site-packages/torch/utils/bottleneck/__main__.py", line 261, in main
autograd_prof_cpu, autograd_prof_cuda = run_autograd_prof(code, globs)
File "/private/home/fmassa/.conda/envs/detectron_v2/lib/python3.6/site-packages/torch/utils/bottleneck/__main__.py", line 155, in run_autograd_prof
result.append(run_prof(use_cuda=True))
File "/private/home/fmassa/.conda/envs/detectron_v2/lib/python3.6/site-packages/torch/utils/bottleneck/__main__.py", line 149, in run_prof
exec(code, globs, None)
File "mwe2.py", line 15, in <module>
for i, batch in enumerate(data_loader):
File "/private/home/fmassa/.conda/envs/detectron_v2/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 285, in __next__
return self._process_next_batch(batch)
File "/private/home/fmassa/.conda/envs/detectron_v2/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 306, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
RuntimeError: Traceback (most recent call last):
File "/private/home/fmassa/.conda/envs/detectron_v2/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 57, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/private/home/fmassa/.conda/envs/detectron_v2/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 57, in <listcomp>
samples = collate_fn([dataset[i] for i in batch_indices])
File "/private/home/fmassa/.conda/envs/detectron_v2/lib/python3.6/site-packages/torch/utils/data/dataset.py", line 40, in __getitem__
return tuple(tensor[index] for tensor in self.tensors)
File "/private/home/fmassa/.conda/envs/detectron_v2/lib/python3.6/site-packages/torch/utils/data/dataset.py", line 40, in <genexpr>
return tuple(tensor[index] for tensor in self.tensors)
RuntimeError: /private/home/fmassa/github/pytorch/torch/csrc/autograd/profiler.h:52: initialization error
assigning this to @zou3519 , even thought I'm not sure if it's a problem in the profiler or in the bottleneck
tool.
pytorch version '0.4.0a0+b21e135'
vwvolodya, ir413, r2123b, jhagege, marcocaccin and 34 more
Metadata
Metadata
Assignees
Labels
module: bottleneckRelated to torch.utils.bottleneckRelated to torch.utils.bottleneckmodule: dataloaderRelated to torch.utils.data.DataLoader and SamplerRelated to torch.utils.data.DataLoader and Samplerquansight-nackHigh-prio issues that have been reviewed by Quansight and are judged to be not actionable.High-prio issues that have been reviewed by Quansight and are judged to be not actionable.triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module