Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Pin memory thread exited unexpectedly #392

Closed
ioir123ju opened this issue Aug 5, 2021 · 5 comments
Closed

RuntimeError: Pin memory thread exited unexpectedly #392

ioir123ju opened this issue Aug 5, 2021 · 5 comments
Labels
help wanted Extra attention is needed

Comments

@ioir123ju
Copy link

推荐使用英语模板 General question,以便你的问题帮助更多人。

首先确认以下内容

描述你遇到的问题

I used my own DATASET in VOC format.
When pin_memory is set to False, there is no error, otherwise an error is reported.
python tools/train.py configs/vgg/vgg16_b16x8_voc.py

相关信息

  1. pip list | grep "mmcv\|mmcls\|^torch" 命令的输出
    .mmcls 0.13.0 /home/juzheng/code/mmclassification
    mmcv-full 1.3.9
    torch 1.7.1
    torchaudio 0.7.2
    torchvision 0.8.2

  2. 如果你修改了,或者使用了新的配置文件,请在这里写明

model = dict(
    type='ImageClassifier',
    backbone=dict(type='VGG', depth=16, num_classes=8),
    neck=None,
    head=dict(
        type='MultiLabelClsHead',
        loss=dict(type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)))
  1. 如果你是在训练过程中遇到的问题,请填写完整的训练日志和报错信息
    [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 855/855, 229.8 task/s, elapsed: 4s, ETA: 0s2021-08-05 15:29:05,892 - mmcls - INFO - Epoch(val) [1][54] mAP: 89.2137, CP: 74.1056, CR: 71.8438, CF1: 72.9571, OP: 85.5930, OR: 85.4678, OF1: 85.5304
    Exception in thread Thread-1:
    Traceback (most recent call last):
    File "/home/juzheng/anaconda3/envs/python3.7/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
    File "/home/juzheng/anaconda3/envs/python3.7/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
    File "/home/juzheng/anaconda3/envs/python3.7/lib/python3.7/site-packages/torch/utils/data/_utils/pin_memory.py", line 28, in _pin_memory_loop
    idx, data = r
    ValueError: not enough values to unpack (expected 2, got 0)

Traceback (most recent call last):
File "tools/train.py", line 156, in
main()
File "tools/train.py", line 152, in main
meta=meta)
File "/home/juzheng/code/mmclassification/mmcls/apis/train.py", line 159, in train_model
runner.run(data_loaders, cfg.workflow)
File "/home/juzheng/anaconda3/envs/python3.7/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/juzheng/anaconda3/envs/python3.7/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 47, in train
for i, data_batch in enumerate(self.data_loader):
File "/home/juzheng/anaconda3/envs/python3.7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 349, in iter
self._iterator._reset(self)
File "/home/juzheng/anaconda3/envs/python3.7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 852, in _reset
data = self._get_data()
File "/home/juzheng/anaconda3/envs/python3.7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1029, in _get_data
raise RuntimeError('Pin memory thread exited unexpectedly')
RuntimeError: Pin memory thread exited unexpectedly

  1. 如果你对 mmcls 文件夹下的代码做了其他相关的修改,请在这里写明
    CLASSES = ('fall', 'standing', 'smoking', 'no_smoking', 'helmet', 'no_helmet', 'mask', 'no_mask')
@ioir123ju ioir123ju added the help wanted Extra attention is needed label Aug 5, 2021
@Ezra-Yu
Copy link
Collaborator

Ezra-Yu commented Aug 5, 2021

Have you tried other version torch? torch1.6 or torch1.8

@mzr1996
Copy link
Member

mzr1996 commented Aug 5, 2021

Hello, I think it's an upstream issue. It has been fixed in PyTorch 1.8.0, which refers to pytorch/pytorch@54ce171.
Please update your PyTorch>=1.8.0 or set pin_memory=False or set persistent_workers=False in https://github.com/open-mmlab/mmclassification/blob/64bbed41f40ce41928a66e7a7b9817284c8a677f/mmcls/datasets/builder.py#L52-L53
We will avoid this situation in the future. Thanks!

@domattioli
Copy link

As of 16 Aug 2022, this error happens with PyTorch 1.12.0 and 1.12.1. Downgrade to 1.11.

@Mohit-robo
Copy link

As of 16 Aug 2022, this error happens with PyTorch 1.12.0 and 1.12.1. Downgrade to 1.11.

For me setting persistent_workers=False and pin_memory=False worked with PyTorch 1.12.0

@trinhvg
Copy link

trinhvg commented Jan 4, 2023

As of 03 Jan 2023, this error happens with PyTorch 1.11.0 (Only 1/3 of my custom dataset has this problem). Downgrade to 1.10.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

6 participants