memory leak when using pytorch dataloader #746

techkang · 2019-05-20T12:09:58Z

When I use tqdm with pytorch, I found a memory leak.

This code will use more and more memory until break down.

Comment code dummy = tqdm(total=100) in main function or set num_workers=0 will work well.

from torch.utils.data import Dataset, DataLoader
from tqdm import tqdm


class MinDataset(Dataset):

    def __init__(self):
        super().__init__()

    def __getitem__(self, item):
        return '{:.4f}'.format(item) * 1000000

    def __len__(self):
        return 1000`


if __name__ == '__main__':

    min_dataloader = DataLoader(MinDataset(), 32, num_workers=4)
    dummy = tqdm(total=100)
    while True:
        for i, batch in tqdm(enumerate(min_dataloader), total=1000):
            if i == 2:
                break
            pass

I used PyTorch 1.0.1 and tqdm 4.31.1.

The text was updated successfully, but these errors were encountered:

casperdcl · 2019-05-20T23:34:22Z

as mentioned in the documentation, use enumerate(tqdm(x)) instead of tqdm(enumerate(x))

techkang · 2019-05-21T02:57:36Z

It works well when I use tqdm(enumerate(x)). Thank you.

guaguablue · 2020-03-30T05:36:42Z

I feel puzzled about the last reply:"It works well when I use tqdm(enumerate(x))", according to another reply before:"use enumerate(tqdm(x)) instead of tqdm(enumerate(x))". I am wondering if Mr.techkang wanted to say: "It works well when I use enumerate(tqdm(x))".
And I read the docunment, it says: "Replace tqdm(enumerate(...)) with enumerate(tqdm(...)) or tqdm(enumerate(x), total=len(x), ...)". However, I met the oppsite situation. When I use tqdm in in my code like below:
for batch_i, (imgs, targets, paths, shapes) in enumerate(tqdm(dataloader, desc='Computing mAP')):
Some times it will make my program stuck in this step, and if I use tqdm(enumerate(dataloader), total=len(dataloader), ...) instead, the program will not work any more. I also tried to use tqdm(enumerate(dataloader)) the computing is ok but the tqdm not works well that I can not see the progress bar like: ‘Computing mAP: 5it [00:05, 1.13s/it]‘.
pytorch version:1.1.0, tqdm:4.43.0
So could some help me about that? Thanks a lot!

casperdcl · 2020-03-30T10:18:34Z

Also I should mention tqdm.contrib.tenumerate

Memory leak seems to be related to the way how tqdm is wrapping data_loader. More info at: tqdm/tqdm#746

casperdcl self-assigned this May 20, 2019

casperdcl added need-feedback 📢 We need your response (question) submodule ⊂ Periphery/subclasses synchronisation ⇶ Multi-thread/processing labels May 20, 2019

casperdcl closed this as completed Jun 1, 2019

eldarkurtic mentioned this issue Apr 6, 2021

Fix memory leak when using debug-steps neuralmagic/sparseml#153

Merged

markurtz pushed a commit to neuralmagic/sparseml that referenced this issue Apr 6, 2021

Fix memory leak when using debug-steps (#153)

43b0e6c

Memory leak seems to be related to the way how tqdm is wrapping data_loader. More info at: tqdm/tqdm#746

rfriel mentioned this issue Jul 31, 2021

Memory leak in tqdm_notebook #1216

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory leak when using pytorch dataloader #746

memory leak when using pytorch dataloader #746

techkang commented May 20, 2019 •

edited by casperdcl

casperdcl commented May 20, 2019

techkang commented May 21, 2019

guaguablue commented Mar 30, 2020

casperdcl commented Mar 30, 2020

memory leak when using pytorch dataloader #746

memory leak when using pytorch dataloader #746

Comments

techkang commented May 20, 2019 • edited by casperdcl

casperdcl commented May 20, 2019

techkang commented May 21, 2019

guaguablue commented Mar 30, 2020

casperdcl commented Mar 30, 2020

techkang commented May 20, 2019 •

edited by casperdcl