OOM in train.py . #71

jiangzhiwei2018 · 2023-04-02T19:34:24Z

My env:
OS: Windows10
python version: 3.8
pytorch version: 1.13.1
numpy version: 1.23.5
GPU: RTX3090TI
RAM: 32GB

A suspected memory leak occurred when I ran train.py for the training process with a single GPU.
The usage memory is 95% after a period of time and keeps rising.

Afterwards, I ran a memory analysis through memory_profiler found that there seemed to be over-occupying memory during the load data phase.

Maybe it can provide some suggestions for solutions

I used the same training data (Vimeo90K), and didn't make any major changes to the Vimeo7Dataset class but the following to make it fit my own cache_keys.pkl

jiangzhiwei2018 closed this as completed Apr 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OOM in train.py . #71

OOM in train.py . #71

jiangzhiwei2018 commented Apr 2, 2023 •

edited

Loading

OOM in train.py . #71

OOM in train.py . #71

Comments

jiangzhiwei2018 commented Apr 2, 2023 • edited Loading

jiangzhiwei2018 commented Apr 2, 2023 •

edited

Loading