Memory usage grows over time and ends with memory error #116
Comments
Same for me on Ubuntu w/ 16GB and a 1070. It will last ~700 iterations or |
8571MB is total for all processes. |
how many src and dst images used? |
Between 1000 and 1500 for both. The python subprocesses are what grow for me, not the main python thread. |
not you |
TF session dies - another issue. |
src 319 files, 30MB with CPU I think it starts about ~7GB and it works though it takes so long that I cannot test if there is some memory leak and I am talking about the RAM usage as the GPU usage seems to be stable It starts at ~1960MB (preview window) + 4 x 80MB (so these 4 grow to 1.7 GB each) |
and why are you not using prebuilt windows binary? |
Sometimes I need to customize existing code like with the saving interval, so it is nice to use repository instead, but actually the need was caused by the out of memory problem. I have no memory leak using prebuild binary, also when I've just copied the code from prebuild internal to deepFaceLab the memory leak was visible so it seems it is related to some external module |
Hi, I have used yolk to compare installed site packages in prebuilt package and downloaded project. |
@berniejerom wow interesting. |
I have upgraded numpy to 1.16 and got memory leak too |
looks like memory leak caused by np array interprocess pickling |
they will fix it in 1.16.1 |
Expected behavior
memory usage probaly shouldn't rise a few GB over time.
Actual behavior
I am able to run a few hundreds epochs but then I get the memory error.
Training window process memory usage is stable, but the other python process can rise up to 2GB each.
I get memory error when the app uses about ~8.5GB, even though some of RAM is still available.
GPU mem usage looks to be stable - about 10GB
Steps to reproduce
Train any model, I think it also occurs for liaef and DF, sometimes it just takes more time.
Other relevant information
Python 3.6.5 64bit
Windows 10
1080TI
16 GB + Swap 10GB
CUDA 9.0
CUDNN v7.4.1.5 also tried newest one
Note: I've changed save period to 2min.
The text was updated successfully, but these errors were encountered: