memory leak #57

marisancans · 2020-01-27T08:01:37Z

Hello, Im facing a memory leak and I can't find out why. I am simply looping through a lot of images and it gradually fills all my memory, this is my setup:
Version facenet-pytorch==2.0.1

mtcnn = MTCNN(image_size=64, keep_all=True)
resnet = InceptionResnetV1(pretrained='vggface2').eval()

for nth, img_path in enumerate(img_paths):
    img = Image.open(img_path.resolve())
    boxes, probs = mtcnn.detect(img)

The text was updated successfully, but these errors were encountered:

marisancans · 2020-01-27T08:28:51Z

Some additional info, I ran the code on another machine (server) and it looks like there is no leak there, i'm using conda.

Heres my local PC dependencies:
Python 3.6.10

astroid==2.3.3
certifi==2019.11.28
cffi==1.13.2
chardet==3.0.4
cloudpickle==1.2.2
cycler==0.10.0
cytoolz==0.10.1
dask==2.9.1
decorator==4.4.1
face-alignment==1.0.1
facenet-pytorch==2.0.1
idna==2.8
imageio==2.6.1
isort==4.3.21
kiwisolver==1.1.0
lazy-object-proxy==1.4.3
matplotlib==3.1.2
mccabe==0.6.1
mkl-fft==1.0.15
mkl-random==1.1.0
mkl-service==2.3.0
networkx==2.4
numpy==1.17.4
olefile==0.46
opencv-python==4.1.2.30
pandas==0.25.3
Pillow==7.0.0
pycparser==2.19
pylint==2.4.4
pyparsing==2.4.6
python-dateutil==2.8.1
pytz==2019.3
PyWavelets==1.1.1
requests==2.22.0
scikit-image==0.16.2
scipy==1.4.1
six==1.13.0
toolz==0.10.0
torch==1.3.1
torchvision==0.4.2
tornado==6.0.3
tqdm==4.41.1
typed-ast==1.4.1
urllib3==1.25.7
wrapt==1.11.2

timesler · 2020-01-28T06:50:28Z

Hi @marisancans, I think it is very unlikely that the memory leak is caused by facenet-pytorch since it did not occur on a different system with the same version.

Can you provide a complete working example of code that caused the leak? I would suggest checking if it is related to the version of PIL or torch that you have installed. To check if it is related to PIL, you could add a del img inside your loop.

ShadowElement · 2020-02-17T01:46:42Z

facenet-pytorch 2.2.1 same error

timesler · 2020-02-17T03:59:30Z

@marisancans @ShadowElement I've been able to reproduce this issue now - it doesn't seem to happen on every system and I am not 100% sure what is happening. My guess is that it has to do with slicing a numpy array without creating a copy.

I'm in the process of tracking down the issue - if I find it and can fix, I'll let you know.

ShadowElement · 2020-02-21T13:08:47Z

@timesler Thanks

haydenroche5 · 2020-02-25T19:24:05Z

Same issue for me using version 2.2.7. I tried to track down the leak myself with pympler, but didn't see anything leaking. Weird. For what it's worth, the leak happens for me when I'm using a CPU. I haven't tried it out with a GPU.

timesler · 2020-02-25T22:18:20Z

@haydenroche5 in what environment did you see the leak happening?

haydenroche5 · 2020-02-25T23:01:37Z

@timesler I'm using a Google Cloud instance.

HW environment:
8 x Intel(R) Xeon(R) CPU @ 2.30GHz
30 GB RAM

SW environment (non-exhaustive):
Ubuntu 19.10
pillow 6.2.1
python 3.7.5
numpy 1.17.3
facenet-pytorch 2.2.7
torch 1.4.0
torchvision 0.5.0

And to be clear, I've got the same kind of loop as @marisancans in my code. Thanks for looking into this. Happy to provide any other info that might be helpful.

marisancans · 2020-02-26T10:02:28Z

if this helps, this is what im using:
Intel(R) Core(TM) i7-4700HQ CPU @ 2.40GHz

Ubuntu 19.04

envs:
facenet-pytorch==2.0.1
torchvision==0.3.0
torch==1.4.0
Pillow==7.0.0
opencv-python==4.1.2.30
numpy==1.18.1

Im also running all tests on CPU only, because i'm using laptop. The leak didn't happen on server with cuda. Thanks

mhamedLmarbouh · 2020-05-05T14:49:18Z

Hi did anyone manage to find the source of this problem cause I am currently suffering from it too, I am running on cpu and the weird thing is that the memory leak doesn't occurs all the time.

armanhak · 2020-05-05T15:30:15Z

I have the same problem. I run it in colab and want to get embeddings of the lfw photo on the GPU, but I get an error even when I take part of the data (4000). RuntimeError: CUDA out of memory. Tried to allocate 1.30 GiB.

TerryTran · 2020-05-12T15:48:21Z

@timesler: I have the same issue above, after digging into the codes it seems memory leak comes from PNet: https://github.com/timesler/facenet-pytorch/blob/master/models/utils/detect_face.py#L50, when I disabled it the memory leak issue disappear. Need others to take a look at this.

armanhak · 2020-05-12T19:02:55Z

@TerryTran: There seems to be a problem in this module https://github.com/timesler/facenet-pytorch/blob/master/models/inception_resnet_v1.py As I understand it, detect_face.py is needed to find the face in the photo. I do this: model = InceptionResnetV1 (pretrained = 'vggface2').Eval().To(device). And the code I transfer data to get embeddings (model(data)), I get an error about the memory. Checked GPU consumption with GPUtil.showUtilization. There is a sharp increase in GPU consumption.

TerryTran · 2020-05-13T01:11:30Z

@TerryTran: There seems to be a problem in this module https://github.com/timesler/facenet-pytorch/blob/master/models/inception_resnet_v1.py As I understand it, detect_face.py is needed to find the face in the photo. I do this: model = InceptionResnetV1 (pretrained = 'vggface2').Eval().To(device). And the code I transfer data to get embeddings (model(data)), I get an error about the memory. Checked GPU consumption with GPUtil.showUtilization. There is a sharp increase in GPU consumption.

I didn't use the InceptionResnetV1 for my testing, only use the detect function of MTCNN, here is my code:

from facenet_pytorch.models.mtcnn import MTCNN
from PIL import Image
img = Image.open(img_path)
for i in range(1000):
boxes, probs = mtcnn.detect(img)

armanhak · 2020-05-29T01:46:43Z

My memory problem was fixed when I transmitted the data on batches in a loop and for each result I called the detach() method. It looks something like this:

model = InceptionResnetV1(pretrained='vggface2').eval()
embeddings =[]
for batch in data:
    embedding  = model(batch).detach()
    embeddings.append(embedding)
embeddings = torch.stack(embeddings)

pcshih · 2020-07-24T02:04:30Z

add torch.cuda.empty_cache() after line 351 in mtcnn.py

jdongca2003 · 2020-08-02T02:19:59Z

I created a pull request to fix GPU out of memory issue.

#105

The root cause is that batch size for rnet and onet is dynamic. Sometimes the batch size for rnet input data is very large ( > 20,000). The solution is to use fixed bounded batch size.

adityapatadia · 2023-11-30T10:27:57Z

Did @marisancans original issue of running on CPU get resolved? We are facing this.

ldulcic mentioned this issue Feb 28, 2020

Memory leak arsfutura/face-recognition#9

Open

timesler closed this as completed Aug 8, 2020

himanshu-doi mentioned this issue Jul 9, 2021

Memory Leak ipazc/mtcnn#87

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory leak #57

memory leak #57

marisancans commented Jan 27, 2020

marisancans commented Jan 27, 2020

timesler commented Jan 28, 2020

ShadowElement commented Feb 17, 2020

timesler commented Feb 17, 2020

ShadowElement commented Feb 21, 2020

haydenroche5 commented Feb 25, 2020 •

edited

timesler commented Feb 25, 2020

haydenroche5 commented Feb 25, 2020

marisancans commented Feb 26, 2020

mhamedLmarbouh commented May 5, 2020

armanhak commented May 5, 2020 •

edited

TerryTran commented May 12, 2020 •

edited

armanhak commented May 12, 2020 •

edited

TerryTran commented May 13, 2020

armanhak commented May 29, 2020 •

edited

pcshih commented Jul 24, 2020

jdongca2003 commented Aug 2, 2020 •

edited

adityapatadia commented Nov 30, 2023

memory leak #57

memory leak #57

Comments

marisancans commented Jan 27, 2020

marisancans commented Jan 27, 2020

timesler commented Jan 28, 2020

ShadowElement commented Feb 17, 2020

timesler commented Feb 17, 2020

ShadowElement commented Feb 21, 2020

haydenroche5 commented Feb 25, 2020 • edited

timesler commented Feb 25, 2020

haydenroche5 commented Feb 25, 2020

marisancans commented Feb 26, 2020

mhamedLmarbouh commented May 5, 2020

armanhak commented May 5, 2020 • edited

TerryTran commented May 12, 2020 • edited

armanhak commented May 12, 2020 • edited

TerryTran commented May 13, 2020

armanhak commented May 29, 2020 • edited

pcshih commented Jul 24, 2020

jdongca2003 commented Aug 2, 2020 • edited

adityapatadia commented Nov 30, 2023

haydenroche5 commented Feb 25, 2020 •

edited

armanhak commented May 5, 2020 •

edited

TerryTran commented May 12, 2020 •

edited

armanhak commented May 12, 2020 •

edited

armanhak commented May 29, 2020 •

edited

jdongca2003 commented Aug 2, 2020 •

edited