GPU memory is not released after prediction #6977

shuo-chen-us · 2022-07-22T16:38:42Z

shuo-chen-us
Jul 22, 2022

Hello, I just realized that the GPU memory seems not to be released each time after prediction. I tried paddle.device.cuda.empty_cache(), but it does not work. The GPU memory just keeps increasing as more predictions happen till it reaches 100%. Is there any suggestions? Thanks in advance!

Evezerest · 2022-07-25T12:59:05Z

Evezerest
Jul 25, 2022
Collaborator

hi, there are several reasons that could cause the problem. You can check according to the following steps:

The version of PaddlePaddle, we recommend you install the version above 2.3.0.
Repeatedly predicting the same image while observing whether the memory consumption will increase (Expected consumption remains unchanged)
If there are images with large resolutions, it could be a possible reason for increasing GPU memory consumption.

0 replies

shuo-chen-us · 2022-07-26T14:52:38Z

shuo-chen-us
Jul 26, 2022
Author

@Evezerest Thanks so much for your reply and your suggestions.

Just provide more details here. We are using a NVIDIA T4 Tensor Core GPU with 16GB memory. We allow maximum 8 PaddleOCR workers running during the peak time.

When we used MobileNetV3 as OCR detection backbone, the GPU memory usage seemed healthy. The GPU usage kept between 45% - 55% within a week.
But when we switched to ResNet18 as OCR detection backbone, the GPU memory usage increased from 50% to over 80% in a couple days.

So, based on our observation, we suspect there is a memory leak inside Paddle when using ResNet18. Circling back to your suggestions:

We are trying to upgrade PaddlePaddle to v2.3.0. We are using 2.2.1 for now. Does 2.2.1 have any GPU issue as far as you are aware?
As I explained above, the GPU consumption of MobileNetV3 seemed unchanged as expected, but ResNet18 just kept increasing.
So, question here, even if there are images with large resolutions, should the GPU memory increase when predicting the large image and decrease after the processing is done?

0 replies

littletomatodonkey · 2022-07-27T15:25:26Z

littletomatodonkey
Jul 27, 2022
Collaborator

ResNet18 is much larger than MobileNetV3 model in PP-OCR(100+M vs 10M), you might decresae the workers num for large model inference.
paddle2.2.1 does not have any GPU issue from our test result.
Paddle needs to apply some gpu memories for further inference process, which is might smaller than resnet18 inference occupation but larger than mobilenet.
Actually, if you want to release the gpu memory during inference, you can try to use paddle.inference.Predictor.try_shrink_memory() (link).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU memory is not released after prediction #6977

{{title}}

Replies: 3 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

GPU memory is not released after prediction #6977

shuo-chen-us Jul 22, 2022

Replies: 3 comments

Evezerest Jul 25, 2022 Collaborator

shuo-chen-us Jul 26, 2022 Author

littletomatodonkey Jul 27, 2022 Collaborator

shuo-chen-us
Jul 22, 2022

Evezerest
Jul 25, 2022
Collaborator

shuo-chen-us
Jul 26, 2022
Author

littletomatodonkey
Jul 27, 2022
Collaborator