Memory leak in PaddleOCR. #11639

ShubhamZoop · 2024-02-28T09:25:24Z

I have noticed some weird memory usage when evaluating the performance of PaddleOCR:

When PaddleOCR processes new images of a sequence, there is a constant increase in memory usage of the process.
The time profile of the memory usage has sudden steps to higher memory levels.
If the system sets an upper bound in the memory consumption, the paddle process is eventually killed. It won't give allocated memory back, ever.
This behaviour can be observed both with the C++ and the Python interface.
Eventually it is impossible to keep running a PaddleOCR process as a service because the system runs out of memory and the process is killed.

Can you provide insights on this memory usage pattern? Do you have any remedy?

The following sections describe the tests in detail.

C++ Tests
Setup for the C++ tests
Test hardware:

OS: Ubuntu 20.04
CPU: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz, 16 threads, AVX2
RAM: 128GB

PaddleOCR: Source code from v2.7.0.3

Model files:

Detector: en_PP-OCRv3_det_infer.tar
Classifier: ch_ppocr_mobile_v2.0_cls_infer.tar
Recognizer: en_PP-OCRv3_rec_infer.tar
OpenCV has been compiled from the source code tagged at v4.6.0 with the parameters:

Methodology for the Python tests
PaddleOCR repo at v2.7.0.3

Slight modifications to report the memory usage (Resident Set Size), the number of bboxes/image and the total time spent after the OCR of each image.

Run on CPU with or without MKL enabled.

cmd arguments:

./${PADDLE_OCR_BUILD_DIR}/ppocr
--rec_char_dict_path="./dicts/en_dict.txt"
--det_model_dir=${MODELS_INFERENCE_DIR}/det_db
--rec_model_dir=${MODELS_INFERENCE_DIR}/rec_rcnn
--cls_model_dir=${MODELS_INFERENCE_DIR}/cls
--visualize=true
--output=${OUT_DIR}
--image_dir=${IMAGES_DIR}
--use_angle_cls=true
--det=true
--rec=true
--cls=true
--use_gpu=false
--enable_mkldnn=true
--precision=fp32
--cpu-threads=4
Results of the Python tests

Test 1: Base

CPU with MKL enabled
1320 images
OCR pipeline = Detector + Classifier + Recognizer
Num of theads: 4

Test 2: Long run

CPU with MKL enabled
10560 images (1320 images repeated 8 times)
OCR pipeline = Detector + Classifier + Recognizer
Num of theads: 4

Test 3: No MKL

CPU with MKL disabled
1320 images
OCR pipeline = Detector + Classifier + Recognizer
Num of theads: 4

Test 4: Det only

CPU with MKL enabled
1320 images
OCR pipeline = Detector
Num of theads: 4

Test 5: Det + Rec

CPU with MKL enabled
1320 images
OCR pipeline = Detector + Recognizer
Num of theads: 4

Test 6: Det + Cls

CPU with MKL enabled
1320 images
OCR pipeline = Detector + Classifier
Num of theads: 4

Test 7: Loop same image

CPU with MKL enabled
OCR the same image 1320 times
OCR pipeline = Detector + Classifier + Recognizer
Num of theads: 4

Python tests
Setup for the Python tests
Test hardware:

OS: Ubuntu 20.04
CPU: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz, 16 threads, AVX2
RAM: 128GB
paddlepaddle: v2. 6.0

paddleocr: v2.7.0.3

Results of the Python tests
CPU with MKL enabled
1320 images
OCR pipeline = Detector + Classifier + Recognizer
Num of theads: 4

slelekospd · 2024-02-29T12:11:59Z

Same problem...

ShubhamZoop · 2024-02-29T12:41:31Z

@tink2123 Can you look into this issue?

vivienfanghuagood · 2024-03-19T13:35:33Z

你好，这个原因经过分析，本质上不属于内存泄漏，是Paddle框架将Tensor做了缓存复用，这部分内存在下次遇到同样shape tensor的时候会复用，从而避免调用系统的allocator。如果对内存比较敏感，可以export FLAGS_allocator_strategy=naive_best_fit，这个会一定程度上缓解CPU的内存占用。后续我们将进一步优化Allocator的逻辑，以提供更合理的内存复用策略。

ShubhamZoop · 2024-03-19T15:48:46Z

@vivienfanghuagood Thank you for your reply, but as i am using PaddleOCR, I can't find any parameter as FLAGS_allocator_strategy=naive_best_fit as per my research It's in PaddlePaddle repo here

following are the parameters in paddleOCR
Namespace(alpha=1.0, alphacolor=(255, 255, 255), benchmark=False, beta=1.0, binarize=False, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir='/home/shubham/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_thresh=0.9, cpu_threads=1, crop_res_save_dir='./output', det=True, det_algorithm='DB', det_box_type='quad', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_limit_side_len=960, det_limit_type='max', det_model_dir='/home/shubham/.paddleocr/whl/det/en/en_PP-OCRv3_det_infer', det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, det_pse_thresh=0, det_sast_nms_thresh=0.2, det_sast_score_thresh=0.5, draw_img_save_dir='./inference_results', drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=True, fourier_degree=5, gpu_id=0, gpu_mem=500, help='==SUPPRESS==', image_dir=None, image_orientation=False, invert=False, ir_optim=True, kie_algorithm='LayoutXLM', label_list=['0', '180'], lang='en', layout=True, layout_dict_path=None, layout_model_dir=None, layout_nms_threshold=0.5, layout_score_threshold=0.5, max_batch_size=10, max_text_length=25, merge_no_span_structure=True, min_subgraph_size=15, mode='structure', ocr=True, ocr_order_method=None, ocr_version='PP-OCRv4', output='./output', page_num=0, precision='fp32', process_id=0, re_model_dir=None, rec=True, rec_algorithm='SVTR_LCNet', rec_batch_num=6, rec_char_dict_path='/home/shubham/anaconda3/envs/ocr-service-gpu/lib/python3.8/site-packages/paddleocr/ppocr/utils/en_dict.txt', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_model_dir='/home/shubham/.paddleocr/whl/rec/en/en_PP-OCRv4_rec_infer', recovery=False, save_crop_res=False, save_log_path='./log_output/', scales=[8, 16, 32], ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ser_model_dir=None, show_log=True, sr_batch_num=1, sr_image_shape='3, 32, 128', sr_model_dir=None, structure_version='PP-StructureV2', table=True, table_algorithm='TableAttn', table_char_dict_path=None, table_max_len=488, table_model_dir=None, total_process_num=1, type='ocr', use_angle_cls=False, use_dilation=True, use_gpu=1, use_mp=False, use_npu=False, use_onnx=False, use_pdf2docx_api=False, use_pdserving=False, use_space_char=True, use_tensorrt=False, use_visual_backbone=True, use_xpu=False, vis_font_path='./doc/fonts/simfang.ttf', warmup=False)

vivienfanghuagood · 2024-03-20T01:59:44Z

@vivienfanghuagood Thank you for your reply, but as i am using PaddleOCR, I can't find any parameter as FLAGS_allocator_strategy=naive_best_fit as per my research It's in PaddlePaddle repo here

following are the parameters in paddleOCR Namespace(alpha=1.0, alphacolor=(255, 255, 255), benchmark=False, beta=1.0, binarize=False, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir='/home/shubham/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_thresh=0.9, cpu_threads=1, crop_res_save_dir='./output', det=True, det_algorithm='DB', det_box_type='quad', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_limit_side_len=960, det_limit_type='max', det_model_dir='/home/shubham/.paddleocr/whl/det/en/en_PP-OCRv3_det_infer', det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, det_pse_thresh=0, det_sast_nms_thresh=0.2, det_sast_score_thresh=0.5, draw_img_save_dir='./inference_results', drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=True, fourier_degree=5, gpu_id=0, gpu_mem=500, help='==SUPPRESS==', image_dir=None, image_orientation=False, invert=False, ir_optim=True, kie_algorithm='LayoutXLM', label_list=['0', '180'], lang='en', layout=True, layout_dict_path=None, layout_model_dir=None, layout_nms_threshold=0.5, layout_score_threshold=0.5, max_batch_size=10, max_text_length=25, merge_no_span_structure=True, min_subgraph_size=15, mode='structure', ocr=True, ocr_order_method=None, ocr_version='PP-OCRv4', output='./output', page_num=0, precision='fp32', process_id=0, re_model_dir=None, rec=True, rec_algorithm='SVTR_LCNet', rec_batch_num=6, rec_char_dict_path='/home/shubham/anaconda3/envs/ocr-service-gpu/lib/python3.8/site-packages/paddleocr/ppocr/utils/en_dict.txt', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_model_dir='/home/shubham/.paddleocr/whl/rec/en/en_PP-OCRv4_rec_infer', recovery=False, save_crop_res=False, save_log_path='./log_output/', scales=[8, 16, 32], ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ser_model_dir=None, show_log=True, sr_batch_num=1, sr_image_shape='3, 32, 128', sr_model_dir=None, structure_version='PP-StructureV2', table=True, table_algorithm='TableAttn', table_char_dict_path=None, table_max_len=488, table_model_dir=None, total_process_num=1, type='ocr', use_angle_cls=False, use_dilation=True, use_gpu=1, use_mp=False, use_npu=False, use_onnx=False, use_pdf2docx_api=False, use_pdserving=False, use_space_char=True, use_tensorrt=False, use_visual_backbone=True, use_xpu=False, vis_font_path='./doc/fonts/simfang.ttf', warmup=False)

You are right. In fact, this is an environment variable in Paddle. When you use PaddleOCR for inference, the backend may be Paddle( if you choose it), so this environment variable will take effect.

ShubhamZoop · 2024-03-20T10:07:09Z

@vivienfanghuagood Yes you are correct, but I can't manually change that env variable. Can you suggest me any solution, I really want to fix this memory auto growth problem. Thanks.

GreatV · 2024-03-21T07:36:30Z

just set in you shell

export FLAGS_allocator_strategy=naive_best_fit

ShubhamZoop · 2024-03-26T11:17:26Z

just set in you shell
export FLAGS_allocator_strategy=naive_best_fit

@GreatV The following is not working.

GreatV · 2024-03-26T11:52:22Z

It is supposed to work according to the python code.

PaddleOCR/tools/infer_kie.py

Line 29 in 89e0a15

os.environ["FLAGS_allocator_strategy"] = 'auto_growth'

ShubhamZoop · 2024-03-27T12:27:01Z

@GreatV @vivienfanghuagood Still the same, here i am using python memory profiler. You can see on every request we see a memory increment on predict model line. This is not the case if we so same for same image. but I also tried to preprocess the image and resize to make tensors same. but the problem still remains the same.

GreatV · 2024-03-27T12:46:36Z

This is a long standing issue in the community. I generally use onnx as an inference backend in my deployments.

ShubhamZoop · 2024-03-27T12:49:25Z

@GreatV can you show you inference script of using onnx for OCR, will check the memory issue over onnx too.

GreatV · 2024-03-27T12:55:36Z

Hi @ShubhamZoop, you may refer https://github.com/GreatV/ppocr_trt_infer

ShubhamZoop · 2024-03-27T12:59:50Z

Hi @ShubhamZoop, you may refer https://github.com/GreatV/ppocr_trt_infer

Does this solve the problem of memory growth? and can you show me how do I get started with the above infer files. Thank you for your help. Appreciate that.

GreatV · 2024-03-27T15:23:46Z

Hi @ShubhamZoop, You may try other implementations that don't use the paddle inference backend, such as https://github.com/PaddlePaddle/FastDeploy. The code above link is just a demo that removes the paddle inference dependency from the official deploy code and uses tensorrt as the backend.

ShubhamZoop · 2024-03-27T18:16:01Z

Hi @ShubhamZoop, You may try other implementations that don't use the paddle inference backend, such as https://github.com/PaddlePaddle/FastDeploy. The code above link is just a demo that removes the paddle inference dependency from the official deploy code and uses tensorrt as the backend.

already using Openvino as a backend using fastdeploy, which also seems the same. #memory_leak :p
Will try for ORT (onnx run time) and let you know. Thanks

ShubhamZoop · 2024-04-08T10:18:25Z

@GreatV
ORT kind of works fine most of the time, but yes It's pretty slow as compared to OpenVino. and sometime it take so long to predict.

Will there is any update on paddle for memory leak? why this issue is taking so long to solve? @TingquanGao @vivienfanghuagood

jzhang533 · 2024-04-08T11:17:47Z

why this issue is taking so long to solve?

most of key developers (i.e., git shortlog -s -n | head -15) have left. The project is currently undergoing a transition from a corporate open-source project to a fully community-driven model. That means we can't force people to solve the issue. See discussion #11859.

pxike · 2024-04-16T12:22:57Z

@ShubhamZoop Hello ,how did u manage to integrate ORT I've been facing the same problem.

Shubham654 · 2024-04-23T06:45:52Z

@ShubhamZoop Hello ,how did u manage to integrate ORT I've been facing the same problem.

@pxike
You can refer https://github.com/PaddlePaddle/FastDeploy for using different backends. also if you want to use ORT in PaddleOCR itself then you need to change the models which support ort as a backend.

cmathx · 2024-06-11T15:30:48Z

各位大佬，内存泄漏问题，后来怎么解决的啊？

Shubham654 · 2024-06-14T10:51:24Z

@cmathx The issue is not yet fully resolved. But you can try using Onnx models which are better than the default.

cmathx · 2024-06-15T12:32:08Z

@cmathx The issue is not yet fully resolved. But you can try using Onnx models which are better than the default.

to use onnx, it's too slow.
With paddleocr normal inference about 2 threads, it cost about 400ms one image. But 2000ms+ if you use onnx inference.
I also found it exhausts all cpu threads although set OMP_NUM_THREADS=2. How to speed up onnx multithread performance? Because I need to process one image in 1000ms.

LLee233 · 2024-06-19T07:15:42Z

Hi, this issue should be fixed on develop branch of Paddle after oneDNN v3.4 is merged. The root cause is, some specific kernels (brgemm kernels) in oneDNN were not reused even if they were functionally identical, resulting in repetitive creation and memory leak afterwards. This issue is fixed after v3.4, you may try building Paddle locally and see if it works, thanks.

Shubham654 · 2024-06-19T07:20:39Z

@LLee233 Thanks for mentioning it, can you mention the commit for oneDNN v3.4. I can't see any changes regarding oneDNN v3.4 on any branch.

LLee233 · 2024-06-19T07:35:38Z

@LLee233 Thanks for mentioning it, can you mention the commit for oneDNN v3.4. I can't see any changes regarding oneDNN v3.4 on any branch.

@Shubham654 This commit is where we update oneDNN on Paddle.

cmathx · 2024-06-19T07:40:06Z

It's cool. With my several ocr test on paddle/openvino/onnx model, using paddle model it's most quick. As memory leak is well fixed(I will test your mentioned suggestion), paddle model will be a strong engineering toolkit. Xinyi_LI ***@***.***> 于2024年6月19日周三 15:16写道：

…

Hi, this issue should be fixed on *develop branch* of Paddle after oneDNN *v3.4* is merged. The root cause is, some specific kernels (brgemm kernels) in oneDNN were not reused even if they were functionally identical, resulting in repetitive creation and memory leak afterwards. This issue is fixed after v3.4, you may try building Paddle locally and see if it works, thanks. — Reply to this email directly, view it on GitHub <#11639 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABNFGBM4KQA6HIBYUD4POV3ZIEV3LAVCNFSM6AAAAABD5USORSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZXHEZDMNZTHE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

paddle-bot bot assigned TingquanGao Feb 28, 2024

This was referenced Mar 19, 2024

2.7.1 C++ CPU PPOCR类内存泄漏 #11472

Closed

memory leak #11753

Closed

ShubhamZoop mentioned this issue Apr 6, 2024

ch_PP-OCRv4_rec_infer识别速度不稳定 #11860

Closed

jzhang533 added triaged this issue has been looked, and triaged. needs investigation this issue needs investigation to either narrow down, or clarify memory leak labels Apr 10, 2024

pxike mentioned this issue Apr 26, 2024

can't run paddleocr converted model on onnxruntime PaddlePaddle/Paddle2ONNX#1241

Open

jasondalycanpk mentioned this issue May 11, 2024

[Discussion] How to get PaddleOCR better maintained. #11859

Closed

Shubham654 mentioned this issue May 26, 2024

报告在 CPU 机器上使用 PaddleOCR CPU包, 2小时后遇到内存泄漏问题 #12150

Closed

GreatV mentioned this issue Jun 21, 2024

GPU推理时显存一直上涨，不释放，显存满了后，再推理就没有结果了 #13145

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak in PaddleOCR. #11639

Memory leak in PaddleOCR. #11639

ShubhamZoop commented Feb 28, 2024 •

edited

Loading

slelekospd commented Feb 29, 2024

ShubhamZoop commented Feb 29, 2024

vivienfanghuagood commented Mar 19, 2024

ShubhamZoop commented Mar 19, 2024 •

edited

Loading

vivienfanghuagood commented Mar 20, 2024

ShubhamZoop commented Mar 20, 2024

GreatV commented Mar 21, 2024

ShubhamZoop commented Mar 26, 2024 •

edited

Loading

GreatV commented Mar 26, 2024

ShubhamZoop commented Mar 27, 2024 •

edited

Loading

GreatV commented Mar 27, 2024

ShubhamZoop commented Mar 27, 2024

GreatV commented Mar 27, 2024

ShubhamZoop commented Mar 27, 2024 •

edited

Loading

GreatV commented Mar 27, 2024

ShubhamZoop commented Mar 27, 2024 •

edited

Loading

ShubhamZoop commented Apr 8, 2024

jzhang533 commented Apr 8, 2024 •

edited

Loading

pxike commented Apr 16, 2024

Shubham654 commented Apr 23, 2024 •

edited

Loading

cmathx commented Jun 11, 2024

Shubham654 commented Jun 14, 2024

cmathx commented Jun 15, 2024

LLee233 commented Jun 19, 2024

Shubham654 commented Jun 19, 2024 •

edited

Loading

LLee233 commented Jun 19, 2024 •

edited

Loading

cmathx commented Jun 19, 2024 via email

Memory leak in PaddleOCR. #11639

Memory leak in PaddleOCR. #11639

Comments

ShubhamZoop commented Feb 28, 2024 • edited Loading

slelekospd commented Feb 29, 2024

ShubhamZoop commented Feb 29, 2024

vivienfanghuagood commented Mar 19, 2024

ShubhamZoop commented Mar 19, 2024 • edited Loading

vivienfanghuagood commented Mar 20, 2024

ShubhamZoop commented Mar 20, 2024

GreatV commented Mar 21, 2024

ShubhamZoop commented Mar 26, 2024 • edited Loading

GreatV commented Mar 26, 2024

ShubhamZoop commented Mar 27, 2024 • edited Loading

GreatV commented Mar 27, 2024

ShubhamZoop commented Mar 27, 2024

GreatV commented Mar 27, 2024

ShubhamZoop commented Mar 27, 2024 • edited Loading

GreatV commented Mar 27, 2024

ShubhamZoop commented Mar 27, 2024 • edited Loading

ShubhamZoop commented Apr 8, 2024

jzhang533 commented Apr 8, 2024 • edited Loading

pxike commented Apr 16, 2024

Shubham654 commented Apr 23, 2024 • edited Loading

cmathx commented Jun 11, 2024

Shubham654 commented Jun 14, 2024

cmathx commented Jun 15, 2024

LLee233 commented Jun 19, 2024

Shubham654 commented Jun 19, 2024 • edited Loading

LLee233 commented Jun 19, 2024 • edited Loading

cmathx commented Jun 19, 2024 via email

ShubhamZoop commented Feb 28, 2024 •

edited

Loading

ShubhamZoop commented Mar 19, 2024 •

edited

Loading

ShubhamZoop commented Mar 26, 2024 •

edited

Loading

ShubhamZoop commented Mar 27, 2024 •

edited

Loading

ShubhamZoop commented Mar 27, 2024 •

edited

Loading

ShubhamZoop commented Mar 27, 2024 •

edited

Loading

jzhang533 commented Apr 8, 2024 •

edited

Loading

Shubham654 commented Apr 23, 2024 •

edited

Loading

Shubham654 commented Jun 19, 2024 •

edited

Loading

LLee233 commented Jun 19, 2024 •

edited

Loading