Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

内存泄漏 #187

Closed
cmathx opened this issue Jun 17, 2024 · 2 comments
Closed

内存泄漏 #187

cmathx opened this issue Jun 17, 2024 · 2 comments

Comments

@cmathx
Copy link

cmathx commented Jun 17, 2024

问题描述 / Problem Description

采用cpu进行ocr预测,都会内存泄漏
部署ocr服务后,线上有各种不同图像进行ocr识别

  • paddle原生模型,内存泄漏,大约半小时后内存超过3.5GB;
  • 转openvino版本,内存泄漏,大约半小时后内存超过3.5GB;
  • 转onnx版本,内存泄漏,大约6小时后,内存超过2.3GB;

运行环境 / Runtime Environment

cpu四核,debian

复现代码 / Reproduction Code

######server.py######
from paddle_ocr_client import paddle_ocr_server
@server.register('predict_images')
def predict_images(ctx, req):
    resp = paddle_ocr_server.predict_images(req)
    return resp

######paddle_ocr_client.py######
class PaddleOcrServer():
    def __init__(self) -> None:
        #self.ocr = PaddleOCR(enable_mkldnn=True, cpu_threads=2, use_space_char=True, lang="en", warmup=True, ir_optim=True, rec_batch_num=8, det_db_thresh=0.5, det_db_score_mode='fast', det_limit_side_len=864, det_model_dir='./inference/det_onnx/model.onnx', rec_model_dir='./inference/rec_onnx/model.onnx', rec_char_dict_path='./inference/en_dict.txt', use_onnx=True)
        #self.ocr = PaddleOCR(use_space_char=True, lang="en", det_db_thresh=0.5, det_db_score_mode='fast', det_limit_side_len=864, det_model_dir='./inference/det_onnx/model.onnx', rec_model_dir='./inference/rec_onnx/model.onnx', rec_char_dict_path='./inference/en_dict.txt', use_onnx=True)
        #self.ocr = PaddleOCR(use_gpu=True, ocr_version='PP-OCRv3', use_space_char=True, lang="en", warmup=True, enable_mkldnn=True, ir_optim=True, cpu_threads=2, rec_batch_num=8, det_db_thresh=0.5, det_db_score_mode='fast', det_limit_side_len=864, det_model_dir='./inference/det_onnx/model.onnx', rec_model_dir='./inference/rec_onnx/model.onnx', use_onnx=True)
        self.ocr = RapidOCR(config_path='inference/config.yaml')
        self.ocr_result = OcrResult()
        self.image_ocr_result = ImageOcrResult()
        for i in range(20):
            self.warmup_test()

    def predict_images(self, req):
        result_ll = []
        beg_time = time.time()
        for image_info in req.images:
            result = self.ocr(image_info.data)[0]
            #print(result)
        end_time = time.time()
        print(1000.0 * (end_time - beg_time))
        #logging.info('paddle_ocr cost: %s\n' %(str(100.0*(end_time-beg_time))))
        #cost = 1000.0 * (end_time - beg_time)
        return self.ocr_result
    def warmup_test():
        ###

######inference/config.yaml######
Global:
    text_score: 0.5
    use_det: true
    use_cls: false
    use_rec: true
    print_verbose: false
    min_height: 30
    width_height_ratio: 8

    intra_op_num_threads: 4
    inter_op_num_threads: 4

Det:
    intra_op_num_threads: 4
    inter_op_num_threads: 4

    use_cuda: false
    use_dml: false

    model_path: inference/det_slim_onnx/model.onnx

    limit_side_len: 576
    limit_type: min

    thresh: 0.3
    box_thresh: 0.5
    max_candidates: 1000
    unclip_ratio: 1.6
    use_dilation: true
    score_mode: fast

Cls:
    intra_op_num_threads: 4
    inter_op_num_threads: 4

    use_cuda: false
    use_dml: false

    model_path: inference/cls_onnx/model.onnx

    cls_image_shape: [3, 48, 192]
    cls_batch_num: 6
    cls_thresh: 0.9
    label_list: ['0', '180']

Rec:
    intra_op_num_threads: 4
    inter_op_num_threads: 4

    use_cuda: false
    use_dml: false

    model_path: inference/rec_slim_onnx/model.onnx
    rec_keys_path: inference/en_dict.txt

    rec_img_shape: [3, 48, 320]
    rec_batch_num: 6

可能解决方案 / Possible solutions

paddleocr内部

@SWHL
Copy link
Member

SWHL commented Jun 17, 2024

你是不是提错地方了啊,应该是PaddleOCR项目吧

@SWHL SWHL closed this as not planned Won't fix, can't repro, duplicate, stale Jun 18, 2024
@cmathx
Copy link
Author

cmathx commented Jun 19, 2024

请教下,paddleocr采用cpu预测,内存泄漏的问题是否还没fix。这个开源项目,速度优化的还不错,尤其是官方的模型,但是内存bug貌似一直未修复?

@RapidAI RapidAI locked and limited conversation to collaborators Jun 19, 2024
@SWHL SWHL converted this issue into discussion #188 Jun 19, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants