使用CPU下进行加速处理，但是识别的速度将近30S，请问有什么方法提高嘛？ #2950

cgq0816 · 2021-05-28T03:12:47Z

硬件环境：
cpu 2.3Ghz
paddle 2.02
paddleocr 2.0.1

cpu加速配置
Namespace(cls_batch_num=30, cls_image_shape='3, 48, 192', cls_model_dir='./inference/ch_ppocr_mobile_v2.0_cls_infer', cls_thresh=0.9, det=True, det_algorithm='DB', det_db_box_thresh=0.2, det_db_thresh=0.3, det_db_unclip_ratio=1.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_limit_side_len=1920, det_limit_type='max', det_model_dir='./inference/ch_ppocr_server_v2.0_det_infer', drop_score=0.5, enable_mkldnn=True, enhancer=True, gpu_mem=8000, image_dir='', ir_optim=True, label_list=['0', '180'], lang='ch', max_text_length=25, rec=True, rec_algorithm='CRNN', rec_batch_num=30, rec_char_dict_path='../ppocr/utils/ppocr_keys_v1.txt', rec_char_type='ch', rec_image_shape='3, 32, 320', rec_model_dir='./inference/ch_ppocr_server_v2.0_rec_infer', remove_red=True, rotate_cut=False, use_angle_cls=False, use_dilation=True, use_gpu=False, use_pdserving=False, use_space_char=True, use_tensorrt=False, use_zero_copy_run=False, vertical_cut=True)

det_limit_side_len=1920
enable_mkldnn=true
rec_batch_num = 6

执行时间：
[2021/05/28 11:02:57] root INFO: dt_boxes num : 322, elapse : 2.1602203845977783
[2021/05/28 11:02:57] root INFO: text_detector time : 2.317730188369751
[2021/05/28 11:03:24] root INFO: text_recognizer time : 26.86380124092102
[2021/05/28 11:03:24] root INFO: rec_res num : 324, elapse : 26.78428030014038

发现是识别模型识别时间很长，如何加速识别的推理呢？

图片大小 (2352, 1673, 3)
图片如下

littletomatodonkey · 2021-05-28T06:12:25Z

开启mkldnn试下，有一定的加速效果 enable_mkldnn参数
识别的batch size可以设置大一点, rec_batch_num参数
图片很多的话，可以试着开启多进程use_mp参数

上述参数均可在下面的文件中查看

https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.1/tools/infer/utility.py

cgq0816 · 2021-05-29T03:38:34Z

开启mkldnn试下，有一定的加速效果 enable_mkldnn参数

识别的batch size可以设置大一点, rec_batch_num参数

图片很多的话，可以试着开启多进程use_mp参数

上述参数均可在下面的文件中查看

https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.1/tools/infer/utility.py

您好，我使用的是2.0.2的版本，已经设置enable_mkldnn、rec_batch_num这两个参数了，上面已经进行说明了，
det_limit_side_len=1920
enable_mkldnn=true
rec_batch_num = 6

use_mp 这个参数应该是最新的paddleocr版本有，2.0.2没有这个参数

bytes-lost · 2021-06-02T10:20:59Z

paddleocr

你试试不开启mkldnn的推理速度如何？我遇到的问题是，开启mkldnn后速度慢了很多

cgq0816 · 2021-06-03T07:22:07Z

paddleocr

你试试不开启mkldnn的推理速度如何？我遇到的问题是，开启mkldnn后速度慢了很多

之前不开启，识别很慢，但是开启之后，提高几秒，但是整体识别速度还是很低的

Sanster · 2021-06-05T14:01:49Z

文本框多所以识别会比较慢，在意 latency 的话，多用几个 CPU，一张图片上的框多个进程并行跑

cgq0816 · 2021-06-07T00:48:36Z

文本框多所以识别会比较慢，在意 latency 的话，多用几个 CPU，一张图片上的框多个进程并行跑

目前是这样的，但是还是慢，上面有图片样例，可以看一下推理时间，请问还有没有其他方式呢

Sanster · 2021-06-12T06:51:13Z

可以 benchmark 一下 onnxruntime 的性能，在 cpu 上还是可以的

cgq0816 · 2021-06-12T07:36:32Z

可以 benchmark 一下 onnxruntime 的性能，在 cpu 上还是可以的

https://aistudio.baidu.com/aistudio/projectdetail/1967957

您好，我试过onnx，但是效果不是很好，不知道您说的可以，推理是多少呢？有没有试过我上传的图片呢

Sanster · 2021-06-12T08:05:42Z

对比过 onnxruntime 和 pytorch ，ORT 的 CPU 推理速度有明显的优势，以 crnn 为例，32*128 尺寸的输入，7ms vs 40 ms，具体耗时还要看你的 backbone 规模和 cpu 主频了，加钱换个主频高点的 CPU....或者 check 下现在的 cpu 是不是支持 AVX512 的，paddle 对 AVX512 有没有优化

cgq0816 · 2021-06-12T08:09:00Z

对比过 onnxruntime 和 pytorch ，ORT 的 CPU 推理速度有明显的优势，以 crnn 为例，32*128 尺寸的输入，7ms vs 40 ms，具体耗时还要看你的 backbone 规模和 cpu 主频了，加钱换个主频高点的 CPU....或者 check 下现在的 cpu 是不是支持 AVX512 的，paddle 对 AVX512 有没有优化

好的，您好，您这里能提供一下示例代码嘛？因为我之前按照官方的方式去做的测试

SWHL · 2021-07-02T03:30:39Z

对比过 onnxruntime 和 pytorch ，ORT 的 CPU 推理速度有明显的优势，以 crnn 为例，32*128 尺寸的输入，7ms vs 40 ms，具体耗时还要看你的 backbone 规模和 cpu 主频了，加钱换个主频高点的 CPU....或者 check 下现在的 cpu 是不是支持 AVX512 的，paddle 对 AVX512 有没有优化

好的，您好，您这里能提供一下示例代码嘛？因为我之前按照官方的方式去做的测试

可以参考一下RapidOCR

cgq0816 · 2021-08-03T02:12:37Z

硬件环境：
cpu 2.3Ghz
paddle 2.02
paddleocr 2.0.1

cpu加速配置
Namespace(cls_batch_num=30, cls_image_shape='3, 48, 192', cls_model_dir='./inference/ch_ppocr_mobile_v2.0_cls_infer', cls_thresh=0.9, det=True, det_algorithm='DB', det_db_box_thresh=0.2, det_db_thresh=0.3, det_db_unclip_ratio=1.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_limit_side_len=1920, det_limit_type='max', det_model_dir='./inference/ch_ppocr_server_v2.0_det_infer', drop_score=0.5, enable_mkldnn=True, enhancer=True, gpu_mem=8000, image_dir='', ir_optim=True, label_list=['0', '180'], lang='ch', max_text_length=25, rec=True, rec_algorithm='CRNN', rec_batch_num=30, rec_char_dict_path='../ppocr/utils/ppocr_keys_v1.txt', rec_char_type='ch', rec_image_shape='3, 32, 320', rec_model_dir='./inference/ch_ppocr_server_v2.0_rec_infer', remove_red=True, rotate_cut=False, use_angle_cls=False, use_dilation=True, use_gpu=False, use_pdserving=False, use_space_char=True, use_tensorrt=False, use_zero_copy_run=False, vertical_cut=True)

det_limit_side_len=1920
enable_mkldnn=true
rec_batch_num = 6

执行时间：
[2021/05/28 11:02:57] root INFO: dt_boxes num : 322, elapse : 2.1602203845977783
[2021/05/28 11:02:57] root INFO: text_detector time : 2.317730188369751
[2021/05/28 11:03:24] root INFO: text_recognizer time : 26.86380124092102
[2021/05/28 11:03:24] root INFO: rec_res num : 324, elapse : 26.78428030014038

发现是识别模型识别时间很长，如何加速识别的推理呢？

图片大小 (2352, 1673, 3)
图片如下

在utility.py修改线程数和识别的图片数
config.set_cpu_math_library_num_threads(6)
args.rec_batch_num = 30
服务器支持的话可以设置更多的线程和batch_num

luodaoyi · 2022-04-15T23:18:07Z

onnx在gpu上如何才能快一些啊

cgq0816 changed the title ~~使用CPU下进行加速处理，但是识别的速度将近20S，请问有什么方法提高嘛？~~ 使用CPU下进行加速处理，但是识别的速度将近30S，请问有什么方法提高嘛？ May 28, 2021

cgq0816 closed this as completed Aug 3, 2021

This was referenced Aug 20, 2021

onnxruntime CPU加速 #3759

Closed

CPU识别速度 #3738

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用CPU下进行加速处理，但是识别的速度将近30S，请问有什么方法提高嘛？ #2950

使用CPU下进行加速处理，但是识别的速度将近30S，请问有什么方法提高嘛？ #2950

cgq0816 commented May 28, 2021 •

edited

littletomatodonkey commented May 28, 2021

cgq0816 commented May 29, 2021

bytes-lost commented Jun 2, 2021

cgq0816 commented Jun 3, 2021

Sanster commented Jun 5, 2021

cgq0816 commented Jun 7, 2021

Sanster commented Jun 12, 2021

cgq0816 commented Jun 12, 2021

Sanster commented Jun 12, 2021

cgq0816 commented Jun 12, 2021

SWHL commented Jul 2, 2021

cgq0816 commented Aug 3, 2021

luodaoyi commented Apr 15, 2022

使用CPU下进行加速处理，但是识别的速度将近30S，请问有什么方法提高嘛？ #2950

使用CPU下进行加速处理，但是识别的速度将近30S，请问有什么方法提高嘛？ #2950

Comments

cgq0816 commented May 28, 2021 • edited

littletomatodonkey commented May 28, 2021

cgq0816 commented May 29, 2021

bytes-lost commented Jun 2, 2021

cgq0816 commented Jun 3, 2021

Sanster commented Jun 5, 2021

cgq0816 commented Jun 7, 2021

Sanster commented Jun 12, 2021

cgq0816 commented Jun 12, 2021

Sanster commented Jun 12, 2021

cgq0816 commented Jun 12, 2021

SWHL commented Jul 2, 2021

cgq0816 commented Aug 3, 2021

luodaoyi commented Apr 15, 2022

cgq0816 commented May 28, 2021 •

edited