Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

有一个中文字识别不出来 #13118

Closed
xiaoxinmiao opened this issue Jun 18, 2024 · 7 comments
Closed

有一个中文字识别不出来 #13118

xiaoxinmiao opened this issue Jun 18, 2024 · 7 comments

Comments

@xiaoxinmiao
Copy link

xiaoxinmiao commented Jun 18, 2024

问题描述 / Problem Description

最后一个中文字(详见附件图片):撮,识别不出来,,这个是普通话考试词汇,妥妥的通用字哈

运行环境 / Runtime Environment

  • OS: Windows 11 专业版 64-bit (10.0, Build 22631) (22621.ni_release.220506-1250)
  • Paddle:2.6.1
  • PaddleOCR:2.8.0(因为2.7.5在windows上有错误,所以手动build main分支)

复现代码 / Reproduction Code

paddleocr --image_dir processed_image.jpg --use_angle_cls true

或者直接在网页上,也是一样的结果:
https://aistudio.baidu.com/community/app/91660/webUI

完整报错 / Complete Error Message

[2024/06/18 14:56:52] ppocr DEBUG: Namespace(help='==SUPPRESS==', use_gpu=False, use_xpu=False, use_npu=False, use_mlu=False, ir_optim=True, use_tensorrt=False, min_subgraph_size=15, precision='fp32', gpu_mem=500, gpu_id=0, image_dir='processed_image.jpg', page_num=0, det_algorithm='DB', det_model_dir='C:\Users\Administrator/.paddleocr/whl\det\ch\ch_PP-OCRv4_det_infer', det_limit_side_len=960, det_limit_type='max', det_box_type='quad', det_db_thresh=0.3, det_db_box_thresh=0.6, det_db_unclip_ratio=1.5, max_batch_size=10, use_dilation=False, det_db_score_mode='fast', det_east_score_thresh=0.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_sast_score_thresh=0.5, det_sast_nms_thresh=0.2, det_pse_thresh=0, det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, scales=[8, 16, 32], alpha=1.0, beta=1.0, fourier_degree=5, rec_algorithm='SVTR_LCNet', rec_model_dir='C:\Users\Administrator/.paddleocr/whl\rec\ch\ch_PP-OCRv4_rec_infer', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_batch_num=6, max_text_length=25, rec_char_dict_path='C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\site-packages\paddleocr\ppocr\utils\ppocr_keys_v1.txt', use_space_char=True, vis_font_path='./doc/fonts/simfang.ttf', drop_score=0.5, e2e_algorithm='PGNet', e2e_model_dir=None, e2e_limit_side_len=768, e2e_limit_type='max', e2e_pgnet_score_thresh=0.5, e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_pgnet_valid_set='totaltext', e2e_pgnet_mode='fast', use_angle_cls=True, cls_model_dir='C:\Users\Administrator/.paddleocr/whl\cls\ch_ppocr_mobile_v2.0_cls_infer', cls_image_shape='3, 48, 192', label_list=['0', '180'], cls_batch_num=6, cls_thresh=0.9, enable_mkldnn=False, cpu_threads=10, use_pdserving=False, warmup=False, sr_model_dir=None, sr_image_shape='3, 32, 128', sr_batch_num=1, draw_img_save_dir='./inference_results', save_crop_res=False, crop_res_save_dir='./output', use_mp=False, total_process_num=1, process_id=0, benchmark=False, save_log_path='./log_output/', show_log=True, use_onnx=False, return_word_box=False, output='./output', table_max_len=488, table_algorithm='TableAttn', table_model_dir=None, merge_no_span_structure=True, table_char_dict_path=None, layout_model_dir=None, layout_dict_path=None, layout_score_threshold=0.5, layout_nms_threshold=0.5, kie_algorithm='LayoutXLM', ser_model_dir=None, re_model_dir=None, use_visual_backbone=True, ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ocr_order_method=None, mode='structure', image_orientation=False, layout=True, table=True, ocr=True, recovery=False, use_pdf2docx_api=False, invert=False, binarize=False, alphacolor=(255, 255, 255), lang='ch', det=True, rec=True, type='ocr', savefile=False, ocr_version='PP-OCRv4', structure_version='PP-StructureV2')
[2024/06/18 14:56:53] ppocr INFO: processed_image.jpg
[2024/06/18 14:56:53] ppocr DEBUG: dt_boxes num : 20, elapsed : 0.5209295749664307
[2024/06/18 14:56:53] ppocr DEBUG: cls num : 20, elapsed : 0.11768102645874023
[2024/06/18 14:56:54] ppocr DEBUG: rec_res num : 20, elapsed : 0.5152599811553955
[2024/06/18 14:56:54] ppocr INFO: [[[68.0, 21.0], [152.0, 28.0], [149.0, 68.0], [64.0, 61.0]], ('bang', 0.995819628238678)]
[2024/06/18 14:56:54] ppocr INFO: [[[289.0, 26.0], [335.0, 26.0], [335.0, 62.0], [289.0, 62.0]], ('an', 0.9728361368179321)]
[2024/06/18 14:56:54] ppocr INFO: [[[477.0, 22.0], [542.0, 22.0], [542.0, 62.0], [477.0, 62.0]], ('huT', 0.8201795220375061)]
[2024/06/18 14:56:54] ppocr INFO: [[[670.0, 24.0], [754.0, 22.0], [755.0, 60.0], [671.0, 63.0]], ('xuan', 0.9222057461738586)]
[2024/06/18 14:56:54] ppocr INFO: [[[874.0, 23.0], [956.0, 23.0], [956.0, 68.0], [874.0, 68.0]], ('jing', 0.9630146026611328)]
[2024/06/18 14:56:54] ppocr INFO: [[[81.0, 119.0], [140.0, 119.0], [140.0, 177.0], [81.0, 177.0]], ('邦', 0.9998911619186401)]
[2024/06/18 14:56:54] ppocr INFO: [[[286.0, 120.0], [340.0, 120.0], [340.0, 174.0], [286.0, 174.0]], ('庵', 0.9973974227905273)]
[2024/06/18 14:56:54] ppocr INFO: [[[484.0, 120.0], [540.0, 120.0], [540.0, 174.0], [484.0, 174.0]], ('辉', 0.9999433755874634)]
[2024/06/18 14:56:54] ppocr INFO: [[[685.0, 120.0], [742.0, 120.0], [742.0, 174.0], [685.0, 174.0]], ('旋', 0.9998549222946167)]
[2024/06/18 14:56:54] ppocr INFO: [[[886.0, 119.0], [943.0, 119.0], [943.0, 177.0], [886.0, 177.0]], ('竞', 0.9520371556282043)]
[2024/06/18 14:56:54] ppocr INFO: [[[76.0, 303.0], [144.0, 303.0], [144.0, 349.0], [76.0, 349.0]], ('cha', 0.8126180171966553)]
[2024/06/18 14:56:54] ppocr INFO: [[[269.0, 302.0], [354.0, 305.0], [353.0, 346.0], [267.0, 343.0]], ('zhou', 0.9807876944541931)]
[2024/06/18 14:56:54] ppocr INFO: [[[470.0, 301.0], [556.0, 308.0], [552.0, 351.0], [466.0, 344.0]], ('tong', 0.9794102907180786)]
[2024/06/18 14:56:54] ppocr INFO: [[[687.0, 303.0], [738.0, 303.0], [738.0, 350.0], [687.0, 350.0]], ('b6', 0.9985640048980713)]
[2024/06/18 14:56:54] ppocr INFO: [[[880.0, 305.0], [947.0, 305.0], [947.0, 346.0], [880.0, 346.0]], ('zu8', 0.7663939595222473)]
[2024/06/18 14:56:54] ppocr INFO: [[[84.0, 400.0], [137.0, 400.0], [137.0, 456.0], [84.0, 456.0]], ('茬', 0.9519940614700317)]
[2024/06/18 14:56:54] ppocr INFO: [[[282.0, 400.0], [341.0, 400.0], [341.0, 457.0], [282.0, 457.0]], ('皱', 0.9996483325958252)]
[2024/06/18 14:56:54] ppocr INFO: [[[484.0, 401.0], [538.0, 401.0], [538.0, 456.0], [484.0, 456.0]], ('捅', 0.9096341729164124)]
[2024/06/18 14:56:54] ppocr INFO: [[[686.0, 400.0], [740.0, 400.0], [740.0, 456.0], [686.0, 456.0]], ('伯', 0.9994731545448303)]

可能解决方案 / Possible solutions

附件 / Appendix

0002

@UserWangZz
Copy link
Collaborator

可以尝试使用ppocr_v4模型
运行命令

python tools/infer/predict_system.py --image_dir t.jpg --det_model_dir pretrain/ch_PP-OCRv4_det_server_infer --rec_model_dir pretrain/ch_PP-OCRv4_rec_infer/ --rec_char_dict_path ppocr/utils/ppocr_keys_v1.txt --use_space_char True

t

@xiaoxinmiao
Copy link
Author

xiaoxinmiao commented Jun 18, 2024

我在本地尝试了一下,仍然没有获取到最后一个,从哪里可以下载你发的模型呢?我的模型来自于这里。https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/models_list.md
image
image
下面是执行的结果

'''

PS D:\source\pythonpath\other\PaddleOCR-main> python tools/infer/predict_system.py --image_dir t/0002.jpg --det_model_dir pretrain/ch_PP-OCRv4_det_server_infer --rec_model_dir pretrain/ch_PP-OCRv4_rec_infer/ --rec_char_dict_path ppocr/utils/ppocr_keys_v1.txt --use_space_char True
E0618 17:51:01.776589 888 analysis_config.cc:125] Please use PaddlePaddle with GPU version.
E0618 17:51:02.816227 888 analysis_config.cc:125] Please use PaddlePaddle with GPU version.
[2024/06/18 17:51:03] ppocr INFO: In PP-OCRv3, rec_image_shape parameter defaults to '3, 48, 320', if you are using recognition model with PP-OCRv2 or an older version, please set --rec_image_shape='3,32,320
[2024/06/18 17:51:05] ppocr DEBUG: dt_boxes num : 20, elapsed : 2.3756794929504395
[2024/06/18 17:51:06] ppocr DEBUG: rec_res num : 20, elapsed : 0.5246231555938721
[2024/06/18 17:51:06] ppocr DEBUG: 0 Predict time of t/0002.jpg: 2.916s
[2024/06/18 17:51:06] ppocr DEBUG: bang, 0.996
[2024/06/18 17:51:06] ppocr DEBUG: an, 0.994
[2024/06/18 17:51:06] ppocr DEBUG: huT, 0.829
[2024/06/18 17:51:06] ppocr DEBUG: xuan, 0.969
[2024/06/18 17:51:06] ppocr DEBUG: jing, 0.979
[2024/06/18 17:51:06] ppocr DEBUG: 邦, 1.000
[2024/06/18 17:51:06] ppocr DEBUG: 庵, 0.998
[2024/06/18 17:51:06] ppocr DEBUG: 辉, 1.000
[2024/06/18 17:51:06] ppocr DEBUG: 旋, 1.000
[2024/06/18 17:51:06] ppocr DEBUG: 竞, 0.932
[2024/06/18 17:51:06] ppocr DEBUG: cha, 0.944
[2024/06/18 17:51:06] ppocr DEBUG: zhou, 0.993
[2024/06/18 17:51:06] ppocr DEBUG: tong, 0.957
[2024/06/18 17:51:06] ppocr DEBUG: b6, 0.932
[2024/06/18 17:51:06] ppocr DEBUG: zuo, 0.824
[2024/06/18 17:51:06] ppocr DEBUG: 茬, 0.940
[2024/06/18 17:51:06] ppocr DEBUG: 皱, 1.000
[2024/06/18 17:51:06] ppocr DEBUG: 捅, 0.831
[2024/06/18 17:51:06] ppocr DEBUG: 伯, 0.999
[2024/06/18 17:51:06] ppocr DEBUG: The visualized image saved in ./inference_results\0002.jpg
[2024/06/18 17:51:06] ppocr INFO: The predict total time is 3.027186393737793
'''

image

@xiaoxinmiao
Copy link
Author

xiaoxinmiao commented Jun 18, 2024

另外,执行了343张图片,其中59张是不能完全识别的,我都发上来了
PS: 发现一个比较奇怪的现象,当用多线程执行的时候,偶尔会出来
不完全识别的.zip

@UserWangZz
Copy link
Collaborator

奇怪,尝试切换到2.7分支上试试

@UserWangZz
Copy link
Collaborator

我和你使用的模型是一样的

@xiaoxinmiao
Copy link
Author

使用2.7,似乎没有任何改变,效果如下
'''
paddleocr 2.7.0.0
paddlepaddle 2.6.1
'''
image

'''
(.venv) D:\source\pythonpath\other\PaddleOCR-release-2.7>python tools/infer/predict_system.py --image_dir t/0002.jpg --det_model_dir pretrain/ch_PP-OCRv4_det_server_infer --rec_model_dir pretrain/ch_PP-OCRv4_rec_infer/ --rec_char_dict_path ppocr/utils/ppocr_keys_v1.txt --use_space_char True
E0618 20:29:00.455236 7944 analysis_config.cc:125] Please use PaddlePaddle with GPU version.
E0618 20:29:01.519253 7944 analysis_config.cc:125] Please use PaddlePaddle with GPU version.
[2024/06/18 20:29:01] ppocr INFO: In PP-OCRv3, rec_image_shape parameter defaults to '3, 48, 320', if you are using recognition model with PP-OCRv2 or an older version, please set --rec_image_shape='3,32,320
[2024/06/18 20:29:03] ppocr DEBUG: dt_boxes num : 20, elapsed : 1.5256006717681885
[2024/06/18 20:29:04] ppocr DEBUG: rec_res num : 20, elapsed : 0.5638906955718994
[2024/06/18 20:29:04] ppocr DEBUG: 0 Predict time of t/0002.jpg: 2.090s
[2024/06/18 20:29:04] ppocr DEBUG: bang, 0.996
[2024/06/18 20:29:04] ppocr DEBUG: an, 0.994
[2024/06/18 20:29:04] ppocr DEBUG: huT, 0.829
[2024/06/18 20:29:04] ppocr DEBUG: xuan, 0.969
[2024/06/18 20:29:04] ppocr DEBUG: jing, 0.979
[2024/06/18 20:29:04] ppocr DEBUG: 邦, 1.000
[2024/06/18 20:29:04] ppocr DEBUG: 庵, 0.998
[2024/06/18 20:29:04] ppocr DEBUG: 辉, 1.000
[2024/06/18 20:29:04] ppocr DEBUG: 旋, 1.000
[2024/06/18 20:29:04] ppocr DEBUG: 竞, 0.932
[2024/06/18 20:29:04] ppocr DEBUG: cha, 0.944
[2024/06/18 20:29:04] ppocr DEBUG: zhou, 0.993
[2024/06/18 20:29:04] ppocr DEBUG: tong, 0.957
[2024/06/18 20:29:04] ppocr DEBUG: b6, 0.932
[2024/06/18 20:29:04] ppocr DEBUG: zuo, 0.824
[2024/06/18 20:29:04] ppocr DEBUG: 茬, 0.940
[2024/06/18 20:29:04] ppocr DEBUG: 皱, 1.000
[2024/06/18 20:29:04] ppocr DEBUG: 捅, 0.831
[2024/06/18 20:29:04] ppocr DEBUG: 伯, 0.999
[2024/06/18 20:29:04] ppocr DEBUG: The visualized image saved in ./inference_results\0002.jpg
[2024/06/18 20:29:04] ppocr INFO: The predict total time is 2.222066879272461
'''

@UserWangZz
Copy link
Collaborator

--drop_score 0.4 默认是0.5
可以把这个参数设置低一些,因为最后一个撮字置信度低于0.5,被过滤掉了

@PaddlePaddle PaddlePaddle locked and limited conversation to collaborators Jun 19, 2024
@GreatV GreatV converted this issue into discussion #13124 Jun 19, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants