You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
如题,要识别的样例图片如下:
![微信图片_20231205132817](https://private-user-images.githubusercontent.com/158746475/340362172-c3548e19-7d14-4efe-9ea1-902e73da0af3.jpg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA0MjEzNDIsIm5iZiI6MTcyMDQyMTA0MiwicGF0aCI6Ii8xNTg3NDY0NzUvMzQwMzYyMTcyLWMzNTQ4ZTE5LTdkMTQtNGVmZS05ZWExLTkwMmU3M2RhMGFmMy5qcGc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzA4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwOFQwNjQ0MDJaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT05YzRiNDgyYWE5ZDE5NTlkNWI5YjNhNTEzNjFlY2Y4MGY5NDI5ZTlmMGJmMzVhN2M1N2IyNTUzNTg0ZTY0YTAyJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.qGK2D__zrtpcfTaL_JWJ7qEPH7Subtpa8Egndy4P-qo)
------ 已经做的尝试 -------
总体思路是:找一个已有的可识别中英文的model,再用拼音数据去做finetune。
然后参考的这个项目基于PaddleOCR的小学生手写汉语拼音识别,用的预训练模型和配置如图
![image](https://private-user-images.githubusercontent.com/158746475/340360706-7ecca546-eff9-4135-86b7-e7c909642677.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA0MjEzNDIsIm5iZiI6MTcyMDQyMTA0MiwicGF0aCI6Ii8xNTg3NDY0NzUvMzQwMzYwNzA2LTdlY2NhNTQ2LWVmZjktNDEzNS04NmI3LWU3YzkwOTY0MjY3Ny5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzA4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwOFQwNjQ0MDJaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT02ZmFhNWE4MTIxNTE1Mzc0MzA4ODEyNjg1MTExMmVhMDZkZjNhNjY3NzcxZGE3NjM4MDJmNjMzZmM1MzBlYmE2JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.OGvIMzfDmkJ-bb-S6i4e2FxwZ-ao7O4RDu14Y4il78o)
配置文件除了必要的数据路径,其它基本都没动。数据用的参考项目的。
训练完之后,首先用训练集的数据测试,都能正确识别。但是用新的数据,无法正确识别。比如
![yin - 副本](https://private-user-images.githubusercontent.com/158746475/340363972-91f2747b-c3a8-4fc5-810b-3ff6cd231cd1.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA0MjEzNDIsIm5iZiI6MTcyMDQyMTA0MiwicGF0aCI6Ii8xNTg3NDY0NzUvMzQwMzYzOTcyLTkxZjI3NDdiLWMzYTgtNGZjNS04MTBiLTNmZjZjZDIzMWNkMS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzA4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwOFQwNjQ0MDJaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT00MGI2NTI2NTMxZDhmZTZjN2FmYjQ3Njg5NjU1Mzk4ZTMzOTNjMDljOGVjY2QxZjI3ZjI2N2IwNDZlMTZhOGRkJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.8KYXyVBcQflV8rGldaih9uWdigFqjD_cB_6_hnSqNaI)
![image](https://private-user-images.githubusercontent.com/158746475/340364156-7e821779-5964-4536-bb51-a5238e198d80.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA0MjEzNDIsIm5iZiI6MTcyMDQyMTA0MiwicGF0aCI6Ii8xNTg3NDY0NzUvMzQwMzY0MTU2LTdlODIxNzc5LTU5NjQtNDUzNi1iYjUxLWE1MjM4ZTE5OGQ4MC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzA4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwOFQwNjQ0MDJaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1kOGFhMjQwYzY4MmNkMzdmZjU4YTY2ODkwYWNhODVhMmU3MTEyN2I1OGMzMTQ5OGI2YTBiODE4YWZlNmNiNjM2JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.nbVRPjiWccW9dPF_uVZbvxiiV3XcgGSY_uE7QXYJxvg)
infer_img为:
,输出结果为:
------ 我的问题 ------
出现上面这种情况,一种可能原因是不是过拟合?我的epoch用的默认800,但是数据集只有500(所以800是否过大了?)
这个思路(找一个已有的可识别中英文的model,再用拼音数据去做finetune)是否可行?是否有更推荐的预训练model,比如v3?或者更好的实现方案呢?
我用官方的预训练模型,去测试一个很正常的图片,结果也是错的,这是为什么?
python tools/infer_rec.py -c ch_PP-OCRv2_rec.yml -o Global.pretrained_model=pretrain_models/ch_PP-OCRv2_rec_slim_quant_train/best_accuracy Global.infer_img=./train_data/pinyin_image_data/self_data/beizhu.png
输出结果(每次识别结果不同):
![image](https://private-user-images.githubusercontent.com/158746475/340367792-8bdf68d8-b95c-49f8-88f6-fa1289f6948c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA0MjEzNDIsIm5iZiI6MTcyMDQyMTA0MiwicGF0aCI6Ii8xNTg3NDY0NzUvMzQwMzY3NzkyLThiZGY2OGQ4LWI5NWMtNDlmOC04OGY2LWZhMTI4OWY2OTQ4Yy5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzA4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwOFQwNjQ0MDJaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT03NmE3ZWVjMGI1N2ZjMjViNGYxNjRjYzhhOWM2M2JlYzY2MmNlMjU2NWNjM2U1ZDczNGNhODBhMjYzYTkwZmE1JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.P3OlwgQRpDl7IbX7yVtGw2nPF2fn1U_8RRS0ajHMue0)
4.我发现对于大段的文字,比如像样例图片这种,识别结果只会输出零星的几个字,是需要对文字检测作什么调整么?
--------------- 任何回复,不甚感激! ----------------
Beta Was this translation helpful? Give feedback.
All reactions