Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于训练集字符平衡问题 #44

Open
moondaiy opened this issue Feb 13, 2018 · 1 comment
Open

关于训练集字符平衡问题 #44

moondaiy opened this issue Feb 13, 2018 · 1 comment

Comments

@moondaiy
Copy link

LZ大神好,请问下,是否统计过汉子字符出现的频率?是否平衡???

@moondaiy moondaiy changed the title 关于训练集问题 关于训练集字符平衡问题 Feb 13, 2018
@bestzld
Copy link

bestzld commented Jun 15, 2018

应该不是很均衡,我从小说随机摘取问本行,有的文本行识别很好,有的文本行就识别的比较差,用脚本统计一下,然后不足的补充上(最好根据场景,词频来规划,针对你目标场景的语料集合也很重要)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants