-
Notifications
You must be signed in to change notification settings - Fork 929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chinese font problem #8
Comments
Hi! Please include the chinese words you were unable to print. Thanks |
Thank you for you replying! |
That would be because the font you used does not support all characters. I'll try to provide more fonts in the future. In the meantime, I know https://github.com/JarveeLee/SynthText_Chinese_version/tree/master/data/fonts has a lot of choice but I cannot add them to this project over copyright infringements concerns. |
Thank you very much !!! |
@DLUTfangping , have you tried training using dataset generated by this? I'm using crnn to train, but the result is not good. |
@liangshuang1993 I can't say for Chinese, but I got decent result in English when using lowercase only (lowercase plus uppercase was a challenge). Also, while I don't know which implementation of CRNN you used, mine takes a long time to train (+50 hours on GTX 1080Ti) so it's very normal if the initial performance is very poor. |
Hi @Belval , thanks for you answer. Maybe I generated training data wrongly. First I generated one dataset with word length is 5, using Gaussian Noise as background. the performance is good on training dataset and validation dataset, bad on some real pictures. Then I generated another dataset with word length is 8, using given pictures as background. And I trained crnn on the whole dataset(dataset1 and dataset2). I have trained 7000 epochs, but the training dataset accuracy is still 0.5590. Strange thing is when I did some test on training data, I found it can barely recognize the word. So is this means I must have same length dataset? Thanks a lot. By the way, each dataset has 500,000 pictures, containing English, Chinese and number.(They may appear in the same picture). |
@liangshanghuang1993 That is indeed a rather ambitious idea to learn both English and Chinese. Most implementation I know only do one. But yes, same word count is required. The original author even went as far as only recognizing single words instead of multi-word sentences. |
OK. Thanks a lot!! |
你好,请问这个问题你解决了吗,我也遇到这个问题了,在真实样本上的效果很差 |
some chinese fonts can not generate good samples(for example ,some word could not be generated),do you have some suggests to solve the problem .thank you in advance
The text was updated successfully, but these errors were encountered: