-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: '\xe2' is not in list error when i try to train with color parameter #110
Comments
This error means that your image labels have a character that's not in the model charmap. Check your labels, because by default the supported charset is numbers + uppercase only: attention-ocr/aocr/util/data_gen.py Line 23 in 7bb17af
|
Thank you for the prompt response. I used --full-ascii and --no-force-uppercase flags as well, I assumed --full-ascii covers all characters. Am I missing something? |
|
thank you for the reply. I am able to train the model. However, I now see a very weird issue. I added synthetic images (generated using GANs) to the training data, and few cropped COCO images. So about 1M synthetic images and Synth90k and about 60k coco images. The training loss is improving but when I test the model for prediction, it performs very poorly. It just prints "cccc" or "aaaaa" etc. |
Everything looks correct, so I wouldn't know. Really depends on your dataset, separation of training/testing data, and a whole bunch of other factors. This might, of course, be a fault in the code or the model itself. In that case, once you pin the issue, please submit a detailed report or a PR—that'd be much appreciated if the aocr code is indeed the problem. |
Hi @kulkarnivishal , |
Not really. Although, I continued the training for 3 more weeks and results look better, not that great though. The main issue I am facing is predicting symbols. No matter how I train the inference seems to be getting it always wrong. |
You can try to manually change the dictionary instead of using keys. At the same time I have no idea how you managed to add a non-ASCII labeled targets to the train-set .tfrecords file. When I tried to do this it used to throw an error of encoding all the time( |
you mean manually adding symbols instead of using full-ascii flag? |
Hi @emedvedev
I get this error after a first few steps when I train with --color parameter (set channels to 3). Could you please help?
Please find the error log below:
The text was updated successfully, but these errors were encountered: