Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use TextRecognitionDataGenerator #4158

Open
DesBw opened this issue Nov 6, 2023 · 1 comment
Open

Use TextRecognitionDataGenerator #4158

DesBw opened this issue Nov 6, 2023 · 1 comment

Comments

@DesBw
Copy link

DesBw commented Nov 6, 2023

Your Feature Request

The images generated by https://github.com/Belval/TextRecognitionDataGenerator appear much more realistic to most real-world images than the images generated by text2image script.
It would be nice if Tesseract can use or support TextRecognitionDataGenerator.
Also this: Belval/TextRecognitionDataGenerator#153 shows that tool supports the tesseract format.

Or, if someone has come up with a way to use the two together, that would be nice. TextRecognitionDataGenerator supports advanced distortion and background choice. A model trained with those images could be more accurate.

@DesBw DesBw changed the title Move to TextRecognitionDataGenerator Use TextRecognitionDataGenerator Nov 6, 2023
@zdenop
Copy link
Contributor

zdenop commented Jan 13, 2024

Not sure what exactly the is problem: AFAIK text2image is only one of the tools that can be used for tesseract training. It is up to the trainer how the image and ground truth files are created/generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants