-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCR and Font detection, which of these three is better approach? #40
Comments
I’m still a noob when It comes to ML but for fonts I think a whole new network would be overkill, we could use existing tools like https://github.com/Vasile-Peste/Typefont which uses tesseract and integrate it into our project somehow |
This is a great next step. I'd go for option 2. As this project scales, I like the idea of having niche models, e.g. one for layout, one for font and text, and one for animations, etc. Then have an integration pipeline that fits everything together. This makes it more modular and easier to collaborate on. The difficult aspect of text and font recognition is inserting it into the HTML. Here's what I'd start with:
|
Ok thank you |
I want to train the model on font detection and OCR using below links but i'm not sure of the 4 options how best to do it:
OCR
https://github.com/Tony607/keras-image-ocr/blob/master/image-ocr.ipynb
https://mc.ai/how-to-train-a-keras-model-to-recognize-text-with-variable-length/
Fonts:
https://tsprojectsblog.wordpress.com/2017/08/19/using-a-neuronal-network-for-font-character-detection-in-images/
https://tanmayshah2015.wordpress.com/2015/12/01/synthetic-font-dataset-generation/
The text was updated successfully, but these errors were encountered: