Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCR and Font detection, which of these three is better approach? #40

Closed
jonoNel opened this issue Sep 3, 2019 · 3 comments
Closed

OCR and Font detection, which of these three is better approach? #40

jonoNel opened this issue Sep 3, 2019 · 3 comments

Comments

@jonoNel
Copy link

jonoNel commented Sep 3, 2019

I want to train the model on font detection and OCR using below links but i'm not sure of the 4 options how best to do it:

  1. Train on top of existing model ie add the new data to the existing dataset
  2. Train the the networks independently but combine the output like ensemble model?
  3. Make a brand new neural network using the logics and algorithms of the other neural networks?

OCR
https://github.com/Tony607/keras-image-ocr/blob/master/image-ocr.ipynb
https://mc.ai/how-to-train-a-keras-model-to-recognize-text-with-variable-length/

Fonts:
https://tsprojectsblog.wordpress.com/2017/08/19/using-a-neuronal-network-for-font-character-detection-in-images/
https://tanmayshah2015.wordpress.com/2015/12/01/synthetic-font-dataset-generation/

@PaulGwamanda
Copy link

PaulGwamanda commented Sep 3, 2019

I’m still a noob when It comes to ML but for fonts I think a whole new network would be overkill, we could use existing tools like https://github.com/Vasile-Peste/Typefont which uses tesseract and integrate it into our project somehow

@emilwallner
Copy link
Owner

This is a great next step. I'd go for option 2.

As this project scales, I like the idea of having niche models, e.g. one for layout, one for font and text, and one for animations, etc. Then have an integration pipeline that fits everything together. This makes it more modular and easier to collaborate on.

The difficult aspect of text and font recognition is inserting it into the HTML. Here's what I'd start with:

  1. Try finding an existing model that extracts the text in a page and separates them by area, and then finds the font associated with each area. (I'd probably skip the font to start with to narrow down the problem, then eventually add max 10-20 fonts)
  2. Input training data:
    a) The screenshot including the correct text
    b) The HTML with unique div tags and a placeholder for the text
    c) One of the text snippets and a potential font tag.
    Output: The unique div tag that corresponds to the text snippet in c.
  3. Write a script that extracts all the text/fonts using existing OCR, makes a prediction for each text snippet and inserts it into the HTML.

@jonoNel
Copy link
Author

jonoNel commented Sep 6, 2019

Ok thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants