Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I got poor results on my own screenshots #12

Closed
NHZlX opened this issue Apr 26, 2018 · 3 comments
Closed

I got poor results on my own screenshots #12

NHZlX opened this issue Apr 26, 2018 · 3 comments

Comments

@NHZlX
Copy link

NHZlX commented Apr 26, 2018

I used the screenshots of my computer to intercept the formula in the papers. But none of them can be identified. Are there any special data processing methods for those data?

@da03
Copy link
Collaborator

da03 commented Apr 26, 2018

Are you using the provided model to translate a screenshot? If so it’s not surprising because neural network is very domain specific: at training time it only saw images of a single font size, so at test time if the font size or style is different, it would probably fail. For recognizing screenshots, maybe try resizing the image to be of approximately the same font size as used in training set would probably give better results, but I strongly recommend training a new model for your task.

@GohioAC
Copy link

GohioAC commented Jul 20, 2018

I'm trying to translate equations extracted from pdf images but facing the same problems, probably due to font size and style. Do you think that simply rendering the equations in different fonts and styles and retraining the model might help? Also, it'd be a big help if you can provide the latex code you used to generate the equation images with transparent backgrounds.

@Miffyli
Copy link
Collaborator

Miffyli commented Jul 20, 2018

@GohioAC

See this repo for tools used to generate the images for these experiments: https://github.com/Miffyli/im2latex-dataset (see this issue for explanation on how to change rendering setup: Miffyli/im2latex-dataset#8 ). For transparency you have to find proper tool for converting PDFs into rasterized images. The original code uses ImageMagick's convert which has --transparent command, but I do not know how well it works for this situation.

Generally augmenting the dataset with different fonts, sizes, noise-levels, backgrounds, perspectives etcetc should help creating more robust models, so it is worth a shot!

@NHZlX NHZlX closed this as completed Jul 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants