-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bounding boxes are displaced from math regions #3
Comments
Did you try other pdfs? |
Yes, I download these files: Other files are unavailable. Only for Borcherds86.pdf and Cline88.pdf bounding boxes are placed on math regions correctly. For other files bounding boxes are fully displaced. |
Which version of pdf2image are you using? I think I used the following version - Name: pdf2image |
many PDF link are not aviliable. |
The answer to questions: |
Hi @VladimirKalachikhin , |
Hi @MaliParag , Could you please share your image dataset with us? |
I used MARMOT dataset, see above. |
Hi @VladimirKalachikhin , |
I don't quite understand you. MARMOT just another one dataset. I created a simple tool to convert MARMOT to IDCAR-compatible format for use IDCAR instruments. |
Thank you for your reply. I have understand your mean. |
I get the data from this. |
NOTE: If you find the bounding boxes are displaced from math regions, it is because the document image that you have rendered is of different size than the one used while annotating. datasetV2 provides file sizes for each image. Resize the image that you have rendered to the size provided in datasetV2 and you should be able to use the annotations. |
I know. |
Yes, I rendered the image to sizes from
file_sizes
file. But bounding boxes are fully displaced.I see that pages numeration on math_gt .csv files start from 0, but
convert_pdf_to_image.py
created pages from 1. Also,convert_pdf_to_image.py
creates images different them infile_sizes
sizes.I make my own
convert_pdf_to_image
, and rending images correct sizes. I start numeration from 0 or 1. Nothing happened.I tried http://aif.centre-mersenne.org/article/AIF_1970__20_1_493_0.pdf as AIF_1970_493_498.pdf
The text was updated successfully, but these errors were encountered: