New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to always read left to right? #80
Comments
It's not clear what you mean by '...seems to read from top to bottom'. |
@amitdo thanks for your response. The raw result:
So as you can see it seems to go from the top to the bottom for a first 'row', then again for the next part etc. Ocropy should always read from left to right so that I know in which order the text is coming from the image. The commands (to give you a raw idea):
Also, performance is quite bad. It takes 55 seconds to parse one A4 format image and to get all the text out. Anything I'm missing or can I optimalise this? |
Well, the layout analysis phase (ocropus-gpageseg) does not do a very good job with this document. In general, the current layout analysis in ocropy is too basic. Did you tried Tesseract? |
@amitdo I did try Tesseract before but I seemed to get very bad results. So the results seem to be good but I have the exact same problem, being that the text is scrambled through and I have no idea about the order. (atleast not programmatory) Thanks for the help! |
|
|
|
About the use of a dictionary. You can search for the phrase 'ocr language model' in google. |
@Yenthe666 Yes, ocropy is written in NumPy and only uses one core, because the image processing and numerical libraries in NumPy are generally single core only and because Python itself has very limited threading. What's been happening is:
|
@amitdo We have had good experiences with LSTM-based language models for OCR correction; CLSTM allows you to implement those. |
@zuphilip I can't seem to find anything regarding
I assume that the bbox values are the position? I saw the following line in the code: @amitdo thanks for that link, I'll be reading that for sure! @tmbdev I honestly think that is a major minus in this library then. The ability to use multiple cores would speed up this library by a lot! |
@tmbdev I was wondering, have you ever considered building the HTML exporter to re-create the same lay-out as the original PDF's? When looking at the code they match to x0, y0, x1 and y1 but could we somehow map this into an HTML document that has the same lay-out (approx) as the original image / PDF? |
HOCR document with two columns (https://github.com/tmbdev/ocropy/blob/master/tests/testpage.png): |
@avr248 how do you transform hocr html into the original layout ones ? |
@tmbdev I try to use ocropus-gpageseg to segment some image for preparing the dataset for lstm training. But I found that the image outputs of ocropus-gpageseg are not very correct. Sometime they lost the information in the output line file (For example in the original image the line is 15 September 2010 and the text is underlined but in the output of ocropus-gpageseg the image is 15 Se tember 2010 and the Se tember is not underlined. I want to know is this problem due to the ocropus-ggapeseg ? Is there any solution ? Thank you ! |
This issue is now diverging into several very different angles:
I close this issue here. If you want to continue any of the discussions from here, then please open a new issue. |
Hi guys,
I've been developing a bit with Ocropy but it sometimes seems to read from top to bottom, I'd like it to always read from left to right, no matter what.
Does anybody have any clue on how to do this?
P.S: my apoligies for creating an issue for this.
The text was updated successfully, but these errors were encountered: