You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a bunch of files from which I've extracted the text in both a line format and a coherent blob format and I'm trying to understand what the best practices are for using ocropy-linegen.
An example in the document given is lines 5-8 (reproduced below):
(Examples taken from this pdf - https://www.dropbox.com/s/6sy77shnro7sqdf/6.pdf?dl=0)
I have a bunch of files from which I've extracted the text in both a line format and a coherent blob format and I'm trying to understand what the best practices are for using ocropy-linegen.
An example in the document given is lines 5-8 (reproduced below):
Here, I could feed that whole blob to ocropy-linegen or I could feed it line by line:
I get the sense that the latter is what it expects. Is that right?
For another example, see the table further down on that page. The second row is:
Does ocropy-linegen want the full line (row), the full line with the spacing, or would it rather have each cell individually?
Thanks.
The text was updated successfully, but these errors were encountered: