Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Optical character recognition (OCR) of text in freeform fields #15
Oh, I somehow expected this to come up at some point ... thanks for pointing out gamera, I did not know about it.
Some thoughts of what needs to be done for this:
Gamera seems like a good starting point, but this whole thing is quite a big chunk to get it working. I have no idea when or even how much time I can spend on this myself ...
Sent from my iPhone
On Feb 23, 2013, at 10:09, Benjamin Berg email@example.com wrote:
I did some work on the branch to interface with gamera. I don't seem to entirely understand it right now, but there is hope :-)
It seems to me that the default grouping doesn't work, but SDAPS can do grouping by itself. Also, for a start it seems easier to do the training using the gamera_gui program instead of something custom.
Important next steps:
Fun side fact: gamera seems to store the image+original location into the XML file for each character; I guess some munging could be necessary for privacy reasons if one wants to share the training data. Otherwise the original strings could be build from the training data.
It is hard to say how much work it is overall. I expect that there is a
You might want also want to talk to Matthew Roy, see
I'll think about the matter some more the next days/week (i.e. how much
I will be very interested by your evaluation, please keep me informed.
SDAPS seems to be a very good start point to build a complete solution to meet needs of one of my customers but I absolutely need OCR in addition of OMR to be complete.
Another option could be to take over ocr branch code to try to achieve it but certainly not the best in time and money terms in my context.