You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there - I am looking into parsing laboratory test results (unfortunately results are often received as pdfs), and performance seems to be great except in a very specific context: a report that I'm looking at contains a critical element with white text on a black background. In this case the text is either not detected or read incorrectly. I'm a bit limited in what I can share so this is lacking context, but for example, failure to detect text:
Incorrect results:
Any suggestions on settings or pre-processing strategies that might help?
Thanks a lot!
The text was updated successfully, but these errors were encountered:
This is a really interesting edge case. I think the challenge is the "mostly regular text with some inverted". Some ideas:
Finetune the text detection model with negative examples
Flood fill (I think it's called flood fill) from a corner with black (which will just leave the number 36 white), then invert colors and do OCR. Then OCR the normal page. Merge the two results by just blanking out any regions in the normal page where the inverted page has text.
Thanks a lot for the suggestions - I'd love to give the fine tuning approach a shot, but I'm not sure where to start. I know it's a big topic, but can you suggest a) a general resource describing how I would go about fine tuning the text detection model (eg, an overview of the process, how many examples you think might be sufficient, would I provide examples cropped to the white on black text vs providing examples in context); b) in the context of this project, where is the model specified (I assume it downloads a model from huggingface, but I can't seem to find where this configuration is located), and how would I update the the configuration to refer to the fine-tuned model. I'd certainly be happy to document the process for anyone else with a need for something similar.
Hi there - I am looking into parsing laboratory test results (unfortunately results are often received as pdfs), and performance seems to be great except in a very specific context: a report that I'm looking at contains a critical element with white text on a black background. In this case the text is either not detected or read incorrectly. I'm a bit limited in what I can share so this is lacking context, but for example, failure to detect text:
Incorrect results:
Any suggestions on settings or pre-processing strategies that might help?
Thanks a lot!
The text was updated successfully, but these errors were encountered: