Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving OCR recognition #8

Open
madmalkav opened this issue Oct 23, 2021 · 4 comments
Open

Improving OCR recognition #8

madmalkav opened this issue Oct 23, 2021 · 4 comments
Labels
enhancement New feature or request

Comments

@madmalkav
Copy link

Are there any options that can be played with to try to improve recognition? In example, this text:
test

Is readed as:

「 お しゝ ! 邊つかったか ? .

If I remove the furigana from the selection (not very convinient for multiline texts) I get :

「 おいしい ! 上幅つかっ たか ? 」

@madmalkav
Copy link
Author

I have been testing alternatives to Tesseract and Easy OCR seems to do a much better recognition work (but messes the output format a little if there is furigana, see: JaidedAI/EasyOCR#575

I have barely no coding experience but I'm looking into trying to fork the project to try to add support for backends different to Tesseract. Will report if I manage to do anything useful.

@kamui-fin
Copy link
Owner

Improving the OCR accuracy is definitely an ongoing goal of this project. Including alternative backends does sound interesting however it seems like Easy OCR only supports python. I think a better option would be to focus development efforts towards fine tuning tesseract to recognize text better along with some extra text processing. One of the first steps would be to implement a text processing stage which replaces many of the commonly missed characters with the expected ones, sort of how Kaku does it. Another thing to look into is further training the models to adapt to commonly missed fonts. I'm open to any contributions or ideas so feel free to share your findings.

@wildwestrom
Copy link
Contributor

wildwestrom commented Apr 6, 2022

So here's one problem I found while OCRing Steins;Gate.
As you can see, this is the image that comes out of processing. Kurisu's labcoat is visible within the image, and as a consequence messes up the OCR.
gazou-ocr-processed-img
Output text: 「しかも、完全ではないけけど、タイムトラへし老殿玖さぜてるってこ と|になるわね、これ」盆 「・世 で アー

When I change the Otsu Score Fraction to anything greater than or equal to 0.1, this problem is nearly eliminated.
Here's what it looks like at 0.1.
gazou-ocr-processed-img
Output text: 「しかも、完全ではない|けど、タイムトラベルを成功させてるってこ と|こなるわね、これ」るを

Here's the relevant line of code:
https://github.com/kamui-fin/gazou/blob/master/src/ocr.cpp#L32
Raw image for reference:
raw-capture

Let me know your thoughts on this, what I should test this setting on, etc.

@kamui-fin kamui-fin added the enhancement New feature or request label Mar 15, 2023
@Yuri-K7
Copy link

Yuri-K7 commented Jan 8, 2024

I have a similar issue, with a specific background of a game which makes the text completely unreadable :
Screenshot_20240108_053458
tempDebugOcrImage

Changing the same otsu score fraction to 0.7 removes most of the background, but there's a lot of errors, and then changing the usm fract to 1.5 makes it almost perfect (there's still one error) :
tempDebugOcrImage0dot7and1dot5
Output : 僕は黙々とシャープペンシルの先を走らせ、 青い野
線が刻まれた真新しい大学ノートに、ひとつの円を描
きだした。 いつも描く馴染みのあるあの円だ。

そして僕ば 、ねっとりとした夢の中へ落ちていく 。

I don't know how that applies to other content and if they're even the best values for this image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants