You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Its a huge timesaver to use the batch tool conversion to OCR forced subtitles, but there is no setting for the input language so it returns many spelling errors:
I tried under "Fix common errors" to select "tr" as language, but it does not seem to work?!
EDIT: Ok, "my" fault. I never used Turkish through the manual OCR method so the Turkish language and dictionary was not installed. Maybe the dropdown of "Fix common errors" should be displayed as follows to make it more clear for the user:
-Auto-
aa (not installed)
...
de
en
...
tr (not installed)
And in the overview there should be an additional column "Language": File name | Size | Format | Status | Language
By that I can see which language "-Auto-" has detected (P.S. How does "-Auto-" work?)
And I think "Fix common errors" should be renamed to "Fix common OCR errors" to make it clear that this setting includes the language selection.
Maybe the used engine should be part of this setting, too. Because finally I ask myself if the batch tool uses Tesseract 4.1.0 or something else.
EDIT2: Hmm. It seems that the selected language under "Fix common errors" does not influence the used OCR language. I used "de" to batch convert a german subtitle, but an other language was used as it does not contain the German umlauts:
EDIT3: Ok. I used the manual OCR tool and change everything back to German. Then successfully converted a german subtitle. Then I opened the batch tool, set the language under "Fix common errors" to "tr" and converted the german subtitle again. And it is still correct. This means the batch tool does not respect the language selected under "Fix common errors". Instead it uses the language that was last used by the manual OCR tool.
The text was updated successfully, but these errors were encountered:
Its a huge timesaver to use the batch tool conversion to OCR forced subtitles, but there is no setting for the input language so it returns many spelling errors:
I tried under "Fix common errors" to select "tr" as language, but it does not seem to work?!
EDIT: Ok, "my" fault. I never used Turkish through the manual OCR method so the Turkish language and dictionary was not installed. Maybe the dropdown of "Fix common errors" should be displayed as follows to make it more clear for the user:
And in the overview there should be an additional column "Language":
File name | Size | Format | Status | Language
By that I can see which language "-Auto-" has detected (P.S. How does "-Auto-" work?)
And I think "Fix common errors" should be renamed to "Fix common OCR errors" to make it clear that this setting includes the language selection.
Maybe the used engine should be part of this setting, too. Because finally I ask myself if the batch tool uses Tesseract 4.1.0 or something else.
EDIT2: Hmm. It seems that the selected language under "Fix common errors" does not influence the used OCR language. I used "de" to batch convert a german subtitle, but an other language was used as it does not contain the German umlauts:
EDIT3: Ok. I used the manual OCR tool and change everything back to German. Then successfully converted a german subtitle. Then I opened the batch tool, set the language under "Fix common errors" to "tr" and converted the german subtitle again. And it is still correct. This means the batch tool does not respect the language selected under "Fix common errors". Instead it uses the language that was last used by the manual OCR tool.
The text was updated successfully, but these errors were encountered: