text extractor is not working on Japanese despite OCR language pack is installed #22325

heiyue-an · 2022-11-27T13:29:16Z

Provide a description of requested docs changes

great support in English but cannot use in Japanese
powertoys v0.64.1 and win11 22623.891

inetkachev · 2023-01-08T20:14:24Z

Have same problem:

PS C:\Users\Ivan> [Windows.Media.Ocr.OcrEngine]::AvailableRecognizerLanguages


DisplayName     : English (United States)
LanguageTag     : en-US
NativeName      : English (United States)
Script          : Latn
LayoutDirection : Ltr
AbbreviatedName : ENG

DisplayName     : Russian
LanguageTag     : ru
NativeName      : Русский
Script          : Cyrl
LayoutDirection : Ltr
AbbreviatedName : РУС

But Cyrylic text recognised as English

JeffJacobson · 2023-01-16T00:34:25Z

I am also having the same problem. I know this tool USED TO WORK* on this page in an earlier version of PowerToys, but it no longer works. (I made sure I had the Japanese IME active when using the tool.)

You can get the what the text should be on that page by running this command in the browser's console.

[...document.body.querySelectorAll("img[src$='.svg'")].map(e => e.alt).join("\n")

However, I noticed that DeepL is also exhibiting similar behavior (see screenshot below), so it might be something that broke in an update to Windows itself rather than PowerToys. (I had never tried DeepL's OCR until today, so I don't know if it ever worked correctly.)

* When it "worked" it didn't get everything 100% correct, but now it returns complete gibberish.

System Info

Name	Value
PowerToys	v0.66.0
Display	XV273K
Scale	200%
Resolution	3840 x 2160 (Recommended)
Display 1	Connected to NVIDIA GeForce RTX 3080 Ti
Desktop mode	3840 x 2160, 119.91 Hz
Active signal mode	3840 x 2160, 119.91 Hz
Bit depth	8-bit with dithering
Color format	RGB
Color space	High dynamic range (HDR)
HDR certification	Not found More about HDR certification
Peak brightness	409 nits
Edition	Windows 11 Home
Version	22H2
Installed on	2022-09-29
OS build	22621.1105
Experience	Windows Feature Experience Pack 1000.22638.1000.0
Processor	12th Gen Intel(R) Core(TM) i9-12900KF 3.19 GHz
Installed RAM	64.0 GB (63.9 GB usable)
System type	64-bit operating system, x64-based processor
Pen and touch	No pen or touch input is available for this display

iamenews · 2023-05-12T00:25:07Z

Still happening on 0.69.1 May 12.
I have JA and zh-CN installed (as shown below) but OCR only works in english on tested websites like Yahoo Japan. I was following this help article btw.

LuisLauM · 2023-06-16T09:01:18Z

Please devs, don't forget to solve this issue. 🙏

TheJoeFin · 2023-08-10T15:36:25Z

Can you confirm this is still an issue with PowerToys v0.72?

/needinfo

hockyy · 2023-08-14T06:37:28Z

Can you confirm this is still an issue with PowerToys v0.72?

/needinfo

@TheJoeFin yes

hockyy · 2023-08-14T06:38:46Z

Update:

This only happens if both en-US and ja-JP is installed

PS C:\Users\hocky> $Capability = Get-WindowsCapability -Online | Where-Object { $_.Name -Like 'Language.OCR*en-US*' }
PS C:\Users\hocky> Get-WindowsCapability -Online | Where-Object { $_.Name -Like 'Language.OCR*' }
PS C:\Users\hocky> $Capability = Get-WindowsCapability -Online | Where-Object { $_.Name -Like 'Language.OCR*ja-JP*' }
PS C:\Users\hocky> $Capability | Add-WindowsCapability -Online

This runs fine if you abolish those en OCR packages

hockyy · 2023-08-14T06:42:59Z

Example result:

よーし、さっそく町で調査を始めましよう/

TheJoeFin · 2023-08-14T13:20:31Z

You have to select the language you want to OCR from the right click menu after you activate Text Extractor. If you don't select a different language, then the keyboard language will be used. By removing the other OCR languages, it seems like that could have the same effect.

Can you confirm some more details:

Windows language
Keyboard language
Have you tried changing language via context menu?

/needinfo

hockyy · 2023-08-14T13:41:38Z

Windows Language is English
Keyboard language is english (2 keyboards available but english is chosen)
Havent tried to change the language in the context menu
@TheJoeFin

TheJoeFin · 2023-08-14T14:34:38Z

I am going to update the UI to make the language changing obvious. Please try changing language with the right click menu when all languages are installed. Comment here if that fixes the issue.

/needinfo

TheJoeFin · 2023-08-15T12:52:41Z

/needinfo

microsoft-github-policy-service · 2023-08-20T19:24:45Z

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 5 days. It will be closed if no further activity occurs within 5 days of this comment.

heiyue-an added Issue-Docs Documentation issue that needs to be improved Needs-Triage For issues raised to be triaged and prioritized by internal Microsoft teams labels Nov 27, 2022

heiyue-an closed this as completed Nov 27, 2022

heiyue-an reopened this Nov 27, 2022

microsoft-github-policy-service bot added Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something and removed Needs-Triage For issues raised to be triaged and prioritized by internal Microsoft teams labels Aug 10, 2023

microsoft-github-policy-service bot added the Status-No recent activity no activity in the past 5 days when follow up's are needed label Aug 20, 2023

microsoft-github-policy-service bot closed this as completed Aug 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

text extractor is not working on Japanese despite OCR language pack is installed #22325

text extractor is not working on Japanese despite OCR language pack is installed #22325

heiyue-an commented Nov 27, 2022

inetkachev commented Jan 8, 2023 •

edited

Loading

JeffJacobson commented Jan 16, 2023

iamenews commented May 12, 2023

LuisLauM commented Jun 16, 2023

TheJoeFin commented Aug 10, 2023

hockyy commented Aug 14, 2023

hockyy commented Aug 14, 2023 •

edited

Loading

hockyy commented Aug 14, 2023 •

edited

Loading

TheJoeFin commented Aug 14, 2023

hockyy commented Aug 14, 2023

TheJoeFin commented Aug 14, 2023

TheJoeFin commented Aug 15, 2023

microsoft-github-policy-service bot commented Aug 20, 2023

text extractor is not working on Japanese despite OCR language pack is installed #22325

text extractor is not working on Japanese despite OCR language pack is installed #22325

Comments

heiyue-an commented Nov 27, 2022

Provide a description of requested docs changes

inetkachev commented Jan 8, 2023 • edited Loading

JeffJacobson commented Jan 16, 2023

System Info

iamenews commented May 12, 2023

LuisLauM commented Jun 16, 2023

TheJoeFin commented Aug 10, 2023

hockyy commented Aug 14, 2023

hockyy commented Aug 14, 2023 • edited Loading

hockyy commented Aug 14, 2023 • edited Loading

TheJoeFin commented Aug 14, 2023

hockyy commented Aug 14, 2023

TheJoeFin commented Aug 14, 2023

TheJoeFin commented Aug 15, 2023

microsoft-github-policy-service bot commented Aug 20, 2023

inetkachev commented Jan 8, 2023 •

edited

Loading

hockyy commented Aug 14, 2023 •

edited

Loading

hockyy commented Aug 14, 2023 •

edited

Loading