Allow sending an entire file to ocr #12981
Labels
feature/ocr
Related to Optical Character Recognition feature in NVDA
OCR add-on
p5
https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority
triaged
Has been triaged, issue is waiting for implementation.
Is your feature request related to a problem? Please describe.
in my opinion using nvda ocr (windows) to recognize documents in images, especially if these images are very long, or using nvda to recognize large documents in .pdf format is very frustrating.
Several times when I open the image with text in .jpg format and this file is opened in the windows native program photos app, two things can happen.
Sometimes nvda only reads the file name, when this occurs it is possible to assion ocr by pressing nvda + r and nvda reads the text of the photo.
Sometimes I do the same task, I open the image in the photos app, sometimes a button like full screen gains focus, and in this case we can press nvda + r and the text in the image is not read in any way.
I can't explain why sometimes the full screen button gets focus and sometimes it doesn't, and I also don't know how to focus on the image to perform the ocr.
Another issue, to perform the ocr in .pdf files is even worse, we have to put page by page on the screen and then perform the ocr page by page, it is at least counterproductive, when it works.
I tried to paste the picture into word and perform the ocr, and to my frustration it also read only one line.
Describe the solution you'd like
I would like a function in nvda, so that we open an image or a .pdf file, and nvda submits this entire file to ocr, and then presents this recognized document in a virtual viewer.
I would like to read a recognized .pdf file without having to scan page by page.
I would like to scan a photo without having to place the object scan on the image, when we can and then try to scan the photo.
I'm thinking about novice users, but not just novice users, I who consider myself a not so novice user, I have difficulties to do these tasks.
Describe alternatives you've considered
Additional context
I believe it is possible for nvda to implement this feature.
Sorry for the other screen reader.
In jaws it is possible to configure the reader to use windows ocr, this was not possible until some previous versions.
After configuring the reader to use windows ocr it is possible to focus the arrows on the .jpg file or .pdf file and press the commands
uncertain + space
O
r
and the entire document including formatted is displayed in recognized word.
I don't expect nvda to implement the feature in the same way, but based on the evolution of another paid reader that is using the same feature of windows nvda ocr it is possible to greatly improve the user experience of nvda.
For example the unrecognized document does not need to be displayed in word, it can open in the virtual viewer.
We can provide a window for the user to choose and open the document, he doesn't need to open it directly from the explorer.
The important thing is that the entire document is recognized at once, and that the user does not need to use object navigation until they are finally able to read the document.
to not only locate the experience in the paid reader, when using a device resents with ios iphone if 2020 and more resents, just placing your finger on the image the text is already read and recognized, and with jesto 4 touches with 3 fingers on the screen the recognized text is copied.
The text was updated successfully, but these errors were encountered: