Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow sending an entire file to ocr #12981

Open
fernando-jose-silva opened this issue Oct 23, 2021 · 2 comments
Open

Allow sending an entire file to ocr #12981

fernando-jose-silva opened this issue Oct 23, 2021 · 2 comments
Labels
feature/ocr Related to Optical Character Recognition feature in NVDA OCR add-on p5 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority triaged Has been triaged, issue is waiting for implementation.

Comments

@fernando-jose-silva
Copy link

Is your feature request related to a problem? Please describe.

in my opinion using nvda ocr (windows) to recognize documents in images, especially if these images are very long, or using nvda to recognize large documents in .pdf format is very frustrating.
Several times when I open the image with text in .jpg format and this file is opened in the windows native program photos app, two things can happen.
Sometimes nvda only reads the file name, when this occurs it is possible to assion ocr by pressing nvda + r and nvda reads the text of the photo.
Sometimes I do the same task, I open the image in the photos app, sometimes a button like full screen gains focus, and in this case we can press nvda + r and the text in the image is not read in any way.
I can't explain why sometimes the full screen button gets focus and sometimes it doesn't, and I also don't know how to focus on the image to perform the ocr.
Another issue, to perform the ocr in .pdf files is even worse, we have to put page by page on the screen and then perform the ocr page by page, it is at least counterproductive, when it works.
I tried to paste the picture into word and perform the ocr, and to my frustration it also read only one line.

Describe the solution you'd like

I would like a function in nvda, so that we open an image or a .pdf file, and nvda submits this entire file to ocr, and then presents this recognized document in a virtual viewer.
I would like to read a recognized .pdf file without having to scan page by page.
I would like to scan a photo without having to place the object scan on the image, when we can and then try to scan the photo.
I'm thinking about novice users, but not just novice users, I who consider myself a not so novice user, I have difficulties to do these tasks.

Describe alternatives you've considered

Additional context

I believe it is possible for nvda to implement this feature.
Sorry for the other screen reader.
In jaws it is possible to configure the reader to use windows ocr, this was not possible until some previous versions.
After configuring the reader to use windows ocr it is possible to focus the arrows on the .jpg file or .pdf file and press the commands
uncertain + space
O
r
and the entire document including formatted is displayed in recognized word.
I don't expect nvda to implement the feature in the same way, but based on the evolution of another paid reader that is using the same feature of windows nvda ocr it is possible to greatly improve the user experience of nvda.
For example the unrecognized document does not need to be displayed in word, it can open in the virtual viewer.
We can provide a window for the user to choose and open the document, he doesn't need to open it directly from the explorer.
The important thing is that the entire document is recognized at once, and that the user does not need to use object navigation until they are finally able to read the document.
to not only locate the experience in the paid reader, when using a device resents with ios iphone if 2020 and more resents, just placing your finger on the image the text is already read and recognized, and with jesto 4 touches with 3 fingers on the screen the recognized text is copied.

@feerrenrut feerrenrut added the feature/ocr Related to Optical Character Recognition feature in NVDA label Nov 4, 2021
@surfer0627
Copy link
Contributor

@fernando-jose-silva commented:

I would like a function in nvda, so that we open an image or a .pdf file, and nvda submits this entire file to ocr, and then presents this recognized document in a virtual viewer.

Some NVDA addons have implemented feature like this:

NVDA + Shift + R: recognize any sorts of images and pdf from file system

Windows + Control + r: to recognize the selected document

@fernando-jose-silva
Copy link
Author

thank you very much, i am using these addons this is very good.
He still continued to suggest that these features be directly part of nvda. I always point out, the search for add-ons, and the installation and add-ons, is a lot of work for novice users, I have daily contact with novice users, and who have a lot of difficulty in having access to these tablets, I am in favor of implementing these features, in particular this feature, directly in nvda

@seanbudd seanbudd added triaged Has been triaged, issue is waiting for implementation. p5 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority labels Jun 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/ocr Related to Optical Character Recognition feature in NVDA OCR add-on p5 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority triaged Has been triaged, issue is waiting for implementation.
Projects
None yet
Development

No branches or pull requests

4 participants