Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text extractor inaccurate #23842

Closed
AdhamHakki opened this issue Feb 4, 2023 · 5 comments
Closed

Text extractor inaccurate #23842

AdhamHakki opened this issue Feb 4, 2023 · 5 comments
Labels
Issue-Bug Something isn't working Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something

Comments

@AdhamHakki
Copy link

Microsoft PowerToys version

0.67.0

Installation method

GitHub

Running as admin

Yes

Area(s) with issue?

TextExtractor

Steps to reproduce

PowerToysReport_2023-02-04-11-38-28.zip
I selected an area of text and pasted it somewhere
It does not copy properly

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

Other Software

No response

@AdhamHakki AdhamHakki added Issue-Bug Something isn't working Needs-Triage For issues raised to be triaged and prioritized by internal Microsoft teams labels Feb 4, 2023
@jaimecbernardo
Copy link
Collaborator

Can you please provide more information? What's the image? can you share it? What was copied?
/needinfo

@jaimecbernardo jaimecbernardo added Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something and removed Needs-Triage For issues raised to be triaged and prioritized by internal Microsoft teams labels Feb 6, 2023
@txwizard
Copy link

Though I am not the OP, this looks like exactly the issue that I just observed, and I have evidence that you can evaluate.
The attached PNG is the image from which I selected the text shown in the like-named text file. As well, I have attached a ZIP archive of a recent MSINFO file.

PowerToys version: 0.67.1

OS Edition Windows 10 Pro
Version 22H2
Installed on ‎06/‎16/‎2021
OS build 19045.2486
Experience Windows Feature Experience Pack 120.2212.4190.0

Processor Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz 3.60 GHz
Installed RAM 32.0 GB (31.8 GB usable)
Device ID B5721CAD-1DCC-46C6-BD7F-BEDE742E69BE
Product
ID 00330-50826-58067-AAOEM
System type 64-bit operating system, x64-based processor

MSINFO32.zip

Staging_Domain
Staging_Domain.TXT

@txwizard
Copy link

I have here another example, taken from "Prevention from getting error by adding script tag anywhere using DOMContentLoaded Event Listener of JavaScript" at https://www.geeksforgeeks.org/prevention-from-getting-error-by-adding-script-tag-anywhere-using-domcontentloaded-event-listener-of-javascript/?ref=rp, an article from which I was taking notes.

The issue is that the third line of text should appear at the end of the first of two lines of text as they appear in the image.

This error is less serious than the one cited above, because all the text is present, just slightly out of order.

error.TXT
error

@txwizard
Copy link

I just encountered another instance of text being parsed incorrectly from an image. The image in question is a standard JavaScript alert box. The attached image is a screen clip created by the Snipping Tool that ships with Windows 10, and the text file is the text as the PowerToy copied it onto the Windows Clipboard.

Screenshot_2023_02_17_11_14_16
Screenshot_2023_02_17_11_14_16.TXT

@crutkas
Copy link
Member

crutkas commented Apr 7, 2023

The main goal of text extractor is get you in a decent enough state. OCR is tricky, more so on small text. Without swapping out the engine, not much we can do.

while not perfect, these are pretty decent.

image
domain: SalesTalkStaging-1002
tenant: SalesTalkStaging- 1001
user: DavidJohn.Gray@salesrelevance.com-1128
email: david.gray@salesrelevance.com
name: David Gray
time zone: Central Standard Time
login at: 02/15/2023 200145
url: https://salestalktech.com/staging
database: SalesTalkDev
02023 SalesTalk

image
purl.salestalktech.com says
An exception arose while applying input masks.
is not a function
Please contact SalesTalk customer support for assistance.

with a quick zoom, it does add in resolution.
purl.salestalktech.com says
An exception arose while applying input masks.
$(...).mask is not a function
Please contact SalesTalk customer support for assistance.

image
Uncaught TypeError: Cannot read properties of null (reading •addEventListener )
error. html : 28 : 14

As the current user didn't respond and Jaime did directly ask for that,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Bug Something isn't working Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something
Projects
None yet
Development

No branches or pull requests

4 participants