Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to dectet OS #526

Closed
JobberRT opened this issue Mar 26, 2021 · 6 comments
Closed

Failed to dectet OS #526

JobberRT opened this issue Mar 26, 2021 · 6 comments

Comments

@JobberRT
Copy link

Describe the bug
A clear and concise description of what the bug is.

To Reproduce

Using Pure JS+HTML like this:

const exampleImage = some.png;
const worker = Tesseract.createWorker({
  logger: m => console.log(m)
});
Tesseract.setLogging(true);
work();
async function work() {
  await worker.load();
  await worker.loadLanguage('eng');
  await worker.initialize('eng');
  let result = await worker.detect(exampleImage);
  console.log(result.data);
  result = await worker.recognize(exampleImage);
  console.log(result.data);
  await worker.terminate();
}

Expected behavior
Console log "Failed to dectet OS"

Screenshots

Image

Desktop (please complete the following information):

  • OS: Windows10
  • Browser Chrome
  • Version 86

Additional context
Source PNG file URL: http://cn8.frp.cool:12385/upload/1616768763_377_123.png

@XfedeX
Copy link

XfedeX commented Aug 21, 2021

I am encountering this too, but with Firefox 91.

@munsterlander
Copy link

munsterlander commented Jun 21, 2022

Same issue for me on chrome and firefox. Current version of tesseract.js, detect method does not work.

Also, it fails no matter the source. Video, canvas, image and base64 all fail.

@Balearica
Copy link
Member

I am unable to reproduce this issue (I ran the code snippet above and it worked for me), and the image linked above is dead. For anybody with this issue, please clarify whether the demo site runs for you. That should help us figure out whether the issue stems from the particulars of your browser/system or how you are deploying tesseract.js.

https://tesseract.projectnaptha.com/

@Balearica
Copy link
Member

@JobberRT @munsterlander I looked more into this, and believe I understand the issue. In the same way that Tesseract does not always detect text, it does not always detect script/orientation. When running on an image with only a couple words, it will not detect script/orientation, and tesseract.js throws an error.

The fact that Tesseract does not recognize script/orientation on such images is outside of the scope of this project (as we do not edit the Tesseract engine). However, throwing an error when this happens does not seem like the correct behavior. Presumably tesseract.js should simply return a null value rather than throwing an exception (similar to if you run recognize on a page with no text).

@Balearica
Copy link
Member

I edited so detect now returns null values when OS detection is not possible rather than throwing an error and killing the API. As this is technically a breaking change, it was implemented in the dev/v4 branch, and will be included with the next major release (v4). To learn more about changes in v4 see Issue #662.

@JobberRT
Copy link
Author

@Balearica Thanks! I will look into #662 and check it!

Balearica added a commit that referenced this issue Nov 25, 2022
See #662 for explanation of Tesseract.js Version 4 changes.  List below is auto-generated from commits. 

* Added image preprocessing functions (rotate + save images)

* Updated createWorker to be async

* Reworked createWorker to be async and throw errors per #654

* Reworked createWorker to be async and throw errors per #654

* Edited detect to return null when detection fails rather than throwing error per #526

* Updated types per #606 and #580 (#663) (#664)

* Removed unused files

* Added savePDF option to recognize per #488; cleaned up code for linter

* Updated download-pdf example for node to use new savePDF option

* Added OutputFormats option/interface for setting output

* Allowed for Tesseract parameters to be set through recognition options per #665

* Updated docs

* Edited loadLanguage to no longer overwrite cache with data from cache per #666

* Added interface for setting 'init only' options per #613

* Wrapped caching in try block per #609

* Fixed unit tests

* Updated setImage to resolve memory leak per #678

* Added debug output option per #681

* Fixed bug with saving images per #588

* Updated examples

* Updated readme and Tesseract.js-core version
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants