Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning. Invalid resolution 0 dpi. Using 70 instead. #1702

Closed
JILeXanDR opened this issue Jun 24, 2018 · 37 comments
Closed

Warning. Invalid resolution 0 dpi. Using 70 instead. #1702

JILeXanDR opened this issue Jun 24, 2018 · 37 comments

Comments

@JILeXanDR
Copy link

command tesseract https://image.ibb.co/eibzaT/test.png result

Current Behavior:

Warning. Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 161
Estimating resolution as 161

version

tesseract 4.0.0-beta.2-313-g29f2
 leptonica-1.76.0
  libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.4.4 : libopenjp2 2.1.0
 Found AVX
 Found SSE

original image https://image.ibb.co/eibzaT/test.png

@amitdo
Copy link
Collaborator

amitdo commented Jun 24, 2018

It means your image does not contain a resolution info in its metadata, so Tesseract warns you about this issue in the image and it tries to estimate the resolution by itself.

@JILeXanDR
Copy link
Author

@amitdo In which way?

@amitdo
Copy link
Collaborator

amitdo commented Jun 24, 2018

int res = IntCastRounded(to_block->line_size * kResolutionEstimationFactor);

By measuring lines height.

@ghost
Copy link

ghost commented Jun 24, 2018

@zdenop I think, you can close this topic

@zdenop zdenop closed this as completed Jun 25, 2018
@lloiodice

This comment was marked as off-topic.

@lloiodice
Copy link

Tesseract uses Leptonica which uses libpng to read the input image source resolution.
If the input png does not have the correct metadata info, it will generate the warning referred in
this issue. I also seen this to cause tesseract to return slightly different text results for certain images.
The code above adds metadata to the PNG

@lloiodice
Copy link

To test if an image has the correct header you can use
magick identify -verbose filename or equivalent tools

and make sure these 2 values are set
Resolution: 118.11x118.11
Units: PixelsPerCentimeter
Above is for a 300 dpi PNG

@15013605249
Copy link

@lloiodice The version I am using is 3.05.02 and I can't find the function you said.

@stweil
Copy link
Contributor

stweil commented Aug 23, 2018

tesseract https://image.ibb.co/eibzaT/test.png result

How did you manage that Tesseract could read the input image from a URL?

@bhasinnaik
Copy link

Hello team,

I am trying to extract text from text using tesseract but every time it returns warning .
Warning: Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 463
Empty page!!
Estimating resolution as 463
Empty page!!

3594

@stweil
Copy link
Contributor

stweil commented Mar 16, 2019

There is an undocumented command line option. Try using --dpi 300 (or the correct value for your image).

@zdenop
Copy link
Contributor

zdenop commented Mar 16, 2019

@bhasinnaik : your input image has no information about dpi. If you want to avoid warning, you should fix it.

@maky-hnou
Copy link

@zdenop , any idea how to fix the photo's DPI when it is 0?
I tried using different methods (pyexiv2, imagemagick,..) but I still get the same error when I use tesseract...

@zdenop
Copy link
Contributor

zdenop commented Oct 18, 2019

What did you try with imagemagick? Show your command ;-)

@maky-hnou
Copy link

mogrify -set density 300 image.jpg I found this on a post in Stackoverflow.

@zdenop
Copy link
Contributor

zdenop commented Oct 18, 2019

Just very easy and short internet search suggest this modification:
mogrify -set units PixelsPerInch -density 300 image.jpg

@maky-hnou
Copy link

Thank you. I already tried that :-D , but still get the same tesseract error. Probably it is because of my image.
I'll try to find a workaround.
Thank you again :-)

@zdenop
Copy link
Contributor

zdenop commented Oct 18, 2019

I tried it and it works for my image.jpg. If you are using tesseract >=4 you can use --dpi option of tesseract executable.

@jbreiden
Copy link
Contributor

jbreiden commented Oct 18, 2019 via email

@lloiodice
Copy link

FYI

In my test of tesseract 4.1 (trying to upgrade from 4.0) .. I get this warning often and
using 70 dpi instead of the correct dpi affects the efficacy of results especially when
running script detection.

In 4.0 setting the metadata in the PNG as described in my post above allowed the correct
resolution to be used by tesseract. That does not work anymore so something has changed
in 4.1 (maybe in the dependencies ... libpng? ... not sure).

If anybody knows what PNG metadata fields are necessary to avoid the warning please
let me know ... using the tesseract dpi setting is a complication that would be good to avoid
when processing arbitrary images.

@zdenop
Copy link
Contributor

zdenop commented Oct 19, 2019

so you have png that does not produce dpi warning with tesseact 4.0, but it produces warning in 4.1?

@lloiodice
Copy link

It appears to be a problem with script detection

tesseract dpi_issue.png out --psm 0
Tesseract Open Source OCR Engine v4.1.0 with Leptonica
Warning. Invalid resolution 0 dpi. Using 70 instead.

File:
https://www.dropbox.com/s/y7yb49ew04kj72q/dpi_issue.png?dl=0

When using 300 dpi or with 4.0 I get detected script Han. In 4.1 I get Latin.

@zdenop
Copy link
Contributor

zdenop commented Oct 20, 2019

interesting: I see file has dpi (118x118 in IrfanView). Even when I specify dpi for tesseract, I see warning message:

tesseract dpi_issue.png - --dpi 300 --psm 0
Warning. Invalid resolution 0 dpi. Using 70 instead.
Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 7.89
Script: Latin
Script confidence: 0.09

BTW:

identify -format '%x,%y' dpi_issue.png
'46.460000000000001,46.460000000000001'

so it seems like problem with generating your png(?)

stweil added a commit that referenced this issue Oct 20, 2019
Signed-off-by: Stefan Weil <sw@weilnetz.de>
@stweil
Copy link
Contributor

stweil commented Oct 20, 2019

That's a bug in Tesseract. Tesseract internally creates a new image for that png file, but does not copy the resolution from the original image. Fixed now in commit a209a6b.

@zdenop
Copy link
Contributor

zdenop commented Oct 20, 2019

@stweil : Thanks for looking into this. Funny that problem was with psm 0 only. Others psm works as expected.

zdenop pushed a commit that referenced this issue Nov 1, 2019
Signed-off-by: Stefan Weil <sw@weilnetz.de>
@OmarJabri7
Copy link

I am using pytesseract and there are some images that work with the image_to_osd, but other images do not and the program gives me the following error:
"raise TesseractError(proc.returncode, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (1, 'Tesseract Open Source OCR Engine v4.1.0-bibtag19 with Leptonica Warning. Invalid resolution 0 dpi. Using 70 instead. Too few characters. Skipping this page Error during processing.')"
I am using Python 3.7 anaconda distribution and tesseract >=4.0.0 with pytesseract as a tool.
Please help.

@stweil
Copy link
Contributor

stweil commented Jan 27, 2020

Please use the user forum for questions.

@PrashantDixit0
Copy link

PrashantDixit0 commented Nov 25, 2020

After having an image of 300dpi , then also same error is coming mention below

TesseractError: (3221225501, "read_params_file: Can't open txt Tesseract Open Source OCR Engine v4.1.1 with Leptonica Warning: Invalid resolution 0 dpi. Using 70 instead. Estimating resolution as 711")

@anilbb23
Copy link

anilbb23 commented Jun 10, 2021

I am using pytesseract and there are some images that work with the image_to_osd, but other images do not and the program gives me the following error:
"raise TesseractError(proc.returncode, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (1, 'Tesseract Open Source OCR Engine v4.1.0-bibtag19 with Leptonica Warning. Invalid resolution 0 dpi. Using 70 instead. Too few characters. Skipping this page Error during processing.')"
I am using Python 3.7 anaconda distribution and tesseract >=4.0.0 with pytesseract as a tool.
Please help.

Hi @OmarJabri7,
I am getting the same error. How did you resolve it? Please share the solution / sample code.

I am rotating the image with OpenCV and then passing it to pytesseract.image_to_osd()
TesseractError: (1, 'Tesseract Open Source OCR Engine v5.0.0-alpha.20201127 with Leptonica Warning: Invalid resolution 0 dpi. Using 70 instead. Estimating resolution as 179 Warning. Invalid resolution 0 dpi. Using 70 instead. Too few characters. Skipping this page Error during processing.')

@esraa-abdelmaksoud
Copy link

esraa-abdelmaksoud commented Aug 23, 2021

I've been using Tesseract for a while and got the same error. I just want to confirm that it is never about the metadata. I got the error while using image_to_osd for photos captured using the same device and this happened to only 3 of 50 images. I've checked the image details and the dpi already exists.

I've been trying to see if the error disappears when I crop the background around the objects in the image, and I saw that the error disappeared for 2 of them, not all of them. I still don't really know the reason, but if it was the metadata, the third one would have worked.

However, I believe that it has something related to the number of characters. The images that resulted in the error that disappeared after cropping were somewhat rotated. The text angle was around 30-40. Tesseract was giving me rotations of 90,180, and 270 only for the images that worked. When it comes to the image that gave error in both cases, it already has a low number of characters. This is why it would be interesting if more people try this so we can figure out if it's really the reason.

@amitdo
Copy link
Collaborator

amitdo commented Aug 23, 2021

Run this command on the original image:
identify -verbose imagename.jpg

@zdenop
Copy link
Contributor

zdenop commented Aug 23, 2021

Please respect guidelines for posting issue: we do not provide support for 3rd party projects (I assume you use pytesseract based on image_to_osd). Also provide information about tesseract and leptonica version.
Provide test case to reproduce problem with resent version of tesseract executable - otherwise you are alone with your thoughts and believes.

@dzg
Copy link

dzg commented Dec 10, 2021

Is there any way to set a default resolution in the event of Invalid resolution ? I would like to force --dpi 72 but only when no DPI detected.

@stweil
Copy link
Contributor

stweil commented Dec 10, 2021

That is not directly supported by Tesseract, but could be implemented by a wrapper script.

The current Tesseract release 5.0.0 tries to guess the correct resolution if there is no explicit information from the image file.

@dzg
Copy link

dzg commented Dec 10, 2021

That is not directly supported by Tesseract, but could be implemented by a wrapper script.

The current Tesseract release 5.0.0 tries to guess the correct resolution if there is no explicit information from the image file.

Most of the images I need to process are macOS screenshots – which are all PNG @ 144 DPI – and none of which tesseract recgonizes as having a DPI. Bummer. (Furthermore, to process well I have to set --dpi 72)

@stweil
Copy link
Contributor

stweil commented Dec 11, 2021

I tested some macOS screenshots, and Tesseract 5.0.0 worked well without the need to give a resolution. But you are free to add --dpi 72 if that helps for your screenshots.

@Laxmipriya71

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests