-
Notifications
You must be signed in to change notification settings - Fork 714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic unit tests #215
Add basic unit tests #215
Conversation
…es Add tests to validate the output of the method `image_to_data`
d5d61a7
to
6cbbea3
Compare
Evidently, Tesseract v3.04.01 with Leptonica has trouble with gif images.
|
@johnthagen please review whenever you have some spare time and let me know if everything looks good for the tests. |
…sing `isort` Make method `predict` private by adding a leading underscore `_predict`
…t__` to `pytesseract.pytesseract`
tests/test_pytesseract.py
Outdated
|
||
|
||
@pytest.mark.parametrize('test_file', [ | ||
# https://github.com/tesseract-ocr/tesseract/issues/2558 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these tests be turned on for tesseract 4.0 / bionic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also depends on the behaviour and version of Leptonica. On my system I have the newest versions of Tesseract and Leptonica and it doesn't work. Nevertheless I will test it with the images of Travis.
JFYI, we can skip tests with this decorator:
@pytest.mark.skipif(
TESSERACT_VERSION[0] < 4,
reason='requires tesseract >= 4'
)
# ...
tests/test_pytesseract.py
Outdated
# os.path.join(DATA_DIR, 'test.bmp'), | ||
# os.path.join(DATA_DIR, 'test.gif'), | ||
os.path.join(DATA_DIR, 'test.jpg'), | ||
Image.open(os.path.join(DATA_DIR, 'test.jpg')), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could probably simplify the logic if the Image.open()
and os.path.join
was moved into the test itself, and the only things parameterized was the file itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will split this tests in two different tests. You're right, I mixed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gif
images don't work with Tesseract 3, so I skip these cases.
bmp
images don't work with any version of Tesseract.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, interesting. According to the Leptonica docs - Image I/O, there is support for BMP and GIF. Leptonica is the underlying library that tesseract utilize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, as far as I can see from the Travis CI report, the gif
test passes on bionic.
So indeed, the problem is the version of Leptopnica (leptonica-1.73 - xenial vs leptonica-1.75.3 - bionic which is also a bit old - Feb 16, 2018).
…ting # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
…esseract.tesseract_cmd` at the end of the test
…t and ignore bitmaps generally
Thanks for your review and feedback @int3l and @johnthagen. In general Tesseract 3/4 can't handle bitmap images. But Tesseract 4 can handle gif images. The tests handle these circumstances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for this PR.
Once again, @nok and @johnthagen thank you for the contributions and the time spent on the tasks. |
@int3l It makes fun and I learned something new 😄. Yes, please add me to the list of contributors, thanks! |
Hello,
I added some unit tests to cover all common methods. For that I added more test data.
Finally I wasn't able to use a bitmap file for the tests, because tesseract (leptonica) failed.
But this bug is related to an open issue: tesseract-ocr/tesseract#2558.
Nevertheless the other tests passed successfully (Python 2.7, Python 3.6 and Python 3.7).
Screenshot: