Skip to content

Conversation

@charlesmindee
Copy link
Collaborator

@charlesmindee charlesmindee commented Jan 15, 2021

Module to load pdf as documents

@codecov
Copy link

codecov bot commented Jan 15, 2021

Codecov Report

Merging #18 (2c840ae) into main (9abdf11) will decrease coverage by 3.74%.
The diff coverage is 87.09%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #18      +/-   ##
==========================================
- Coverage   94.85%   91.11%   -3.75%     
==========================================
  Files           8        4       -4     
  Lines         175       90      -85     
==========================================
- Hits          166       82      -84     
+ Misses          9        8       -1     
Impacted Files Coverage Δ
doctr/documents/reader.py 86.88% <86.88%> (-0.81%) ⬇️
doctr/documents/__init__.py 100.00% <100.00%> (ø)
doctr/utils/geometry.py
doctr/utils/typing.py
doctr/utils/__init__.py
doctr/documents/elements.py

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9abdf11...9a9475c. Read the comment docs.

@fg-mindee fg-mindee added type: enhancement Improvement module: io Related to doctr.io labels Jan 15, 2021
@fg-mindee fg-mindee added this to the 0.1.0 milestone Jan 15, 2021
@fg-mindee
Copy link
Contributor

I think you need to merge main before we move on to review! Also, if you could add a PR description, that would be great 🙏

@fg-mindee
Copy link
Contributor

Could you add a PR description as well please? 🙏

Copy link
Contributor

@fg-mindee fg-mindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Apart from a little improvement in the tests, this looks good to me!

Comment on lines 9 to 11
DEFAULT_RES_MIN = int(0.8e6)
DEFAULT_RES_MAX = int(3e6)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove this if we hardcode the expected values, which is preferrable

Comment on lines 175 to 176
assert shape[0] * shape[1] <= DEFAULT_RES_MAX
assert shape[0] * shape[1] >= DEFAULT_RES_MIN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to roll back to hardcoded int values 👌

Comment on lines 172 to 173
assert shape[0] * shape[1] <= int(3e6)
assert shape[0] * shape[1] >= int(0.8e6)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid unnecessary casting, let's write 3000000 and 800000

Copy link
Contributor

@fg-mindee fg-mindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the edits!

@fg-mindee fg-mindee changed the title Pdf doc reader whithout bytestrings refactor: PDF doc reader without bytestrings Jan 18, 2021
@charlesmindee charlesmindee merged commit b21c2d4 into main Jan 18, 2021
@charlesmindee charlesmindee deleted the pdf_doc_reader branch January 18, 2021 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module: io Related to doctr.io type: enhancement Improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants