Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Added line reading for source PDFs #707

Merged
merged 4 commits into from
Dec 14, 2021
Merged

feat: Added line reading for source PDFs #707

merged 4 commits into from
Dec 14, 2021

Conversation

fg-mindee
Copy link
Contributor

This PR introduces the following modifications:

  • adds get_lines methods on PDF objects to aggregate words into lines on source PDFs
  • adds unittest
  • adds entry in the documentation

Any feedback is welcome!

@fg-mindee fg-mindee added topic: documentation Improvements or additions to documentation type: enhancement Improvement module: io Related to doctr.io ext: tests Related to tests folder labels Dec 13, 2021
@fg-mindee fg-mindee added this to the 0.5.0 milestone Dec 13, 2021
@fg-mindee fg-mindee self-assigned this Dec 13, 2021
charlesmindee
charlesmindee previously approved these changes Dec 14, 2021
Copy link
Collaborator

@charlesmindee charlesmindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@codecov
Copy link

codecov bot commented Dec 14, 2021

Codecov Report

Merging #707 (a10548a) into main (cf97d9e) will increase coverage by 0.05%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #707      +/-   ##
==========================================
+ Coverage   96.12%   96.18%   +0.05%     
==========================================
  Files         124      124              
  Lines        4648     4668      +20     
==========================================
+ Hits         4468     4490      +22     
+ Misses        180      178       -2     
Flag Coverage Δ
unittests 96.18% <100.00%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
doctr/io/pdf.py 100.00% <100.00%> (ø)
doctr/models/builder.py 96.49% <0.00%> (-2.64%) ⬇️
...dels/detection/differentiable_binarization/base.py 93.33% <0.00%> (+2.77%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8af125c...a10548a. Read the comment docs.

@fg-mindee fg-mindee merged commit f119778 into main Dec 14, 2021
@fg-mindee fg-mindee deleted the fitz-lines branch December 14, 2021 10:39
@fg-mindee fg-mindee added type: new feature New feature ext: docs Related to docs folder and removed type: enhancement Improvement labels Dec 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ext: docs Related to docs folder ext: tests Related to tests folder module: io Related to doctr.io topic: documentation Improvements or additions to documentation type: new feature New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants