Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for (not parsing) strikethrough #11

Open
lkaniak opened this issue Nov 16, 2023 · 2 comments
Open

Support for (not parsing) strikethrough #11

lkaniak opened this issue Nov 16, 2023 · 2 comments

Comments

@lkaniak
Copy link

lkaniak commented Nov 16, 2023

Hello,

does this package support a parse option to not get text with strikethrough? Is it feasible?

@lublak
Copy link
Owner

lublak commented Jan 22, 2024

@lkaniak hi :) Theoretically, it would be possible with version 4.0

#10

Unfortunately, I'm very limited in my private life, so I can hardly make any progress.

@lublak
Copy link
Owner

lublak commented Mar 20, 2024

Just for some documentation:
pdf self doesn't have information about strikethrough.
It use a path data which is than drawn oveFor documentation purposes only:
PDF itself has no information about strikethroughs.
Path data is used, which is then drawn over the text. In order to recognise whether texts are crossed out or not, coordinates must be used to check this.
I don't think pdfdataextract will offer a function for this. But the complete data information, where which text is and where which path is, can be extracted with the future 4.0 version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants