Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encrypted but not really _encrypted_ PDF isn't parsed and detected as "secured pdf" #320

Open
rameezrami opened this issue Jul 17, 2020 · 13 comments
Labels
needs more info parsing fail When (almost) nothing can be extracted from a given PDF

Comments

@rameezrami
Copy link

rameezrami commented Jul 17, 2020

This is a very good package with excellent and easy to use API but it seems not production-ready yet. I'm having errors with different PDFs.

also, I'm getting an error saying pdf in encrypted. I double-checked if the PDF is protected, it wasn't.

maybe a version issue? like in another issue posted in this repository.

I have created a screencast for better understanding and these pdf is not getting parsed in your demo page too.

Please find the files in the attachment.

pdf and screencasts.zip

@k00ni k00ni added bug missing or incomplete functionality For something which is not a bug, but more like an incomplete feature. labels Jul 17, 2020
@cesardmoro
Copy link

any advice on this ?

@k00ni
Copy link
Collaborator

k00ni commented Aug 13, 2020

any advice on this ?

Sorry, I can't help you on this one.

@j0k3r
Copy link
Collaborator

j0k3r commented Aug 21, 2020

I've tried locally and on my Mac the PDF information are saying the PDF is password encrypted.

Also tried using an online tool and it says the same: https://www.metadata2go.com/result/cffe1bdf-d6be-45d1-b810-f4db8ae846af

image

@k00ni k00ni added needs more info and removed bug missing or incomplete functionality For something which is not a bug, but more like an incomplete feature. labels Aug 25, 2020
@rameezrami
Copy link
Author

@j0k3r but you were able to open the pdf without any issue right? no passwords asked. I have a couple of other pdfs(can't share here, confidential information) too which doesn't work the same way.

@j0k3r
Copy link
Collaborator

j0k3r commented Aug 28, 2020

Of course, but the label says the PDF is encrypted. So we don't try to read it at all.
See https://github.com/smalot/pdfparser/blob/master/src/Smalot/PdfParser/Parser.php#L97-L99

Maybe we should improve that and try to read the PDF?

@j0k3r j0k3r changed the title Not all PDF gets parsed Encrypted but not really _encrypted_ PDF isn't parsed and detected as "secured pdf" Sep 1, 2020
@j0k3r j0k3r added the parsing fail When (almost) nothing can be extracted from a given PDF label Sep 1, 2020
@Sagar1219
Copy link

Hey I have encountered same issue, is this fixed? Also, meanwhile do we have any alternate solution for this ?

@j0k3r
Copy link
Collaborator

j0k3r commented Nov 8, 2021

No alternate solution, but contributions are welcome.

@k00ni
Copy link
Collaborator

k00ni commented Dec 1, 2023

#653 should provide a valid work around until there is a better solution/fix.

@k00ni k00ni closed this as completed Dec 1, 2023
@unixnut
Copy link
Contributor

unixnut commented Jan 19, 2024

Fix in unixnut/pdfparser fork, 'decryption' branch. I was going to create a draft pull request but I already have an unrelated PR open for one of the commits on that branch so I can't make another until that's merged. @k00ni can you please reopen this issue?

@j0k3r j0k3r reopened this Jan 19, 2024
@matkozikowski
Copy link

Hi,
Do you know if there is pull request of continue work with src/Smalot/PdfParser/Config::setIgnoreEncryption ? I would like to use this but if is depreacted do we have some alternative method to use ?

@unixnut
Copy link
Contributor

unixnut commented Jul 20, 2024

@matkozikowski @j0k3r @k00ni I am still interested in making a PR for my decryption branch. However, I have been struggling emotionally and need to get into counselling, which should happen soon.

@k00ni
Copy link
Collaborator

k00ni commented Jul 22, 2024

@unixnut I wish you all the best.

@unixnut
Copy link
Contributor

unixnut commented Sep 12, 2024

@k00ni I sent you an e-mail about this project. No rush to reply, but if you didn't see it can you please check your spam folder? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs more info parsing fail When (almost) nothing can be extracted from a given PDF
Projects
None yet
Development

No branches or pull requests

7 participants