Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uncaught Exception: Secured pdf file are currently not supported (Smalot/PdfParser) #2223

Closed
etoulas opened this issue Aug 19, 2016 · 5 comments · Fixed by #2224
Closed

Uncaught Exception: Secured pdf file are currently not supported (Smalot/PdfParser) #2223

etoulas opened this issue Aug 19, 2016 · 5 comments · Fixed by #2224
Assignees
Labels
Milestone

Comments

@etoulas
Copy link

etoulas commented Aug 19, 2016

Issue details

Hi, I just installed the latest version to migrate from Pocket.
I'm importing thousands of links but very early I get an 500 when trying to download a "secured PDF".

I would appreciate even a temporary workaround to get at least all the remaining articles imported...

From var/logs/prod.log

[2016-08-19 22:37:10] graby.DEBUG: Graby is ready to fetch [] []
[2016-08-19 22:37:10] graby.DEBUG: Fetching url: {url} {"url":"http://www.iarc.fr/en/media-centre/pr/2015/pdfs/pr231_E.pdf"} []
[2016-08-19 22:37:10] graby.DEBUG: Trying using method "{method}" on url "{url}" {"method":"head","url":"http://www.iarc.fr/en/media-centre/pr/2015/pdfs/pr231_E.pdf"} []
[2016-08-19 22:37:10] graby.DEBUG: Data fetched: {data} {"data":{"effective_url":"http://www.iarc.fr/en/media-centre/pr/2015/pdfs/pr231_E.pdf","body":"(only length for debug): 0","headers":"application/pdf","status":200}} []
[2016-08-19 22:37:11] request.CRITICAL: Uncaught PHP Exception Exception: "Secured pdf file are currently not supported." at /home/et/repositories/wallabag2/vendor/smalot/pdfparser/src/Smalot/PdfParser/Parser.php line 94 {"exception":"[object] (Exception(code: 0): Secured pdf file are currently not supported. at /home/et/repositories/wallabag2/vendor/smalot/pdfparser/src/Smalot/PdfParser/Parser.php:94)"} []

Environment

  • wallabag version: git tag 2.0.6 installed via git clone
  • php version: PHP 5.5.9-1ubuntu4.19
  • OS: Ubuntu 14.04.5 LTS (trusty)
  • type of hosting (shared or dedicated): dedicated
  • which storage system you choose at install (SQLite, MySQL/MariaDB or PostgreSQL): MySQL

Steps to reproduce/test case

Add the URL with the PDF that causes the issue: http://www.iarc.fr/en/media-centre/pr/2015/pdfs/pr231_E.pdf

NB: The error in the log file will look slightly different compared to importing from Pocket.

[2016-08-19 22:51:23] app.ERROR: Error while saving an entry {"exception":"[object] (Exception(code: 0): Secured pdf file are currently not supported. at /home/et/repositories/wallabag2/vendor/smalot/pdfparser/src/Smalot/PdfParser/Parser.php:94)","entry":"[object] (Wallabag\\CoreBundle\\Entity\\Entry: {})"} []
@tcitworld
Copy link
Member

This is a https://github.com/j0k3r/graby issue @j0k3r

@j0k3r
Copy link
Member

j0k3r commented Aug 19, 2016

Hum interesting, didn't meet this kind of error recently.
I'll fix that.

And even it's related to graby (since the Exception is throw from it) this should be fixed on wallabag side)

@tcitworld
Copy link
Member

Will be fixed in the next minor release. Workaround is a try/catch around the call to Graby (see PR).

@etoulas
Copy link
Author

etoulas commented Aug 20, 2016

Hi All,

Thanks for your super swift resolution! Very impressive.
Let me also send you some praise for your commitment and very good work. I also like the fact that you migrated to SF2.

Regarding the issue: I agree with @j0k3r that this should be fixed on wallabag side in order to not break the import.
And as far as I can tell, the issue was "worked around" by @tcitworld. Thanks again!

Looking at the PR and how the exception is handled, if FetchContent fails, then the entry will be skipped from the import.

I'm not sure if I understood it right, but in case of a PDF, wouldn't it be enough to keep the link in the entry? This way it can be downloaded (if still available of course).

What do you think?

Yeah, I know this sounds more like an improvement, and the actual problem of breaking the import was solved here 👍

Cheers,
Tim

@j0k3r
Copy link
Member

j0k3r commented Aug 20, 2016

Well, in fact your little improvement can be applied to every link, not only PDF.
This can be an option or by default, I don't know: instead of skipping an entry if getting content fail, just save the link.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants