New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect mimetype when adding a link to a pdf file #4142
Comments
I see, so doing the scenario on a public instance maybe is not the best idea. Still, I get the issue on my self-hosted instance, and the result is the same. I'm able to wget the file from the same server just fine, so I'm not sure that it's a "too many requests" issue. But what you're telling me is that if there is any error during the parsing at any point, the mimetype will no longer reflect that of the content? For example in my case, it looks to me like the pdf was downloaded correctly, but its parsing failed and raised an exception
I just saw the details of how to make the logs more verbose, I'll do that this evening. |
Looks like the parsing of the PDF fail. |
I understand the that the issue that triggers the error is out of the bounds of wallabag. What is exactly the purpose of the In the first case, I would understand the current result. An In the second case, either:
Wouldn't it make sense to display the initial The goal would be to allow to automate some tasks more easily. For example I want to be able to tag all |
Issue details
I'm trying to use the
mimetype
of an entry to known when to skip the conversion and directly go for a direct download. More specifically, I'm looking for "application/pdf" in order to download the PDF file directly.When adding links to some pdf files, I get error in the web interface (
fetching content failed
) and the mimetype is not set to "application/pdf" as expected. Looking at the logs, and doing a manualcurl -I
on the link shows the correct mimetype. The logs also suggest that pdf parsing was attempted, so the mimetype was correctly detected at the beginning.When testing on https://f43.me/feed/test, I also get an error at the parsing of the url. However, I expect the mimetype of the entry to still reflect that of the content of the url, regardless of an issue when parsing the link.
Should the mimetype only be updated when the content is correctly parsed? Should I do my own fetching on the original url to get its mimetype? I want to avoid doing that on my client if possible.
Thanks for your help! :)
Environment
The issue is reproducible on the demo wallabag instance, as of now version is 2.3.8.
Steps to reproduce/test case
mimetype = "application/pdf"
to tag entries aspdf
for example, just to make it easier to spot discrepancies.The text was updated successfully, but these errors were encountered: