-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Corruption in PDF files when DRM removed #104
Comments
I'd like to add that I've also tried 10.0.0 and 10.0.2 with the same results as 10.0.3 |
Okay, that's not supposed to happen... Is the corruption deterministic? Meaning, when you take one PDF and run it through the plugin multiple times, is the corruption identical? Can you try if this also occurs with Python3, meaning, Calibre 5 or newer? If you can't update your main Calibre install there's also portable versions of Calibre. The old 6.8.X versions were mainly for Python2 while the newer versions (and all 10.X ones) are mainly for Python3 but should in theory be backwards-compatible to Python2, so maybe there's a bug in that code. Would you mind sharing the original PDF (with DRM) and your DER key file with me so I can try to reproduce this on my machine? My email is on my Github profile. |
Sorry for the delay in replying, real life got in the way. If the same PDF is run through the plugin multiple times the visible corruption (ie, title/author/ToC text) will be identical, but the file will have a different SHA256 hash each time (in contrast to when Calibre 6/Py3 is used and the hash is identical each time). I installed an isolated version of Calibre 6 to test with and every PDF file I've tried DRM was successfully removed with no hint of corruption. I'm sorry but I'm not comfortable sharing copyright material or my key, but of 14 random PDFs I tried: Two show corruption in Author So if you have a PDF with a TOC that should should hopefully allow you to replicate it with Py2. The logs of the three that failed to open after DRM was removed is below, they all seem to fail in the same manner - one that isn't present when using Calibre 6/Py3
|
Thanks for the report. I got one PDF with a corrupted TOC and 41df9ec fixes that for me, would you mind re-testing your PDFs if they work as well? You can download the new version from https://github.com/noDRM/DeDRM_tools/suites/7695663815/artifacts/321597181 |
The PDFs that were showing corruption are no longer corrupt with that version, although it doesn't help for the PDFs mentioned in comment 1193258014 - those are still broken and unable to be opened at all. |
So all the corruption is gone but you still have three PDFs that can't be opened at all after the DRM is removed in Calibre 4 with my plugin but they can be opened without issues in Calibre 5 or 6, or with Calibre 4 with the original plugin from Apprentice Harper? Interesting ... Unfortunately this is really difficult to debug without having access to the PDF files in question. I will review the PDF-related code and compare it with the 6.8.0 version again and see if there's any obvious issues I can see, but I don't think that will have a high chance of success. I tested all the available Adobe test PDFs and they're now all working fine in Calibre 4. I will try to add a ton more logging output to the PDF code to hopefully figure out what's going wrong - I'll give you a couple more test versions soon. EDIT: For the broken PDFs, where do they "come from"? Fulfilled by an eReader (which one?), downloaded by Adobe Digital Editions (which version?), or downloaded with the ACSM calibre plugin? And is ADE able to open them (with DRM) just fine? |
Would you mind testing if the issue is fixed with this version? |
Still broken with that version:
|
Damnit. I don't think there's much I can do about that then, without having access to the PDF and key. I'm going to leave this issue open as this is definitely a bug, but until I either run into this myself with one of my PDF files, or someone else who is comfortable sharing the PDF and key with me, this is unlikely to get solved. At least the bug with the silent corruption (where there's no error but the ToC is messed up) is fixed, bugs like these are even worse than just "it doesn't work"... Thanks for your testing, I will let you know if I find something else. |
Question / bug report
I'm seeing corruption when removing DRM from PDF files — affecting at least Title, Author, Table of Contents. See images for an illustration. Also included are images (and log) of the same files having DRM removed with Apprentice Harpers 6.8.x plugin, where the corruption doesn't occur.
NoDRM
https://i.imgur.com/PnkoMMh.png
https://i.imgur.com/juubDMT.png
https://i.imgur.com/UDEf7Xz.png
Apprentice:
https://i.imgur.com/r2Vy9AR.png
https://i.imgur.com/ZntKU96.png
https://i.imgur.com/QinZsbT.png
Which version of Calibre are you running?
4.23
Which version of the DeDRM plugin are you running?
v10.0.3
If applicable, which version of the Kindle software are you running?
No response
Log output
Log when using NoDRM plugin
Log when using apprenticeharper plugin
The text was updated successfully, but these errors were encountered: