-
Notifications
You must be signed in to change notification settings - Fork 441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wrong file embedding #176
Comments
|
@JorjMcKie, many thanks for your reply and sorry for mixing file embedding in document at document level and file attachment annotations (it was my fault, since I knew the difference). The following document (include-attachments.pdf) shows what I mean. And Acrobat displays it fine: For that, no Just having a I hope this is something easy to implement. |
@JorjMcKie, on second thoughts, I would like to comment your previous comment. First of all, I know that the implementation comes from MuPDF, so that the Python bindings have really little to do here.
The first screenshot from my original report shows what SumatraPDF does. Evince is the second screenshot, and Acrobat the third. Both SumatraPDF and Evince only display the list of embedded files (as in
It might be that Adobe failed to implement collections (“12.3.5. Collections”, from the PDF specification [sorry, I don’t know how to link to sections in a PDF document]) in Acrobat, both Reader and Professional versions. But it might be that MuPDF is missing something in the implementation of what Adobe Acrobat sells as portfolios. Collections are described as (in my opinion, something slightly different and more complex than
A glimpse of the result may be displayed at https://acrobatusers.com/tutorials/how-create-pdf-portfolio (I didn’t watch the whole video). All other PDF viewers display attached documents fine, because of the following remark from the PDF spec:
Only Acrobat deals with the Why does Acrobat fail with the portfolios generated with MuPDF? My personal take is that, although all entries in a Besides avoiding the issues (no matter where they come from), implementing the display of the embedded files is a much more consistent and uncomplicated approach that using collections. Of course, this is something inherited from MuPDF. I wanted to comment the issue with the common assumption among technical people implementing PDF solution: “Adobe doesn’t follow its own specification”. Are there inconsistencies between the PDF specification and Acrobat? Yes, but there may be not so many as we assume. Sorry, this is not directed to you. In this particular case, we don’t know whether the implementation from MuPDF is correct (when MuPDF isn’t simply pursuing another feature). I remember a conversation with a PDF engineer (since this is a public issue, this may be a good description) about an inconsistency in what the PDF spec described and how Acrobat dealt with that feature. At the end of our discussion, I realized that the whole problem was that the engineer was simply not wanting to consider the requirement stated in another part of the PDF documentation. |
ok, understood.
Conclusion / Action |
addendum: |
The document I removed the Steps:
Versions:
Could you share the PDF document that works for you? |
this-works-pretty.pdf Your version seems definitely broken then. Semantically, there is no difference in having the complete |
BTW for future manual PDF manipulations, you can conveniently use PyMuPDF:
This will calculate all the object locations in the file for you. |
Here is another successfully "cleaned" example (the MuPDF one): |
You are right with your other admonition (re: storing attachment w/o compression). |
@JorjMcKie, many thanks for your help and your fix. If my Acrobat version didn’t allow indirect objects, it would be totally broken (I agree). But it might be something different. Could you post screenshots with Acrobat displaying both With my Acrobat,
Even if I want, I cannot get attachments displayed (I have access only to the navigation panels for pages, layers and comments [as Acrobat calls them]). At least, I need to remove the And there I have access to the attachments navigation panel: Replacing It might be that Acrobat in its newer versions isn’t so strict about how to add a
I’m glad to read it helped, but I cannot recall where I commented it. |
A comment on your Here is a sample:
Here you have the output from this little improvement (sorry, if that was obvious to you): |
Yeah, I am actually not sure, why MuPDF stated this comment about ASCII-only entries. Might be they just didn't want to be bothered with complaints from people not finding their successfully embedded files - just because they made errors with encoding ... I don't know. |
Could you also provide a screenshot from About the |
What I currently do in PyMuPDF, is inserting identical stuff in both entries, Your remark on not compressing embedded files was made in the issue system of MuPDF (bugzilla). |
I’m interested in having a fix for MuPDF itself. I hope you agree with me that it is weird to fix in bindings what is wrong in the parent library (or whichever the name may be). I can investigate the issue further and see whether Sorry, but I think that the minimal difference isn’t worth the more complex implementation (if this is all what portfolios have to show). You insert identical stuff in the three entries Of course, it is fine that both entries have ASCII only in the file names, but they need to have their proper extensions. I suspect that if files have extensions, Acrobat may be able to display them in a different way. |
No, I put in
|
I see, I don’t know how you generated the PDF document, but it seems you invoked How about generating and posting the resulting PDF document (plus screenshot) with Again, if portfolio is working on Acrobat DC, it should be able to display PDF documents in a different way. |
I think I disagree with your use of the word "complexity": The fullblown "portfolio" capability BTW is present in the C-library of MuPDF - but it is not used in |
Well, I always thought that portfolios were based in But in any case, I wonder why “Complexity” could be in this case adding code (in the PDF document) that has no significant gain (and it isn’t clear that is valid PDF code [at least, to me]). |
After reading the Adobe manual carefully again: "(Intermediate and leaf nodes only; required)" it seems to me that indeed there exists some ambiguity: |
In any case, |
I read about |
Can you confirm that removing |
This is the main reason why one should be careful before stating that the PDF spec is inconsistent. In some cases, the spec is wrong. But it might be that all wrongdoings come from the specification.
I have paged through the PDF spec (the previous ISO version) and I wish das käme mir Spanisch vor (I must admit that it’s all Greek to me [pun intended 😉]). I think the experience with the implementation of the PDF spec may be similar to the translation of a book. It is way too easy to criticize the translation of a passage. Of course, one may be fully right. The not-so-easy task is translating the full book, bearing the responsibility of making the decisions. Of course, I guess it may be easy to criticize Adobe for inconsistencies in an over thousand pages specification. These inconsistencies should be removed, I totally agree. But it is a totally different task to write the full spec.
I can confirm that removing
Many thanks for your help with this. It will fix a long-standing bug in But, please, don’t forget that in this case |
OK, just finished the update. As indicated, it will always modify the
I will now upload the corresponding version 1.13.9 and create the respective wheels. |
Cómo saves tanto Alemán? Obviamente vives en una zona horaria compatible con la mia (Venezuela) y no en Alemania ...! |
From what I read in your PDF code, it looks fine to me. About your patch, I cannot say anything (again, it is all Greek to me, since I cannot code). Many thanks for your fix (I almost forgot to write). I don’t know whether you intended to keep the issue open. It would be great, if you could report the progress of the fix in MuPDF’s C library. (But that doesn’t require the issue to be open.) And many thanks for contributing the fixing code to the original library.
Das stimmt zum Teil. Zu meiner Zeitzone gehört auch Deutschland, aber seit fast zwei Jahrzehnten wohne ich in Deutschland nicht mehr. (Selbstverständlich befinde ich mich jetzt in Spanien.) Entschuldingen Sie dann mein Deutsch. Nach so vielen Jahren, habe ich es nur verlernt. |
Wow Pablo, meinen Respekt! Dein Deutsch ist viel besser als mein Spanisch! Ich lebe seit 2009 in Venezuela und bin mit meinem Spanisch sehr unzufrieden ... Was die anderen Dinge betrifft: |
Jorj, tu español es muy bueno. Escribes con acentos, algo que no hace mucha gente en España. Puedes engañarte pensando que tu español no es bueno. Tienes razón. Imagínate que en una página de un libro encuentras una pequeña mancha. Te puedes obsesionar con la mancha, pero puedes ver la página entera. La mancha seguirá ahí, pero desaparece su importancia (y se deja de ver) cuando se contempla la página entera. I forgot to mention, Fedora is going to ship PyMuPDF soon: https://bugzilla.redhat.com/show_bug.cgi?id=1586324. Once I compile the new release, I will confirm how it went with the fixed version. |
I have just checked it myself and it works perfectly fine (just invoking Many thanks for your help and your excellent work, Jorj. |
@JorjMcKie, many thanks for your comment at Artifex. Sorry for asking that, but doesn’t |
Oops, sorry about that - you are right. BTW, I am looking at enhancing the embedded-import.py script. I think it should be more flexibly controllable via its command line options. |
I am the one to blame about forgetting Many thanks for reporting that at Artifex. I agree with the three options, being only one mandatory. |
Using
examples/embedded-import.py
, I get the following output document: embedded-pymupdf.pdfOnly Acrobat Reader has an issue displaying (or accessing to) the content of the embedded document:
I get the similar results when using
mutool portfolio
: embedded-mupdf.pdf.I reported the issue upstream. I suggested a simpler approach than the portfolio. I don’t know when this will be fixed.
In the meantime, would it be possible that simple file attachments are available, so that they also work with Adobe Acrobat?
I guess my question may be related to #174, regarding the implementation of
FileAttachment
annotation type.The text was updated successfully, but these errors were encountered: