Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Corrupted" files when converting from html to OpenDocument #1729

Closed
comete-upn opened this issue Oct 30, 2014 · 4 comments
Closed

"Corrupted" files when converting from html to OpenDocument #1729

comete-upn opened this issue Oct 30, 2014 · 4 comments

Comments

@comete-upn
Copy link

It only occurs when the html links to images without extension. For example, if an html file contains a link to image.jpg everything works fine. If it contains a link to the same file renamed image, LibreOffice complains when I open the document, stating that it is corrupt and LibreOffice can try to "repair" it. The repair is always successful and the document opens.

The only difference between the two resulting files: the second ODT file has no entry for the image in the manifest.xml whereas the first one contains:

<manifest:file-entry manifest:media-type="image/jpeg" manifest:full-path="Pictures/0.jpg" manifest:version="1.2" />

Why does it matter? (I could name my files better)

The thing is that I only discovered this using base64 encoded images embedded in HTML. And these have no filename since the source IS the file…


System: Ubuntu 14.04
Pandoc version: 1.12.2.1
LibreOffice version: 4.2.6.3

@jgm
Copy link
Owner

jgm commented Oct 30, 2014

+++ COMETE [Oct 30 14 03:28 ]:

It only occurs when the html links to images without extension. For example, if an html file contains a link to image.jpg everything works fine. If it contains a link to the same file renamed image, LibreOffice complains when I open the document, stating that it is corrupt and LibreOffice can try to "repair" it. The repair is always successful and the document opens.

The only difference between the two resulting files: the second ODT file has no entry for the image in the manifest.xml whereas the first one contains:

<manifest:file-entry manifest:media-type="image/jpeg" manifest:full-path="Pictures/0.jpg" manifest:version="1.2" />

Which is the first, and which is the second? Is the repaired file the
one that contains the manifest entry above, or the un-repaired one?

@comete-upn
Copy link
Author

Sorry I wasn't clear.

  • the first document contains the manifest entry
  • the second one (the "corrupted" one) has no mention of the image in the manifest

@jgm
Copy link
Owner

jgm commented Oct 30, 2014

OK, so I think what is happening is that the ODT writer is not including
images in the manifest when the images lack extensions. That's because
the part of the code that builds the manifest relies on extensions to
determine the mime type. I think I can fix this pretty easily.

+++ COMETE [Oct 30 14 09:21 ]:

Sorry I wasn't clear.

  • the first document contains the manifest entry
  • the second one (the "corrupted" one) has no mention of the image in the manifest

Reply to this email directly or view it on GitHub:
#1729 (comment)

@jgm jgm closed this as completed in deaefb1 Oct 30, 2014
@comete-upn
Copy link
Author

Great! Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants