Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

base64-urls don't work in epubs when the html file is in a subdirectory #3150

Closed
lep opened this issue Oct 6, 2016 · 2 comments
Closed

base64-urls don't work in epubs when the html file is in a subdirectory #3150

lep opened this issue Oct 6, 2016 · 2 comments

Comments

@lep
Copy link

lep commented Oct 6, 2016

$ pandoc --version
pandoc 1.17.1
Compiled with texmath 0.8.6.4, highlighting-kate 0.6.2.1.
Syntax highlighting is supported for the following languages:
    abc, actionscript, ada, agda, apache, asn1, asp, awk, bash, bibtex, boo, c,
    changelog, clojure, cmake, coffee, coldfusion, commonlisp, cpp, cs, css,
    curry, d, diff, djangotemplate, dockerfile, dot, doxygen, doxygenlua, dtd,
    eiffel, elixir, email, erlang, fasm, fortran, fsharp, gcc, glsl,
    gnuassembler, go, hamlet, haskell, haxe, html, idris, ini, isocpp, java,
    javadoc, javascript, json, jsp, julia, kotlin, latex, lex, lilypond,
    literatecurry, literatehaskell, llvm, lua, m4, makefile, mandoc, markdown,
    mathematica, matlab, maxima, mediawiki, metafont, mips, modelines, modula2,
    modula3, monobasic, nasm, noweb, objectivec, objectivecpp, ocaml, octave,
    opencl, pascal, perl, php, pike, postscript, prolog, pure, python, r,
    relaxng, relaxngcompact, rest, rhtml, roff, ruby, rust, scala, scheme, sci,
    sed, sgml, sql, sqlmysql, sqlpostgresql, tcl, tcsh, texinfo, verilog, vhdl,
    xml, xorg, xslt, xul, yacc, yaml, zsh
Default user data directory: /home/user/.pandoc
Copyright (C) 2006-2016 John MacFarlane
Web:  http://pandoc.org
This is free software; see the source for copying conditions.
There is no warranty, not even for merchantability or fitness
for a particular purpose.

I try to convert some epubs to pdfs and some fail with the following error:

$ pandoc -o failing.pdf failing.epub
pandoc: Could not find image `Text/data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAC0lEQVQIW2NkAAIAAAoAAggA9GkAAAAASUVORK5CYII=', skipping...
pandoc: Unable to convert `Text/data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAC0lEQVQIW2NkAAIAAAoAAggA9GkAAAAASUVORK5CYII=' for use with pdflatex.
! LaTeX Error: File `Text/data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAA
BCAYAAAAfFcSJAAAAC0lEQVQIW2NkAAIAAAoAAggA9GkAAAAASUVORK5CYII=' not found.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H <return>  for immediate help.
 ...                                              

l.73 ...EQVQIW2NkAAIAAAoAAggA9GkAAAAASUVORK5CYII=}

pandoc: Error producing PDF

The issue is that if the epubs content is stored inside a directory like Text/ch0001.xhtml pandoc will, when it encounters some base64-inlined image, prefix that URI with Text/ which obviously doesn't exist.

I have attached two files as a zip, a working and a failing epub-file
epubs.zip

@lep
Copy link
Author

lep commented Oct 6, 2016

Ok, i have managed to compile pandoc from source:

$ git show-ref HEAD --abbrev=6
d8600d
$ ./dist/build/pandoc/pandoc --version
1.17.3
<snip>
$ ./dist/build/pandoc/pandoc -o failing.pdf failing.epub
pandoc: Could not find image `Text/data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAC0lEQVQIW2NkAAIAAAoAAggA9GkAAAAASUVORK5CYII=', skipping...
pandoc: Could not find image `Text/data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAIAAAACDbGyAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsMAAA7DAcdvqGQAAAARSURBVBhXY3gro4KMGCjkAwCr9R1miWOjyQAAAABJRU5ErkJggg==', skipping...

As you can see the pdf builds now but the images are being ignored.
So i guess that's okay-ish but if someone wants to investigate it's still the same testcase as above.

@lep
Copy link
Author

lep commented Oct 7, 2016

This patch seems to do the trick for me but i have not compiled pandoc with tests enabled.

diff --git a/src/Text/Pandoc/Readers/EPUB.hs b/src/Text/Pandoc/Readers/EPUB.hs
index e547b84..ecbfa0b 100644
--- a/src/Text/Pandoc/Readers/EPUB.hs
+++ b/src/Text/Pandoc/Readers/EPUB.hs
@@ -109,7 +109,9 @@ iq _ = []

 -- Remove relative paths
 renameImages :: FilePath -> Inline -> Inline
-renameImages root (Image attr a (url, b)) = Image attr a (collapseFilePath (root </> url), b)
+renameImages root img@(Image attr a (url, b))
+  | "data:image" `isPrefixOf` url = img
+  | otherwise                     = Image attr a (collapseFilePath (root </> url), b)
 renameImages _ x = x

 imageToPandoc :: FilePath -> Pandoc

@jgm jgm closed this as completed in d2a6533 Oct 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant