Images with umlauts in image src-attribute are not loaded in PDF document #3271

Open
christianhueser opened this Issue Jan 6, 2017 · 3 comments

Projects

None yet

2 participants

@christianhueser

It seems to me that wkhtmltopdf v0.13-alpha is not properly displaying images in PDF documents if umlauts are contained in src-attribute of an image. I tried both with unencoded urls as well with encoded urls. Is this a known issue and is there any known workaround? Thanks for your help in advance.

@PhilterPaper

Is it only image src attributes, or do other file names written in non-ASCII (accented) text also fail? This could give a clue whether it's a general filename handling problem with accented characters, or that it is isolated to one or two special cases.

Of course, you've checked that you didn't write the file names in, say, Latin-1 while running wkHTMLtoPDF in UTF-8, or vice-versa. That would certainly cause problems with being unable to find files. I take it that your operating system handles such file names OK.

@christianhueser
christianhueser commented Jan 6, 2017 edited

Thanks a lot for the immediate response. With your comment I was able to narrow down the problem I was encountered.

For some reason, if I put in a filepath containing umlauts, e.g. "file:///path/to/image/ä.jpg", in an image src-attribute, the image is not loaded within the pdf-document. By exchanging the filepath by the url, e.g. "https://url.of.image/ä.jpg", the image gets loaded in the pdf-document. Consequently, wkhtmltopdf makes a difference if the image source is an url to an image or a path to an image. In the latter case, the image is not loaded in the resulting pdf-document, if umlauts are contained in the filepath. The same behaviour can be observed for css-filepaths in href-attributes in link-tags as well as javascript-filepaths in src-attributes in script-tags. Possibly, filepaths containing umlauts in other html-tags may fail as well, but I haven't checked.

For my application, however, the described workaround works for me.

@PhilterPaper

Hmm. Maybe you found that local files are handled with different encoding than URLs. The bottom line is that whenever you use a character which is not straight ASCII (unaccented Latin alphabet), you have to be aware of what character encoding you're using, and what the system is expecting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment