-
Notifications
You must be signed in to change notification settings - Fork 1.9k
UTF-8 characters passed as value to external header/footer html files not showing correctly. #2427
Comments
Are you on Windows? |
Also, without a minimal, reproducible test case as requested in the support page this issue cannot be investigated further. |
It's Ubuntu 14.04.1 LTS. I'm using phpwkhtmltopdf on apache server. The generated command is like this:
And on the footer.html, I the JS The '中文字' on the header-left is alright, but the '中文字' on the footer.html becomes unrecognizable words. But if I directly place '中文字' on the footer.html, it is working fine. |
I have tried a different JavaScript function (http://stackoverflow.com/questions/12049620/how-to-get-get-variables-value-in-javascript) function subst() {
var vars={};
var query = document.location
.toString()
// get the query string
.replace(/^.*?\?/, '')
// and remove any existing hash string (thanks, @vrijdenker)
.replace(/#.*$/, '')
.split('&');
for(var i=0, l=query.length; i<l; i++) {
var aux = decodeURIComponent(query[i]).split('=');
vars[aux[0]] = aux[1];
}
var x=['frompage','topage','page','webpage','section','subsection','subsubsection'];
for (var i in x) {
var y = document.getElementsByClassName(x[i]);
for (var j=0; j<y.length; ++j) y[j].textContent = vars[x[i]];
} } Now that if I go to footer.html?webpage=中文字, I can display the UTF-8 characters correctly, but it's still not ok when the the file is called by wkhtmltopdf. I have already set |
Same here. If section contains some non-ascii character they are not shown correctly. Prehľad rozloženia => Prehľad rozloženia footer.html is similar as @newpen described. |
@ashkulz Hey guys, my colleague found the catch. @newpen
|
@ddeath thanks for the hint, I have it changed to decodeURIComponent() and the line is also included in the footer.html, but it doesn't work for me. My simplified command is like: |
I did workaround this issue by adding an ugly javascript workaround to the header.html I am running wkhtmltopdf on CentOS7. the header.html is stored as UTF-8 with BOM function subst() {
var vars={};
var x=window.location.search.substring(1).split('&');
for (var i in x) {var z=x[i].split('=',2);vars[z[0]] = decodeUTF8(unescape(z[1]));}
var x=['frompage','topage','page','webpage','section','subsection','subsubsection','title'];
for (var i in x) {
var y = document.getElementsByClassName(x[i]);
for (var j=0; j<y.length; ++j) y[j].textContent = vars[x[i]];
}
}
// This is an ugly hack for this bug:
// https://github.com/wkhtmltopdf/wkhtmltopdf/issues/2427
function decodeUTF8(text) {
var i=0;
var replacement = [
{'dec': 'Á', 'enc': 'Ã?'},
{'dec': 'Â', 'enc': 'Â'},
{'dec': 'Ä', 'enc': 'Ä'},
{'dec': 'É', 'enc': 'É'},
{'dec': 'Ó', 'enc': 'Ó'},
{'dec': 'Ô', 'enc': 'Ô'},
{'dec': 'Ö', 'enc': 'Ö'},
{'dec': 'Ú', 'enc': 'Ú'},
{'dec': 'Ü', 'enc': 'Ãœ'},
{'dec': 'ß', 'enc': 'ß'},
{'dec': 'á', 'enc': 'á'},
{'dec': 'â', 'enc': 'â'},
{'dec': 'ä', 'enc': 'ä'},
{'dec': 'ç', 'enc': 'ç'},
{'dec': 'é', 'enc': 'é'},
{'dec': 'ë', 'enc': 'ë'},
{'dec': 'î', 'enc': 'î'},
{'dec': 'ó', 'enc': 'ó'},
{'dec': 'ô', 'enc': 'ô'},
{'dec': 'ö', 'enc': 'ö'},
{'dec': 'ú', 'enc': 'ú'},
{'dec': 'ü', 'enc': 'ü'},
{'dec': 'è', 'enc': 'Ä?'},
{'dec': 'Ê', 'enc': 'Ę'},
{'dec': 'ê', 'enc': 'Ä™'},
{'dec': 'Ì', 'enc': 'Äš'},
{'dec': 'ì', 'enc': 'Ä›'},
{'dec': 'Ò', 'enc': 'Ň'},
{'dec': 'ò', 'enc': 'ň'},
{'dec': 'À', 'enc': 'Å”'},
{'dec': 'à', 'enc': 'Å•'},
{'dec': 'Ù', 'enc': 'Å®'},
{'dec': 'ù', 'enc': 'ů'},
{'dec': 'Û', 'enc': 'Å°'}
];
for (; i < replacement.length; i++) {
text = text.replace(replacement[i].enc, replacement[i].dec);
}
return text;
} |
Hi I am having the same issue. Characters passed in the --replace argument are appearing different to the other query params for the --header-html. I am using wkhtmltopdf 0.12.1 (with patched qt) on ubuntu 16.04 query string obtained from the header html I tried using --title "á" instead of --replace, but in this case the character is dropped as mentioned by other users |
I have recently found potentially a workaround here, I am URI encoding my UTF-8 characters before passing these to wkhtmlpdf, then in my header file I am double decoding the query string that I am passing this seems to get passed the issue. But I am still testing this. I will update with a code example once I have confirmed. Has anyone else tried this approach? |
Is this still an issue in 0.12.5? |
As a workaround, params that come from the --replace, I decode them with
I am affected by this problem on 0.12.3, don´t know about 0.12.5 |
I had a similar problem and i found such a solution: As --replace parameter i pass base64 encoded string and after that in footer.html file i worked on it with javascript function: function b64DecodeUnicode(str) {
return decodeURIComponent(Array.prototype.map.call(atob(str), function(c) {
return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2)
}).join(''))
} Hope this helps someone. |
I am using --replace to pass some values (Chinese Characters) to my external footer.html (using --footer-html). But the Chinese characters do not shows correctly. While the Chinese characters on the footer.html itself (hard coded on the file, not passed as variable) can be printed with no problem. And using command such as 'header-left' to print out the Chinese characters is also working fine.
On my footer.html, I already have
<meta charset="utf-8">
. But as characters that are on this file can be printed correctly, I think it should not be the encoding problem. Any idea regarding this? Thanks!The text was updated successfully, but these errors were encountered: