UTF-8 in header/footer (skipping non-ASCII characters) (Again) #4228
Comments
First, are you absolutely certain that the file you showed (with text containing "é") is UTF-8 and not Latin-1? It's very easy for an editor to slip into Latin-1 etc. mode without you realizing it. Second, if this is being processed by the shell command line, are you absolutely certain that it passes in UTF-8 characters uncorrupted? Often, command shells are single byte (Latin-1, etc.) and may have done something nasty to the text you're passing in. |
Hi Phil; Thanks for your suggestions. Yes, I'm pretty certain everything is passed as UTF-8. The source file containing the "é" character is a ruby file, annotated with "# encoding: UTF-8", and edited with the Sublime Text editor. The character is passed to the wicked_pdf gem in a call such as this:
when I check the encoding on footer_text, it reports as UTF-8:
The wicked_pdf gem then generates the command line calling wkhtmltopdf:
and the bash shell in which wkhtmltopdf are operating in UTF-8:
So, I'm not sure what else I can control? Tim |
And, even if I explicitly reference "é" as "\u00E9", it's still being dropped by wkhtmltopdf 0.12.3. |
If an accented character is vanishing, my best guess is that it is reaching the engine as an invalid character (corrupted somewhere along the line), or even just dropped. If it definitely started out as a UTF-8 (two byte) character, most likely it was in the shell/command-line handling that something bad happened. Per one of the referenced older issues, have you checked the "locale" settings all the way down the line, to make sure you're properly handling UTF-8 and not treating it as Latin-1 or even ASCII? You might want to either insert some printout code in wkHTMLtoPDF or make up a little dummy program to see if the UTF-8 character is getting to wkHTMLtoPDF, or it's being dropped or corrupted somewhere earlier. Beyond that, I'm out of ideas. |
To complicate matters, I'm finding that the accented character is NOT dropped when wkHTMLtoPDF generates the PDF in my development environment (Mac OS/Ruby 2.2.5). But in production, when running under RHEL 7.3/Ruby 2.2.5, the character is dropped, even with the locale of the shell clearly operating in en_US.UTF-8. Yes, I think I'm going to have to dig into the receiving end of wkHTMLtoPDF as you suggested, Phil. If I uncover any reasons for the character being dropped or misinterpreted, I'll post back here. |
I had the exact same thing happen to me using However, what did work for me, was using |
Much obliged, Kai! Thanks for the tip. I will give that a shot and report
back.
Tim
…On Fri, May 24, 2019 at 12:42 AM Kai Sassnowski ***@***.***> wrote:
I had the exact same thing happen to me using footer-text. All german
umlauts would disappear from the generated pdf.
However, what *did* work for me, was using footer-html with <meta
charset="utf-8"> instead of footer-text. @enwood
<https://github.com/enwood> Might be worth a shot? Doesn't really fix the
original error but might be a workaround.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4228?email_source=notifications&email_token=AAATMR7G62ULL5IQV5AELRTPW5WZVA5CNFSM4GPJ53J2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWEELDQ#issuecomment-495469966>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAATMR35274ZMLYDW5KDRJLPW5WZVANCNFSM4GPJ53JQ>
.
|
I had the same problems with cyrillic symbols. |
see also #4777 |
I'm opening this issue again (formerly Issue 2002) as I'm still seeing this problem with version 0.12.3 (with patched qt) operating under Red Hat Enterprise Linux 7.3.
In my application, the wicked_pdf gem is being used to drive wkhtmltopdf.
When wicked_pdf passes UTF-8 characters to wkhtmltopdf via the command line, they are being dropped by wkhtmltopdf. In this example, I pass the text "Issued/Issué" as footer text. I can see the accented (UTF-8) footer text ("Issué") being passed in the command line:
but the resulting PDF continues to be missing the "é" in "Issué" when it appears in the rendered footer.
Red Hat Enterprise Linux Server release 7.3 (Maipo)
[apps@vapp02t lts]$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
Any guidance would be appreciated!
The text was updated successfully, but these errors were encountered: