New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoding issue with unicode #144
Comments
Just what I was hoping for, a use case for fleshing out encoding support! Is this critical for 0.1.0, or can it be addressed after? |
It would be a nice wedge for me to push on getting an update pushed On Tue, Jan 29, 2013 at 11:46 AM, Dan Allen notifications@github.comwrote:
|
On Tue, Jan 29, 2013 at 10:49 AM, Ryan Waldron notifications@github.comwrote:
Gotcha. Perhaps we can schedule it for a 0.1.x point release, which will be |
Aha! I've got a fix that will work across Ruby versions. Turns out, this is ERB biting us again. Previously, I had add the magic encoding directive to all the block-level templates, thinking those were the only ones that would be invoked directly. However, the substitutions are loading the Inline templates directly and concatenating them to the string. That's where the encodings are getting mixed up. All I needed to do is add the magic encoding directive to all the ERB templates, and it all works. I also noticed that the link macro is catching the endline as part of the URL, which it shouldn't...so I'm fixing that too. I can have a patch ready w/ a test shortly. |
- needed to add magic encoding line to all erb templates - add example from issue to encodings test case
resolves issue #144 - encoding issue w/ utf-8
Reported (and fix suggested) by @brianmario
For this content:
https://github.com/foo-users/foo
へと
vicmd
キーマップを足してみている試み、アニメーションgifです。
この辺りでやっています。
https://github.com/foo/bar/compare/master...tb;keymap
in erebor/asciidoctor@master:lib/asciidoctor/substituters.rb#L358
m[2] is "https://github.com/foo-users/foo\nへと"
Notice it matched the newline and first couple of unicode chars from the next line.
That's the first part.
Second, in erebor/asciidoctor@master:lib/asciidoctor/substituters.rb#L378 the resulting string needs to be in the same encoding as the original (the
result
var in this case).Quick fix might be to do something like
r = "…"
r.force_encoding(result.encoding) if r.respond_to?(:force_encoding)
r
Brian says that fixed it for him locally.
What's happening there is the resulting string inside the block (the one we're building up) ends up tagged as US-ASCII but
result
is UTF-8 so when gsub goes to join the strings back together it blows upThe text was updated successfully, but these errors were encountered: