-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unicode utf-8 encoding causing issue for xml parser #40
Comments
@drchainmail, |
utf-8 is should be fine for any XML parser, as long as the string is actually encoded as utf-8. Can you provide a simple example where builder is not generating proper utf-8 output (given utf-8 input)? |
I have a similar issue. In the XML-File all special characters are converted but i need it unchanged for further processing.
But when i try this out i get a XML-File with: I#241;t#235;rn#226;ti#244;n#224;l (i removed "&" here otherwise the ascii character conversion is not visible here) instead of Iñtërnâtiônàl |
I can't reproduce this. # -*- coding: utf-8 -*-
require 'builder'
$KCODE = 'UTF8'
xml = Builder::XmlMarkup.new
xml.instruct!(:xml, :encoding => "UTF-8")
xml.sample("Iñtërnâtiônàl")
puts xml.target! Gives:
What version of Ruby are you using? 2.0 reports that $KCODE is ignored. |
I use Ruby 1.8.7 |
Even with 1.8.7 I get:
Any other environmental issues that may effect this (OS, etc)? |
We use Windows Server and MS IIS as Webserver, Ruby 1.8.7, Rails 3.0.20. Please take a look at my post at stackoverflow: I added a few screenshots, maybe the problem is the file-encoding of the generated xml-file. |
You are using the latest version of builder, right? (3.2.2) |
ok, actually not, but i updated the builder gem to 3.2.2 and tested it again and no changes. The special chars are still converted into ascii |
Wait, i got the following output C:\Appl_Ruby>gem dependency --reverse-dependencies builder Gem builder-3.2.2 |
With version 2.1.2 I get the same output as you. It's a version issue. According to the CHANGELOG, you need at least version 2.2.0 to get the behavior you want. |
Hi,
I am trying to chase down this issue and it comes down to Builder. Rails ActiveResource uses ActiveSupport's to_xml method which in return call Builder to generate the xml. Builder will generate the xml with encoding set to UTF-8 and showing the unicode without escaping to ascii entity. However, this is cause the receiving end xml parser to fail. The reason I believe is that if there are unicode in the xml, the encoding must be UTF-16. See: http://www.w3schools.com/xml/xml_encoding.asp . The only time it works is when encoding set to UTF-16. Thought?
The text was updated successfully, but these errors were encountered: