Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
String encoding #1198
Can someone tell me what's wrong with the following code? I'm developing an app on torquebox, but i'm running into weird encoding errors!
My system encoding is Windows-1252.
s = "OoaAçÇãÚú$%()"
@headius I tested against 1.7.5 and 1.7.6 and got the same results :(
Ok, I'm seeing the exact same output from MRI 2.0.0 as from JRuby on your example script (JRuby master, but 1.7.5+ should be the same. Not sure if this will paste right, but...
If you are actually seeing a difference between JRuby and MRI, perhaps you can add a screenshot to that repository? I can't reproduce here.
My system: OS X 10.8.x, JRuby 9000, Java 7u40, system encoding = UTF-8.
@headius You actually got the output i expected! I installed ruby 2.0 on my windows machine and i got the correct results.
Here's the print screen for the MRI
and here's the print screen for JRuby
I'm getting all sorts of errors on my app when i use String methods(unpack, gsub ...) and i think it's all related to this issue.
Ok so I am just throwing this out there since I went about this all wrong...
If I capture the output to the file and compare against JRuby and MRI on both Windows and MacOS those chars are identical. If I run it without capturing it then I see that on Windows all three lines look exactly the same whereas viewing the saved output in an editor capable of viewing UTF-8 then I see only the bottom one rendering properly.
So I am convinced this is purely a terminal affordance thing. It is clearly doing something else because if I redirect MRI output on windows to a file and then cat it and I cat what JRuby generates they are identical as well. Sleuthing in MRI code now.
Amazing if this is totally fixed but it seems to work and logically it seems like it should work. I discovered System.console() which seems capable of taking a Java String and converting it to the underlying codepage of the windows console. I suspect the part which will fail is the facility for what to do on trancoding error (Java likes to print '?'). That can be a followup bug if someone can make that happen.
Note in case this fails utterly...we can use WriteConsoleW and a couple of Windows methods using jnr-posix but that seems like an ugly set of code. Let's hope we don't need to go there.