Convert chinese encoding GB18030 to UTF-8 doesn't work #3411
Labels
Milestone
Comments
We're still having this issue with JRuby 9.0.3.0. Could this be related to our environment? |
@naag venturing a guess we may be a little out of date in our oniguruma translation tables and perhaps there is a bug somewhere? JRuby 9k uses this port for all transcoding and it should work as well as MRI. |
I'll have a look at the GB18030 transcoding stuff and see if we're missing something. |
<3 Thanks a lot guys! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Chinese characters encoding in GB18030 can not be converted to UTF-8. If the source encoding is GB2312 everythink works properly.
Here is an example:
Linux system:
Distributor ID: Ubuntu
Description: Ubuntu 14.04.3 LTS
Release: 14.04
Codename: trusty
When I run these lines on ruby 2.1.6p336 (2015-04-13 revision 50298) [x86_64-linux] I get following results:
As you can see both lines returns the same content.
But if I run these lines on jruby 9.0.1.0 (2.2.2) 2015-09-02 583f336 Java HotSpot(TM) 64-Bit Server VM 25.60-b23 on 1.8.0_60-b27 +jit [linux-amd64] I get a different result:
As you can see the second line is not able to encode the string in UTF-8.
I would be very thankful for any help on this issue.
The text was updated successfully, but these errors were encountered: