Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue when splitting an encoded string with specific characters #5714

chrisdlf opened this issue Apr 25, 2019 · 1 comment


Copy link

commented Apr 25, 2019


$ bin/jruby -v
jruby (2.5.3) 2019-04-23 1679826 Java HotSpot(TM) 64-Bit Server VM 25.131-b11 on 1.8.0_131-b11 +jit [darwin-x86_64]

Expected Behavior

Splitting an encoded string with a null byte delimiter should returns the expected array of strings.
Example script (test.rb):

str1 = "AA\0BB\0CC".encode('utf-16le')
str2 = "\0".encode('utf-16le')
array = str1.split(str2)
puts array.inspect

Expected result (CRuby):

$ ruby test.rb
["AA", "BB", "CC"]

Actual Behavior

JRuby does not properly split the string:

$ jruby test.rb
["", "", "CC"]

The issue is in indexOf() method from ( This method looks for the index of a specified substring (or character) in a byte array without considering the real size of the encoded characters.

In this example, the byte array related to str1 is:
byte_array => [65, 0, 65, 0, 0, 0, 66, 0, 66, 0, 0, 0, 67, 0, 67, 0]
and the delimiter character (str2) is:
delim => [0, 0]

The first time it is called, indexOf() will match byte_array[3] and byte_array[4] instead of matching byte_array[4] and byte_array[5] and returning 4.


This comment has been minimized.

Copy link
Contributor Author

commented Apr 25, 2019

I have submitted PR with a possible fix (#5715)

@chrisdlf chrisdlf referenced this issue Apr 25, 2019


Add support to winreg main functions #149

4 of 4 tasks complete

@kares kares added this to the JRuby milestone May 7, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
2 participants
You can’t perform that action at this time.