Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue when splitting an encoded string with specific characters #5714

Closed
n00tmeg opened this issue Apr 25, 2019 · 1 comment · Fixed by #5715
Closed

Issue when splitting an encoded string with specific characters #5714

n00tmeg opened this issue Apr 25, 2019 · 1 comment · Fixed by #5715
Milestone

Comments

@n00tmeg
Copy link
Contributor

n00tmeg commented Apr 25, 2019

Environment

$ bin/jruby -v
jruby 9.2.8.0-SNAPSHOT (2.5.3) 2019-04-23 1679826 Java HotSpot(TM) 64-Bit Server VM 25.131-b11 on 1.8.0_131-b11 +jit [darwin-x86_64]

Expected Behavior

Splitting an encoded string with a null byte delimiter should returns the expected array of strings.
Example script (test.rb):

str1 = "AA\0BB\0CC".encode('utf-16le')
str2 = "\0".encode('utf-16le')
array = str1.split(str2)
puts array.inspect

Expected result (CRuby):

$ ruby test.rb
["AA", "BB", "CC"]

Actual Behavior

JRuby does not properly split the string:

$ jruby test.rb
["", "", "CC"]

The issue is in indexOf() method from RubyString.java (https://github.com/jruby/jruby/blob/master/core/src/main/java/org/jruby/RubyString.java#L4258). This method looks for the index of a specified substring (or character) in a byte array without considering the real size of the encoded characters.

In this example, the byte array related to str1 is:
byte_array => [65, 0, 65, 0, 0, 0, 66, 0, 66, 0, 0, 0, 67, 0, 67, 0]
and the delimiter character (str2) is:
delim => [0, 0]

The first time it is called, indexOf() will match byte_array[3] and byte_array[4] instead of matching byte_array[4] and byte_array[5] and returning 4.

@n00tmeg
Copy link
Contributor Author

n00tmeg commented Apr 25, 2019

I have submitted PR with a possible fix (#5715)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants