String rindex does not work properly with matchdata #1688

Closed
zhaohanweng opened this Issue May 9, 2014 · 0 comments

Projects

None yet

2 participants

@zhaohanweng

Case 1: message ends with the matched word

message = "I love this new status update"
=> "I love this new status update"
matched_string = message.match(/status update/)[0]
=> "status update"
message.rindex(matched_string)
=> 16
message.rindex("status update")
=> 16

Case 2: message does not end with the matched word

message = "I love this new status update..."
=> "I love this new status update..."
matched_string = message.match(/status update/)[0]
=> "status update"
message.rindex(matched_string)
=> 19
message.rindex("status update")
=> 16

I am not sure why this would happen, matched_string is just a string, the value is the same "status update", how does it pickup the ... from the end of the message.

I tried in MRI ruby 1.9.3-p545 and 2.1.1, the issue does not occur.

Also I find if I do message.rindex(matched_string.tr("", ""))), the result will be 16.
But

matched_string.to_java_bytes
=> byte[115, 116, 97, 116, 117, 115, 32, 117, 112, 100, 97, 116, 101]@543c5bbd
"status update".to_java_bytes
=> byte[115, 116, 97, 116, 117, 115, 32, 117, 112, 100, 97, 116, 101]@46ab007f
And
matched_string.bytesize
=> 13
"status update".bytesize
=> 13

I don't see any special characters in the matched_string.

@atambo atambo added this to the JRuby 1.7.13 milestone May 24, 2014
@atambo atambo self-assigned this May 24, 2014
@atambo atambo added a commit that closed this issue Jun 14, 2014
@atambo atambo String#rindex should handle matchdata strings
A string extracted from a matchdata object shares the same
byte array as the string it was matched against. When doing a
rindex between a string from a matchdata and the string the matchdata
was matched against the same byte array is passed into ByteList.memcmp
which will always return 0 as ByteList.memcmp just checks for equality
of the byte arrays.

I fix this by passing bytes() instead getUnsafeBytes() into the
ByteList.memcmp method so that the byte arrays are different.

Fixes #1688
5303e9a
@atambo atambo closed this in 5303e9a Jun 14, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment