Permalink
Browse files

Correctly handle offsets in Multibyte::Chars#index and #rindex.

The offset in codepoints was being passed directly to the wrapped string's index/rindex method. Now we translate the offset into bytes first.

[#3028 state:committed]

Signed-off-by: Jeremy Kemper <jeremy@bitsweat.net>
  • Loading branch information...
eostrom authored and jeremy committed Aug 10, 2009
1 parent 25fe43b commit 4e014379a3c792281138c5a19a5bd74e9435fede
Showing with 28 additions and 1 deletion.
  1. +16 −1 activesupport/lib/active_support/multibyte/chars.rb
  2. +12 −0 activesupport/test/multibyte_chars_test.rb
@@ -205,7 +205,22 @@ def include?(other)
# 'Café périferôl'.mb_chars.index('ô') #=> 12
# 'Café périferôl'.mb_chars.index(/\w/u) #=> 0
def index(needle, offset=0)
- index = @wrapped_string.index(needle, offset)
+ wrapped_offset = self.first(offset).wrapped_string.length
+ index = @wrapped_string.index(needle, wrapped_offset)
+ index ? (self.class.u_unpack(@wrapped_string.slice(0...index)).size) : nil
+ end
+
+ # Returns the position _needle_ in the string, counting in
+ # codepoints, searching backward from _offset_ or the end of the
+ # string. Returns +nil+ if _needle_ isn't found.
+ #
+ # Example:
+ # 'Café périferôl'.mb_chars.rindex('é') #=> 6
+ # 'Café périferôl'.mb_chars.rindex(/\w/u) #=> 13
+ def rindex(needle, offset=nil)
+ offset ||= length
+ wrapped_offset = self.first(offset).wrapped_string.length
+ index = @wrapped_string.rindex(needle, wrapped_offset)
index ? (self.class.u_unpack(@wrapped_string.slice(0...index)).size) : nil
end
@@ -231,7 +231,19 @@ def test_index_should_return_character_offset
assert_nil @chars.index('u')
assert_equal 0, @chars.index('こに')
assert_equal 2, @chars.index('')
+ assert_equal 2, @chars.index('', -2)
+ assert_equal nil, @chars.index('', -1)
assert_equal 3, @chars.index('')
+ assert_equal 5, 'ééxééx'.mb_chars.index('x', 4)
+ end
+
+ def test_rindex_should_return_character_offset
+ assert_nil @chars.rindex('u')
+ assert_equal 1, @chars.rindex('')
+ assert_equal 2, @chars.rindex('', -2)
+ assert_nil @chars.rindex('', -3)
+ assert_equal 6, 'Café périferôl'.mb_chars.rindex('é')
+ assert_equal 13, 'Café périferôl'.mb_chars.rindex(/\w/u)
end
def test_indexed_insert_should_take_character_offsets

0 comments on commit 4e01437

Please sign in to comment.