Permalink
Browse files

Deal with regex match groups in excerpt

Original implementation has bugs if the regex contains a match group.

Example:

    excerpt('This is a beautiful? morning', /\b(beau\w*)\b/i, :radius => 5)
    Expected: "...is a beautiful? mor..."
    Actual: "...is a beautifulbeaut..."

The original phrase was being converted to a regex and returning the text
either side of the phrase as expected:

    'This is a beautiful? morning'.split(/beautiful/i, 2)
    # => ["This is a ", "? morning"]

When we have a match with groups the match is returned in the array.

Quoting the ruby docs: "If pattern is a Regexp, str is divided where the
pattern matches. [...] If pattern contains groups, the respective matches will
be returned in the array as well."

    'This is a beautiful? morning'.split(/\b(beau\w*)\b/iu, 2)
    # => ["This is a ", "beautiful", "? morning"]

If we assume we want to split on the first match – this fix makes that
assumption – we can pass the already assigned `phrase` variable as the place
to split (because we already know that a match exists from line 168).

Originally spotted by Louise Crow (@crowbot) at
mysociety/alaveteli#1557
  • Loading branch information...
1 parent a1bd00d commit 124f88eaa22ef5b1851db505af9e639c96d5b1d8 @garethrees garethrees committed with garethrees Jun 9, 2014
View
2 actionview/lib/action_view/helpers/text_helper.rb
@@ -188,7 +188,7 @@ def excerpt(text, phrase, options = {})
end
end
- first_part, second_part = text.split(regex, 2)
+ first_part, second_part = text.split(phrase, 2)
prefix, first_part = cut_excerpt_part(:first, first_part, separator, options)
postfix, second_part = cut_excerpt_part(:second, second_part, separator, options)
View
10 actionview/test/template/text_helper_test.rb
@@ -280,8 +280,13 @@ def test_excerpt
end
def test_excerpt_with_regex
+ assert_equal('...is a beautiful! mor...', excerpt('This is a beautiful! morning', 'beautiful', :radius => 5))
+ assert_equal('...is a beautiful? mor...', excerpt('This is a beautiful? morning', 'beautiful', :radius => 5))
+ assert_equal('...is a beautiful? mor...', excerpt('This is a beautiful? morning', /\bbeau\w*\b/i, :radius => 5))
+ assert_equal('...is a beautiful? mor...', excerpt('This is a beautiful? morning', /\b(beau\w*)\b/i, :radius => 5))
assert_equal("...udge Allen and...", excerpt("This day was challenging for judge Allen and his colleagues.", /\ballen\b/i, :radius => 5))
assert_equal("...judge Allen and...", excerpt("This day was challenging for judge Allen and his colleagues.", /\ballen\b/i, :radius => 1, :separator => ' '))
+ assert_equal("...was challenging for...", excerpt("This day was challenging for judge Allen and his colleagues.", /\b(\w*allen\w*)\b/i, :radius => 5))
end
def test_excerpt_should_not_be_html_safe
@@ -305,11 +310,6 @@ def test_excerpt_in_borderline_cases
assert_equal("...abc...", excerpt("z abc d", "b", :radius => 1))
end
- def test_excerpt_with_regex
- assert_equal('...is a beautiful! mor...', excerpt('This is a beautiful! morning', 'beautiful', :radius => 5))
- assert_equal('...is a beautiful? mor...', excerpt('This is a beautiful? morning', 'beautiful', :radius => 5))
- end
-
def test_excerpt_with_omission
assert_equal("[...]is a beautiful morn[...]", excerpt("This is a beautiful morning", "beautiful", :omission => "[...]",:radius => 5))
assert_equal(

0 comments on commit 124f88e

Please sign in to comment.