Skip to content

Commit

Permalink
Handle invalid UTF-8 strings when HTML escaping
Browse files Browse the repository at this point in the history
Use `ActiveSupport::Multibyte::Unicode.tidy_bytes` to handle invalid UTF-8
strings in `ERB::Util.unwrapped_html_escape` and `ERB::Util.html_escape_once`.
Prevents user-entered input passed from a querystring into a form field from
causing invalid byte sequence errors.
  • Loading branch information
greysteil committed Jun 8, 2015
1 parent a69e0a5 commit 05a2a6a
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 4 deletions.
9 changes: 9 additions & 0 deletions activesupport/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
* Handle invalid UTF-8 strings when HTML escaping

Use `ActiveSupport::Multibyte::Unicode.tidy_bytes` to handle invalid UTF-8
strings in `ERB::Util.unwrapped_html_escape` and `ERB::Util.html_escape_once`.
Prevents user-entered input passed from a querystring into a form field from
causing invalid byte sequence errors.

*Grey Baker*

* Fix a range of values for parameters of the Time#change

*Nikolay Kondratyev*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ def unwrapped_html_escape(s) # :nodoc:
if s.html_safe?
s
else
s.gsub(HTML_ESCAPE_REGEXP, HTML_ESCAPE)
ActiveSupport::Multibyte::Unicode.tidy_bytes(s).gsub(HTML_ESCAPE_REGEXP, HTML_ESCAPE)
end
end
module_function :unwrapped_html_escape
Expand All @@ -50,7 +50,7 @@ def unwrapped_html_escape(s) # :nodoc:
# html_escape_once('<< Accept & Checkout')
# # => "<< Accept & Checkout"
def html_escape_once(s)
result = s.to_s.gsub(HTML_ESCAPE_ONCE_REGEXP, HTML_ESCAPE)
result = ActiveSupport::Multibyte::Unicode.tidy_bytes(s.to_s).gsub(HTML_ESCAPE_ONCE_REGEXP, HTML_ESCAPE)
s.html_safe? ? result.html_safe : result
end

Expand Down
10 changes: 8 additions & 2 deletions activesupport/test/core_ext/string_ext_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -782,8 +782,8 @@ def to_s
end

test "ERB::Util.html_escape should correctly handle invalid UTF-8 strings" do
string = [192, 60].pack('CC')
expected = 192.chr + "<"
string = "\251 <"
expected = &lt;"
assert_equal expected, ERB::Util.html_escape(string)
end

Expand All @@ -799,6 +799,12 @@ def to_s
assert_equal escaped_string, ERB::Util.html_escape_once(string)
assert_equal escaped_string, ERB::Util.html_escape_once(escaped_string)
end

test "ERB::Util.html_escape_once should correctly handle invalid UTF-8 strings" do
string = "\251 <"
expected = "© &lt;"
assert_equal expected, ERB::Util.html_escape_once(string)
end
end

class StringExcludeTest < ActiveSupport::TestCase
Expand Down

0 comments on commit 05a2a6a

Please sign in to comment.