Skip to content

Commit

Permalink
Fixed a problem in which ParseException error messages could not be r…
Browse files Browse the repository at this point in the history
…etrieved if the error content contained Unicode characters.

## Why?
If the xml tag contains Unicode characters when the error occurs, an `Encoding::CompatibilityError: incompatible character encodings: UTF-8 and ASCII-8BIT` exception is raised, ParseException error message cannot be retrieved.

See: #29
  • Loading branch information
naitoh committed May 3, 2024
1 parent 06be5cf commit 065bb91
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 0 deletions.
1 change: 1 addition & 0 deletions lib/rexml/parseexception.rb
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ def to_s
err << "\nLine: #{line}\n"
err << "Position: #{position}\n"
err << "Last 80 unconsumed characters:\n"
err.force_encoding("ASCII-8BIT")
err << @source.buffer[0..80].force_encoding("ASCII-8BIT").gsub(/\n/, ' ')
end

Expand Down
13 changes: 13 additions & 0 deletions test/parse/test_element.rb
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,19 @@ def test_empty_namespace_attribute_name
DETAIL
end

def test_empty_namespace_attribute_name_with_utf8_character
exception = assert_raise(REXML::ParseException) do
parse("<x :\xE2\x80\x8B>")
end
assert_equal(<<-DETAIL.chomp.force_encoding("ASCII-8BIT"), exception.to_s)
Invalid attribute name: <:\xE2\x80\x8B>
Line: 1
Position: 8
Last 80 unconsumed characters:
:\xE2\x80\x8B>
DETAIL
end

def test_garbage_less_than_before_root_element_at_line_start
exception = assert_raise(REXML::ParseException) do
parse("<\n<x/>")
Expand Down

0 comments on commit 065bb91

Please sign in to comment.