Skip to content

Commit

Permalink
Item11755: fix by completely disabling all entity processing
Browse files Browse the repository at this point in the history
If we don't let HTML::TreeBuilder decode entities in the first place, we know
that all text nodes contain proper HTML entities and so we can skip
re-encoding things. This means we remove potentially fault-inducing steps of
processing which is a Good Thing™ and certainly beats the previous attempt at
fixing the problem in every way.

git-svn-id: http://svn.foswiki.org/branches/Release01x01@15832 0b4bb1d4-4e5a-0410-9cc4-b2b747904278
  • Loading branch information
GeorgeClark authored and GeorgeClark committed Nov 1, 2012
1 parent ffdf600 commit 2eb64fb
Showing 1 changed file with 5 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -329,6 +329,7 @@ sub _getTree {

my $tree = new HTML::TreeBuilder;
$tree->implicit_body_p_tag(1);
$tree->no_expand_entities(1); # Item11755
$tree->p_strict(1);
$tree->parse($text);
$tree->eof;
Expand Down Expand Up @@ -429,7 +430,8 @@ sub _findSubChanges {
sub _elementHash {

# Purpose: Stringify HTML ELement for comparison in Algorithm::Diff
my $text = ref( $_[0] ) eq $HTMLElement ? $_[0]->as_HTML('<>&') : "$_[0]";
# Item11755: prevent entity mangling
my $text = ref( $_[0] ) eq $HTMLElement ? $_[0]->as_HTML('') : "$_[0]";

# Strip leading & trailing blanks in text and paragraphs
$text =~ s/^\s*//;
Expand Down Expand Up @@ -521,8 +523,8 @@ sub _getTextWithClass {
if ( ref($element) eq $HTMLElement ) {
_addClass( $element, $class ) if $class;

# Don't let HTML::Entities touch high-bit bytes (Item11755)
return $element->as_HTML( '<>&', undef, {} );
# Item11755: prevent entity mangling
return $element->as_HTML( '', undef, {} );
}
elsif ($class) {
return '<span class="' . $class . '">' . $element . '</span>';
Expand Down

0 comments on commit 2eb64fb

Please sign in to comment.