Permalink
Browse files

$nswcr, the nonstartcharword regex, contains its own sets of parens and

matches a leading space already. Use those parens and don't try to match
additional leading space before it.
  • Loading branch information...
1 parent 5e75721 commit 93c36439e0a95986fe20de9d6e8976fa80aeead2 @jamiemccarthy jamiemccarthy committed Jul 5, 2002
Showing with 4 additions and 3 deletions.
  1. +4 −3 Slash/Utility/Data/Data.pm
@@ -474,8 +474,6 @@ sub stripByMode {
$str =~ s/&/&/g;
$str =~ s/</&lt;/g;
$str =~ s/>/&gt;/g;
- ### this is not ideal; we want breakHtml to be
- ### entity-aware
# attributes are inside tags, and don't need to be broken up
$str = breakHtml($str) unless $no_white_fix || $fmode == ATTRIBUTE;
@@ -624,6 +622,7 @@ C<approveTag> function, C<approveCharref> function.
sub stripBadHtml {
my($str) = @_;
+#print STDERR "stripBadHtml 1 '$str'\n";
$str =~ s/<(?!.*?>)//gs;
$str =~ s/<(.*?)>/approveTag($1)/sge;
@@ -649,9 +648,11 @@ sub stripBadHtml {
)
}{&lt;$1}gx;
+#print STDERR "stripBadHtml 2 '$str'\n";
my $ent = qr/#?[a-zA-Z0-9]+/;
$str =~ s/&(?!$ent;)/&amp;/g;
$str =~ s/&($ent);?/approveCharref($1)/ge;
+#print STDERR "stripBadHtml 3 '$str'\n";
return $str;
}
@@ -858,7 +859,7 @@ sub breakHtml {
# the mwl), but the algorithm would be too complicated to
# implement in a regex, at least practically speaking, and
# walking through the string is also fairly complex.
- $text =~ s{ ($nswcr)}{ &nbsp;$1}gs;
+ $text =~ s{$nswcr}{$1&nbsp;$2$3}gs;
#print STDERR "text 7 '$text'\n";
return $text;

0 comments on commit 93c3643

Please sign in to comment.