Skip to content

Commit

Permalink
Item1096: Malformed header anchors if header contains non A-Za-z0-9_ …
Browse files Browse the repository at this point in the history
…characters - simple solution

This is the 1.5 and 1.1.0 compatibilitysoluthat makes non-english TOC anchors work and
maintain compatibility with anchors that are A-Z0-9 and punktuation.
I still need to confirm UTF8 charset. But I think this also works well for Chinese etc
TestCases topic is not updated in trunk as it will require more work to verify compatibility for the
target anchors. And the TOC in trunk will use only the new anchors.


git-svn-id: http://svn.foswiki.org/trunk@3630 0b4bb1d4-4e5a-0410-9cc4-b2b747904278
  • Loading branch information
KennethLavrsen authored and KennethLavrsen committed Apr 22, 2009
1 parent c469a83 commit 5f69e96
Showing 1 changed file with 14 additions and 9 deletions.
23 changes: 14 additions & 9 deletions core/lib/Foswiki/Compatibility.pm
Expand Up @@ -437,21 +437,26 @@ sub _makeBadAnchorName {
# filter '!!', '%NOTOC%'
$anchorName =~ s/$Foswiki::regex{headerPatternNoTOC}//o;

# For most common alphabetic-only character encodings (i.e. iso-8859-*),
# remove non-alpha characters
if ( !defined( $Foswiki::cfg{Site}{CharSet} )
|| $Foswiki::cfg{Site}{CharSet} =~ /^iso-?8859-?/i )
{
$anchorName =~ s/[^$Foswiki::regex{mixedAlphaNum}]+/_/g;
}
$anchorName =~ s/__+/_/g; # remove excessive '_' chars
# No matter what character set we use, the HTML standard does not allow
# anything else than English alphanum characters in anchors
# So we convert anything non A-Za-z0-9_ to underscores
# and limit the number consecutive of underscores to 1
# This means that pure non-English anchors will become A, A_AN1, A_AN2, ...
# We accept anchors starting with 0-9. It is non RFC but it works and it
# is very important for compatibility
$anchorName =~ s/[^A-Za-z0-9]+/_/g;
$anchorName =~ s/__+/_/g; # remove excessive '_' chars

if ( !$compatibilityMode ) {
$anchorName =~ s/^[\s#_]+//; # no leading space nor '#', '_'
}

$anchorName =~ s/^$/A/; # prevent empty anchor

# limit to 32 chars
$anchorName =~ s/^(.{32})(.*)$/$1/;
if ( !$compatibilityMode ) {
$anchorName =~ s/[\s_]+$//; # no trailing space, nor '_'
$anchorName =~ s/[\s_]+$//; # no trailing space, nor '_'
}
return $anchorName;
}
Expand Down

0 comments on commit 5f69e96

Please sign in to comment.