Permalink
Browse files

sax: Fix parsing bug with ']' in PCDATA

Fix bug reported in:
   #23

The SAX parser (which is used by the DOM) looks
out for ']]>' in parsed character data (because
such a string is not permitted - it marks the
end of unparsed character data. In keeping track
of the ']' charcters we forgot to reset the
internal counter when ']a' or ']]a' were found
and then added ']' or ']]' to each additional
character. This is fixed.

Need to audit the code to check the other
cases - is PCDATA case correct for example.
Should add test cases to the sax tests too.
And fix up the external sax & dom tests.
  • Loading branch information...
1 parent 0894588 commit 5a936d7e849cfbbc7ff197c2e280260e5483ac8f @andreww committed Dec 30, 2012
Showing with 2 additions and 0 deletions.
  1. +2 −0 sax/m_sax_tokenizer.F90
View
@@ -376,9 +376,11 @@ subroutine sax_tokenize(fx, fb, eof)
elseif (phrase==1) then
call append_varstr( fx%token, ']' )
call append_varstr( fx%token, c )
+ phrase = 0
elseif (phrase==2) then
call append_varstr( fx%token, ']]' )
call append_varstr( fx%token, c )
+ phrase = 0
else
call append_varstr( fx%token, c )
endif

0 comments on commit 5a936d7

Please sign in to comment.