Skip to content

Commit

Permalink
Strip out illegal XML characters in escapeXMLString.
Browse files Browse the repository at this point in the history
Closes jgm#5119.
  • Loading branch information
jgm committed Dec 4, 2018
1 parent 48115fc commit 38200c0
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 1 deletion.
7 changes: 6 additions & 1 deletion src/Text/Pandoc/XML.hs
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,12 @@ escapeCharForXML x = case x of

-- | Escape string as needed for XML. Entity references are not preserved.
escapeStringForXML :: String -> String
escapeStringForXML = concatMap escapeCharForXML
escapeStringForXML = concatMap escapeCharForXML . filter isLegalXMLChar
where isLegalXMLChar c = c == '\t' || c == '\n' || c == '\r' ||
(c >= '\x20' && c <= '\xD7FF') ||
(c >= '\xE000' && c <= '\xFFFD') ||
(c >= '\x10000' && c <= '\x10FFFF')
-- see https://www.w3.org/TR/xml/#charsets

-- | Escape newline characters as &#10;
escapeNls :: String -> String
Expand Down
9 changes: 9 additions & 0 deletions test/command/5119.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
```
% pandoc -t docbook
h&#x4;i
^D
<para>
hi
</para>
```

0 comments on commit 38200c0

Please sign in to comment.