Skip to content
This repository has been archived by the owner on Nov 9, 2017. It is now read-only.

Commit

Permalink
Let the HTML filter encode all character entity references.
Browse files Browse the repository at this point in the history
This is done by modifying the 'escapeCharacters' parameter on the filter's default configuration, and changing the encoding used to produce these files to ascii.
  • Loading branch information
Carlos A. Munoz committed Jan 15, 2014
1 parent 319d18f commit 84f4cb8
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 1 deletion.
Expand Up @@ -332,7 +332,7 @@ public void writeTranslatedFile(OutputStream output, URI originalFile,
net.sf.okapi.common.LocaleId localeId =
net.sf.okapi.common.LocaleId.fromString(locale);
IFilterWriter writer = filter.createFilterWriter();
writer.setOptions(localeId, "UTF-8");
writer.setOptions(localeId, "ascii");

if (requireFileOutput) {
writeTranslatedFileWithFileOutput(output, originalFile,
Expand Down
Expand Up @@ -35,6 +35,8 @@

assumeWellformed: false
preserve_whitespace: true
# Source: http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
escapeCharacters: "\"&'<> ¡¢£¤¥¦§¨©ª«¬ ®¯°±²³´µ¶·¸¹º»¼½¾¿Àgrave)ÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿŒœŠšŸƒˆ˜ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩαβγδεζηθικλμνξοπρςστυφχψωϑϒϖ   –—‘’‚“”„†‡•…‰′″‹›‾⁄€ℑ℘ℜ™ℵ←↑→↓↔↵⇐⇑⇒⇓⇔∀∂∃∅∇∈∉∋∏∑−∗√∝∞∠∧∨∩∪∫∴∼≅≈≠≡≤≥⊂⊃⊄⊆⊇⊕⊗⊥⋅⋮⌈⌉⌊⌋〈〉◊♠♣♥♦"

attributes:
# attributes that occur on many elements
Expand Down

0 comments on commit 84f4cb8

Please sign in to comment.