-
Notifications
You must be signed in to change notification settings - Fork 433
Closed
Labels
Description
When cleaning up an MSWord document,
<p class=CSP-ChapterBodyText><span lang=EN-US style='font-size:14.0pt'>“I thought—”</span></p>
is converted to
<p>
"I thought-"
</p>
_If you look closes the — emdash in the original text has been changed to a hyphen, which is actually introducing a grammatical error into the text, the original emdash should be persisted. _
I've included my config file and how I run the tool below. I'm not sure how to check, but I'm 99% sure this is version 5.6.0.
Note: The change of the quote characters is interesting, and I'd be interested to know why that happens, but I'm not reporting that as an issue here.
`.\tidy.exe -config .\htmltidy.config -o .\tidied.html .\original.html
add-xml-space: no
add-meta-charset: no
anchor-as-name: yes
ascii-chars: no
assume-xml-procins: no
bare: yes
break-before-br: no
char-encoding: utf8
clean: no
coerce-endtags: yes
add-xml-decl: no
css-prefix: c
custom-tags: no
decorate-inferred-ul: no
doctype: auto
drop-empty-elements: yes
drop-empty-paras: yes
drop-proprietary-attributes: no
enclose-block-text: no
enclose-text: no
escape-cdata: no
fix-backslash: yes
escape-scripts: yes
fix-bad-comments: no
fix-style-tags: yes
fix-uri: yes
force-output: no
gdoc: no
gnu-emacs: no
hide-comments: yes
indent: yes
indent-attributes: no
indent-cdata: no
indent-spaces: 4
indent-with-tabs: yes
input-encoding: utf8
input-xml: no
join-classes: no
keep-tabs: no
keep-time: no
literal-attributes: no
join-styles: yes
logical-emphasis: no
lower-literals: yes
markup: yes
merge-divs: auto
merge-emphasis: yes
merge-spans: auto
mute-id: no
ncr: yes
new-blocklevel-tags: no
omit-optional-tags: no
output-bom: auto
output-encoding: utf8
output-html: no
output-xhtml: no
output-xml: no
preserve-entities: no
punctuation-wrap: no
quiet: no
quote-ampersand: yes
quote-marks: no
quote-nbsp: yes
repeated-attributes: keep-last
replace-color: no
show-body-only: no
show-errors: 6
show-info: yes
show-meta-change: no
show-warnings: yes
skip-nested: yes
sort-attributes: none
strict-tags-attributes: no
tab-size: 8
tidy-mark: yes
uppercase-attributes: no
uppercase-tags: no
vertical-space: no
warn-proprietary-attributes: yes
word-2000: yes
wrap: 68
wrap-asp: yes
wrap-attributes: no
wrap-jste: yes
wrap-php: yes
write-back: no
wrap-script-literals: no
wrap-sections: yes