Skip to content

Commit

Permalink
dist-docs: Fix bugs in text to HTML conversion.
Browse files Browse the repository at this point in the history
This fixes two bugs.  First, & has a special meaning in the replacement
text for a sed "s" command, so this escapes it.  Second, this code
misprocessed bold or underlined &<>: >^H> would become &gt;^H&gt; which
would display as &gt&gt; in most browers.

Finally, this improves the HTML output so that bold ABC becomes <b>ABC</b>
instead of <b>A</b><b>B</b><b>C</b>.

Reported-by: Nicolas Bouliane <nbouliane@digitalocean.com>
Reported-at: https://twitter.com/nicboul/status/1126959264772259842
Signed-off-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: 0-day Robot <robot@bytheb.org>
  • Loading branch information
blp authored and ovsrobot committed May 10, 2019
1 parent d58b59c commit ba38cb4
Showing 1 changed file with 23 additions and 5 deletions.
28 changes: 23 additions & 5 deletions build-aux/dist-docs
Expand Up @@ -69,11 +69,29 @@ EOF
GROFF_NO_SGR=1 man -l -Tutf8 $manpage | sed 's/.//g' > $manpage.txt
(echo '<html><head><meta charset="UTF-8"></head><body><pre>'
GROFF_NO_SGR=1 man -l -Tutf8 $manpage | sed '
s/&/&amp;/g
s/</&lt;/g
s/>/&gt;/g
s,\(.\)\1,<b>\1</b>,g
s,_\(.\),<u>\1</u>,g'
# Change bold and underline via backspacing into bracketing with control
# characters. We cannot directly translate them to HTML because <> need
# to be escaped later. (We cannot escape <> first because bold or
# underlined escaped characters would be mis-processed.)
s,\(.\)\1,\1,g
s,_\(.\),\1,g
# Drop redundant font changes, to keep from having every character have
# a separate tag pair.
s,,,g
s,,,g
# Escape special characters.
s,&,\&amp;,g
s,<,\&lt;,g
s,>,\&gt;,g
# Translate control characters to HTML.
s,,<b>,g
s,,</b>,g
s,,<u>,g
s,,</u>,g
'
echo '</pre></body></html>'
) > $manpage.html

Expand Down

0 comments on commit ba38cb4

Please sign in to comment.