Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bullet list rendered as multiple <dl>s #67

Open
ilmari opened this issue Apr 10, 2017 · 11 comments

Comments

Projects
None yet
4 participants
@ilmari
Copy link

commented Apr 10, 2017

Each bullet in the lists in capabilities(7) is rendered as a separate definition list, like this:

<dl>
    <dd>*</dd>
    <dt>…</dt>
</dl>

Instead each list should be a single unordered list:

<ul>
    <li>…</li>
    …
</ul>
@stapelberg

This comment has been minimized.

Copy link
Contributor

commented Apr 13, 2017

This seems to be an upstream issue with mandoc:

$ mandoc -Thtml /usr/share/man/man7/capabilities.7.gz
[…]
<dl class="Bl-tag">
  <dt class="It-tag"><b>CAP_AUDIT_CONTROL</b> (since Linux 2.6.11)</dt>
  <dd class="It-tag">Enable and disable kernel auditing; change auditing filter
      rules; retrieve auditing status and filtering rules.</dd>
</dl>

Could you report it at http://mdocml.bsd.lv/contact.html please, or would you prefer if I relayed the report?

@lahwaacz

This comment has been minimized.

Copy link

commented Aug 27, 2017

This is an inherent problem of converting the old man(7) language to HTML. The snippet from capabilities(7) is written with the .IP macro as follows:

.IP * 2
Bypass file read permission checks and
directory read and execute permission checks;
.IP *
Invoke
.BR open_by_handle_at (2).

Literally speaking, the mandoc output is correct, because the .IP macro is intended for definition lists. The thing is that with * as a header, it looks exactly as bullet-point list in plain-text output, where the * is flushed into the left margin. The semantically correct solution is provided by the mdoc(7) language and its .Bl and .It macros.

To make existing manuals written in the man language more visually pleasing, I think it would be best to modify mandoc's HTML formatting to treat .IP *, .IP - etc. as unordered lists and produce the <ul> tags instead of <dl> tags. Alternatively you could do it in the post-processing phase or even style the dt and dd tags to appear on the same line, but you'd still need to recognize the bullet-definitions from other definitions.

(As for contacting mandoc upstream, their contact page says that messages on all three mailing lists are publicly visible, but there is no link to a viewer. Do you know it? I'd like to read some existing bug reports or discussions.)

@stapelberg

This comment has been minimized.

Copy link
Contributor

commented Aug 27, 2017

@ischwarze

This comment has been minimized.

Copy link

commented Sep 2, 2017

I agree there is room for improvement in mandoc, so i added an entry to my TODO list:

format ".IP *" etc. as <ul> rather than <dl>

I suspect that is feasible with a bit of heuristic inspection, but it's not completely trivial, so i'm not doing it right away, but i did mark it as relatively high priority: the impact is cosmetic, but the resulting ugliness is above average for a cosmetic issue.

In general, man(7) HTML formatting is less refined than mdoc(7) HTML formatting, and harder to implement nicely, but that's no excuse for not trying.

That said, i see a cosmetic issue with debiman as well. The upstream mandoc.css contains detailed CSS code to nicely format class "Bl-tag" lists, in particular to make sure that tags appear left of bodies if they fit, or above bodies otherwise - in fact, that's the part of mandoc.css that was hardest to tune. While debiman produces large amounts of CSS code - more than i would deem reasonable - this specific detail seems to be missing, resulting in ugly display of "Bl-tag" lists in general. In particular, the tags never seem to appear to the left of the respective body, not even if they are short.

@lahwaacz

This comment has been minimized.

Copy link

commented Sep 10, 2017

Very similar case is in systemd.environment-generator(7) where the list is written as

.RS 4
.ie n \{\
\h'-04'\(bu\h'+03'\c
.\}
.el \{\
.sp -1
.IP \(bu 2.3
.\}
Generators are executed sequentially in the alphanumerical order of the final component of their name\&. The output of each generator output is immediately parsed and used to update the environment for generators that run after that\&. Thus, later generators can use and/or modify the output of earlier generators\&.
.RE
.sp
.RS 4
.ie n \{\
\h'-04'\(bu\h'+03'\c
.\}
.el \{\
.sp -1
.IP \(bu 2.3
.\}
Generators are run by every manager instance, their output can be different for each user\&.
.RE
.PP

and the HTML version is

<div style="margin-left: 4.00ex;">•Generators are executed sequentially in
  the alphanumerical order of the final component of their name. The output of
  each generator output is immediately parsed and used to update the environment
  for generators that run after that. Thus, later generators can use and/or
  modify the output of earlier generators.</div>
<div style="height: 1.00em;"> </div>
<div style="margin-left: 4.00ex;">•Generators are run by every manager
  instance, their output can be different for each user.</div>
<div class="Pp"></div>

which still looks rather ugly.

@ischwarze

This comment has been minimized.

Copy link

commented Mar 2, 2019

I finally implemented this feature request in:
http://mandoc.bsd.lv/cgi-bin/cvsweb/man_html.c#rev1.173
The change will be contained in the next release, which will likely be called mandoc-1.14.5.

Here is an example with mandoc(1) from CVS HEAD:

$ mandoc -Thtml /co/linux-man-pages/man7/capabilities.7
<div class="Bd-indent">
<ul class="Bl-bullet">
  <li>Bypass file read permission checks and directory read and execute
      permission checks;</li>
  <li>invoke <b>open_by_handle_at</b>(2);</li>
  <li>use the <b>linkat</b>(2) <b>AT_EMPTY_PATH</b> flag to create a link to a
      file referred to by a file descriptor.</li>
</ul>
</div>
@ischwarze

This comment has been minimized.

Copy link

commented Mar 2, 2019

Very similar case is in systemd.environment-generator(7)

That isn't similar at all and i think putting it into the same bugtracking ticket is very misleading.

where the list is written as

.RS 4
.ie n \{\
\h'-04'\(bu\h'+03'\c
.\}
.el \{\
.sp -1
.IP \(bu 2.3
.\}
Generators are executed sequentially in the alphanumerical order of the final component of their name\&. The output of each generator output is immediately parsed and used to update the environment for generators that run after that\&. Thus, later generators can use and/or modify the output of earlier generators\&.
.RE
.sp
.RS 4
.ie n \{\
\h'-04'\(bu\h'+03'\c
.\}
.el \{\
.sp -1
.IP \(bu 2.3
.\}
Generators are run by every manager instance, their output can be different for each user\&.
.RE
.PP

That is man(7) code of such low quality that it is kind of a stretch to even call it "man(7)"; calling it "low-level roff(7) trickery" would be more to the point. Such low-level stuff definitely has no place in a manual page. People can't really expect to get input semantically translated to HTML when they rely on manual horizontal movements, moving left and right on the printing paper. HTML simply contains no facilities to represent such manual printing head movements, and a formatter has very little chance to guess what the semantic intention of the author might be.

Please report the manual page as broken upstream and tell upstream to properly use .IP macros and to not use \h escapes.

Mandoc rendering still is:

<div class="Bd-indent">&#x2022;Generators are executed sequentially in the
  alphanumerical order of the final component of their name. The output of each
  generator output is immediately parsed and used to update the environment for
  generators that run after that. Thus, later generators can use and/or modify
  the output of earlier generators.</div>

I don't see any reasonable way how this could be improved.

@lahwaacz

This comment has been minimized.

Copy link

commented Mar 2, 2019

That isn't similar at all

There is .IP \(bu, an approximately as common sequence as .IP *, which can be found even in GNU's roff(7) itself. If you handle .IP * specially, you might as well handle .IP \(bu. The other macros/escapes which are ignored in the HTML conversion don't make this case unsimilar.

bob-beck pushed a commit to openbsd/src that referenced this issue Mar 2, 2019

Represent multiple subsequent .IP blocks having a consistent
head argument of *, \-, or \(bu as <ul> rather than as <dl>,
using a bit of heuristics.

Basic idea suggested by Dagfinn Ilmari Mannsaker <ilmari at github>
in Debian/debiman#67 and independently by
<Pali dot Rohar at gmail dot com> on <discuss at mandoc dot bsd dot lv>.
@ischwarze

This comment has been minimized.

Copy link

commented Mar 2, 2019

Hi @lahwaacz ,

i agree that ".IP *" and ".IP \(bu" are similar, and that there is nothing wrong with using "\(bu" in manual pages, and indeed the patch i recently committed - see the "bob-beck pushed a commit to openbsd/src" right above - handles both.

What i meant with "isn't similar at all" was this horrible code from systemd.environment-generator(7):

\h'-04'\(bu\h'+03'\c

If you look closely, you will see that the ".IP \(bu 2.3" in that manual page is in an inactive .el clause: "ie n" is always true for manual pages (except when formatting with a real typesetter for PostScript or PDF output), so the .el clause is never entered.

@lahwaacz

This comment has been minimized.

Copy link

commented Mar 2, 2019

Oh, in that case you're right. On closer look, systemd seems to use xsltproc to generate their man pages from XML.

@ischwarze

This comment has been minimized.

Copy link

commented Mar 3, 2019

Hi @lahwaacz ,

On closer look, systemd seems to use xsltproc to generate their man pages from XML

... and more specifically, from DocBook 4.2:

https://github.com/systemd/systemd/blob/master/man/systemd-environment-d-generator.xml

So no wonder the output is crap. DocBook is by far the worst and lowest quality file format you can pick for documentation. It is absolutely notorious for generating abysmal man(7) output as well as for being full of bugs and almost unmaintained.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.