Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add link to manpages in HTML builder #4235

Merged
merged 1 commit into from
Jan 13, 2018

Conversation

anarcat
Copy link
Contributor

@anarcat anarcat commented Nov 11, 2017

  • Feature

  • Create actual HTML links (<a href>) when the :manpage: role is
    used in RST markup.

  • Requires a new configuration setting, but is backwards-compatible
    with older configuration files.

It seems way more useful to have the HTML documentation builder
actually link to real rendered versions of HTML manpages in its
output. That way people can click on manpages to get the full
documentation. There are a few services offering this online, so we do
not explicitly enable one by default, but the Debian manpages
repository has a lot of them.

No unit tests because I couldn't quite figure out where to fit that
in.

Copy link
Member

@tk0miya tk0miya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me for your concept.
Could you add testcase for this please?

doc/config.rst Outdated
@@ -295,6 +295,14 @@ General configuration

.. versionadded:: 1.3

.. confval:: manpages_url_pattern
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel manpages_url is simpler. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't care - but i felt that "pattern" implied it was a bit more than just a prefix...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't opposite, could you rename this please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. done.

doc/config.rst Outdated
.. confval:: manpages_url_pattern

A URL to cross-reference manpages directives. If this is defined to
``https://manpages.debian.org/%s``, the ``:manpage:ls(1)`` role
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, manpages.org uses only the name part (without section) like http://manpages.org/ls .
So I propose you to use named format arguments like {full}, {name} and {section}.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, it would be very useful to add a hyperlink to here onto an introduction of :manpage: role.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought about doing a tuple - but how do you parse the manpage? Sometimes people don't put a section, sometimes it's page.section or page(section), it's not really well normalized.

But if you think that's important, i'll try to work on a regex and switch to format-style patterns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, so I got this now:

    def visit_manpage(self, node):
        # type: (nodes.Node) -> None
        self.visit_literal_emphasis(node)
        if self.manpages_url_pattern:
            manpage = ' '.join([str(x) for x in node.children
                                if isinstance(x, nodes.Text)])
            pattern = r'^(?P<full>(?P<name>.+)[\(\.](?P<section>[1-9]\w*)?\)?)$'
            r = re.match(pattern, manpage)
            if r:
                info = r.groupdict()
            else:
                info = {'full': manpage}
            node['refuri'] = self.manpages_url_pattern.format(**info)
            self.visit_reference(node)

but my concern is that I have to copy this over the html5.py builder again, and then this will be duplicated all over the place. what would be a better place to keep that functionality?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work. I like it :-)

what would be a better place to keep that functionality?

I think this is a kind of transform module.

# in sphinx/transform/__init__.py
class ManpageTransform(SphinxTransform):
    def apply(self):
        for node in self.document.traverse(addnodes.manpage):
            manpage = ' '.join([str(x) for x in node.children
                                if isinstance(x, nodes.Text)])
            pattern = r'^(?P<full>(?P<name>.+)[\(\.](?P<section>[1-9]\w*)?\)?)$'
            r = re.match(pattern, manpage)
            if r:
                info = r.groupdict()
            else:
                info = {'full': manpage}
            for key, value in info.items():
                node[key] = value

# => Add this transform class to the transforms list in sphinx/io.py

This allows you to access name, section and full as attributes of nodes.
And it also make writers simple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see, done.

sphinx/config.py Outdated
@@ -123,6 +123,7 @@ class Config(object):
primary_domain = ('py', 'env', [NoneType]),
needs_sphinx = (None, None, string_classes),
needs_extensions = ({}, None),
manpages_url_pattern = (None, 'html'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be effect not to only html, but also all builders (in future). For example, it would be nice if LaTeX builder also support the hyperlinks. ^So last argument should be 'env'. It means the update of settings effects to all builders.

Copy link
Contributor Author

@anarcat anarcat Nov 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah. okay. done.

@@ -117,6 +117,9 @@ html_static_path = ['{{ dot }}static']
#
# html_sidebars = {}

# Cross-reference :manpage: roles to actual sites.
#
# manpage_url_pattern = 'https://manpages.debian.org/%s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make default conf.py simple, please do not add new comment-block her.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, what do you mean? should i add it elsewhere? or not at all? i added this because it says that should be done on top of the Config class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry. Please forget the comment of Config class. I will change it later.
Current conf.py contains a lot of comments. I think it confuses users. So we have a plan to reduce these comments from conf.py.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove this please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. it wasn't clear from your comment you actually wanted it removed (or if you would remove it yourself at a later point).

@@ -79,6 +79,7 @@ def __init__(self, builder, *args, **kwds):
self.highlightopts = builder.config.highlight_options
self.highlightlinenothreshold = sys.maxsize
self.docnames = [builder.current_docname] # for singlehtml builder
self.manpages_url_pattern = getattr(builder.config, 'manpages_url_pattern')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After adding an entry to sphinx.config, you can access it always. So you don't need to use getattr here.
BTW, why are you storing the value of manpage_url_pattern as an instance variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because that's the pattern used for other configuration, like highlightopts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and the reason why i'm using getattr is so that older configuration files keep working, but maybe that's not necessary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's not necessary. It is always enabled even if with old config-files, because you'd added it to sphinx.config.

because that's the pattern used for other configuration, like highlightopts.

I see, either is okay to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, fixed.

manpage = ' '.join([str(x) for x in node.children
if isinstance(x, nodes.Text)])
node['refuri'] = self.manpages_url_pattern % manpage
self.visit_reference(node)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you tell me the result if manpage_url_pattern enabled?
It seems this visitor method call both visit_literal_emphasis and visit_reference. So I can't guess it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This:

:manpage:`hdparm(8)`

... gets turned in:

<em class="manpage"><a class="manpage reference external" href="https://manpages.debian.org/hdparm(8)">hdparm(8)</a></em>

i figured i would keep the existing emphasis style.

@tk0miya tk0miya added this to the 1.7 milestone Nov 12, 2017
@anarcat
Copy link
Contributor Author

anarcat commented Nov 12, 2017

Could you add testcase for this please?

See, that's where I'm stuck. Where should i add it?

@anarcat
Copy link
Contributor Author

anarcat commented Nov 12, 2017

i re-rolled a new patch with improved pattern support and documentation. i would need help to figure out where to put those eventual tests and how to avoid code duplication.

doc/config.rst Outdated

A URL to cross-reference :rst:role:`manpage` directives. If this is
defined to ``https://manpages.debian.org/{full}``, the
``:manpage:man(1)`` role will like to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use

:literal:`:manpage:`man(1)``

instead. It allows to represent role notations correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's quite a mouthful! but okay, done. :)

@@ -117,6 +117,9 @@ html_static_path = ['{{ dot }}static']
#
# html_sidebars = {}

# Cross-reference :manpage: roles to actual sites.
#
# manpage_url_pattern = 'https://manpages.debian.org/%s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry. Please forget the comment of Config class. I will change it later.
Current conf.py contains a lot of comments. I think it confuses users. So we have a plan to reduce these comments from conf.py.

@@ -79,6 +79,7 @@ def __init__(self, builder, *args, **kwds):
self.highlightopts = builder.config.highlight_options
self.highlightlinenothreshold = sys.maxsize
self.docnames = [builder.current_docname] # for singlehtml builder
self.manpages_url_pattern = getattr(builder.config, 'manpages_url_pattern')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's not necessary. It is always enabled even if with old config-files, because you'd added it to sphinx.config.

because that's the pattern used for other configuration, like highlightopts.

I see, either is okay to me.

@tk0miya
Copy link
Member

tk0miya commented Nov 13, 2017

See, that's where I'm stuck. Where should i add it?

Please add testcase to tests/test_build_html.py.
test_numfig_with_numbered_toctree is a good example for you. It modifies configuration using confoverrides parameter of @pytest.mark.sphinx, and verifies the generated HTML with @pytest.mark.parametrize decorator.

@anarcat
Copy link
Contributor Author

anarcat commented Nov 13, 2017

alright, i pushed a new patch with the recommended changes. next step is for me to figure out unit tests.

@anarcat anarcat force-pushed the manpage-links branch 2 times, most recently from 48b83c9 to d7f8552 Compare November 13, 2017 20:25
@anarcat
Copy link
Contributor Author

anarcat commented Nov 13, 2017

i think i got the test suite right. i had some weird failures elsewhere in the test suite, but i suspect this may be my local environment that was broken... we'll see what travis says.

i'm not sure what the second argument in the xpath checks is. i left it empty, but i have tried man(1) and :manpage:... to no avail. any ideas?

@anarcat
Copy link
Contributor Author

anarcat commented Nov 13, 2017

whoa, and the test suite actually passes now. woot! :)

Copy link
Member

@tk0miya tk0miya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nits

doc/config.rst Outdated
@@ -295,6 +295,14 @@ General configuration

.. versionadded:: 1.3

.. confval:: manpages_url_pattern
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't opposite, could you rename this please?

doc/config.rst Outdated

* ``manpage_name`` - the manpage name (``man``)
* ``manpage_section`` - the manpage section (``1``)
* ``manpage_full`` - the whole string (``man(1)``)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel manpage_ suffix is a bit verbose. It is obvious that these variables are related with manpage, because this settings is used for URL of manpages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, but internally, the node parameter would be ambiguous otherwise. since i just pass the whole node attributes to the formatter, i figured it was better to have unambiguous parameters.

now, i could expand that formatter to have an explicit dictionary with only those values, but that didn't seem very useful. i will do that, however, if you insist on changing the formatter...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel manpage_ prefix is also too verbose for node attributes. manpage_node['name'] and manpage_node['section'] are much simpler for me.
What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

er. well i explained this earlier: i thought having a name parameter in the node would be ambiguous. but i don't know how the parsers work - if you're saying it's unambiguous i'm happy to fix this!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. Usually, we know the type of nodes when we manipulate them. For example, ManpageLink transform searches manpage node using node.traverse(). And each visit_* methods of writer also know the type of node.
So I feel manpage_name of manpage node is verbose. We already know the node represents a reference for manpage.
I understand any_node['name'] is ambiguous. But we would not access in such way.

In addition, I feel the format params of manpages_url is verbose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understood, i'll fix this then! :)

@@ -117,6 +117,9 @@ html_static_path = ['{{ dot }}static']
#
# html_sidebars = {}

# Cross-reference :manpage: roles to actual sites.
#
# manpage_url_pattern = 'https://manpages.debian.org/%s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove this please?

doc/config.rst Outdated

This also supports manpages specified as ``man.1``.

.. versionadded:: 1.7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mention to this confval at an introduction of :manpage: role.
http://www.sphinx-doc.org/en/stable/markup/inline.html#role-manpage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation only effect to HTML output. So we should mention users the limitation (or rename this to html_manpages_url).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a note.

@anarcat
Copy link
Contributor Author

anarcat commented Nov 15, 2017

i believe i fixed all your remaining concerns, except the manpage_ prefix in the node attributes and formatter. let me know if you still think that's important to change.

@shimizukawa
Copy link
Member

FYI. The code freeze day of 1.7 will come in a week.

Mid of Jan: Feature freeze, Call for translation and release 1.7b1
#4211 (comment)

@anarcat
Copy link
Contributor Author

anarcat commented Jan 7, 2018

i remerged with master here to resolve conflicts, and I believe that I addressed all the concerns that were formulated here by @tk0miya - so i'm not sure what the next step is here.

@tk0miya
Copy link
Member

tk0miya commented Jan 8, 2018

@anarcat Please check the result of Travis CI and Circle CI. It seems many tests are failed. I will review it after resolved.

sphinx/config.py Outdated
@@ -125,6 +125,7 @@ class Config(object):
primary_domain = ('py', 'env', [NoneType]),
needs_sphinx = (None, None, string_classes),
needs_extensions = ({}, None),
manpages_url_pattern = (None, 'env'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the error comes from here. This is not renamed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@anarcat
Copy link
Contributor Author

anarcat commented Jan 10, 2018

@anarcat Please check the result of Travis CI and Circle CI. It seems many tests are failed. I will review it after resolved.

i pushed another fix and conflict resolution, let's see what CI thinks.

i'd also like to hear from you about the ambiguity question...

@anarcat
Copy link
Contributor Author

anarcat commented Jan 10, 2018

@tk0miya CI passes, this is ready for a final review. i think the only question is the manage_ prefix in the node object, if it is redundant or if just having e.g. name there would be ambiguous.

thanks!

It is useful to have the HTML documentation builder actually link to
real rendered versions of HTML manpages in its output. That way people
can click on manpages to get the full documentation. There are a few
services offering this online, so we do not explicitly enable one by
default, but the Debian manpages repository has a lot of the manpages
pre-rendered, so it is used as an example in the documentation.

The parsing work is done by a transformer class that parses manpage
objects and extract name/section elements. Those then can be used by
writers to cross-reference to actual sites. An implementation is done
in the two HTML writers, but could also apply to ePUB/PDF writers as
well in the future.

This is not enabled by default: the `manpages_url` configuration item
needs to be enabled to point to the chosen site. The `page`, `section`
and `path` parameters are expanded through Python string formatting in
the URL on output.

Unit tests are fairly limited, but should cover most common use-cases.
@anarcat
Copy link
Contributor Author

anarcat commented Jan 11, 2018

okay, so I did the following renames:

  • manpage_name is now page, to fit with the naming convention in man(1)
  • manpage_section is now section
  • manpage_full is now path because full seemed strange without context

CI seems to pass, so I think this would be ready to merge if you agree with the new naming.

Thanks for the prompt review!

Copy link
Member

@tk0miya tk0miya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for update :-)

@tk0miya tk0miya merged commit 7e365d9 into sphinx-doc:master Jan 13, 2018
@tk0miya
Copy link
Member

tk0miya commented Jan 13, 2018

Merged. Thank you for your contribution!

tk0miya added a commit that referenced this pull request Jan 13, 2018
@anarcat anarcat deleted the manpage-links branch January 13, 2018 22:38
@anarcat
Copy link
Contributor Author

anarcat commented Jan 13, 2018

yay, thanks!

keszybz added a commit to systemd/python-systemd that referenced this pull request Jun 16, 2019
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 7, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants