Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0064] New Documentation Format for nixpkgs and NixOS #64

Open
wants to merge 24 commits into
base: master
from

Conversation

@Infinisil
Copy link
Member

Infinisil commented Jan 5, 2020

Many people from the Nix community are interested in evaluating alternatives to DocBook as the documentation format for nixpkgs/NixOS.
Through the process of this RFC, a potentially new doc format will be decided.

In short, with this RFC a set of requirements for a doc format is decided, after which candidates that fulfil them are collected and evaluated. A public poll is then held allowing everybody to vote which formats they'd prefer. The result of this poll is then to be used by the shepherds to come to a final decision.

Rendered

Pinging some people who have shown interest in this topic and/or are knowledgeable in certain formats: @edolstra @grahamc @domenkozar @zimbatm @alyssais @peterhoeg @danbst @chreekat

@Infinisil Infinisil changed the title [RFC 0063] New Documentation Format [RFC 0064] New Documentation Format Jan 5, 2020
@edolstra

This comment has been minimized.

Copy link
Member

edolstra commented Jan 5, 2020

FWIW, I don't want to move away from DocBook; at least not to the listed alternatives, which are all worse than DocBook for technical documentation.

@Infinisil

This comment has been minimized.

Copy link
Member Author

Infinisil commented Jan 5, 2020

@edolstra As mentioned in the first paragraph, this is not the place for such discussions. If you have another format you'd prefer, feel free to suggest it, or feel free to write an objective overview of docbook for the RFC.

@grahamc

This comment has been minimized.

Copy link
Member

grahamc commented Jan 5, 2020

Thank you Silvan for moving forward on this project. However, I feel this RFC is too early: This RFC is jumping to solutions, when we don't have a handle on (or agreement on) the problem's requirements. I feel this RFC is more akin to a technical survey of available documentation technology, potentially answering many questions about each individual tool -- without an idea of how to concretely determine if they meet our requirements. The RFC I would hope to see is an exploration and documentation of what our project requires out of its documentation tooling.

@Infinisil

This comment has been minimized.

Copy link
Member Author

Infinisil commented Jan 5, 2020

I think all formats listed here are easily up to the task. If you have a concrete reason to believe one of them is infeasible, feel free to elaborate and we can remove it from the candidates if it's really a complete no-go. If in the end the chosen format is found to be infeasible, we will use the one on second place, and so on, as described in the RFC.

@grahamc

This comment has been minimized.

Copy link
Member

grahamc commented Jan 5, 2020

I'm glad you think they are all up to the task! If we started first by evaluating requirements, we might agree. Evidently, Eelco strongly disagrees. I think this project is much more likely to succeed if we start with some requirements first.

Edit: I would not expect you to put any in the list which you didn't think were up to the task. However, we've discussed the problems with asciidoctor's closure size and it remains in this list. We cannot remove candidates if we don't have agreement on what the requirements are. We cannot evaluate candidates any more than superficially if we don't have agreement on what the requirements are, either.

@FRidh

This comment has been minimized.

Copy link
Member

FRidh commented Jan 5, 2020

I agree with @grahamc we should list requirements first. Additionally, an RFC should specify which projects' documentation it would cover because the different projects have, although there's overlap, their own communities.

@zimbatm

This comment has been minimized.

Copy link
Member

zimbatm commented Jan 5, 2020

Just a reminder that a RFC's value is also in the associated discussion. It doesn't have to be perfect and can also dramatically change in it's form. @Infinisil thanks for raising a subject which has been a recurring topic in the community.

@Infinisil

This comment has been minimized.

Copy link
Member Author

Infinisil commented Jan 5, 2020

Sounds good, let's start with requirements and decide on candidates based on those. I'll start with some:

  • Supports standard stuff like bold, italics, code, links, references (all formats support this)
  • Inter-file references (I'm not sure if all formats support this)

I personally don't think small closure size is a hard requirement, as there's no way to define "small", and it can change over time too. I also think all formats have a passable closure size as of now. Asciidoctor is about 1GB, but Antora which can also handle Asciidoc is only like 300MB.

Edit: Also note that nixpkgs currently uses Pandoc to process markdown, which is over 2GB in closure size.

@grahamc

This comment has been minimized.

Copy link
Member

grahamc commented Jan 5, 2020

I think the requirement gathering is significant enough to warrant its own RFC (and as @zimbatm pointed out, that could be this one), and quite probably a community video call about it.

In terms of closure size, I think the closest we can get to 50MB the better. For one datapoint, using jing in the XML build process was no good at its existing size, and I'm pretty sure that was quite a bit smaller than 300MB.

My list of requirements would be something like:

  • errors while authoring documentation are trivial to spot, and don't require a debugger
  • has an existing ecosystem of tools which already:
    • produces searchable, multi-page documentation as HTML
    • produces man pages
    • supports translations at some level (ie: internationalization)
    • takes 10 seconds or less to generate the full documentation set on a laptop, 1 minute or less on a raspberri pi 3b+
      • ideally, supports partial documentation rebuilding or even live-reloading for faster iteration

A nice-to-have would be: rendered well in GitHub's "visual diff".

@jtojnar

This comment has been minimized.

Copy link

jtojnar commented Jan 5, 2020

Some features I like to use in our current toolchain:

  • Annotations/Links inside code (e.g. linking options in configuration snippets)
  • Ability to make arbitrary command line prompt in code listings non-copyable (see docbook.rocks example, I always hate it when I coppy multiline bash snippet and then have to delete $)
  • Create anchors anywhere (if not inside paragraphs, at least for list items)
  • Automatic link labels (again, for options)
@FRidh

This comment has been minimized.

Copy link
Member

FRidh commented Jan 5, 2020

Additionally, an RFC should specify which projects' documentation it would cover because the different projects have, although there's overlap, their own communities.

To clarify my point here. I don't think the whole Nix/NixOS/Nixpkgs community should decide for e.g. for Nix or Hydra or NixOps what the docs should be like. We have different projects composed of different contributors and within those groups of contributors you have a subset that writes docs. Let the projects decide this for themselves.

Furthermore, I think that if we would decide on a process for the format, we should also reconsider exactly what manuals we want to have. There have often been talk about user manuals, tutorials, reference manuals. This is something that I think adds value.

@Infinisil

This comment has been minimized.

Copy link
Member Author

Infinisil commented Jan 5, 2020

@FRidh I think we should at least have the same format for all of nixpkgs including NixOS, otherwise we won't get much out of it, since nixpkgs gets the majority of use and contributions.

Nix itself would benefit from the same format as nixpkgs too, at least for the builtins.* docs defined in it, which we could then easily include in the lib.* nixpkgs docs, which is where a lot of builtins.* are reexported.

Also to consider is that many projects define NixOS options, which are all processed and rendered in a single evaluation, so it would simplify things if all had the same format. However I think with some better doc tooling we should also be able to allow projects to choose their own format for options, even if they're rendered at the same time as the NixOS ones.

@Infinisil

This comment has been minimized.

Copy link
Member Author

Infinisil commented Jan 5, 2020

For now I think keeping it to nixpkgs only (including NixOS) should be good.

@Infinisil Infinisil changed the title [RFC 0064] New Documentation Format [RFC 0064] New Documentation Format for nixpkgs and NixOS Jan 5, 2020
@Shados

This comment has been minimized.

Copy link

Shados commented Jan 6, 2020

@Infinisil For what it's worth, markdown does have a standardized specification and test suite in CommonMark, which is fairly near a finalised 1.0 release.

@Infinisil

This comment has been minimized.

Copy link
Member Author

Infinisil commented Jan 6, 2020

@Shados However even mentioned in the page you linked, the original Markdown does not have a standard. But yeah, CommonMark would almost certainly be the standardized markdown to use because of what you said, and I also just learned that it's what Sphinx supports (in addition to reST), and that GitHub's markdown is a strict superset of CommonMark and they are intending to have full 1.0 conformance. I'll change the RFC to reflect that CommonMark would be the Markdown to use for these reasons.

Infinisil added 2 commits Jan 6, 2020
@zimbatm

This comment has been minimized.

Copy link
Member

zimbatm commented Jan 6, 2020

A few links to existing discussions so we don't go over the same arguments again:

Infinisil added 3 commits Jan 6, 2020
- Inter-file references for being able to link to options from anywhere
- Ability to create link anchors to most places such that we can link to e.g. paragraphs
- Errors are easily and quickly detectable, e.g. with a fast and good processor, a live-view, or highlighting editor plugins
- Is decently fast to fully generate, in the range of 10 seconds for the full documentation on an average machine

This comment has been minimized.

Copy link
@domenkozar

domenkozar Jan 7, 2020

Member
  • supports syntax highlighting (of Nix)
  • wide editor integration (this could be added to the errors bullet point)
  • active community supporting the tooling infrastructure
  • good conversion story from docbook
  • ideally a good search integration
  • low work cost for integration (this one is really important as docbook fails hard)

This comment has been minimized.

Copy link
@Infinisil

Infinisil Jan 7, 2020

Author Member

Nice, what do you mean by the last point though? That we don't need to write too many customizations and supporting code to make it work?

- A [Discourse](https://discourse.nixos.org/) post is created with these overviews, along with a poll such that people can vote on the formats they prefer. This poll will be open to the whole community and should be advertised as such
- Whatever format wins in the poll is chosen as the new default documentation format. If later it is discovered that the winner is infeasible for any reason, e.g. if it doesn't meet the requirements after all, the format on second place is chosen instead, and so on.

## Poll

This comment has been minimized.

Copy link
@domenkozar

domenkozar Jan 7, 2020

Member

I think poll is too vague, people should discuss on this RFC exactly what they'd like and then the shepherd team makes a decision.

This comment has been minimized.

Copy link
@Infinisil

Infinisil Jan 7, 2020

Author Member

I think it's important to involve as many people from the community as possible, because they will be the ones writing and reading the docs. Unfortunately RFC's are known to be bad places for involving many people, since they tend to get long quickly and less-active people get overshadowed by very vocal people.

As you suggested to me though, having the poll happen before the RFC is accepted and using it as an input for the shepherd team might be a good idea.

This comment has been minimized.

Copy link
@Infinisil

Infinisil Jan 11, 2020

Author Member

I changed the RFC to reflect this

@nixos-discourse

This comment has been minimized.

Copy link

nixos-discourse commented Jan 7, 2020

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/documentation-format/4650/14

@emilazy

This comment has been minimized.

Copy link
Member

emilazy commented Mar 5, 2020

@emilazy would you feel up to filling in the overview for AsciiDoc in the RFC? You seem to have experience with it.

Not sure I'll have the time, but I'll try and take a look at some point soon.

@domenkozar

This comment has been minimized.

Copy link
Member

domenkozar commented Mar 5, 2020

We would like to propose @domenkozar as a leader for this RFC.

Since I'm being loud about this RFC making a huge difference, I'll put time where my mouth is and accept :) Thank you.

Add PowerDNS as ReST user example
@edolstra edolstra mentioned this pull request Mar 19, 2020
10 of 10 tasks complete
@nlewo

This comment has been minimized.

Copy link
Member

nlewo commented Mar 20, 2020

I had the opportunity to experiment the conversion of the NixOS manual to ReST with Sphinx extensions.

The goal is to explore the feasibility of a such conversion and to show how it could look like, in order to help us to choose a documentation format.

To convert DocBook files to ReST, I first tried Pandoc but a lot of
DocBook directives were removed during the conversion. The Sphinx
documentation points to a Python script to convert a DocBook document to a ReST (with Sphinx extensions) document by using lxml (I had to patch this script to improve the conversion).

Note the Sphinx (the tool to generate the doc from ReST files) closure
size is currently 177MB.

Resources

The good

  • ReST is simple
  • ReST is rendered by GitHub: for trivial documentation changes, a
    reviewer doesn't need to generate the manual since the GitHub
    preview would be sufficient
  • Sphinx detects some inconsistencies in the document (dangling
    pointers for instance)
  • Sphinx is written in Python which is a language already used by
    the community (by the new NixOS test system)
  • Manpages can be written in ReST and are generated by Sphinx
  • Sphinx can generate XML for the next documentation conversion;)
  • Emacs ReST mode works well by default
  • Sphinx is able to generate source code documentations. Maybe we
    could develop a Sphinx extension to generate the documentation for
    our standard nixpkgs library: this however needs to be explored
  • Sphinx provides a builtin search engine (edited thx to Mic92 comment)
  • Online version works well without javascript enabled

Issues/Improvements

  • Implement a Sphinx cache: it takes about 5min (with a CPU i7-8565U) to generate the documentation with all NixOS options. By only considering the 100 first options, it takes 15sec. It would then be nice to generate a cache for the option documentation in order to speedup build time of the hand written part to make contributions more convenient
  • Improve the ReST output: I didn't really try to improve the generated ReST files. For instance, paragraph are not well truncated at 80 characters
  • DocBook callouts: these are not supported yet (and I didn't try to)
  • To generate the nixos-rebuid manpage, the ReST file generated by the conversion script has been manually post-processed (~20min)
@toonn

This comment has been minimized.

Copy link

toonn commented Mar 20, 2020

Could you clarify the following point? The part about the module system is news to me.

  • Sphinx is written in Python which is a language already used by
    the community (by the new NixOS module system)
@nlewo

This comment has been minimized.

Copy link
Member

nlewo commented Mar 20, 2020

@toonn Sorry, it's a mistake. I wanted to mention the new "NixOS test system" which is now in Python instead of Perl. I fixed my comment.

@nixos-discourse

This comment has been minimized.

Copy link

nixos-discourse commented Mar 22, 2020

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/comment-doc-praise-thread/6325/3

@domenkozar

This comment has been minimized.

Copy link
Member

domenkozar commented Mar 23, 2020

@nlewo excellent work :)

@Mic92

This comment has been minimized.

Copy link

Mic92 commented Mar 23, 2020

The good

  • sphinx provides a builtin search engine
@Mic92

This comment has been minimized.

Copy link

Mic92 commented Mar 23, 2020

@domenkozar Has this RFC already had a meeting?

@domenkozar

This comment has been minimized.

Copy link
Member

domenkozar commented Mar 23, 2020

Not yet, giving it a bit more time. I'd like to see more tooling compared before we start talking.

@nlewo

This comment has been minimized.

Copy link
Member

nlewo commented Mar 23, 2020

@Mic92 Thanks. I edited my comment to add this item.

@emilazy

This comment has been minimized.

Copy link
Member

emilazy commented Mar 23, 2020

I'd like to see more tooling compared before we start talking.

FWIW, I was idly planning on trying a scratch conversion of the manual to AsciiDoc with DocBookRx or similar soon; @nlewo's great work on reST has made me mentally bump the priority of that up for comparison purposes (no promises though).

(That said, I definitely don't mean this in a competitive way – the conversion looks great and I'm confident reST would be an improvement over the status quo. "Anything but DocBook or Markdown" easily gets my vote!)

@nixos-discourse

This comment has been minimized.

Copy link

nixos-discourse commented Mar 24, 2020

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/documentation-presentation-navigation/6401/4

@adisbladis

This comment has been minimized.

Copy link
Member

adisbladis commented Mar 24, 2020

Like @nlewo I was also experimenting with documentation formats with the same goal but I picked a different one, namely Asciidoc.

The total closure size of Asciidoctor (used to convert from asciidoc to man pages, html & epub) is 142M.

The work can be found in the asciidoc branch on my Nixpkgs fork on Github.

Introduction

Asciidoc is a lightweight markup language.
Unlike most of these other lightweight formats Asciidoc supports structural elements necessary for writing technical documentation.

Notes

I started this effort using Pandoc but it turns out it's woefully inadequate at converting either from Docbook, to Asciidoc or both.

So we've packaged DocBookRx & Kramdown AsciiDoc which in combination gets us pretty close to a workable conversion, though even these tools are not able to fully automatically do the conversion so we have a custom post-processor that touches up some of the shortcomings in the documentation generator.

This Python script should be dropped (via git rebase) before merging this work.

There is another Python script for man page conversion as this was not supported by any of the tools used.

The good

  • Building nixpkgs docs takes well under a second - even on my laptop

  • The documentation's build closure size is now 142M (thanks to work by @alyssais to reduce asciidoctor closure size)

  • The build machinery is much simplified

  • Asciidoc is more intuitive than DocBook
    This is more of a personal preference than anything else

  • Asciidoctor has good support for syntax highlighting
    Asciidoctor has support for a wide variety of highlighters:

    • Rouge - My current choice, "just works" and has good support for Nix
    • Pygments - One of the widest language support highlighters out there, in Python though so we may want to steer clear for closure size reasons.
    • coderay, prettify, highlight.js - Also supported but dont feel we need to get into these.
  • Has first-class support for epub

  • Composes nicely via includes

  • Supported & rendered nicely by Github

The bad

  • While converting the docs I've noticed a huge value in the old docs being XML (or some other well-understood format)
    It's trivial to parse an XML file using custom tooling, the same cannot be said for asciidoc (or other "human-friendly" formats).

  • Custom conversion tooling may have missed something
    As most data loss I've seen in doc conversion so far has been silent it's possible we're losing something important.
    Care needs to be taken and things manually checked.

  • No support for inline Markdown
    Unlike in DocBook where we can import Markdown files pretty seamlessly we have to convert Markdown to asciidoc.
    This can be seen both as a positive & a negative depending on your perspective.

Samples

Pre rendered samples can be found at:

You can also take a look directly at the adoc files in the Github UI.

Note that the samples are not 100% complete and has rendering issues, this should serve as a demo and is not a final polished implementation.

@domenkozar

This comment has been minimized.

Copy link
Member

domenkozar commented Mar 24, 2020

On top of that we could use https://antora.org/ to aggregate all projects under a single docs.nixos.org and a single search!

Awesome work y'all :D

@toonn

This comment has been minimized.

Copy link

toonn commented Mar 24, 2020

Isn't python already part of the nixos closure? nlewo, mentioned it's used by the new test infrastructure?

@emilazy

This comment has been minimized.

Copy link
Member

emilazy commented Mar 24, 2020

I'm very happy to see full conversions for my two favourite mainstream lightweight markup languages for comparison!

It would be great to get numbers for NixOS whole system closure size differences relative to the DocBook status quo; it looks like AsciiDoctor and Sphinx should be competitive in terms of closure size, but Python already being present on most systems by default might affect that.

@emilazy

This comment has been minimized.

Copy link
Member

emilazy commented Mar 24, 2020

On top of that we could use https://antora.org/ to aggregate all projects under a single docs.nixos.org and a single search!

This looks great! An example of it being used in the wild: https://apple.github.io/servicetalk/servicetalk/SNAPSHOT/index.html

Interestingly, this uses Asciidoctor.js, a compilation of the AsciiDoctor Ruby source to JavaScript; it might be worth taking a look at packaging that with node.js and comparing its closure size, too.

@michaelpj

This comment has been minimized.

Copy link

michaelpj commented Mar 25, 2020

Having used Asciidoc a bit, I disagree with some of the points.

Has first-class support for epub

I would say it's second-class at best. I spent a good week or so trying to making this work, and I found numerous bugs in the plugin that does the conversion. The lack of a semantic document model means that the implementation is a fragile hack that relies on introspecting the internals of the parser.

Composes nicely via includes

It doesn't really "compose" so much as "smash together". The includes are textual includes, with all that entails. This means that e.g. getting multi-page structures out is hard because the processor doesn't even know that things come from different files.

Note that the Sphinx documentation is naturally shown on multiple pages, but for Asciidoc you need a third-party plugin (which doesn't even support the newest version), or Antora.

On top of that we could use https://antora.org/ to aggregate all projects under a single docs.nixos.org and a single search!

My experience of trying to use Antora was that it was much less polished than Sphinx, and has to do a lot of non-standard things to Asciidoc to get around its limitations, e.g. the lack of a semantic document model really hurts when trying to inter-file links.

Interestingly, this uses Asciidoctor.js, a compilation of the AsciiDoctor Ruby source to JavaScript

For bonus points: the plugins that I mentioned are for the Ruby version of Asciidoctor so won't even work with Asciidoctor.js.

Overall, I would not use it again and I would try reST next time.

@joepie91

This comment has been minimized.

Copy link

joepie91 commented Mar 25, 2020

  • While converting the docs I've noticed a huge value in the old docs being XML (or some other well-understood format)
    It's trivial to parse an XML file using custom tooling, the same cannot be said for asciidoc (or other "human-friendly" formats).

This is not an entirely fair comparison; while the structural parsing is handled by an off-the-shelf XML parser, the format parsing is not. You still need to write an implementation that understands the format (DocBook, in this case) to meaningfully operate on the data.

Especially considering the relative verbosity of the format, that is not necessarily an easier task than even writing an AsciiDoc parser from scratch, using something like a PEG parser generator.

@asymmetric

This comment has been minimized.

Copy link

asymmetric commented Mar 25, 2020

@adisbladis a couple of issues I've found:

  • the manual is rendered 3 times
  • some links to other sections are rendered as ???

This seems to confirm what you were saying that errors are silent.

@domenkozar

This comment has been minimized.

Copy link
Member

domenkozar commented Mar 25, 2020

You need to use a link validator with asciidoc, it doesn't do validation by default. Sphinx is better in that sense.

@FRidh

This comment has been minimized.

Copy link
Member

FRidh commented Mar 28, 2020

Good to see these experiments being done.

As announced on NixOS weekly I put together some tutorials that can be used interactively. This lead me to a Sphinx package called ThebeLab that provides the possibility to include "code cells" which can be interactive, that is, after pressing "Activate" a Docker container is started allowing users to run and modify the code. This is especially interesting for newcomers. Its already possible to make it use Nix (it's how my tutorials work).

@domenkozar domenkozar mentioned this pull request Apr 1, 2020
@nixos-discourse

This comment has been minimized.

Copy link

nixos-discourse commented Apr 1, 2020

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/tweag-nix-dev-update/6525/1

@Mic92 Mic92 mentioned this pull request Apr 2, 2020
10 of 10 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

You can’t perform that action at this time.