Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lightweight enhancements to documentation syntax #28378

Closed
risc26z opened this issue Apr 16, 2024 · 14 comments
Closed

Lightweight enhancements to documentation syntax #28378

risc26z opened this issue Apr 16, 2024 · 14 comments
Labels

Comments

@risc26z
Copy link

risc26z commented Apr 16, 2024

Problem

I recently found a beautiful pdf version of Vim's documentation, and wondered what it would take to update it for Neovim. As far as I can tell, it would be a lot of work, and would be a maintenance nightmare, as every change to the master help files would need to be mirrored in the LaTeX sources.

Reading #329 and a few other issues, I see that others have already given vimdoc's limitations some thought. The problem is that, as a format, it's not rich enough to facilitate good-quality conversion to other formats. It's also clear that replacing it is not really realistic.

Expected behavior

Rather than approaching the gigantic task of moving Neovim's documentation to another format, I wonder if I might make an alternative suggestion: add some lightweight enhancements to vimdoc's syntax, which by itself is lacking in structural markup.

It occurs to me that any such markup could be layered over a 'comment' syntax, much like Doxygen. All the builtin help viewer needs to do with such markup is ignore it.

A program written to convert documentation to HTML or LaTeX, on the other hand, could make use of the markup. This would provide a way to incrementally enhance Neovim's documentation to support better online browsing or produce beautiful typeset manuals (which could be an additional source of funding for the project).

Simple markup examples could be: mark the beginning and end of ASCII art (and alternative art for converters), indicate that the following table should have invisible borders or right-justified columns, etc.

Any thoughts?

@risc26z risc26z added the enhancement feature request label Apr 16, 2024
@justinmk
Copy link
Member

justinmk commented Apr 16, 2024

from #329 (comment) :

We have a parser for vimdoc (:help) file format now, and it is used by gen_help_html.lua to generate HTML. It would be trivial to generate other formats like markdown.

gen_help_html.lua can easily be modified to produce a PDF. Just waiting for someone to take a few hours to do that. A proof of concept was mentioned at neovim/doc#14

add some lightweight enhancements to vimdoc's syntax, which by itself is lacking in structural markup.

We have already done that, e.g.:

  • codeblocks support language annotations. (TODO: update :help help-codeblock to mention this)
  • list items are supports

vimdoc's syntax, which by itself is lacking in structural markup.

Not sure what you mean. The vimdoc parser recognizes list items and "blocks" (as opposed to "inline" text).

The major missing piece (compared to markdown) is support for "tables".

A program written to convert documentation to HTML or LaTeX, on the other hand, could make use of the markup

gen_help_html.lua exists and is used to produce https://neovim.io/doc/user/

Closing this since it lacks specifics. And most of the suggestions are covered already.

PDF generation is tracked in neovim/doc#14

@risc26z
Copy link
Author

risc26z commented Apr 16, 2024

gen_help_html.lua can easily be modified to produce a PDF. Just waiting for someone to take a few hours to do that. A proof of concept was mentioned at neovim/doc#14

There's a big difference between a pdf and a beautiful pdf. I tried the proof of concept you mentioned. Depending on the page, the output varied between adequate and terrible. It's poor compared to the link I provided above, and even that isn't publication-quality.

We have already done that, e.g.:

  • codeblocks support language annotations. (TODO: update :help help-codeblock to mention this)
  • list items are supports

vimdoc's syntax, which by itself is lacking in structural markup.

Sorry, I should have written "lacking in sufficient structural markup."

There's no support for alternative renderings of ASCII art: compare the 'hjkl' diagram in usr_02.txt with page 7 of Steve Oualline's book. What would it take to generate both from the same file (or at least one base file plus a few supporting files such as diagrams)?

Suppose we could include something like:

@@asciiart alt="hjkl.svg"
    k
h       l
    j
@@end

And the help viewer would just treat lines beginning with '@@' in the 1st column as a comment and simply not display them. (I'm not suggesting this is the ideal syntax, just that if any such syntax existed it'd really help).

The major missing piece (compared to markdown) is support for "tables".

There's a kind of implied layout already, using aligned columns. Imagine if we could do:

@@table cols="rl" border=true
 12  Some text
  3  Some more text
@@end

And it would signal that LaTeX conversion would have one right and one left justified column, with a border.

Again, the underlying problem isn't that the syntax is bad, it's just that it's not rich enough.

@clason
Copy link
Member

clason commented Apr 16, 2024

And the other underlying problem is that this is the syntax we have, both regarding the engine and the (massive!) documentation we inherited from Vim and added to. Since Neovim is and will remain the primary output of our documentation, this is what you will have to work with, I am afraid. You can either try to make lemonade from it or move to a clean slate project like Helix.

(This is not saying we can't make incremental, backward compatible, changes like we already discussed with codeblock annotations. But these have to be concrete individual suggestions that we will evaluate each on its own. If we want to make big breaking changes, we'll just switch to LaTeX Markdown wholesale. (Which we have been talking about supporting as an additional format.))

@justinmk
Copy link
Member

justinmk commented Apr 16, 2024

There's no support for alternative renderings of ASCII art

Is that really a strong justification? Also doesn't seem "lightweight".

Table support should cover quite a lot of ground, as well as enhancing the "flow" layout that is already half-supported.

@risc26z
Copy link
Author

risc26z commented Apr 17, 2024

There's no support for alternative renderings of ASCII art

Is that really a strong justification? Also doesn't seem "lightweight".

Hmm. Could code block annotations be extended to support giving the block an arbitrary label like "fig13"? That way, a converter could use it as a key into a data structure to do something "smart"?

Table support should cover quite a lot of ground, as well as enhancing the "flow" layout that is already half-supported.

You seem to be indicating that table support is planned. Could you point me towards anywhere I can find info about it?

@clason
Copy link
Member

clason commented Apr 17, 2024

Could code block annotations be extended to support giving the block an arbitrary label like "fig13"? That way, a converter could use it as a key into a data structure to do something "smart"?

You can already do that. We won't (as this would be an abuse and risk unintended language injections).

You seem to be indicating that table support is planned. Could you point me towards anywhere I can find info about it?

It's planned but not started; the difficulty is integration both with our tree-sitter parser and Vim's legacy syntax engine (and the :help renderer!), which requires careful planning. This is not a "lightweight" addition. (I've been thinking about this for a while, and honestly it seems simpler to just switch to Markdown for our documentation, for which we already bundle a parser that has support for tables (but not a renderer -- yet).)

@risc26z
Copy link
Author

risc26z commented Apr 17, 2024

Could code block annotations be extended to support giving the block an arbitrary label like "fig13"? That way, a converter could use it as a key into a data structure to do something "smart"?

You can already do that. We won't (as this would be an abuse and risk unintended language injections).

I didn't mean a label masquerading as a language. I mean something like >lua:fig1 or >text:mylabel. Would that be problematic?

@clason
Copy link
Member

clason commented Apr 17, 2024

Yes.

@justinmk
Copy link
Member

There's a big difference between a pdf and a beautiful pdf. I tried the proof of concept you mentioned.

The POC just processes the HTML, so it's not great. To get a beautiful PDF, our existing gen_help_html.lua could be modified to generate a PDF. That's probably 1-2 days of work.

You seem to be indicating that table support is planned. Could you point me towards anywhere I can find info about it?

Created a tracking issue: neovim/tree-sitter-vimdoc#132

@risc26z
Copy link
Author

risc26z commented Apr 17, 2024

There's a big difference between a pdf and a beautiful pdf. I tried the proof of concept you mentioned.

The POC just processes the HTML, so it's not great. To get a beautiful PDF, our existing gen_help_html.lua could be modified to generate a PDF. That's probably 1-2 days of work.

I'm trying to understand it right now. I'm also trying to work out whether it's possible to automagically decide when to reformat material as normal, proportionally-spaced text.

BTW, do you know if there's an 'official' way to dump tree-sitter data (node types, locations, node text) as json, xml, or some other form of structured text to a file/buffer? I've written a bit of lua to dump it as an s-expression sort of thing, but it's pretty horrible code.

You seem to be indicating that table support is planned. Could you point me towards anywhere I can find info about it?

Created a tracking issue: neovim/tree-sitter-vimdoc#132

Thanks.

@justinmk
Copy link
Member

justinmk commented Apr 17, 2024

I'm also trying to work out whether it's possible to automagically decide when to reformat material as normal, proportionally-spaced text.

That's what the "flow" layout is trying to do. gen_help_html.lua currently hardcodes some filenames:

local new_layout = {
['api.txt'] = true,
['lsp.txt'] = true,
and for those filenames, it lets HTML treat the text as 1 block. For all other files, it treats the text as "fixed width".

The vimdoc parser recognizes "blocks" of text (i.e. lines not separated by a blank line(s)), as well as lists. Those can be treated as proportionally-spaced text (flow layout).

BTW, do you know if there's an 'official' way to dump tree-sitter data (node types, locations, node text) as json, xml, or some other form of structured text to a file/buffer?

To see the treesitter tree for a :help buffer, visit the buffer and run :InspectTree. If you want to pass data directly to vim.treesitter.inspect_tree(), I think we need to enhance it a bit.

@risc26z
Copy link
Author

risc26z commented Apr 17, 2024

I'm also trying to work out whether it's possible to automagically decide when to reformat material as normal, proportionally-spaced text.

That's what the "flow" layout is trying to do. gen_help_html.lua currently hardcodes some filenames [...] lets HTML treat the text as 1 block. For all other files, it treats the text as "fixed width".

Whitelisting seems rather brittle and inflexible to me. Wouldn't a documentation comment in the file would be cleaner & also help convey the nature of the file to unfamiliar editors? (It could be something like "@@safelayout" to use the style I suggested above, but there's already a match for /^@@/, so maybe %% would be a better prefix.)

The vimdoc parser recognizes "blocks" of text (i.e. lines not separated by a blank line(s)), as well as lists. Those can be treated as proportionally-spaced text (flow layout).

This can't be done reliably, as there's lots of surprise formatting. I'm experimenting with a couple of heuristics regarding tabs and sequences of spaces to see whether a particular paragraph can be reformatted. I'm not sure if this can be made 100% accurate, but it might be posslble to automate most of it. (And if people decide to move to markdown eventually, it might be posslble to semi-automate that process, too.)

@justinmk
Copy link
Member

Whitelisting seems rather brittle and inflexible to me. Wouldn't a documentation comment in the file would be cleaner & also help convey the nature of the file to unfamiliar editors?

Of course. This is trivial to do, it just doesn't matter much until more docs start supporting flow layout (which just means they are valid vimdoc, nothing more).

This can't be done reliably, as there's lots of surprise formatting

Not sure what you mean. It works just fine on the hardcoded list that I linked. Of course it won't work on docs that are jumbled nonsense.

@risc26z
Copy link
Author

risc26z commented Apr 20, 2024

To see the treesitter tree for a :help buffer, visit the buffer and run :InspectTree. If you want to pass data directly to vim.treesitter.inspect_tree(), I think we need to enhance it a bit.

No, that wasn't the issue. What I needed was a way to dump the tree (including position & text) to a structured file so that I could process it in an external program. It's a moot point for me, but it might be a useful feature to add if it's not already implemented somewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants