-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
boot.janet loop
docstring to markdown
#507
Conversation
Format `loop` docstring as markdown, fenced with double-backtick quotes.
Is there any interest in formatting docstrings in markdown like that? Benefits:
Drawbacks:
Concern:
Note, in the PR commit above, there's two types of lists in there:
|
A couple of examples of other spots where markdown formatting in docstrings would improve readability:
For that last one, keywords like |
Looks better without each of those keywords in backticks.
Here's that docstring converted to html by Pandoc with zero css styling applied, and also as a pdf. |
There is some formatting to docstrings already, although it can be a bit hard to read just reading the code. Not really against improving this, but there needs to be some way of formatting this when using the The current docstrings formatting rules are very simple - any number of spaces or a single newline is considered a word break so we can wrap on them, but tabs are preserved for indentation. Multiple newlines in a row are also preserved. This is how we can do formatting without any complicated syntax - this lets you do So yeah, I'm fine with merging this after we can make the doc macro more intelligently handle markdown - which means a markdown parser (at least a subset of that) in boot.janet. Might just be fixing/adding support for lists - and maybe even optional support for VT100 ansi escape codes for nicer formatting. As for preference, I think the bulleted list is much nicer looking than a definition list. |
Ah, ok, I see. That's why the doc macro currently works when each line is indented by 2 spaces. It's condensing that leading whitespace into a single space, then wrapping. One thing I like about the current docstrings is that you don't have to make them flush left (at column zero). They're indented (usually by 2 spaces), but that doesn't show up in the terminal window doc output. This makes the code look nice (docstrings are indented the same amount as the code starts below them). Yes, I think it would be an excellent idea to dedent docstrings --- find out whatever leading space all non-blank lines have in common, and lstrip that (and only that) away. That way, the author can be confident that, say, leading space before a list marker won't be corrupted. It sounds like it would also be good if the So, if you have
then that would leave any docstring markdown formatting alone and the result should look good in the terminal (as well as having the docstring be suitable for conversion to html via something like Pandoc). Note, I don't mean to suggest that the Regarding the Pandoc definition list format, while maybe not ideal, IIRC they went through a lot of discussions on how to choose something that:
I think it looks a little suboptimal here because:
But for list items where the definition takes up multiple lines, it works out pretty well. |
As plain (markdown) text, to make it less dense, you could go with a bullet list separated by blank lines:
but I think the definition list syntax, even though it contains that extra colon, still looks better:
Having those definitions all indented the same amount makes it look more consistent and easier to scan, IMO. And Pandoc (or other markdowns that support this syntax) can render it nicely as html. |
Another alternative (the best-looking one, IMO) to the definition list, is a table:
From which Pandoc yields what you'd expect (this is styled a bit on my system): Though, any line-wrapping on that for a narrow terminal window would mess it up. What do you think about setting a hard mininum where no wrapping would happen for lines, say, 80 chars or shorter? That way, if an author wanted to make sure a table (or any docstring line) would remain untouched by line wrap, they could just keep lines to 80 columns or less. As for comparison, the Python docs (in the terminal) are pretty difficult to read below 80 col width, and so I keep it much wider than that if I plan on reading any of its docs in the terminal. |
What do you think about not doing any wrapping at all in the terminal? If I have a really wide terminal window, and I open a man page, I don't really appreciate it line-wrapping to that huge width. I also just noticed that Python doesn't line-wrap its docs in the terminal. I opened a wide window, ran Could Janet simply not do any line-wrapping, and just spit out the docstring as-is? That way, if it looks good in the code, it will look good in the terminal, as long as the terminal is at least 80 chars wide. |
^^ That is, spit out the docstring after dedenting it. |
I'm sorry for the high volume of comments here. Hope they're useful. I just experimented some more with reading Janet docs in the terminal. And regardless of whether I make the terminal window narrow, or quite wide, the docs display at the same reading width. Whatever line-wrapping So, it seems even more reasonable now to suggest that the Janet doc system be simplified and just de-dent and spit out docstrings as-is. If a docstring needs formatting, it's easy enough to M-q or reflow text or whatever on the docstring next time you're editing that file. The only drawback to this I see is if some third-party Janet lib has docs that need formatting help, then they will display poorly in the terminal until tidied up --- which may actually work out as a net positive (more attention to docs). |
I don't know how to thread comments so it's a bit difficult to reply, but here's a bit. For: #507 (comment) I agree the final result looks a bit better with definition lists, but not enough to be worth using them over ordinary lists. So overall I prefer the first thing you showed in your comment (ordinary list with space between each list item) over the definition list. |
I looked through the PR and the specific "Markdown" constructs I found included:
Did I miss any? (I put "Markdown" in quotes because "definition lists" are not common to all flavors, while the other three were present in the original implementation and possibly exist in most/all flavorts.) |
@utvc there is a dynamic binding As for making rendered output look nice with Pandoc, that should not really be a concern - the docstring format needs to be as universally consumable as possible (hence why we are using dumb strings to start with). Markdown is good because it looks nice even if you just treat it as dumb text. |
As for not doing line wrapping, the problem is that most docstrings will contain lots of leading white space. For example: (defn my-fn
``
Some documentation here.
``
[x]
(+ x x)) Will end up EDIT: |
The backtick string dedent sounds very reasonable to me! I can remember |
The longstring-autoindent branch was created to try and deal with some of these underlying issues - it contains a change in the parser that makes that gives the author-desired indentation for long strings (but probably breaks spork/fmt) , as well as changes to the I think that has all of the changes that would be needed to get some of this markdown formatting into boot.janet. |
@bakpakin wrote:
Right. A core value of markdown is that it's a little more work to write, but it looks good and natural in plain text. It looks like what it means, and doesn't hardly look like markup. So, that makes it excellent for docs viewable in the terminal. That said, it looks even better when viewed rendered as html. So if you're reading docs outside of the terminal, it's a nice bonus to have them in html (and easy if the docstring is already in markdown). Another thing is, if you want to include mathematics in your markdown docstring, LaTeX isn't really readable in plain text, but as html Pandoc will use MathJax to get beautiful real math output. |
@bakpakin wrote:
Right. I expect a docstring will be written indented to match the code around it, so I'm expecting that the doc tools will do that smart "textwrap.dedent" behaviour automatically when displaying the docstring. Another option might be to have a special string prefix literal that indicates the string should be automatically dedented when read in... ex., something like: (defn ...
dd``this string
will automatically
be dedented.``
...) |
@sogaiu wrote:
Not sure how to answer this in just a few words... Markdown of course has two types of markup: inline (span) and block (div).
And there's two types of block syntax: side-marked and delimited (or "fenced").
And, IMO, another rule of well-written markdown is always indent things a multiple of 4 spaces --- or, more specifically, 4 places (column nums). That is:
(Edit: removed mistaken note about blockquotes and 4-space rule.) Sticking to that 4-space rule is consistent and lets you create lists with multiple paragraphs and nested sublists and not get confused. So, to answer your question, this particular PR contains some inline code, and also short (single line only) code blocks (indented, not fenced). It also contains some lists (unordered and definition). |
Thanks for spelling things out :) The background for my comment for looking at the specifics of the PR is that "Markdown" is not a specific enough target -- there is no spec, just an implementation. There are many flavors. "CommonMark" is a specific target. I would guess that if one can choose a good subset of features that works across many flavors one may get the benefit of more tooling working well. Another aspect is that the fewer the features are chosen, the less work there is for creation, maintenance, and testing. Of course one needs a sufficient set of features to accomplish enough :) Does that make sense? |
Hi @sogaiu . You're welcome! I'm not sure I understand. It sounds like you may be implying that Janet should/would implement some amount of Markdown. But why? Why not instead just dedent and print out docstrings to the terminal as-is? They already look good --- and would just require some manual minor tweaking (ex, to replace backslash-escaped tabs and newlines with literal ones). My 2 cents, I expect CommonMark to become the de-facto standard markdown if it isn't already. It comes standard with C and JS reference implementations, with many more available. Note further that one of the principals behind CommonMark is also the author of Pandoc. And, aside, the extensions that Pandoc makes to markdown are generally conservative and very carefully thought out. The thread to add div syntax went on for 6 years! :) |
If it's not necessary for Janet to have any Markdown-ish thing in it, I'm not bothered by that :) Just reading the discussion, it seemed possible that support for some Markdown(-ish?) constructs in Janet was on the table. May be I misunderstood. |
With the merging of #511, I think this is ready to go. That patch contains the changes here so I think we can close this out. |
Thanks. Great to see this! Now I get it: you wanted Janet to know just enough markdown so it can linewrap docstrings that contain some markdown formatting. |
Format
loop
docstring as markdown, fenced with double-backtick quotes.