Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

abbreviated syntax for docstrings #8965

Closed
stevengj opened this issue Nov 10, 2014 · 39 comments
Closed

abbreviated syntax for docstrings #8965

stevengj opened this issue Nov 10, 2014 · 39 comments
Labels
domain:docs This change adds or pertains to documentation parser Language parsing and surface syntax
Milestone

Comments

@stevengj
Copy link
Member

Rather than @doc docstring -> foo(x) = ..., we have discussed allowing docstring foo(x) = ... in #8791. i.e. treat any literal string (including string macros!) immediately preceding a function declaration (either one-line or function foo...) as shorthand for @doc.

@stevengj stevengj added feature domain:docs This change adds or pertains to documentation labels Nov 10, 2014
@ntessore ntessore mentioned this issue Nov 10, 2014
6 tasks
@prcastro
Copy link
Contributor

Is the -> gonna be dropped sometime? I really would like to be able to write:

@doc """
... flowing rivers of technical poetry ...
"""
function foo() ...

@stevengj
Copy link
Member Author

@brk00, we're hoping to drop the @doc too in most cases: the proposal was that any literal string immediately before a function (separated only by whitespace) will be parsed as documentation.

@tknopp
Copy link
Contributor

tknopp commented Dec 20, 2014

+1

@prcastro
Copy link
Contributor

Is this feature expected to arrive for 0.4?

@jakebolewski jakebolewski added the parser Language parsing and surface syntax label Jun 2, 2015
@jakebolewski jakebolewski added this to the 0.4.0 milestone Jun 3, 2015
@StefanKarpinski StefanKarpinski mentioned this issue Jun 5, 2015
13 tasks
@StefanKarpinski
Copy link
Sponsor Member

I made a comment over here that is very relevant to this issue. The situation can be summarized as:

  1. we want to parse bare docstrings as part of Julia's syntax
  2. we want to render and interpret docstring content as Markdown text
  3. it is unacceptable to make the Julia parser depend on a Markdown parser

Therefore Markdown parsing must happen when docstring data is rendered as Markdown, rather than when docstrings are parsed as part of Julia's syntax.

Until this issue is resolved – i.e. until Markdown parsing is deferred to when the docstring data is used, rather than happening eagerly when the docstring is parsed – docstrings are going to remain "stuck in limbo as a clever macro hack rather than a first class language feature".

@StefanKarpinski
Copy link
Sponsor Member

@one-more-minute, @MichaelHatherly, can we come up with a plan for this to happen? What do you see as the necessary steps to having docstrings in the language (without a dependency between parsing Julia code and Markdown parsing).

@MichaelHatherly
Copy link
Member

@StefanKarpinski, Docile's already extracting bare docstrings from base. In my mind the syntax doesn't really need to be part of the language, since I've managed pretty well so far. (That's not to say having it builtin in some way might be better of course.)

The parsing of docstrings is also completely lazy and left until the need to render the raw text, which can be markdown or any other format that someone writes a package for.

I believe the package is already covering most of you're suggestions as is, but could always do with more eyes and testers.

@MikeInnes
Copy link
Member

Maybe I'm misunderstanding this, because parsing isn't exactly my area of expertise, but it seems like this should be a pretty trivial change. It's just a case of having the parser take

"foo"
bar

and create the AST equivalent of

@doc doc"foo" ->
bar

and it should just work. If someone like @JeffBezanson or @jakebolewski can give me a rough idea of where the code for that should go I'll happily do it myself.

@MichaelHatherly
Copy link
Member

and create the AST equivalent of

Is it necessary to do that transformation? Traversing the parsed Exprs and pulling out anything that has the form <<string>>; <<linenode>>; <<definition>> has been working quite well. I've not run into any major problems doing that.

@ScottPJones
Copy link
Contributor

@StefanKarpinski When do you see the Markdown parsing occurring?
I don't think that the parsing definitely needs to always wait until rendering, esp. if it can get errors.
I think it might be better, to have it defer until later, for the base documentation, but then if used
(for ? in the REPL, or @doc function) they are rendered, and any doc strings in user code are parsed immediately.

@MikeInnes
Copy link
Member

@MichaelHatherly I think it makes more sense to load doc strings dynamically as code is loaded, as opposed to re-parsing and traversing the code at a later point as Docile does. For one thing, the dynamic approach generalises immediately to user code – if loading docs is a separate step we either have to hack include to carry out that step or make the user do it, both of which are less than ideal.

@MichaelHatherly
Copy link
Member

re-parsing and traversing the code at a later point

Yes, that was I concern, but typically I've found that to be relatively fast:

julia> @time Docile.Cache.getraw(Base)
Docile: updating package list...
Docile: caching 73 modules from 'Base'.
 625.987 milliseconds (3892 k allocations: 183 MB, 21.02% gc time)
ObjectIdDict with 2 entries:
  checkstring        => "\nValidates and calculates number of characters in a UTF-8,UTF…
  unsafe_checkstring => "\nValidates and calculates number of characters in a UTF-8,UTF…

there's probably room for improvement in that as well, since I've not put much time into that yet.

user code

Is this code at the REPL, a script someone might write, or a package? Do we need to be able to document the first two?

For one thing, the dynamic approach generalises immediately to user code – if loading docs is a separate step we either have to hack include to carry out that step or make the user do it, both of which are less than ideal.

Apart from needing import Lexicon prior to using the ?-mode there's nothing that the user needs to do themselves, and there's also no hacking of include either anymore.

@MikeInnes
Copy link
Member

Is this code at the REPL, a script someone might write, or a package? Do we need to be able to document the first two?

Sure, both. e.g. if you evaluate a file/form in Juno, or write include("file.jl") in the repl, then it should load docstrings as usual. I'm guessing Docile doesn't handle those cases?

Also, how do you assign docstrings to the right method of a function statically? I'm sure there are decent ways of doing it but my main concern is that stuff like that just wouldn't be as reliable.

@ScottPJones
Copy link
Contributor

@MichaelHatherly The 2 entries it found were my recently merged UTF checking module?
I feel vindicated in trying to get documentation into Base 😀
Could using the ? mode for the first time do an implicit import Lexicon?
How does it assign docstrings currently? That's the sort of thing that I think could be handled in the parser (and could more accurately determine the information about the context of the docstring).
Finally, what would people think about accepting the first void string inside a method as the docstring, like Python? 😀

@MichaelHatherly
Copy link
Member

@one-more-minute:

I'm guessing Docile doesn't handle those cases?

Correct, my reasoning has been that the rendered docstrings are for users of packages. When writing a package I've not seen much need for evaluating a docstring and seeing it rendered right then – it's markdown and so quite readable as is, but that could just be me.

Also, how do you assign docstrings to the right method of a function statically?

By file/line position and Function object. When a source file changes and a module is reloaded then we just re-parse that module. I've not come across funky behaviour yet, but having not tried your interactive approach there could be things that need to be straightened out.

@ScottPJones:

Could using the ? mode for the first time do an implicit import Lexicon?

Not really, if you add it to your .juliarc then yes, but will probably add a second or two to startup time. I think we'd need the long talked about default packages/standard library for that.

How does it assign docstrings currently?

Just traversing over each file in a module and looking for strings in the right places, then taking the current module into account find the object related to that docstring. Is that what you were wanting to know?

inside

I think it's more important to have a consistent place for docstrings to go. Since we have some declarations that don't have an "inside", (abstract, bitstype, f(x) = ...), placing them above is the best choice. An @doc macro probably couldn't work from inside I think without some trickery.

@ScottPJones
Copy link
Contributor

About assigning the docstrings, don't you think it would be better if all that was handled directly when parsing the program? (this is thinking of the future, of course, not right away). I.e. just deciding that the string is a doc string, and saving it, along with the metadata, not doing all the Markdown processing.
OK, I asked about inside, because I already had to deal with one developer (who is coming from Python) who thought that docstrings would be like Python... (given the """ format, etc.).

@MichaelHatherly
Copy link
Member

don't you think it would be better if all that was handled directly when parsing the program?

I've got zero experience with the parser, so it might very well be better done there :)

OK, I asked about inside, because I already had to deal with one developer (who is coming from Python) who thought that docstrings would be like Python... (given the """ format, etc.).

Yes it is a gotcha for those coming from some languages, but there are other languages the also put their docstrings above, elixir and rust come to mind. We can't really please everybody I think. If you'd like to give it a try adding them to Docile you're most welcome to.

@ScottPJones
Copy link
Contributor

No, no, I'm not really a Python person myself, just wanted to raise the issue, and you've answered it just fine, wrt declarations that don't have an inside.
The only action item that comes to mind, is to add something to the Noteworthy differences documentation section for Python, about the differences in documentation strings.
(also good to talk about the differences in """ quoted strings, which are always dedented in Julia, and not with the exactly the same rules as Python docstrings)

@Ismael-VC
Copy link
Contributor

How are the stdlib docs and REPL help generated for Base Julia? I´m concerned with redundancy, if we start including docstrings in Base, I think we should extract those docstrings to generate the final documentation.

Since we are already starting to use docstrings with markdown syntax (even if it's not parsed currently) in Base:

Does this mean I have green light to make PRs to add these?

@ScottPJones
Copy link
Contributor

Well, that one place was me, #11575 (and before it #11004), pushing for internal documentation, so I'm the one to blame for that!
(it was rather controversial... I'd originally tried to use Doxygen tagged syntax [which allows Markdown],
but got a lot of grief over that).

@MichaelHatherly
Copy link
Member

@ScottPJones you weren't actually the first to use bare docstrings it seems :)

julia> Docile.Cache.getraw(Base.Markdown)
Docile: updating package list...
Docile: caching 73 modules from 'Base'.
ObjectIdDict with 9 entries:
  skipwhitespace(io::IO) at markdown/parse/util.jl:19    => "Skip any leading whitespace. Returns io.\n"
...

@Ismael-VC #11828 might be of interest to you since that moves all the docs over to the new system.

@Ismael-VC
Copy link
Contributor

By the way the example Base.unsafe_checkstring above uses:

"
# ....
"

Instead of:

"""
# ...
"""

I think """ is more "visible" and thus a little better, I know that we don´t have an official Style Guide yet, how about letting me gather all our preferences and finally write one, so we can have consistent code styling across Julia that can serve as a reference, i have found things like:

some_func{with<:a, lot<:of, lot<:of, lot<:of, lot<:of, params}(with, a, lot, lot, lot, lot, of, args...) =
    # some code
    for # something
        # for code
end    # this is endfor not func!

Instead I would like to change those to:

function some_func{with<:a, lot<:of, lot<:of, lot<:of, lot<:of, params}(with, a, lot, lot, lot, lot, of, args...)
    # some code
    for # something
        # for code
    end
end

And things like that, which seems to be better, I know it's nice to be able to do the former (in the REPL this is helpful), but the latter is better for consistency IMHO. What do you think?

@ScottPJones
Copy link
Contributor

That's because """ is not available that early in the build process... you are looking at something in Base, not user code. See #11815 for a hopefully merged soon solution.

@ScottPJones
Copy link
Contributor

@MichaelHatherly I had no choice! 😀
I actually was working on getting the triplequoted code and macro to load before my utf* code, have a PR that was leading up to that #11719, where the 2nd commit (which is now being superseded by @nolta's #11815), removed a lot of the dependencies that the original triplequoted code had,
but #11815 will take care of that in a better way.

@ScottPJones
Copy link
Contributor

And I was the first to use them with Markdown syntax in Base! ;-)

@MikeInnes
Copy link
Member

Lazy.jl was my first package and uses markdown docstrings – I've been planning this stuff for a long time ;)

@ScottPJones
Copy link
Contributor

I saw a lot of good markdown docstrings in packages, it just surprised me that there was pretty much nothing in Base. I'm a big fan of internal documentation (since most all of my programming is deep in the internals)

@MichaelHatherly
Copy link
Member

Style-wise: triplequoted for multiline docstrings and single quotes for short one-liners seem the most readable to me. That's what packages have been doing with bare docstrings that I've come across.

If I'm reading #11815 correctly that would allow """ to be used earlier instead of Scott's single quoted docstrings workaround?

@StefanKarpinski
Copy link
Sponsor Member

I think that we should shoot for a consistent, simple user experience, which seems to be that regardless of how you load code – whether from packages, include, or entered at the REPL – docstrings should work and be available via help immediately. It seems to me that the only way to make that work is for the parser to know about docstrings and stash their content somewhere that is immediately available upon parsing code. At the same time, the constraint that parsing Julia code should not depend on parsing Markdown (or any other document syntax) implies that the docstring data be stashed as plain, unparsed data. In non-interactive mode, the parser could just ignore docstrings. If the user asks for help and has a Markdown package – and it can be a standard module that ships with Julia but is not in Base – then it can be loaded automatically and used to render the docstring data nicely. Since Markdown is pretty readable without Markdown rendering, we could still display any docstrings as-is, even if the Markdown module is absent for some reason. That could very well be the case, for example, on a server where Julia has been installed with a minimal distribution, lacking bells and whistles. With this arrangement, even on such a system, you would get a usable, albeit less fancy REPL help experience.

@StefanKarpinski
Copy link
Sponsor Member

Both Base and pre-compiled modules will need some way of providing help info as well. I'm not sure if we want to make this some kind of data segment in a shared object file (i.e. .so, .dylib or .dll) or a separate data file. I'm kind of inclined to stick it in the shared object with the option to produce "stripped down" shared object files that don't include this data.

@ScottPJones
Copy link
Contributor

Yep, precisely what I'd been trying to say, but said much more precisely and more completely. 👍 (to the 1st comment, also agree to the second comment)

@StefanKarpinski
Copy link
Sponsor Member

If people agree with this view of how things should work then we just need to figure out how to get there.

@MichaelHatherly
Copy link
Member

If people agree with this view of how things should work then we just need to figure out how to get there.

+1 to finally concluding the docstring saga. Anything more fancy can live in packages.

@ScottPJones
Copy link
Contributor

Long live docstrings! 👍

@jakebolewski
Copy link
Member

Wouldn't then all docstrings have to be static? Lifting the parsing of docstrings into the parser is fine, but how do we add documentation for methods generated at a later stage (during macroexpansion for instance)?

for meth in (:foo, :bar, :baz)
@eval begin
       doc"""
       $($meth) is a method
       """
       $(meth)() = "hello $($meth)"
end
end

@StefanKarpinski
Copy link
Sponsor Member

I guess the association could happen at eval time rather than parse time.

@ScottPJones
Copy link
Contributor

Good point, definitely would want to be able to handle that.

@prcastro
Copy link
Contributor

Closed by #11836

@IainNZ IainNZ closed this as completed Jun 26, 2015
@stevengj
Copy link
Member Author

Hooray!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:docs This change adds or pertains to documentation parser Language parsing and surface syntax
Projects
None yet
Development

No branches or pull requests