There are a number of issues discussing documentation for Julia code (#762, #1619, #3407), but I'd like to separate this problem into two very distinct issues:

1. Associating text from source files – both comments and source code – with functions, methods, modules, and global bindings.
2. Interpreting and presenting this data to the world.

We keep getting bogged down in the combination of these two issues, but they can be tackled separately, and should, imo, remain decoupled – that is, the infrastructure for (1) should be reusable with different approaches to interpreting comments and different mechanisms for presenting documentation (help, sphinx, dexy, jocco, etc.).

This issue is for discussion of (1):

• What we want to be able to associate with run-time objects like functions, methods, modules, and global bindings? It would be nice to have easy, queryable access to source code for things as well as inline comments associated with that source code.
• How to associate that data with run-time objects? While it may be reasonable to have this kind of overhead in interactive situations, we also must be able to run programs non-interactively without paying that price.

Let's solve this first and then figure out how to interpret and present things.

The Julia Language member
The Julia Language member

I'm pretty excited about this; I think it'll make documentation easier as a first pass, but being able to attach data to functions in general can be used for quite a few neat things. The first time I wanted something like this was when I was developing the codespeed infrastructure; I wanted to annotate functions with metadata stating the name of the test that function ran, what units the resultant metric of that test would be, (Time, FLOPS, bytes/clock cycle, etc....) whether "less is better" for that particular unit, etc..... So I think whatever we come up has the opportunity to be somewhat more than the only analogue I can think of right now (Python's docstrings), which is just a single string of data. We have the chance to make the data we attach highly structured, in the sense that it can be manipulated by other julia code.

regards to all .. imho ..

@StefanKarpinski writes this right

What we want to be able to associate with run-time objects like functions, methods, modules, and global bindings? It would be nice to have easy, queryable access to source code for things as well as inline comments associated with that source code.
How to associate that data with run-time objects? While it may be reasonable to have this kind of overhead in interactive situations, we also must be able to run programs non-interactively without paying that price.

In any creatively powerful software paradigm, and so certainly with Julia, there is available a dynamism that at once allows a design to run well, go fast reliably, and harvest the deep accurately and at another affords development, investigation and playfulness the robust power and makes perspective, conception, and insight readily accessible as a newly realized design that runs well, goes fast and is reliably accurate.

As Stefan notes, it is entirely reasonable and sound that Julia offer the language user each modality's respective advantage; that is more compelling than a requirement that they operate in mutual simultaneity.

The Julia Language member

+1

To comment on 2) Interpreting and presenting this data to the world. and more in relation with IPython notebook/qtconsole/console that now can be used with IJulia, I want to point out that we had the discussion in IPython of enabling "rich" docstring. So as you integrated multimedia io (#3932) into IJulia core, maybe you could have the possibility of help() returning different mimetype for different frontend.

You are probably much more flexible in what you can do than us in IPython, and we will happily see what you come up with.

Be carefull though, with rich mimetype representation of documentation, doc may become a security issue (inject javascript in the notebook that can execute code in the kernel), but it can also be an advantage as you could also have executable or dynamic doc, like runable sample code. One thing we were not totally able to solve is how to have working cross-link in the live documentation in the notebook.

The Julia Language member

This issue has been neglected for too long. Let me make a concrete proposal to get the ball rolling. A basic starting point could be:

• Julia should include a global dictionary-like object DOC::DocDict <: Association{Any,Any}.
• The keys are any Julia object (typically of type Function or Method, although we'll also want to document other Julia objects.)
• The values are any Julia type, and we will use the writemime machinery to convert this to various formats, e.g. reprmime("text/plain", DOC[x]) to get the text/plain documentation of x.

On top of this machinery, various pieces could be added:

• DOC[f::Function] would look up the general documentation for f, analogous to our help now (we would still have a help function, it will just use DOC). DOC[m::Method] would look up the documentation for a specific method signature. To get all of the documentation for a function f, you would call [DOC[f], [DOC[m] for m in methods(f)].

• Some kind of macro could be defined to make it easier to add documentation for functions of a given signature. e.g.

@doc foo: f =   # equivalent to DOC[f] = foo, i.e. documentation for f independent of any method signature
@doc bar: function f(....)  # equivalent to DOC[method signature for f(....)] = bar
....
end
• Note that importing a module would execute all of its embedded DOC[foo] = bar statements, appending to the documentation.

• We could easily implement a "noninteractive" mode in which @doc and DOC[foo] = bar do nothing, to eliminate any overhead of storing/updating DOC in production code.

The simplest documentation would be in the form of strings, for which only the text/plain representation is available. However, we could define types to encapsulate higher-level information and formatted text. For example:

• DocDefinition(doc::Any, file::String, line::Integer, source::String, ....timestamp?....other?....) to store a documentation value doc along with metadata for a definition in a source file. The @doc macro could automatically use this wrapper type. One could define writemime(m::MIME, d::DocDefinition) = writemime(m, d.doc) to make this wrapper transparent.

• Various formatted-text or other container types. e.g. Markdown(s::String) which interprets its argument as markdown with embedded LaTeX equations, and defines writemime(::MIME"text/x-markdown, x::Markdown) along with other output formats. So one would do e.g.

@doc Markdown("""
.....
"""): foo(...) = ....

or there could be a @docmd shortcut for this.

The Julia Language member

I like the simplicity of this approach.

@velicanu might be interested.

The Julia Language member

This is a really great idea.

This is interesting, I'll try to do it.

The Julia Language member

We also need some way of associating documentation with manual sections in a hierarchy (e.g. "Mathematical functions / Special functions / Bessel functions"). And in general we want a way to associate metadata with objects. One option, in line with the above proposal, would be to:

• Define our own "MIME types" for any desired metadata. e.g. metadata/author for author string, or metadata/section for an @ delimited string of section names in descending order of specificity, e.g. "Bessel functions@Special functions@Mathematical functions.

• Any DOC[x] value type that wants to provide any metadata could define the appropriate writemime function.

• The @doc macro could accept metadata as keyword-like arguments:

@doc section="Documentation@Awesomeness" author="Alyssa P. Hacker" """ ..... docs .... """: somefunction(...) = ...

and would store them in a "metadata" Dict inside DocDefinition. mimewritable for DocDefinition would then return true for metadata MIME types corresponding to keys in the metadata Dict.

The Julia Language member

Some thought should go into the @doc macro syntax to make the resulting code as human-readable as possible. One annoyance with using a macro for this is that you can't simply insert linebreaks wherever you want without breaking the parsing. But if this seems to be a problem I suppose that we could add a new keyword/syntax to Julia that parses as @doc or some kind of document(expr, ...) function call.

The Julia Language member

@loladiro, is there any missing functionality in the above proposal compared to what is needed to implement the REPL help?

Note that importing a module would execute all of its embedded DOC[foo] = bar statements, appending to the documentation.

Have you considered dooing so only at install time for libs ? I'm especially thinking that for library. One would probably like to build the all html doc at once when the library is installed, because of cross-links and everything
you might need to build the doc for the all lib at once. Also, in notebook, we can probably have a link in the pager that open file://path/to/julia/doc/module/function.html that is browsable (runnable ??) .

The Julia Language member

@Carreau, on top of this one can build various tools, e.g. a tool to import a module and build documentation in some format. As @StefanKarpinski said at the top of this thread, however, that is conceptually separate from the task of associating the data with the objects in the first place.

@stevengj Sorry I wasn't clear, I was not worried about the external tool to build the doc, I was wondering about associating externally this back to the objects. Like an external way to add value to DOC::DocDict <: Association{Any,Any} but I guess you are right, this can be a layer on top of DOC.

The Julia Language member

I'm not sure what you mean by an "external way to add a value to DOC" ... any Julia program will be able to mutate the DOC contents.

I might have misunderstood something, and will re-read, but global dictionary-like object made me though of a per-session object that dies with the interpreter, which can make sens in a interactive like environnement. This was comforted by the :

Note that importing a module would execute all of its embedded DOC[foo] = bar statements, appending to the documentation.

I was more thinking of a persisting database of those info (for example build at package installation time)
And at some point I can for example run a local html doc-build of JuMP that "register" with this database, so that when I do help(some-function-of-jump) it knows how to access this.

The global dict DOC::DocDict proposed by @stevengj doesn't seem quite right to me. Shouldn't globals be avoided if possible? Why not put that info directly into the modules, methods and functions themselves? For instance, add a field data to the Function type and similarly to the Methods type. Let data be a dict or contain a field data.doc. That way help(fn) could get to it and the hierarchy information is easily available too. Other data could put into that dict as well, like e.g. the source code or the annotations @staticfloat mentioned above.

What's missing is the possibility to associate data with globals. Either make all globals containers with a data field too, or resort to a global dict for those.

The Julia Language member

What's wrong with a global in this context?

• It's much easier to look up (and update) information in a single dictionary than in many. And a simpler implementation is easier to write, debug, and maintain.
• One module can extend a method defined e.g. in Base, so it's not obvious that segregating the documentation for that method is desirable.
• The hierarchy information is still easily available, because given any Method signature m, m.func.code.module gives the corresponding module (and given a module one can find the parent with module_parent.) It would be easy to add module information to the DOC dict for constants too if desired.
• You want to be able to document things other than Function or Method, e.g. constants, types, and perhaps macros. So adding fields to Function and Method is not sufficient, as you point out. And if you have a "global dict" for constants, it only adds complexity to have a completely separate data structure for function and method documentation.
• Adding fields and dictionaries to common types like Function adds runtime overhead. In something like Python this doesn't matter, but in Julia it is a big deal. You certainly don't want to slow down running code, and there should be a way to avoid storing the documentation entirely in production code.

What is the concrete disadvantage of a global dictionary that overcomes its advantages in simplicity and functionality? Blanket prejudice against globals is not persuasive.

To me segregating the function/method metadata from the function is odd. There is plenty of (meta-)data already associated with methods/functions/modules (e.g. signature, module...), why treat the additional metadata differently? (See point 3 for the most important argument)

Say for instance, I have a method which is 'private' to my module, i.e. I don't export it. But I may still want to document it (for my own purpose) or I want to add other metadata like @staticfloat mentioned. Why should this metadata, which is private to a module, live in a global variable?

1. There is not much difference in looking up/modifying things using fn.data.doc or DOC[fn]. Also usually one would use help(fn) or some other function which would work with either. Also, as I mentioned above, there is plenty data already associated with functions/modules/... so it must be possible to maintain such machinery.

2. when extending a function with another method then that method is still contained in the generic function. So there is no segregation. Also, documentation writing will have to take into account multiple dispatch. I imagine that there should be some generic doc for the function, like + adds numbers; and specialized doc refining on that, like +(a::Integer, b::Rational) adds a + b and returns a Rational (a bit a stupid example).

3. I think it would be awkward to get the namespacing right with a dict. Examples: DOC[:sin] should work and DOC[:(Base.sin)] should work too. Do you define it twice, or is only one valid? What if I do sinalias = sin; DOC[:sinalias]? What if two modules have a function of the same name? How is DOC updated after a using imports some names into the top level? All these namespace issues would come for free if the data was tacked onto the functions/methods/... This seems to me the most important argument against the dict.

4. I think for type-metadata it would also be fine to add another field to the DataType datatype. To annotate instances of a type, say pi a convention could be to define a field like _doc and put the documentation there. That leaves us with macros, not sure about those. Are they a type themselves? What are they?

5. I can't comment too much on performance. But I think in either approach it should be possible to tell the parser to fill the dict/field, if in the REPL, or not, if not interactive. Also, one could make the metadata immutable, that should help.

Well, either way, it will be good to have a way to associate metadata with functions etc., especially for docs.

The Julia Language member

Number 3 is actually not a problem, since you would use the function object itself etc. as the key: DOC[sin]. This works just the same way with namespacing as storing metadata inside the objects. Either approach will have trouble with macros however, since there doesn't seem to be any actual macro object to use as a key, or store metadata in.

The Julia Language member

Argh, github though that my 3. was a 1. I was talking about number 3, anyway.

Yes, you're right. Out goes what I thought was the best argument. Still, why store the data in several places if it could be in one.

The Julia Language member

Regarding point 1, the problem is that we aren't just dealing with functions. You want a metadata lookup procedure that works equally well for all types, and is uniform for all types.

Regarding point 2, the separation of generic and method-specific documentation was already provided for in my proposal: the former is provided by DOC[f::Function] and the latter by DOC[m::Method].

The Julia Language member

Regarding the "why store the data in several places if it could be in one" argument, that is a matter of perspective. I think of a single DOC variable as "just one" place, whereas deciding type-by-type where to stick some field to store metadata seems like several places to me. (And if we forget to add a metadata field to some type, then no metadata is possible for instances of that type.)

Functionally both approaches should work.

What I don't see is why metadata should get treated differently from other data/metadata of function/types/modules. For instance, there is no global dict MOD which stores for each function/type/instance/... the module where it was defined. Now you argue that metadata is different because it should be treated the same for all function/types/modules. But metadata may well not be uniform for all types, e.g.: storing return types would only make sense for functions, having a hierarchy of docs makes sense for functions/methods but probably not for modules and instances, storing the source code may not make sense for instances, modules may want to store only that part of the source which is not in the functions & types to avoid duplication, etc.

Of course, this non-uniformity could be implemented in the global dict as well. But I think because the metadata belongs to the function/types/modules/... and can be specific, that is where is should be tacked on.
(I have no skills to implement this feature, so I cannot comment on specifics of the implementation. I have no frame of reference, really.)

The Julia Language member

I was hoping that julia would adopt support for metadata similar to Clojure where it can be defined for namespaced symbols and collections. Many powerful tools have been built on top of this including a full blown type checking system. Clojure's documentation system is also built upon metadata. However this does incur overhead (in memory and filling out the fields at runtime) so I can see that argument as well.

But why are we arguing about implementation details? Shouldn't we agree on an interfaces to get at specific metadata (doc(f::Function), doc(f::Module), etc.). How it's implemented can always change later.

I like the @docmd idea.

I am personally interested in manual-style documentation, akin to Perl's =pod feature. I like @stevengj's suggestion of keyword-like arguments. My current approach is to write my Julia files like this:

#=doc
# Product Manual
# --------------
# Here I insert Markdown documentation ...
#=end
... my code ...

With @docmd and the metadata idea, I could write:

@docmd
section="Chapter 7"
author="Alyssa P. Hacker"
"""
How to Analyze Simulation Results
---------------------------------
Blah blah blah
""": somefunction(...) = ...

... my code ...

Here is an idea that I personally would love:

function suggestion()
"""@md
Suggestion
----------
Extend Python's doc-string syntax: Add an @ followed by a file extension
after the first triple quote. At the simplest level, anyone can write a
script that pulls out the triple-quoted strings and saves them in a file
with the correct extension. So I can document my code with @tex, @md, @rst,
@htm, @xml (e.g. docbook) or whatever. For example:

prompt $juliadoc example.jl # Saves example.tex prompt$ pdflatex example.tex  #  Saves example.pdf

Going a step further, you can talk about adding @md documentation to the
help() or extracting @html documentation for IJulia, or what-not.

- The idea seems simple to me.

- It lets users adopt any (text) format that they deem useful.

- That includes formats that we might not think of today.

- Anyone can write a useful script for their documentation pipeline.
"""
...
end

Opinions?

The Julia Language member

@dcarrera, Julia already has string macros, so the Julian thing would be md"""......""" rather than """@md ......""".

This should create a String subtype that has a writemime(io, "text/markdown", s) method for extracting markdown text, and also has writemime(io, "text/plain", s) for a plain-text representation, and perhaps writemime(io, "text/html", s) for HTML conversion. In the future we can define additional string types as needed.

The Julia Language member

I like calling these "macro strings" or "string macros" rather than "non-standard string literals", which was the best name I could come up when I was originally writing the manual.

Yeah. I remember seeing L" ... " in PyPlot for LaTeX. I don't know a lot about those, so I'm going to ask some possibly naive questions:

1) One would have to pre-define macros for all the formats that people are likely to use (tex, html, xml, md, rst) (or the user could write their own) right?

2) How would I use writemime(io, "text/markdown", s) ? Would I read a an entire Julia program as text and feed it into writemime or am I still responsible for extracting all the md" ... " strings from the source?

3) In my example, you would have to remove the 4 spaces of indentation. Otherwise that could mess up white-space-significant formats like Markdown. The number of spaces depends on the indentation of the initial triple quote. Can string macros do this? Perhaps I'm trying to solve an unsolvable problem because people could use tabs for indentation and you can't know the tab-to-space ratio of their text editor.

All in all, the idea of string macros sounds good. I suppose one could start by making a JuliaDoc module to experiment with the features before incorporating into Julia.

The Julia Language member

1) I'm proposing that all of the documentation tools operate generically on Julia objects and use writemime to convert to other formats for output. So other classes could be added later as desired, including in user code; you certainly would not have to predefine all possible formats in advance.

2) When Julia code is loaded, the associated documentation objects would be stored in a datastructure of some sort (e.g. a DOC dictionary in my proposal, and other tools (e.g. online help(), documentation generators, etcetera) could process this datastructure as needed. You would not need to do any parsing of source code yourself. (Note that this issue is only about documenting Julia objects (functions, constants, etc.), although of course similar types could be used for other sorts of documentation.)

3) Triple-quoted strings automatically dedent for you (though this is not yet documented; see #5135).

1) Thanks.

2) Ok. I suppose that for that you can either have a Python-style rule where a tripe quoted string following function declaration is assumed to be documentation, or you could use the @doc macro you proposed, maybe like this:

function foo()
@doc md"""
Markdown documentation here...
"""

I just created a new issue ( #5200 ) that is about manual/tutorial type documentation. So I don't pollute this issue with off-topic ideas.

3) I just tried the dedentation feature. It works well. If you try hard enough you can make "break" the Markdown, but it took an intentional effort on my part. I suspect it will work well in practice.

The Julia Language member

If the @doc macro goes before function (as I proposed) rather than after, then it can be implemented purely in Julia. Anything inside the function declaration would require changes to the parser. Also, it wouldn't go well with one-line f(x) = bar functions.

The Julia Language member

Another thing that will cause some trouble are all the functions created by metaprogramming, e.g., https://github.com/JuliaLang/julia/blob/master/base/array.jl#L931-L994. Adding documentation strings to these functions in a sensible manner will require some thought.

The Julia Language member

@dcarrera @stevengj I added quotes to the @docs in your comments. A public service reminder to always quote your macro invocations in GitHub issues. Many of our macros are the same as GitHub usernames.

@stevengj: Ok. I didn't pick up on that. I actually like it better that way -- documentation before the function. I guess that the function would become a parameter for the macro, or something like that... Does that mean that if we want to also have POD-style documentation we'll need to create a different macro besides @doc?

I have a question about macros. Going back to @stevengj's example:

@doc Markdown("""
.....
"""): foo(...) = ....

Is it possible to make the colon and the following function optional? So that @doc could be used both for documenting functions and for writing manual-style documentation, and the way you know whether a doc string refers to a specific function or object is simply that it ends in a colon. For example:

@doc md"""
Product Manual
--------------
Blah blah blah"""

@doc md"""
This is how function foo() works...""":
function foo()
...
end
The Julia Language member

@dcarrera, yes, macros can do different things depending on the number of arguments, so I think that would be a reasonable re-use of @doc for #5200. (And I'm not sure we want the colon anyway.)

I would like to present a different idea from what we've discussed so far:

1) Implement a useful subset of Asciidoc in Julia (easy for a subset).
2) Interpret @doc strings as Asciidoc by default.

Let me give you an example of what I mean:

@doc """
:Author:    Daniel Carrera
:Email:     <dcarrera@gmail.com>
:Date:      2 January 2014
:Revision:  3.2.3

Blah blah blah ... Asciidoc supports metadata.
""" function foo(x)
...
end

I have been thinking about this issue for the last several days. I think that some of the proposed features for the @doc macro (author, section, etc) feel a bit like reinventing the wheel. At the same time, I have become impressed by Asciidoc ---it seems as easy or easier than Markdown, for the things Markdow can do; yet, it seems more complete than ReST---.

I would not try to implement 100% of Asciidoc in Julia. I simply do not see the need. People can use external tools if they want to write a book in Asciidoc. What I think would make sense is to pick a subset of Asciidoc that matches what we would like Julia's help system to have available.

An additional idea is to use Asciidoc labels or headings to make the keys of the DOC[] object. For example:

@doc """
foo(x)::  This is how function foo works by default.
This is another line.

foo(x::Integer)::  Blah blah blah.
"""

This would allow you to separate documentation from the function declaration. Whether doing so is a good idea may depend on the context, but some times it might be a good idea. For example, Julia's current help system does this exactly, but using ReST.

Not reinventing the wheel sounds like a good idea, but I'd rather adopt the meta-data schema of Doxygen or gtk-doc,, which are precisely oriented towards this goal.

Why would we want a default format anyway? I think the better option is to define a standard interface for how the @doc foo"""my foo doc doc""" function should work, and let it be up to the community to develop different formatting solutions, and let the solutions with the best tools win (and be included in Base/standard distribution). A user will probably be able to read and update documentation written in any reasonable format, so the diversity will not be a problem.

Core Julia will then be responsible for the @doc macro, the global DOC dictionary, some guidelines for the object in the dictionary and a simple plain string implementation. It might be reasonable to require it to respond to writemime with MIME"text/html", MIME"text/plain" and so on. If we want author/date/revision to be accessible we might have a Base.Doc module where you can provide implementations for Doc.author(), Doc.date() and Doc.revision().

Just like there's a style guide, I think it would be better to recommend a documentation system to make collaboration easier. This would also allow the package system to check that the documentation is up-to-date, e.g. that a summary of what the function does is provided, and that all arguments and the return value are documented.

R provides such a system, and when the number of contributed packages gets large, it's very nice to have a way to enforce some degree of consistency and quality of the documentation -- or at least to provide a tool helping maintainers to check that their documentation is up to some standard.

The Julia Language member

As @StefanKarpinski said at the top above, the first thing is to decide how to associate data with Julia objects, in a way that allows many different kinds of data to be attached. Deciding on a standard format for documentation data is a somewhat separate issue (not completely independent, but it's important not to get too bogged down on the latter problem before we solve the former problem).

I have mixed feelings on diversity. I think that a default documentation system has a lot of value. My impression is that Perl, Python and Java have all benefited from their respective standards for documentation. I think @nalimilan raised some good points that I hadn't thought of.

I like Doxygen for the topic of this issue ("help" style documentation). I was hoping to use something that would also be useful for manual-style documentation without having to a different format for manuals.

@stevengj : For associating data with Julia objects, what's wrong with the global DOC dictionary you proposed? That seems like a natural solution. I'm probably missing something, but it seems to me that most of difficulty is in the API (including data format), like what should the @doc macro do? What should be the input to @doc and what should @doc do with that input? No?

The Julia Language member

@dcarrera, I don't think anything is wrong with my proposal. :-) But others have to agree and someone has to implement it.

Ok. Here is my attempt at a slightly more concrete proposal:

## Part I -- Definition of DOC

DOC is a global dictionary object, where the keys are any object one wishes to document, and the value is any object that implements writemime with at least the following MIME types:

writemime("meta/summary", DOC[f] )
writemime("meta/author", DOC[f] )
writemime("meta/date", DOC[f] )
writemime("text/plain", DOC[f] )
writemime("text/html", DOC[f] )

In addition, DOC[f].meta must be an array listing all the metadata available for the object.

## Part II - @doc macro

Anyone can write a macro for documentation, as long as it fills the DOC object correctly, as indicated in Part I. Julia can come with a default @doc macro. Personally, I might be warming to the idea of something based on Doxygen, but I need to think more. This provides a type of default, while allowing the freedom for people to document things differently without losing features provided by DOC.

As an example, an @doc macro inspired by Doxygen could look like this:

@doc """
One sentence summary of what the function does.

A longer description of what the function does.
This part can span multiple lines.

* Bullet.
* List.
* Etc.

@author Daniel Carrera
@param  ...
@param  ...
@return ...
""" function foo()
...
end

Same example again, now using AsciiDoc:

@doc """
:author: Daniel Carrera
:summary: One sentence summary of what the function does.
:param:  ...
:param:  ...
:return: ...

The rest of the docstring is a more detailed description
of the function. Everything in the docstring is processed
by some http://www.asciidoc.org[AsciiDoc] parser.

* Include.
* Bullet.
* Lists.

|=======================
|Col 1|Col 2      |Col 3
|1    |Item 1     |a
|2    |Item 2     |b
|3    |Item 3     |c
|6    |Three items|d
|=======================

""" function foo()
...
end

NOTE: This post was edited from the original version.

The Julia Language member

@dcarrera, I think there is some value in specifying a difference between a DOC[f::Function] (generic documentation for all methods of a function) and DOC[m::Method] (documentation specific to a particular method signature).

Also, I'm not sure I like the DOC[f].meta pattern, since . cannot be overloaded. I would suggest instead that:

• DOC[f] <: Associative{Symbol,String}.
• DOC[f][foo] gives the metadata (String) for the symbol foo. e.g. DOC[f][:author] is an author string.
• keys(DOC[f]) gives an iterator over the metadata keys as usual.
• No metadata is required (keys(DOC[f]) may be empty), although :summary at least is recommended. And we standardize a few (optional) metadata names like :author, :date, and so on.

Then we wouldn't use writemime for "meta/foo" metadata faux MIME types. Instead, we would only use it for outputting the documentation itself, requiring only text/plain and text/html.

I prefer Markdown to asciidoc, since:

• Markdown is more widely known thanks to github (as well as IPython/IJulia in the case of Julia users).
• Markdown with embedded LaTeX equations is already directly supported in IPython and IJulia.
• Embedded LaTeX (which can be processed via mathjax as in IPython) is extremely useful for documenting math functions
The Julia Language member

I strongly agree with a preference for Markdown. ASCIIDoc is just kind of clunky. Sure it's more complete, but complete is not really what we need. We need something minimal but useful – which is exactly what Markdown gives. I've also been tossing around the idea of having string types with embedded formatting that know how to render themselves to various outputs. This kind of makes sense given that we're supporting display and MIME types right in Julia base. Basically, the idea is that instead of writing error messages with plain text and then awkwardly marking them up after the fact, the original error message should have some markup in it and then if it gets written to plain text, you just drop the markup, but if it gets written to HTML or some other richer medium, you transform the markup into the appropriate form. I think that Markdown gets the subset that you want to support about right, whereas ASCIIDoc has way too much. We probably want to support things like special link schemas for linking to source files – which are then translated upon presentation to the appropriate external link.

@stevengj: I think your suggestions are a good improvement over my initial sketch. I agree that DOC[m:Method] is very valuable.

@StefanKarpinski: I don't understand why you feel AsciiDoc is clunky compared to Markdown. AFAICT, for the things Markdown does, AsciiDoc looks almost the same or AsciiDoc looks better (e.g. links). I do agree that AsciiDoc is big and has many features that are unlikely to be relevant. I was planning to suggest that we pick a convenient subset of AsciiDoc. I think that this subset-of-AsciiDoc idea is better than Markdown because you have room to grow, without creating yet another incompatible extension of markdown because you want to do something that Gruber was not interested in. In particular, I would like to have from Asciidoc:

1. The NOTE / WARNING / TIP / etc blocks.
2. AsciiDoc tables.
3. Definition lists.

NOTE: I should mention that Github supports AsciiDoc just as well as it supports Markdown. One of the Github developers is a developer of Asciidoctor.

I was reading PEP 287 on the plane. That's the PEP that proposes ReST as the standard format for Python. It makes a good case that a documentation format should NOT be too minimal. It looks like the Python community initially tried the "minimal but useful" approach and had problems and went through some growing pains before they settled for ReST.

Let me show you a sketch of a subset of AsciiDoc that I think would work well (meaning, it has only the features that I would definitely use):

=========

---------

~~~~~~~~~

*bold*

_italics_

monospace

TIP: ...

NOTE: ...

WARNING: ...

CAUTION: ...

IMPORTANT: ...

* bullet
** lists

. ordered
.. lists

foo:: definition for foo

image::diagram.png[Images]

[source,julia]
----
function foo()
...
end
----

|=======================
|Col 1|Col 2      |Col 3
|1    |Item 1     |a
|2    |Item 2     |b
|=======================

Whether the format is Markdown or AsciiDoc, I would like to make a proposal: Use backtics in a heading or definition list to indicate that something should go into the DOC object to become part of the online documentation. For example, stdlib/linalg.rst could look like this:

inv(M):: Matrix inverse

pinv(M):: Moore-Penrose inverse

null(M):: Basis for null space of M.

or like this,

=== repmat(A, n, m)

Construct a matrix by repeating the given matrix n times
in dimension 1 and m times in dimension 2.

=== repeat(A, inner = Int[], outer = Int[])

Construct an array by repeating the entries of A. The i-th element
of inner specifies the number of times that the individual entries
of the i-th dimension of A should be repeated. The i-th element of
outer specifies the number of times that a slice along the i-th
dimension of A should be repeated.

The objective is to allow people to separate the documentation from the code if they wish. This same text can be parsed later by some external Markdown or Asciidoc tool to become part of the manual.

a subset of AsciiDoc

I would like to make a proposal: Use backtics in a heading or definition list to indicate that something should go into the DOC

I don't think that defining Yet Another Standard is the goal. We had this discussion in IPython, we already have problem because we support Mathjax in Markdown. So as the others have said, I think it is wise to first only focus on the way to associate data, not the way to write it.

Github supports AsciiDoc

Often the questionare further than only where the format can be rendered. In particular markdown have the huge advantage that it can be rendered browser side. AFAIK, there is no good library to do is client side for AsciiDoc.
But this is just one example.

That's the PEP that proposes ReST as the standard format for Python. It makes a good case that a documentation format should NOT be too minimal.

But ReST have also a good number of drawbacks, and with all the respect I have for the people that wrote ReST
and choose to use it , it is broken in many way. First one beeing that you cannot reder a partial document. Because the marking of header is context sensitive seeing ~~~ under a text does not tell you wether it is H1, H2, or H3, unless you have parsed the rest of the doc.

So I think the way of writing have to be chosen really carefully, using criteria that have never been brought up before: A documentation that can both be statically build where build-time is (almost) irrelevant. And the capacity to extract only a partial information from the doc and render it on the fly (Just In Time for doc so Doc In Time ?). Obviously the two ave different constraint, and the later will have much more difficulties to get cross references and so on especially in an interactive environnent.

Lastly even with all the good faith in the world I doubt we can sove this problem with only a pen and a paper,
only by trial and error and natural selection we can start to see what survies and what are the things we did not thought of.

I don't think that defining Yet Another Standard is the goal.

I am not sure that using a subset of a standard is the same as making yet-another-standard. I would be more concerned about extending a standard, which is my concern with Markdown. Markdown doesn't do much. If we go for Markdown, sooner or later we will extend it. I have listed some features that AFAIK Markdown lacks that I would like to use (e.g. NOTE / TIP / WARNING and definition lists).

AFAIK, there is no good library to do is client side for AsciiDoc.

There is asciidoctor.js from the Asciidoctor project.

But ReST have also a good number of drawbacks, and with all the respect I have for the people that wrote ReST and choose to use it, it is broken in many way. First one beeing that you cannot reder a partial document.

That is indeed a pretty major drawback. In addition, I am not a big fan of the ReST syntax. I much prefer the Markdown / Asciidoc syntax. That is why I have not proposed ReST for Julia. I think that the Python community is more or less stuck with ReST due to inertia, and Julia seems to be moving in the direction of ReST for manuals.

The Julia Language member
commented Jan 6, 2014

Julia seems to be moving in the direction of ReST for manuals

That's almost entirely because ReadTheDocs/Sphinx is such a nice solution for them.

To help get the ball rolling, I wrote a small documentation module:

https://github.com/dcarrera/Doc.jl

I tried to mostly follow the spec proposed by @stevengj. This module includes a DOC object, and sample minimalist implementations of help(), apropos() and @doc. The only macro is @doc. I did not implement any of the string macros I proposed earlier. See the README for more details.

One place where this module deviates slightly from @stevengj's proposal is that DOC is indexed by strings only. Methods are documented in the form DOC["foo(x::Real)"]. I consider this a backend detail, since the user is supposed to use the help() function to access DOC and help(m::Method) is fully defined. Ditto for help(f::Function).

Please have a look and offer improvements. I will be happy to give commit access to anyone that wants it (you might need to explain to me how to do that).

The Julia Language member

I would prefer to see this work done on a branch or fork of Julia rather than in a module. And I think we have certainly got to index by Julia objects rather than by strings. Not only do strings have all sorts of limitations when it comes to documenting things other than methods (e.g. how would one document constants like π?), but constructing a unique string from a method signature seems inherently fragile.

@stevengj A module does not take 45 seconds to reload when you test and develop like it takes on my machine to make after changing a file in base/. I would be happy to see the functionality in Base, but currently we need to make it easy for adventurous users to test the functionality and find edge cases without requiring them to run a forked julia, or requiring us to merge a half baked solution into master.

@stevengj: Could you have a look at my @doc macro and tell me how to write it so I can get the method object? My inability to do that is the only reason I went for strings.

I will be happy to branch/fork when there is an agreement that I should. This is my first contribution to Julia. I don't know the process.

As @jakebolewski suggested some time ago, I also think that it would make sense to define a function-interface to access the metadata. This means instead of accessing the DOC dict directly (or whatever else), using a function for it: meta(obj::Any) would return DOC[obj] (in case of @stevengj's backend).

This abstraction would allow a few things:

• makes it possible to change the backend, if ever desired.
• lets users choose a different backend for the metadata associated with their types
• lets users do additional stuff when meta is called for their types.

Using the DOC dict backend (renamed to META as it may contain other things as well...) this is (trivially):

# Each metadata entry is just a dict Any=>Any:

# function interface to META
function meta(obj; create=false)
if create
else
META[obj]
end
end
# (primitive) documentation-system functions:
function help(obj)
show(helpstr)
end
#####
# example: associate doc with cos generic function:
meta(cos, create=true)[:doc] = """Calculates the cosine of the single argument"""
meta(cos)[:author] = """Cosinus"""  # some other metadata
# associate doc with a particular method
cos_float64 = methods(cos, (Float64,))[1]
meta(cos_float64, create=true)[:doc] = """for a 64-bit floating point number."""
# call help
help(cos)

This should work fine with @dcarrera approach in his gist.

@mauro3: I am having trouble getting methods() to work correctly:

julia> bar(x) = 3x
julia> bar(x::Real) = 4x
julia> methods(bar)
# 2 methods for generic function "bar":
bar(x::Real) at none:1
bar(x) at none:1

julia> methods(bar, (Real,) )
1-element Array{Any,1}:
(x::Real) at none:1

julia> methods(bar, (Any,) )
2-element Array{Any,1}:
(x::Real) at none:1
(x) at none:1

In the last instance, I see no obvious way to pick out the bar(x) method.

I'm not sure about the best way either. Probably need to write a enhanced methods function, but maybe someone more knowledgeable can comment on this. (The main point about my last post was the meta function, the rest is just some simple illustration; you're much further in that respect.)

Another problem I encountered, when trying to document the META-dict itself, was that using a dict as its own key leads to a stackoverflow: a=Dict();a[a]=5;a[a]. In general using mutable objects as dict keys seems to have its pitfalls, ObjectIdDict helps a bit but not all the way.

Whoo hoo! \o/

I think I might have the macro working now. The trick is to take the method list from methods( f, signature) and locate the least specific method from that list. Using your META dict, here is a full implementation, not including meta() and help():

macro doc(s,e)
if typeof(e) == Expr

# Expr => Get method

f = eval(e)  #  Function.

#
# e.head                   is  :(=) or :function
# e.args[1]                is  :(foo(x,y::Real))
# e.args[1].args[1]        is  :foo
# e.args[1].args[2:end]    is  [ :x , :(y::Real) ]
#
params = e.args[1].args[2:end]

#
# Get the signature as a tuple.
#
sig = map( x -> isa(x,Expr) ? x.args[2] : Any , params)
sig = tuple(sig...)          # Convert [DataType,... ] -> (Symbol,...)
sig = map(x -> eval(x), sig) # Convert (Symbol,...)    -> (DataType,...)

#
# Method list -- all methods that match the signature.
#
ml = methods(f, sig)

#
# Does this work? -- Look for the most general method in ml.
#
while length(ml) > 1
if ml[1].sig <: ml[2].sig
splice!(ml, 1) # ml[1] is more specific => remove it.
elseif ml[2].sig <: ml[1].sig
splice!(ml, 2) # ml[2] is more specific => remove it.
else
# Neither ml[1] nor ml[2] can be the most general method in ml.
splice!(ml, 1:2)
end
end

#
# The last man standing should be the correct method.
#
key = ml[1]
else
# Symbol => Get function

# FIXME: Test this!
key = eval(e)
end

docstr    = typeof(s) == Expr ? eval(s) : s
META[key] = { :docstr => docstr }
end

@doc "Hello world 1" foo(x::Real)     = 3x;
@doc "Hello world 2" foo(x::Int)      = 4x;
@doc "Hello world 3" foo(x::Number)   = 5x;
@doc "Hello world 4" foo(x::Rational) = 6x;
@doc "Hello world 5" foo(x) = 7x;

At this point, you can confirm that length(META) == 5 and that the contents of META are correct.

I will add this to my test module (https://github.com/dcarrera/JDoc.jl) later today.

... And of course, 20min later I realize that my round-robin loop is ridiculously over-complicated and a simple filter would do what I need:

ml = methods(f, sig)
-
-
-       #
-       # Does this work? -- Look for the most general method in ml.
-       #
-       while length(ml) > 1
-           if ml[1].sig <: ml[2].sig
-               splice!(ml, 1) # ml[1] is more specific => remove it.
-           elseif ml[2].sig <: ml[1].sig
-               splice!(ml, 2) # ml[2] is more specific => remove it.
-           else
-               # Neither ml[1] nor ml[2] can be the most general method in ml.
-               splice!(ml, 1:2)
-           end
-       end
+       ml = filter(m -> m.sig == sig, ml)

Here the demo for above pull request (updated 29 Jan):

####################
md[:something] = 5
setmeta!(Base, md) # associate md with Base

getmeta(Base) # retrieve it

# or equivalently
md = getmeta!(sin)
md[66] = -5

getmeta(sin)

### existing help-system
########################
# the REPL-help system is ported over to the new system
help(cos)
help(*)
help(ENV)
help("getting around")
# macros can only be referred to by a string:
# (otherwise they are evaluated right away)
help("@which")

###########################
# (note: these are just the low-level functions and are not intended to
# be used by the user.  We'll need some macro/function-magic here along the
# lines of @dcarrera's work)
hd = Base.Help.HelpDict();  hd[:categories] = ["my stuff"]

f(x) = 5+x
hd[:desc] ="Adds 5 to its argument"
Base.Help.setdoc!(f, hd)

type MyType end
hd = copy(hd)
hd[:desc] ="My type"
Base.Help.setdoc!(MyType, hd)
at = MyType()
hd = copy(hd)
hd[:desc] ="Instance of MyType"
Base.Help.setdoc!(at, hd)

help(f)
help(MyType)
help(at)
help("my stuff")

maybe off topics, but have you guys see devdocs.io ?

Added a @doc macro to pull request #5572:

julia> @doc """
some fine doc
""" function f(x)
2x
end

julia> ?f
some fine doc

(it's a bit ugly that function has to be on the same line as """. Any way around that apart from using parenthesis?)

(it's a bit ugly that function has to be on the same line as """. […])

Yes, it totally is. :( That way documented and undocumented functions would look differently and would be significantly harder to see when skimming through code.

(Disclaimer: Python absolutely gets docstrings right, and I'd love to see a comparable approach in Julia.)

The Julia Language member

I think that ultimately we will want to have some kind of doc keyword in Julia, with the macro only as a stopgap. A keyword will allow more flexible parsing.

Python doc strings for function/classes are good. Although, I think in Python, there is nothing to provide documentation for other objects.

It would be a lot more pretty if there was a line continuation character. But there is none and none planned, I think. But there should probably be some special syntax down the line and I would be happy with a python-like approach for function, types and modules.

Well, ugly or not, we should settle on a metadata implementation. Once that is in place we can play around with and discuss about documentation making.

I too would like to see something like docstrings in Python, definitely the best approach I've come across so far.

Thanks @ErikBjare, and thanks for pinging this issue. Even though the details of the pull request #5572 are not too popular, we should discuss whether the interface I used there to access the metadata is usable.

It's based on the interface to dictionaries (http://docs.julialang.org/en/latest/stdlib/base/#associative-collections) and looks like:

typealias MetaData Dict{Any,Any} # Each individual metadata entry is just a dict Any=>Any:

hasmeta(obj)                     # return true if obj has associated metadata
getmeta(obj)                     # returns metadata of obj or errors if it has none
getmeta(obj, default)            # returns metadata of obj or default hasmeta(obj)==false
getmeta!(obj)                    # returns metadata of obj or an empty MetaData which is associated with obj
getmeta!(obj, md::MetaData)      # returns metadata of obj or md, which is also associated with obj

setmeta! is needed as setindex! and the syntactic suguar [] used for Dicts are not possible here. Should also add deletemeta!(obj). The other functions for Associative Collections are probably not applicable, e.g. iteration is not possible in general.

I agree with @homeworkprod and @ErikBjare. Python's docstrings are invaluable and are a great asset to the language esp. the version used by NumPy/SciPy. The ability to see extensive documentation and examples without leaving the REPL really helps exploratory coding. I think Julia would greatly benefit from implementing something comparable.

The Julia Language member

We really need help for packages. I believe this issue covers it. Added to 0.4.

The Julia Language member

+1000

👍. And what about help for methods? :)

+1 I am also missing Python's docstrings allot. I think that NumPy's docstrings is especially helpful for beginners. But docstrings should also dispatch on types, but could there be some kind of inheritance, here I think of something like a plot function in Gadfly where many is just variations of a base function.

The Julia Language member

While agreeing on the infrastructure that makes documentation of packages possible is good isn't it even more important to agree on the form how documentation is made? With form I don't mean what is in the doc string but how to define it. The @doc macro is one possibility. What would be IMHO important is to support both use cases:
- Inline doc in front of the function/type that is to be documented. Here the function/type should be parsed to get all the rich information
- Offline doc where the function and the doc are separated as it is currently done in base.

If we have agreed upon the syntax I would vote for merging a simple prototype early to master and start migrating base docs and use the global DOC that @stevengj proposed.

Agree with @tknopp that both inline and out of line is needed. Inline is good because its in the programmers face, so it is likely to get updated. But out of line is needed when the documentation gets a bit complex. Having large discussions of preferred usage/performance tips etc inline is very distracting. In this case it should be able to be elsewhere, but referenced from inline so there is still something indicating that it exists and the programmer should update it.

In both cases the function/type should be parsed for as much as possible.

I didn't see an explicit mention, but I assume the data on the function/type will be automagically added to the central repository at compile time (or startup for pre-compiled code).

for generating documentation I'd also go for markdown since
1) it's simple
2) it's neutral & allows to generate different formats like html,pdf,... from it.

Rstudio seems to be doing great things with (their version of) Markdown for generating documents.

One standard structured documentation format for code is way better than every possible format.

I'd also go for putting the metadata/doc string before a function, not in the body. If you have a long detailed docstring with e.g. examples, bullet point list,table it will make a function body less readable because you have a split between you function header with the arguments and the rest of your body.

As far as pure metadata is concerned it might be worthwhile to look a how other languages do it e.g. Java annotation (also heavily used in Groovy) or C#. I definitely would not want a runtime cost for processing these every time a function is called.

The Julia Language member

@ssagaert I've been prototyping some of the ideas you mention in a package https://github.com/MichaelHatherly/Docile.jl. Feel free to have a look at it if you're interested.

There was a lengthy discussion about help/documentation/etc on the mailing list recently, worth referencing here:

Of course, no consensus was reached but a few interesting things were discussed:

• string-based vs comment-based documentation
• how complex or simple should it be
• whether the documentation should become part of the AST
The Julia Language member

Jeff and I just talked about this today and a bare string literal in void context followed by a definition seems like the way to go. This should be lowered by the parser something like this:

"frob(x) frobs the heck out of x."

function frob(x)
# commence frobbing
end

becomes the moral equivalent of this:

let doc = "frob(x) frobs the heck out of x."
__DOC__[:frob] *= doc
else
__DOC__[:frob] = doc
end
end

function frob(x)
# commence frobbing
end

1. parsing has no side-effects – the construction of the documentation structure still occurs when the code is actually evaluated, not when it is parsed.
2. each module has its own const __DOC__ = Dict{Symbol,UTF8String} dictionary; this is important for reloading modules.
3. This ends up just appending all the docs for a given name, including separate doc strings for a single generic function.

An open issue is how to handle adding methods to functions from other modules. Does the definition go into the current module's __DOC__ dict? What symbol is used for the doc key then?

[cross-posted from here]

I don't like the fact that doc are before function but that's probably beeing use to python.

I agree. The thing I most look at documentation for is to see the signature and the one-line summary. Having those right next to each other (as in Python) is really nice. In this proposal, will they often be separated by a lot of detailed documentation?

One way around this is to reproduce the signature, as in the above example, which seems a bit silly given that the perfectly good signature (often with great type information) is available right at the start of the function. I guess another way around this is to put a one-line summary at the end of the docs, which seems weird to me.

@StefanKarpinski, two things:

• I think it would be better to use the actual object as the dict key and not a symbol. For instance when using a module, with just symbols it would be hard to figure out where the binding came from. This would also solve adding docs to other modules: it just adds to the __doc__ of the module where the entity is defined. (but macros would need to be treated separately)
• why not work with a function, say setdoc! which does all of the parser-inserted code?

Your example would then look like:

"frob(x) frobs the heck out of x.":
function frob(x)
# commence frobbing
end

becomes the moral equivalent of this:

function frob(x)
# commence frobbing
end
setdoc!(frob,  "frob(x) frobs the heck out of x.")

Two more things:

• the documentation for a module itself, would that go into its own __doc__ or in the __doc__ of the parent module?
• I like the : (as I inserted into above example) as a way to bind the docstring to the thing following it, but that really is just another bikeshed.
The Julia Language member

One motivation for this design is that it extends to simple variables, e.g.

"doc for X"
const X = ...

Supporting that also seems to preclude attaching the doc string to the object.
I imagine the actual lowering would produce something like setdoc!(current_module(), frob, :frob, string) so that setdoc! has enough information to do whatever might be necessary.

The Julia Language member

If you play around with this syntax in the presence of multiple dispatch, you can see immediately that the doc string inside approach just doesn't work: you want to have the doc string for a generic function before a series of method definitions for the same generic function, not inside one of the method definitions. We're also going to use this for things like globals, which don't have an "inside".

The Julia Language member

Some other examples:

"doc for overall function f"
function f;    # possible syntax for defining a generic function without adding a method yet

"doc for g"
g(x) = 2x

# add docs to a function without defining anything
"doc for h"
h
The Julia Language member

What would this do?

"doc for a,b"
a,b = 1,2
The Julia Language member

At first that might have to be a parse error, or we just ignore the doc string if we don't know how to attach it to the following expression.

It would be nice if one could refer to args by name in the doc. I find this really useful for more elaborate explanations. Like

"blabla @arg blabla"
function f(arg)
...
end

If you think @ clashes too much with macros then just use another character.

The Julia Language member
The Julia Language member

See #8514; @StefanKarpinski's preference is that the documentation be inserted into external docs by some kind of {{myfunction}} manual template, so that they can be mixed with proper narrative documentation. That obviates the question of automatic ordering.

The Julia Language member

Is it fair to close this with the recent work on @doc and discuss specific details in separate issues?

Cc: @one-more-minute @MichaelHatherly

The Julia Language member

For reference: #8514

The Julia Language member

Yes, this issue seems to be basically concerned with the core doc system (storing + displaying metadata), which we have now. We do still want things like syntax, but now that those have their own issues this doesn't seem that relevant.

The Julia Language member

Yeah, the general concerns in this issue seem to be covered by @one-more-minute's recent work.

close this

The Julia Language member

@ViralBShah looks like you performed a drive-by tag removal, but you didn't close--what is the remaining work here?

