I am looking into the possibility of revising Go's doc comment syntax, specifically adjusting headings and adding lists and links. This discussion is meant to gather feedback before writing an official proposal.
The current Go doc comment format has served us well since their introduction in 2009. There has only been one significant change, which was the addition of headings in 2011. But there are also a few long-open issues and proposals about doc comments, including:
It makes sense, as we approach a decade of experience, to take what we've learned and make one coherent revision, setting the syntax for the next 10 or so years.
Goals and non-goals
The primary design criteria for Go doc comments was to make them readable as ordinary comments when viewing the source code directly, in contrast to systems like C#'s Xmldoc, Java's Javadoc, and Perl's Perlpod. The goal was to prioritize readability, avoiding syntactic ceremony and complexity. This remains as a primary goal.
Another concern, new since 2009, is backwards compatibility. Whatever changes we make, existing doc comments must generally continue to render well. Less important, but still something to keep in mind, is forward compatibility: keeping new doc comments rendering well in older Go versions, for a smoother transition.
Another goal for the revamp is that it include writing a separate, standalone web page explaining how to write Go doc comments. Today that information is squirreled away in the doc.ToHTML comment and is not easily found or widely known.
Within those constraints, the focus I have set for this revamp is to address the issues listed above. Specifically:
I believe it also makes sense to add another goal:
It is not a goal to support every possible kind of documentation or markup. For example:
Markdown is not the answer, but we can borrow good ideas
An obvious suggestion is to switch to Markdown; this is especially obvious given the discussion being hosted on GitHub where all comments are written in Markdown. I am fairly convinced Markdown is not the answer, for a few reasons.
First, there is no single definition of Markdown, as explained on the CommonMark site. CommonMark is roughly what is used on GitHub, Reddit, and Stack Overflow (although even among those there can be significant variation). Even so, let's define Markdown as CommonMark and continue.
Second, Markdown is not backwards compatible with existing doc comments. Go doc comments require only a single space of indentation to start a <pre> block, while Markdown requires more. Also, it is common for Go doc comments to use Go expressions like `raw strings` or formulas like a*x^2+b*x+c. Markdown would instead interpret those as syntactic markup and render as “
Third, many features in Markdown are not terribly readable. The basics of Markdown can be simple and punctuation-free, but once you get into more advanced uses, there is a surfeit of notation which directly works against the goal of being able to read (and write) program comments in source files without special tooling. Markdown doc comments would end up full of backquotes and underscores and stars, along with backslashes to escape punctuation that would otherwise be interpreted specially. (Here is my favorite recent example of a particularly subtle issue.)
Fourth, Markdown is surprisingly complex. Markdown, befitting its Perl roots, provides more than one way to do just about anything: _i_, *i*, and <em>i</em>; Setext and ATX headings; indented code blocks and fenced code blocks; three different ways to write a link; and so on. There are subtle rules about exactly how many spaces of indentation are required or allowed in different circumstances. All of this harms not just readability but also comprehensibility, learnability, and consistency. The ability to embed arbitrary HTML adds even more complexity. Developers should be spending their time on the code, not on arcane details of documentation formatting.
Of course, Markdown is widely used and therefore familiar to many users. Even though it would be a serious mistake to adopt Markdown in its entirety, it does make sense to look to Markdown for conventions that users would already be familiar with, that we can tailor to Go's needs. If you are a fan of Markdown, you can view this revision as making Go adopt a (very limited) subset of Markdown. If not, you can view it as Go adopting a couple extra conventions that can be defined separately from any Markdown implementation or spec.
The current rule is:
I can never remember the details of this exact rule, despite having chosen it. Every time I write a heading, I worry about whether it's going to be recognized as such. Others clearly have the same problem (#7349, #31739, #34377). The rule avoided the need for visible syntax, but in retrospect visible syntax would have been simpler. As Markdown shows us, that syntax can be very lightweight: a single “#” would suffice. Therefore I suggest the following:
New Rule: If a span of non-blank lines is a single line beginning with # followed by a space or tab and then additional text, then that line is a heading.
Here are some examples of variations that do not satisfy the rule and are therefore not headings:
Transition: The old heading rule will remain valid, which is acceptable since it mainly has false negatives, not false positives.
There is no support for lists today. As noted before, documentation needing lists uses indented <pre> blocks instead.
For example, here are the docs for cookiejar.PublicSuffixList:
And here are the docs for url.URL.String:
Ideally, we'd like to adopt a rule that makes these into bullet lists without any edits at all. (Markdown's space-counting rules would make these <pre> blocks, not lists.)
Today, a span of lines all indented by one or more spaces or tabs is always a <pre> block.
New Rule: In a span of lines all blank or indented by one or more spaces or tabs (which would otherwise be a <pre> block),
Using this rule, the two doc comments above are both recognized and formatted as bullet lists, not as <pre> blocks.
Note that the rule means that a list item followed by a blank line followed by additional indented text continues the list item (regardless of comparative indentation level):
Note also that there are no code blocks inside list items—any indented paragraph following a list item continues the list item, and the list ends at the next unindented line—nor are there nested lists. This avoids all of the space-counting subtlety of Markdown.
To re-emphasize, a critical property of this definition of lists is that it makes existing doc comments written with pseudo-lists turn into doc comments with real lists.
Transition: Gofmt will rewrite recognized bullet and numbered lists to use a standard format. For example, the two doc comments above would reformat to:
The specific formatting rules are discussed in the Formatting section below.
Markdown recognizes three different bullets: -, *, and +. In the main Go repo, the dash is dominant: in comments of the form
Markdown also recognizes two different numeric list item suffixes: “1.” and “1)”. In the main Go repo, 66% of comments use “1.” (versus 34% for “1)”). In the external corpus, “1.” is again the dominant choice, 81% to 19%.
We have two conflicting goals: handle existing comments well, and avoid needless variation. To satisfy both, all three bullets and both forms of numbers will be recognized, but gofmt (see below) will rewrite them to a single canonical form: dash for bullets, and “N.” for numbers. (Why dashes and not asterisks? Proper typesetting of bullet lists sometimes does use dashes, but never uses asterisks, so using dashes keeps the comments looking as typographically clean as possible.)
Links to URLs
Documentation is more useful with clear links to other web pages. For example, the encoding/json package doc today says:
There is no link to the actual RFC 7159, leaving the reader to Google it. And the link to the “JSON and Go” article must be copied and pasted. Loosely following the Markdown shortcut reference link format, I suggest the following:
New Rule: A span of unindented non-blank lines defines link targets when each line is of the form “[Text]: URL”. In other text, “[Text]” represents a link to URL using the given text—in HTML, <a href="URL">Text</a>.
Note that the link definitions can only be given in their own “paragraph” (span of non-blank unindented lines), which can contain more than one such definition, one per line. If there is no corresponding URL declaration, then (except for doc links, described in the next section) the text is not a hyperlink, and the square brackets are preserved.
This format only minimally interrupts the flow of the actual text, since the URLs are moved to a separate section. As already noted, it also roughly matches the Markdown shortcut reference link format, without the optional title text.
Transition: Gofmt will move link definitions to the end of the overall doc comment. Go vet will flag unused link targets. Older versions of Go will show the text verbatim, which is fairly readable.
Links to Go API documentation
Documentation is also more useful with clear links to other documentation, whether it's one function linking to another, preferred version or a top-level doc comment summarizing the overall API of the package, with links to the key types and functions. Today there is no way to do this. Names can be mentioned, of course, but users must find the docs on their own.
Following discussion on #45533, I suggest to treat doc links like the links in the previous section, without target definitions. Specifically:
New Rule: Doc links are links of the form “[Name1]” or “[Name1.Name2]” to refer to exported identifiers in the current package, or “[pkg]”, “[pkg.Name1]”, or “[pkg.Name1.Name2]” to refer to identifiers in other packages.
For example, if the current package imports encoding/json, then “[json.Decoder]” can be written in place of “[encoding/json.Decoder]” to link to the docs for encoding/json's Decoder.
The implications and potential false positives of this implied URL link are presented by Joe Tsai here. In particular, the false positive rate appears to be low enough not to worry about.
To illustrate the need for the punctuation restriction, consider:
Transition: Older versions of Go will show the text verbatim, which is still fairly readable.
Along with the changes, I suggest we add to go/doc a function
that reformats a doc comment in the conventional presentation, and then to have go/printer and gofmt invoke this formatter.
The formatter would canonicalize the input so that it formatted exactly as before but with the following properties:
The formatter would not reflow paragraphs, so as not to prohibit use of the semantic linefeeds convention.
This canonical formatting has the benefit for Markdown aficionados of being compatible with the Markdown equivalents. The output would still not be exactly Markdown, since various punctuation would not be (and does not need to be) escaped, but the block structure Go doc comments and Markdown have in common would be rendered as valid Markdown.
The current doc.ToHTML is given only the comment text and therefore cannot implement import-based links to other identifiers. To address this, we would need to add ToHTML and ToText methods to the Package type, and define that the top-level functions are as though calling the methods on a zero value of the struct. The ToHTML method will need to take a new Config struct that, at the least, allows specifying the URL prefix of the documentation server (for example,
There is an accepted proposal to add doc.ToMarkdown for easy conversion of Go doc comments to Markdown, and we would implement and update that as part of this work. It too would be added to the Package type.
Beta Was this translation helpful? Give feedback.