New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intra Rustdoc Links #1946

Merged
merged 16 commits into from Jul 25, 2017

Conversation

@killercup
Member

killercup commented Mar 7, 2017

Add a notation how to create relative links in documentation comments (based on Rust item paths) and extend Rustdoc to automatically turn this into working links.

The important bits:

Additions to the documentation syntax

  1. [Iterator](std::iter::Iterator)
  2. [Iterator][iter], and somewhere else in the document: [iter]: std::iter::Iterator
  3. [Iterator] or [`Iterator`] with automatically generated link reference definition

All additions are valid CommonMark syntax.

Rendered

Revisions

March 9: Dropped <Iterator>-style links, added implied reference links (#1946 (comment))

May 28: Add more possible extensions, add more syntax for dealing with path ambiguities (#1946 (comment))

Intra Rustdoc Links
Add a notation how to create relative links in documentation comments
(based on Rust item paths) and extend Rustdoc to automatically turn this
into working links.
@GuillaumeGomez

This comment has been minimized.

Member

GuillaumeGomez commented Mar 7, 2017

Awesome! \o/

@fbstj

This comment has been minimized.

fbstj commented Mar 7, 2017

Bikeshed: would it be useful to have an additional <std::iter::Iterator> style syntax/marker that desugars to [Iterator](std::iter::Iterator) for convenience?

@killercup

This comment has been minimized.

Member

killercup commented Mar 7, 2017

an additional <std::iter::Iterator> style syntax/marker that desugars to [Iterator](std::iter::Iterator)

Hm, I'm wary of adding more cases, and even more so to add magical new syntax. I'd prefer to go with explicit links if you want to set the link text to something other than the full path.

(But: If Iterator is in scope, <Iterator> will be a valid link!)

@sgrif

This comment has been minimized.

Contributor

sgrif commented Mar 7, 2017

This will need a way to disambiguate between functions, traits, etc. It's completely valid to have a struct and function with the same name. Perhaps we could use URIs for this? function://path::to::fn, struct://path::to::struct, etc.

Also should this specify whether super works or not? Are the paths always relative to the crate root? Is there ever a case where it would matter whether the link is based on where the struct appears in the documentation vs where it was defined?

@kennytm

This comment has been minimized.

Member

kennytm commented Mar 7, 2017

Previous discussions 🚲

  • #1661 (Relative link syntax for rustdoc, closed)
    • <::/std/cmp/trait.Ord.html>
  • #792 (Links to Rust items in documentation text, active issue)
    • <<<std::cmp::Ord>>
  • https://internals.rust-lang.org/t/2720 (Pre-RFC: standard form to reference things)
    • {std::cmp::Ord#cmp}
    • [cmp]{@link std::cmp::Ord.cmp}
    • <rustdoc:std::cmp::Ord>
    • <rustdoc://std::cmp::Ord>
    • <rust:std::cmp::Ord>
    • ` std::cmp::Ord` (note leading space)
    • [std::cmp::Ord]
  • https://internals.rust-lang.org/t/968 (Rustdoc: Link to other types from doc comments)
    • ``std::cmp::Ord``
    • <rust://std/cmp/Ord.trait>
@frewsxcv

This comment has been minimized.

Member

frewsxcv commented Mar 7, 2017

Relevant rust-lang/rust issues: rust-lang/rust#36417, rust-lang/rust#38400

@kennytm

This comment has been minimized.

Member

kennytm commented Mar 7, 2017

@sgrif It only needs to distinguish between values (functions, constants) and types (structs, enums, traits, modules, type-aliases, primitives). Maybe:

  • <syntax::ptr::P> — First search for type-like items. If not found, search for value-like items
  • <syntax::ptr::P()>Only search for functions.
  • <std::stringify!>Only search for macros.
@killercup

This comment has been minimized.

Member

killercup commented Mar 7, 2017

This will need a way to disambiguate between functions, traits, etc.

Oh, I had completely missed that this is valid (damn you, Rust!). Is there are an existing way to disambiguate this in paths? Or is it always resolved by the position it appears in?

Playing around on the playpen a bit, it seems that (assuming all have the same name):

  • unit and tuple structs conflict with functions
  • unit and tuple structs conflict with enums
  • regular structs, enums, and traits conflict with each other (but not functions)

I think @kennytm idea with a () suffix to disambiguate functions (#1946 (comment)) works well. I propose it is only needed when the non-suffixed link is actually ambiguous. (I assume most code bases follow Rust's style lints and use snake_case functions names and PascalCase for type/trait names.)
Macros should alway use ! suffix.

Are the paths always relative to the crate root?

Good point, that's totally missing from the RFC. Path are relative to the position of item in whose documentation they appear. (See complex example for relative paths.)

Previous discussions

Thanks, @kennytm! Reading #1661 (comment), it only feels right to cc @JohnHeitmann.

@frewsxcv

This comment has been minimized.

Member

frewsxcv commented Mar 7, 2017

There is talk about extending CommonMark to support syntactic extensions, but it looks like progress has stalled a bit.

If we're going to design our own hyperlink destinations (as proposed in this RFC and the comments above), I'd strongly recommend staying within the bounds of what is acceptable for a "link destination" as defined by CommonMark.

@kennytm

This comment has been minimized.

Member

kennytm commented Mar 7, 2017

How will cross-crate links be handled? Like,

  • linking from num-traits to <std::u32> (linking to standard library)
  • linking from num-complex to <num_traits::One> (crates in the same workspace)
  • linking from num-bigint to <serde::Serialize> (third-party crates, and only linked when a feature is defined)

BTW there is also this thread on talk.commonmark.org discussing the issue from the Markdown side, but it is pretty inconclusive (one suggestion to api:std.cmp.Ord and one to {std::cmp::Ord}).

@killercup

This comment has been minimized.

Member

killercup commented Mar 7, 2017

How will cross-crate links be handled?

@kennytm, rustdoc is already able to generate links to other crates' types, and I would assume we can use this mechanism. For std (and core), it links to an official URL, e.g. https://doc.rust-lang.org/nightly/core/fmt/trait.Debug.html. For other external crates, it links to the docs it generated for the crate itself.

Update: Sorry, I totally missed

linking from num-bigint to serde::Serialize (third-party crates, and only linked when a feature is defined)

I'm not sure how to approach this best. I'd say we don't generate the link and issue a warning. Rendered docs should probably always contain all possible features. It will be difficult to do this when features are exclusive, i.e. you can't use both serde_json and json. This seems to be an unusual practice, though.

Update intra_rustdoc_links
- Resolving paths
- Path ambiguities
- Linking to external crates
@killercup

This comment has been minimized.

Member

killercup commented Mar 7, 2017

Updated the RFC with sections on

  • Resolving paths
  • Path ambiguities
  • Linking to external crates

I don't think we need to use a potential Markdown extension for this.

If we're going to design our own hyperlink destinations (as proposed in this RFC and the comments above), I'd strongly recommend staying within the bounds of what is acceptable for a "link destination" as defined by CommonMark.
#1946 (comment)

I think Rust paths already comply with that. Even the () suffix should not be problem, as the fall into part (b) of:

[…] includes parentheses only if (a) they are backslash-escaped or (b) they are part of a balanced pair of unescaped parentheses […]
http://spec.commonmark.org/0.27/#link-destination

Update intra_rustdoc_links
- Mention this is valid Markdown
- Add more alternatives

@killercup killercup force-pushed the killercup:intra-rustdoc-links branch from cacaa79 to 2dd1ed2 Mar 7, 2017

@steveklabnik

This comment has been minimized.

Member

steveklabnik commented Mar 7, 2017

/cc @jonathandturner who has been working on a rough rustdoc re-write

@durka

This comment has been minimized.

Contributor

durka commented Mar 7, 2017

Good point, that's totally missing from the RFC. Path are relative to the position of item in whose documentation they appear. (See complex example for relative paths.)

This is a problem for docs that get cross-included by Deref.

@killercup

This comment has been minimized.

Member

killercup commented Mar 7, 2017

This is a problem for docs that get cross-included by Deref. (#1946 (comment))

Can you give an example? I found serde::bytes::ByteBuf which has a Deref impl, and the docs from trait Deref itself are rendered.

Including docs from traits should lead to the docs being evaluated relative to the trait item they are defined on, i.e. relative to core::ops::Deref, not serde::bytes::ByteBuf.

@nrc

This comment has been minimized.

Member

nrc commented Mar 7, 2017

@killercup, @sgrif, @kennytm

This will need a way to disambiguate between functions, traits, etc. It's completely valid to have a struct and function with the same name. Perhaps we could use URIs for this? function://path::to::fn, struct://path::to::struct, etc.

It only needs to distinguish between values (functions, constants) and types (structs, enums, traits, modules, type-aliases, primitives)

Rust has three namespaces - values, types, and macros. Everything is in one of those three and you don't need any more namespaces than that (it is a bit more complicated because some things can be in more than one namespace, tuple structs, iirc can be used both as type names and as function names). There are already alternate URLs in Rustdoc that use this scheme, unfortunately via redirects (I would love to deprecate the crazy ones with trait etc in the name, but there were back compat worries), these are easy to synthesise, e.g., https://doc.rust-lang.org/nightly/std/path/Path.t.html for https://doc.rust-lang.org/nightly/std/path/struct.Path.html, so you don't need to know the random rustdoc categorisation of a thing, only its namespace.

@steveklabnik

This comment has been minimized.

Member

steveklabnik commented Mar 7, 2017

Thanks @killercup ! Here's some thoughts....

  1. For me, the priority here is to make sure that we're within regular Markdown rules. It appears that we are. 👍
  2. A significant proposal from before that doesn't seem to be addressed here is a rust:// URI. Putting this in alternatives at least should happen; it's not clear to me that we shouldn't do this.
  3. In general, an RFC like this is largely split between two things: how do I define the links, and how does rustdoc actually figure them out. The former is definitely RFC-worthy for me, but I wonder if the latter can't be avoided as some sort of "you use the canonical path for the item linking to", and then free rustdoc to figure that out itself.
  4. In contradiction to point 4, "how does rustdoc figure this out" is a pretty significant pain point, and the hardest part of the RFC, IMHO.
  5. @jonathandturner and I (among others at various times) have wondered if rustdoc in the future shouldn't be built on top of RLS. If so, that split would mean "rustodc proper asks RLS what the URL is", I think.
  6. thoughts about migrating to this new style might be helpful, I wonder how hard it would be to create a tool to convert existing links like this.

That's all (lol, I feel like it's a lot) I've got at the moment, I'm sure I'll have more eventually...

@durka

This comment has been minimized.

Contributor

durka commented Mar 7, 2017

@killercup

For example, there is a link from str::len to char, but this link is also present (and broken) on the String page because it is a relative link. There are open issues about this.

It's definitely solvable, but it could require "absolute" (i.e. crate-relative) paths in some places.

@nrc

This comment has been minimized.

Member

nrc commented Mar 8, 2017

If so, that split would mean "rustodc proper asks RLS what the URL is", I think.

FWIW, the RLS already creates concrete URLs from paths like this so that when you have a reference to something from std, you can go to the rustdoc page. It might not work exactly how you need it to for this application though.

@kennytm

This comment has been minimized.

Member

kennytm commented Mar 8, 2017

How should Rustdoc know that <P> is an HTML tag or a link to a P item?

@jonathandturner

This comment has been minimized.

Contributor

jonathandturner commented Mar 8, 2017

That's great!

Thanks for the ping. I think for a "rustdoc 2.0" idea, we're still a little ways off, yet. Yeah, it will probably consume metadata rather than trying to work against the compiler directly, but there's still a fair bit of research we need to do to explore what we can do there. That said, I'm currently doing the rls stuff, and what we're doing there will help make a "rustdoc 2.0" possible.

I say go for it and keep improving this rustdoc. It just sets the bar that much higher for the next one (which is a good thing!)

@eddyb

This comment has been minimized.

Member

eddyb commented Mar 8, 2017

@jonathandturner IMO, instead of trying to serialize piecewise, we could have sources for everything, which combined with warm incremental recompilation caches would allow introspecting entire crates.

Even if we don't get incremental reparsing, we could still skip parsing and expansion for, say, a libstd that was compiled once and distributed, so you get almost instant access to the HIR and type information.
To keep it slim, we could drop anything that's not exported (i.e. function bodies' MIR) from the caches.

@solson

This comment has been minimized.

Member

solson commented Mar 8, 2017

  1. [Iterator][iter], and somewhere else in the document: [iter]: std::iter::Iterator

@killercup Markdown has another trick that would make this even sweeter at the use sites:

[Iterator], and somewhere else in the document: [Iterator]: std::iter::Iterator

@llogiq

This comment has been minimized.

Contributor

llogiq commented Mar 8, 2017

The 'easy' alternative is to auto-link all items in scope, as in [String]. This is trivially supported by pulldown-cmark with extending the links map.

@killercup

This comment has been minimized.

Member

killercup commented Jun 19, 2017

@est31

This comment has been minimized.

Contributor

est31 commented Jun 19, 2017

for many primitives there are also types with the same name (stuff like i32)

i32 there are two doc pages, one for a module std::i32, and one for the primitive type. There has been talk in the lang team to get rid of the module and replace it with associated consts (cc @eddyb ), so in future there might only be the primitive type.

@ollie27

This comment has been minimized.

ollie27 commented Jun 19, 2017

I'm pretty sure rustc special cases the primitive types, I know rustdoc does. rustdoc already has code specifically for generating links to the primitive types so I doubt there will be any issues.

As long as there are no types or module named i32 in scope is seems reasonable for [i32] to link to the primitive page.

@nrc

This comment has been minimized.

Member

nrc commented Jun 20, 2017

@rfcbot fcp cancel

@rfcbot

This comment has been minimized.

rfcbot commented Jun 20, 2017

@nrc proposal cancelled.

@nrc

This comment has been minimized.

Member

nrc commented Jun 20, 2017

@rfcbot fcp merge

@rfcbot

This comment has been minimized.

rfcbot commented Jun 20, 2017

Team member @nrc has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@japaric

This comment has been minimized.

Member

japaric commented Jun 21, 2017

@rfcbot reviewed

2 similar comments
@michaelwoerister

This comment has been minimized.

michaelwoerister commented Jun 22, 2017

@rfcbot reviewed

@QuietMisdreavus

This comment has been minimized.

Member

QuietMisdreavus commented Jun 22, 2017

@rfcbot reviewed

@nrc

This comment has been minimized.

Member

nrc commented Jul 10, 2017

ping @brson for fcp approval

@rfcbot

This comment has been minimized.

rfcbot commented Jul 13, 2017

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot

This comment has been minimized.

rfcbot commented Jul 23, 2017

The final comment period is now complete.

@nrc nrc referenced this pull request Jul 25, 2017

Open

Tracking issue for RFC 1946 - intra-rustdoc links #43466

12 of 17 tasks complete

@nrc nrc merged commit 6af994d into rust-lang:master Jul 25, 2017

@colin-kiegel colin-kiegel referenced this pull request Aug 31, 2017

Closed

September 2017 #37

10 of 10 tasks complete
@killercup

This comment has been minimized.

Member

killercup commented Sep 6, 2017

Just came up:

`[Iterator]`

should probably not be rendered as a link inside a code block.

@Manishearth

This comment has been minimized.

Member

Manishearth commented Jan 10, 2018

We're proposing some amendments in #2285

bors added a commit to rust-lang/rust that referenced this pull request Jan 22, 2018

Auto merge of #47046 - Manishearth:intra-doc-links, r=eddyb,Guillaume…
…Gomez,QuietMisdreavus,Manishearth

Implement RFC 1946 - intra-rustdoc links

rust-lang/rfcs#1946 #43466

Note for reviewers: The plain line counts are a little inflated because of how the markdown link parsing was done. [Read the file diff with "whitespace only" changes removed](https://github.com/rust-lang/rust/pull/47046/files?w=1) to get a better view of what actually changed there.

This pulls the name/path resolution mechanisms out of the compiler and runs it on the markdown in a crate's docs, so that links can be made to `SomeStruct` directly rather than finding the folder path to `struct.SomeStruct.html`. Check the `src/test/rustdoc/intra-paths.rs` test in this PR for a demo. The change was... a little invasive, but unlocks a really powerful mechanism for writing documentation that doesn't care about where an item was written to on the hard disk.

Items included:

 - [x] Make work with the hoedown renderer
 - [x] Handle relative paths
 - [x] Parse out the "path ambiguities" qualifiers (`[crate foo]`, `[struct Foo]`, `[foo()]`, `[static FOO]`, `[foo!]`, etc)
 - [x] Resolve foreign macros
 - [x] Resolve local macros
 - [x] Handle the use of inner/outer attributes giving different resolution scopes (handling for non-modules pushed to different PR)

Items not included:

 - [ ] Make sure cross-crate inlining works (blocked on refactor described in #47046 (comment))
 - [ ] Implied Shortcut Reference Links (where just doing `[::std::iter::Iterator][]` without a reference anchor will resolve using the reference name rather than the link target) (requires modifying the markdown parser - blocked on Hoedown/Pulldown switch and raphlinus/pulldown-cmark#121)
 - [ ] Handle enum variants and UFCS methods (Enum variants link to the enum page, associated methods don't link at all)
 - [ ] Emit more warnings/errors when things fail to resolve (linking to a value-namespaced item without a qualifier will emit an error, otherwise the link is just treated as a url, not a rust path)
 - [ ] Give better spans for resolution errors (currently the span for the first doc comment is used)
 - [ ] Check for inner doc comments on things that aren't modules

I'm making the PR, but it should be noted that most of the work was done by Misdreavus 😄

(Editor's note: This has become a lie, check that commit log, Manish did a ton of work after this PR was opened `>_>`)

bors added a commit to rust-lang/rust that referenced this pull request Jan 22, 2018

Auto merge of #47046 - Manishearth:intra-doc-links, r=eddyb,Guillaume…
…Gomez,QuietMisdreavus,Manishearth,killercup

Implement RFC 1946 - intra-rustdoc links

rust-lang/rfcs#1946 #43466

Note for reviewers: The plain line counts are a little inflated because of how the markdown link parsing was done. [Read the file diff with "whitespace only" changes removed](https://github.com/rust-lang/rust/pull/47046/files?w=1) to get a better view of what actually changed there.

This pulls the name/path resolution mechanisms out of the compiler and runs it on the markdown in a crate's docs, so that links can be made to `SomeStruct` directly rather than finding the folder path to `struct.SomeStruct.html`. Check the `src/test/rustdoc/intra-paths.rs` test in this PR for a demo. The change was... a little invasive, but unlocks a really powerful mechanism for writing documentation that doesn't care about where an item was written to on the hard disk.

Items included:

 - [x] Make work with the hoedown renderer
 - [x] Handle relative paths
 - [x] Parse out the "path ambiguities" qualifiers (`[crate foo]`, `[struct Foo]`, `[foo()]`, `[static FOO]`, `[foo!]`, etc)
 - [x] Resolve foreign macros
 - [x] Resolve local macros
 - [x] Handle the use of inner/outer attributes giving different resolution scopes (handling for non-modules pushed to different PR)

Items not included:

 - [ ] Make sure cross-crate inlining works (blocked on refactor described in #47046 (comment))
 - [ ] Implied Shortcut Reference Links (where just doing `[::std::iter::Iterator][]` without a reference anchor will resolve using the reference name rather than the link target) (requires modifying the markdown parser - blocked on Hoedown/Pulldown switch and raphlinus/pulldown-cmark#121)
 - [ ] Handle enum variants and UFCS methods (Enum variants link to the enum page, associated methods don't link at all)
 - [ ] Emit more warnings/errors when things fail to resolve (linking to a value-namespaced item without a qualifier will emit an error, otherwise the link is just treated as a url, not a rust path)
 - [ ] Give better spans for resolution errors (currently the span for the first doc comment is used)
 - [ ] Check for inner doc comments on things that aren't modules

I'm making the PR, but it should be noted that most of the work was done by Misdreavus 😄

(Editor's note: This has become a lie, check that commit log, Manish did a ton of work after this PR was opened `>_>`)

bors added a commit to rust-lang/rust that referenced this pull request Jan 23, 2018

Auto merge of #47046 - Manishearth:intra-doc-links, r=eddyb,Guillaume…
…Gomez,QuietMisdreavus,Manishearth

Implement RFC 1946 - intra-rustdoc links

rust-lang/rfcs#1946 #43466

Note for reviewers: The plain line counts are a little inflated because of how the markdown link parsing was done. [Read the file diff with "whitespace only" changes removed](https://github.com/rust-lang/rust/pull/47046/files?w=1) to get a better view of what actually changed there.

This pulls the name/path resolution mechanisms out of the compiler and runs it on the markdown in a crate's docs, so that links can be made to `SomeStruct` directly rather than finding the folder path to `struct.SomeStruct.html`. Check the `src/test/rustdoc/intra-paths.rs` test in this PR for a demo. The change was... a little invasive, but unlocks a really powerful mechanism for writing documentation that doesn't care about where an item was written to on the hard disk.

Items included:

 - [x] Make work with the hoedown renderer
 - [x] Handle relative paths
 - [x] Parse out the "path ambiguities" qualifiers (`[crate foo]`, `[struct Foo]`, `[foo()]`, `[static FOO]`, `[foo!]`, etc)
 - [x] Resolve foreign macros
 - [x] Resolve local macros
 - [x] Handle the use of inner/outer attributes giving different resolution scopes (handling for non-modules pushed to different PR)

Items not included:

 - [ ] Make sure cross-crate inlining works (blocked on refactor described in #47046 (comment))
 - [ ] Implied Shortcut Reference Links (where just doing `[::std::iter::Iterator][]` without a reference anchor will resolve using the reference name rather than the link target) (requires modifying the markdown parser - blocked on Hoedown/Pulldown switch and raphlinus/pulldown-cmark#121)
 - [ ] Handle enum variants and UFCS methods (Enum variants link to the enum page, associated methods don't link at all)
 - [ ] Emit more warnings/errors when things fail to resolve (linking to a value-namespaced item without a qualifier will emit an error, otherwise the link is just treated as a url, not a rust path)
 - [ ] Give better spans for resolution errors (currently the span for the first doc comment is used)
 - [ ] Check for inner doc comments on things that aren't modules

I'm making the PR, but it should be noted that most of the work was done by Misdreavus 😄

(Editor's note: This has become a lie, check that commit log, Manish did a ton of work after this PR was opened `>_>`)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment