Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support intra-doc links on cross-crate reexports #65983

Closed
Manishearth opened this issue Oct 30, 2019 · 27 comments · Fixed by #73101
Closed

Support intra-doc links on cross-crate reexports #65983

Manishearth opened this issue Oct 30, 2019 · 27 comments · Fixed by #73101
Labels
A-cross-crate-reexports Area: Documentation that has been re-exported from a different crate A-intra-doc-links Area: Intra-doc links, the ability to link to items in docs by name C-enhancement Category: An issue proposing an enhancement or a PR with one. E-hard Call for participation: Hard difficulty. Experience needed to fix: A lot. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.

Comments

@Manishearth
Copy link
Member

Manishearth commented Oct 30, 2019

Previously: #60883, #59833

This is the last major issue blocking the stabilization of intra doc links (#43466)

Currently, if you use intra-doc links on some item:

// crate "foo"

/// This is not [Baz]
pub struct Bar;

pub struct Baz;

the docs will not turn up on any crate reexporting it

pub use foo::Bar;

The docs will have a bare [Baz] instead of the appropriate intra-doc link, because we resolve intra-doc links locally. (More examples in #60883, #59833)


This is tricky to fix; we throw away module resolution information before packaging up crate metadata, so it's not possible to "unseal" a crate and rerun the resolution within it. One solution would be to pre-detect intra doc links and store resolutions in metadata, but this would be expensive since it involves parsing everything as markdown unconditionally.

A better solution is to save resolution information in the metadata, as some kind of tree keyed to the module.

@eddyb's comment outlines the correct approach to use here

What you actually want to persist cross-crate is the scope information that lets you resolve names in a certain (module) scope, after the fact. It'd also allow not doing all the the resolver hacks.

And there's a very good reason to want it in the compiler, at least same-crate: we could (finally) tell if a type is in scope somewhere so we can actually print Vec<_> instead of std::vec::Vec<_> etc.

Basically, instead of throwing away resolver metadata, we should serialize it into some form and store it in the metadata, so that it can be reloaded as necessary.

@Manishearth Manishearth added T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. E-hard Call for participation: Hard difficulty. Experience needed to fix: A lot. A-intra-doc-links Area: Intra-doc links, the ability to link to items in docs by name labels Oct 30, 2019
@kinnison
Copy link
Contributor

I'm interested in helping on this, but right now I'm not confident in the requisite parts of the compiler. I will take a poke around, but please don't let that stop someone who actually knows what they're doing getting it done :D

@eddyb
Copy link
Member

eddyb commented Oct 31, 2019

At this point I really wish rustdoc always took a set of crates' sources and never relied on metadata from rlibs.
It would make so many things easier, including this.

@kinnison
Copy link
Contributor

kinnison commented Nov 3, 2019

Okay, I've spent a bit of time trying to find my way around the libraries which comprise the compiler, looking for things associated with the metadata and/or name resolution. I'm starting to get a feel for things, but I'm still not certain I know quite what I'm trying to achieve. Since this is E-hard I wasn't expecting to work it out right away fortunately, and I'm prepared to keep learning, but it'd be incredibly helpful if someone were prepared to help point me toward the right places to be learning about. Once I feel like I've done enough basic learning, I'll probably post a rough outline of what work I think needs doing to complete this (kinda as an expansion of Manish's original description) and then ask for review of that.

@Manishearth or @eddyb I'm guessing would be best to ask for that review, so this is me pre-asking to see if the above approach would work for either of you?

@Manishearth
Copy link
Member Author

This would work for me, I think @eddyb is far more familiar with some of the stuff here than I am.

librustc_resolve contains a resolution tree that effectively needs serializing, but ideally a reduced form of it.

@eddyb
Copy link
Member

eddyb commented Nov 6, 2019

Probably the best way to do this would be in two steps:

  1. come up with some data structure to place in librustc
    • it only needs to handle being from the same crate, to begin with
    • librustc_resolve would export this, and librustdoc would use it to resolve (some?) links, potentially removing the need for "late Resolver usage" in librustdoc
    • @petrochenkov is most likely to know how and what to exfiltrate from librustc_resolve, I can only guess what parts are relevant
  2. once you have that, you can make librustc_metadata encode/decode it
    • the actual serialization is based on #[derive]s so all you really need to do is add the plumbing around that
    • this is the part I can actually help with

@Manishearth
Copy link
Member Author

come up with some data structure to place in librustc

Basically needs to be a map of module DefId to all contents (imported and otherwise) by name.

Something like HashMap<DefId, ModuleNames>, struct ModuleName {types: HashMap<Ident, DefId>, values: .., macros: ..}

idk how heavy that would be, or how much work that would be to compute

@kinnison
Copy link
Contributor

kinnison commented Nov 6, 2019

That all sounds super-useful. I guess I need to start learning what a DefId is etc. properly :D

@eddyb
Copy link
Member

eddyb commented Nov 6, 2019

@Manishearth Hmm, sounds like the way ExportMap works, except not just for exports. I thought rustdoc would need something more fine-grained than mods, but if not that's great.

EDIT: just realized after sending my comment, that since only exports from modules are documented, only module-level imports would be relevant.
Overall this seems pretty easy, I wonder what's the worst part. Glob imports?

@Manishearth
Copy link
Member Author

Manishearth commented Nov 6, 2019

@eddyb precisely.

Also, currently we don't do inner imports anyway, the following will not work (except for modules):

fn foo() {
   //! [baz]
   use bar::baz;
}

It's debatable if we want that to work, and in that case we can use local resolution though i'd prefer if we only used one way of resolving.

Maybe we can just make ExportMap contain all the module level items, with a pub/priv tag?

Of course this doesn't 100% solve your other problem of being able to spew scoped diagnostics, but it's a step.

@kinnison
Copy link
Contributor

kinnison commented Nov 9, 2019

I have been thinking quite hard about this and I wonder if it might be sufficient to detect any [...] sequence in a docstring (should be cheap to do) and then do the path resolution work inside the compiler rather than deferring to when rustdoc begins? It would then be able to append (or prepend) reference resolution lines to the doc block, thereby not needing any changes to the serialisation or resolver etc.

The argument @Manishearth made in the original posting that full parsing of the markdown would be expensive wouldn't have to happen for every docstring, only ones which contain the [ character, and we could blithely create reference resolution lines for any [...] in the text without consideration of markdown if we wanted to be even faster.

I appreciate this really wouldn't help @eddyb but it strikes me as a potentially much simpler pathway to completion of this task.

cc @GuillaumeGomez

@Manishearth
Copy link
Member Author

that ... could work!

i would prefer we do the harder thing, but this would let us ship.

However, note that this means we can no longer emit warnings for broken links since we're not sure which things are meant to be intra doc links without preparsing markdown

@kinnison
Copy link
Contributor

Oh good point about the lint/warning. Hmm. I guess that we really do need that independent lookaside unless we start doing daft things around embedded nulls in the docs to separate input from autogenerated resolutions, and that would be HORRIBLE. I'll continue to think about possibly simpler solutions while also trying to continue to learn enough to do the "right" thing instead.

@JohnTitor JohnTitor added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Feb 2, 2020
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Apr 20, 2020
Update pattern docs.

A few changes to help clarify string pattern usage:

* Add some examples and stability information in the `pattern` module.
* Fixes the links at https://doc.rust-lang.org/std/str/pattern/ because intra-doc-links don't work with re-exported modules (rust-lang#65983 I think?).
* Consistently use the same phrasing for `str` methods taking a pattern.
    * Also mention that array of `char` is also accepted.

When `Pattern` is stabilized, the phrasing in the `str` methods can be updated to be more general to reflect the exact behavior. I'm reluctant to do this now because the stability story for `Pattern` is uncertain. It may perhaps look something like:

> The pattern can be any type that implements the [`Pattern`] trait. Notable examples are `&str`, [`char`], arrays of [`char`], or functions or closures that determines if a character matches. Additional libraries might provide more complex patterns like regular expressions.

This is complicated because methods like `trim_matches` have bounds, which for example don't support `str`, so those methods may need more elaboration.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Apr 20, 2020
Update pattern docs.

A few changes to help clarify string pattern usage:

* Add some examples and stability information in the `pattern` module.
* Fixes the links at https://doc.rust-lang.org/std/str/pattern/ because intra-doc-links don't work with re-exported modules (rust-lang#65983 I think?).
* Consistently use the same phrasing for `str` methods taking a pattern.
    * Also mention that array of `char` is also accepted.

When `Pattern` is stabilized, the phrasing in the `str` methods can be updated to be more general to reflect the exact behavior. I'm reluctant to do this now because the stability story for `Pattern` is uncertain. It may perhaps look something like:

> The pattern can be any type that implements the [`Pattern`] trait. Notable examples are `&str`, [`char`], arrays of [`char`], or functions or closures that determines if a character matches. Additional libraries might provide more complex patterns like regular expressions.

This is complicated because methods like `trim_matches` have bounds, which for example don't support `str`, so those methods may need more elaboration.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Apr 20, 2020
Update pattern docs.

A few changes to help clarify string pattern usage:

* Add some examples and stability information in the `pattern` module.
* Fixes the links at https://doc.rust-lang.org/std/str/pattern/ because intra-doc-links don't work with re-exported modules (rust-lang#65983 I think?).
* Consistently use the same phrasing for `str` methods taking a pattern.
    * Also mention that array of `char` is also accepted.

When `Pattern` is stabilized, the phrasing in the `str` methods can be updated to be more general to reflect the exact behavior. I'm reluctant to do this now because the stability story for `Pattern` is uncertain. It may perhaps look something like:

> The pattern can be any type that implements the [`Pattern`] trait. Notable examples are `&str`, [`char`], arrays of [`char`], or functions or closures that determines if a character matches. Additional libraries might provide more complex patterns like regular expressions.

This is complicated because methods like `trim_matches` have bounds, which for example don't support `str`, so those methods may need more elaboration.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Apr 20, 2020
Update pattern docs.

A few changes to help clarify string pattern usage:

* Add some examples and stability information in the `pattern` module.
* Fixes the links at https://doc.rust-lang.org/std/str/pattern/ because intra-doc-links don't work with re-exported modules (rust-lang#65983 I think?).
* Consistently use the same phrasing for `str` methods taking a pattern.
    * Also mention that array of `char` is also accepted.

When `Pattern` is stabilized, the phrasing in the `str` methods can be updated to be more general to reflect the exact behavior. I'm reluctant to do this now because the stability story for `Pattern` is uncertain. It may perhaps look something like:

> The pattern can be any type that implements the [`Pattern`] trait. Notable examples are `&str`, [`char`], arrays of [`char`], or functions or closures that determines if a character matches. Additional libraries might provide more complex patterns like regular expressions.

This is complicated because methods like `trim_matches` have bounds, which for example don't support `str`, so those methods may need more elaboration.
@Manishearth
Copy link
Member Author

Manishearth commented May 26, 2020

We should see if we can close this out, this is basically the one thing blocking us from stabilizing intra-doc-links, and folks have been asking for this feature for years.

There are roughly three ways to implement this:

  • librustc_resolve emits a module-imports map -- a structure mapping each module to a list of its imports. this gets stored in the metadata, we consume this in rustdoc
  • the rustc parser actually parses markdown in doc comments, stashes away all the unresolved paths, and librustc_resolve fixes them up with def ids, which get stored in
  • rustdoc itself has its own form of metadata, and when it does intra-doc-link resolution it stashes the metadata somewhere

The last option is the one that appeals to me most because it has zero perf impact, but I don't know how best to do it architecturally (perhaps just a json file in doc/ somewhere? idk how best to invalidate it). The first option is probably the easiest.

@rust-lang/compiler @rust-lang/rustdoc do y'all have opinions on how best to do this? Do you think adding some form of metadata to rustdoc itself is doable?

@kinnison
Copy link
Contributor

For anyone reading this, I ended up unable to continue with this work, so the job is still entirely open for anyone to take.

@nikomatsakis
Copy link
Contributor

Huh, interesting. This feels related to our long-standing efforts to extact name resolution into a shareable code based that both rust-analyzer and rustc can make use of. Not that we've made much progress on that.

@Manishearth
Copy link
Member Author

@jyn514 this is still not the actual issue at hand. it's part of it, but the actual issue involves multiple crates, and the documentation being resolved needs to come from the dependency crate. This is not about documentation on reexports itself.

Here are some test cases that should work:

//crate a

pub struct Bar;

/// Link to [Bar]
pub struct Foo;

// crate b
pub use a::Foo;

and the more complicated case of

// crate a

pub mod bar {
   pub struct Bar;
}

pub mod foo {
  use crate::bar;
  /// link to [bar::Bar]
  pub struct Foo;
}

// crate b

pub use a::foo::Foo;

Part of the issue is indeed fixing things so that reexports within a crate resolve correctly, but that's not all.

Manishearth added a commit to Manishearth/rust that referenced this issue Jul 16, 2020
…anishearth

Resolve items for cross-crate imports relative to the original module

~~Blocked on rust-lang#73103 and rust-lang#73566

Closes rust-lang#65983.

I tested on the following code (as mentioned in rust-lang#65983 (comment)):

```
pub use rand::Rng;
```
and rustdoc generated the following link: https://rust-random.github.io/rand/rand_core/trait.RngCore.html
Manishearth added a commit to Manishearth/rust that referenced this issue Jul 16, 2020
…anishearth

Resolve items for cross-crate imports relative to the original module

~~Blocked on rust-lang#73103 and rust-lang#73566

Closes rust-lang#65983.

I tested on the following code (as mentioned in rust-lang#65983 (comment)):

```
pub use rand::Rng;
```
and rustdoc generated the following link: https://rust-random.github.io/rand/rand_core/trait.RngCore.html
Manishearth added a commit to Manishearth/rust that referenced this issue Jul 16, 2020
…anishearth

Resolve items for cross-crate imports relative to the original module

~~Blocked on rust-lang#73103 and rust-lang#73566

Closes rust-lang#65983.

I tested on the following code (as mentioned in rust-lang#65983 (comment)):

```
pub use rand::Rng;
```
and rustdoc generated the following link: https://rust-random.github.io/rand/rand_core/trait.RngCore.html
Manishearth added a commit to Manishearth/rust that referenced this issue Jul 17, 2020
…anishearth

Resolve items for cross-crate imports relative to the original module

~~Blocked on rust-lang#73103 and rust-lang#73566

Closes rust-lang#65983.

I tested on the following code (as mentioned in rust-lang#65983 (comment)):

```
pub use rand::Rng;
```
and rustdoc generated the following link: https://rust-random.github.io/rand/rand_core/trait.RngCore.html
@bors bors closed this as completed in ec93d56 Jul 17, 2020
jyn514 added a commit to jyn514/rust that referenced this issue Aug 29, 2020
 rust-lang#58972 ignored extern_traits because before rust-lang#65983 was fixed, they
would always fail to resolve, giving spurious warnings.
This undoes that change, so extern traits are now seen by the
`collect_intra_doc_links` pass. There are also some minor changes in
librustdoc/fold.rs to avoid borrowing the extern_traits RefCell more
than once at a time.
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 30, 2020
…trochenkov

Fix intra-doc links for cross-crate re-exports of default trait methods

The original fix for this was very simple: rust-lang#58972 ignored `extern_traits` because before rust-lang#65983 was fixed, they would always fail to resolve, giving spurious warnings. So the first commit just undoes that change, so extern traits are now seen by the `collect_intra_doc_links` pass. There are also some minor changes in `librustdoc/fold.rs` to avoid borrowing the `extern_traits` RefCell more than once at a time.

However, that brought up a much more thorny problem. `rustc_resolve` started giving 'error: cannot find a built-in macro with name `cfg`' when documenting `libproc_macro` (I still haven't been able to reproduce on anything smaller than the full standard library). The chain of events looked like this (thanks @eddyb for the help debugging!):

0. `x.py build --stage 1` builds the standard library and creates a sysroot
1. `cargo doc` does something like `cargo check` to create `rmeta`s for all the crates (unrelated to what was built above)
2. the `cargo check`-like `libcore-*.rmeta` is loaded as a transitive dependency *and claims ownership* of builtin macros
3. `rustdoc` later tries to resolve some path in a doc link
4. suggestion logic fires and loads "extern prelude" crates by name
5. the sysroot `libcore-*.rlib` is loaded and *fails to claim ownership* of builtin macros

`rustc_resolve` gives the error after step 5. However, `rustdoc` doesn't need suggestions at all - `resolve_str_path_error` completely discards the `ResolutionError`! The fix implemented in this PR is to skip the suggestion logic for `resolve_ast_path`: pass `record_used: false` and skip `lookup_import_candidates` when `record_used` isn't set.

It's possible that if/when rust-lang#74207 is implemented this will need a more in-depth fix which returns a `ResolutionError` from `compile_macro`, to allow rustdoc to reuse the suggestions from rustc_resolve. However, that's a much larger change and there's no need for it yet, so I haven't implemented it here.

Fixes rust-lang#73829.

r? @GuillaumeGomez
@jyn514 jyn514 added the A-cross-crate-reexports Area: Documentation that has been re-exported from a different crate label Aug 30, 2020
bors bot added a commit to crossbeam-rs/crossbeam that referenced this issue Sep 6, 2020
559: Use intra-doc links r=stjepang a=taiki-e

Switch to use [intra-doc links](rust-lang/rfcs#1946) in all crates. Previously there was a big bug on cross crate re-exports (rust-lang/rust#65983), but it has been fixed, so we can use this in crossbeam.

This also adds checks of the docs to CI.

560: Use collect::<Box<[_]>>() directly in ShardedLock::new() r=stjepang a=taiki-e



Co-authored-by: Taiki Endo <te316e89@gmail.com>
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Sep 24, 2020
…Manishearth

Stabilize intra-doc links

Fixes rust-lang#43466

Thanks to the great work of @jyn514 in getting the [cross-crate reexport issue](rust-lang#65983) in intra-rustdoc links fixed, I think we're now in a position to stabilize this feature.

The tracking issue currently has two unresolved issues:

 - <s>behavior around doc(hidden): This is fixed in rust-lang#73365, which is just waiting for CI and should land tomorrow. It's also a pretty niche bug so while I expect it to land soon I don't think we need to block stabilization on it anyway.</s>
 - Non-identifier primitive types like slices: This was not a part of the original RFC anyway, and is a pretty niche use case

The feature itself, sans rust-lang#65983, has been shipped on nightly for three years now, with people using it on docs.rs. rust-lang#65983 itself is not an overwhelmingly central bit of functionality; the reason we elected to block stabilization on it was that back in 2017 it was not possible to fix the issue without some major refactorings of resolve, and we did not want to stabilize something that had such a potentially unfixable bug.

Given that we've fixed it, I see no reason to delay stabilization on this long awaited feature. It's possible that the latest patches have problems, however we _have_ done crater runs of some of the crucial parts. Furthermore, that's what the release trains are for, we will have a solid three months to let it ride the trains before it actually hits the stable compiler.

r? @rust-lang/rustdoc
bors added a commit to rust-lang-ci/rust that referenced this issue Sep 24, 2020
…nishearth

Stabilize intra-doc links

Fixes rust-lang#43466

Thanks to the great work of `@jyn514` in getting the [cross-crate reexport issue](rust-lang#65983) in intra-rustdoc links fixed, I think we're now in a position to stabilize this feature.

The tracking issue currently has two unresolved issues:

 - <s>behavior around doc(hidden): This is fixed in rust-lang#73365, which is just waiting for CI and should land tomorrow. It's also a pretty niche bug so while I expect it to land soon I don't think we need to block stabilization on it anyway.</s>
 - Non-identifier primitive types like slices: This was not a part of the original RFC anyway, and is a pretty niche use case

The feature itself, sans rust-lang#65983, has been shipped on nightly for three years now, with people using it on docs.rs. rust-lang#65983 itself is not an overwhelmingly central bit of functionality; the reason we elected to block stabilization on it was that back in 2017 it was not possible to fix the issue without some major refactorings of resolve, and we did not want to stabilize something that had such a potentially unfixable bug.

Given that we've fixed it, I see no reason to delay stabilization on this long awaited feature. It's possible that the latest patches have problems, however we _have_ done crater runs of some of the crucial parts. Furthermore, that's what the release trains are for, we will have a solid three months to let it ride the trains before it actually hits the stable compiler.

r? `@rust-lang/rustdoc`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cross-crate-reexports Area: Documentation that has been re-exported from a different crate A-intra-doc-links Area: Intra-doc links, the ability to link to items in docs by name C-enhancement Category: An issue proposing an enhancement or a PR with one. E-hard Call for participation: Hard difficulty. Experience needed to fix: A lot. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants