Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustdoc: builds including multiple crates with the same name are always dirty #61378

Open
cramertj opened this issue May 30, 2019 · 9 comments
Open
Labels
T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.

Comments

@cramertj
Copy link
Member

I haven't figured out exactly what's going on here yet, but to repro:

mkdir /tmp/rustdoc-dirty && cd /tmp/rustdoc-dirty && cargo init
echo 'futures-preview = { version = "=0.3.0-alpha.16", features = ["io-compat"] }' >> Cargo.toml
cargo doc
cargo doc

(tested on rustc 1.36.0-nightly (a784a8022 2019-05-09))

The second cargo doc should do no work and immediately exit, however it instead proceeds to document tokio-io and everything after it. This only happens when the io-compat feature is enabled, but does not happen when depending directly on tokio-io.

@cramertj cramertj added the T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. label May 30, 2019
@ehuss
Copy link
Contributor

ehuss commented May 31, 2019

This is because the dependency tree has both "futures-preview" and "futures". "futures-preview" has a [lib] named "futures", and cargo gets a little confused (it doesn't know that there is a name collision with two crates with the same name). This is similar in spirit to #56169, where there are multiple crates with the same name. For normal builds, Cargo separates these with a unique hash in the filename, but the rustdoc directories are not differentiated.

I can't think of any good solutions. Fundamentally rustdoc would need to organize crate directory names differently, but I can't think of a way that isn't awful (like package_name-version/crate_name/). There are other reasons a crate can be built multiple times with the same name, so there could be other collision opportunities.

@cramertj
Copy link
Member Author

Why is package_name-version/crate_name/ awful? That seems pretty reasonable to me (or package_name/version/crate_name). If there's any interest at all in doing this, I'd love to make this change, since right now this makes building rustdocs for Fuchsia without --no-deps basically impossible.

@ehuss
Copy link
Contributor

ehuss commented May 31, 2019

I think proper cross-crate linking will be more complicated. When rustdoc resolves a path to the futures crate, how would it know which one to link to? Cargo would likely need to feed more information to rustdoc (such as the version number). rustdoc already has something similar (root-url, how docs.rs works), but that's a bit different because not all crates are combined.

I don't really know much about rustdoc so I'm not really the right person to ask. People more familiar with it may have better ideas. Perhaps it's not so difficult.

@cramertj
Copy link
Member Author

@cramertj cramertj changed the title rustdoc: futures-preview doc builds including tokio-io are always dirty rustdoc: builds including multiple crates with the same name are always dirty May 31, 2019
@cramertj
Copy link
Member Author

TBH I don't care even as much if they're broken and one crate overwrites another-- that would still be more workable than the current status quo where that happens and the doc builds are always dirty, forcing a full rebuild every time.

@GuillaumeGomez
Copy link
Member

I thought it was handled on cargo side? Or did I miss something?

@ehuss
Copy link
Contributor

ehuss commented Jun 1, 2019

Not really. It checks for documenting duplicate workspace crates, and raises an error, but it doesn't check for duplicate dependencies. I posted a PR yesterday to issue a warning for duplicate dependencies (rust-lang/cargo#6998), to at least make it easier to discover/debug this. But issuing errors or warnings isn't really a good user experience. In all of these cases, it would be nice if rustdoc could support duplicate crate names. But as I mentioned, it seems tricky, since there would need to be a new directory layout, and cargo would need to pass extra information to rustdoc to inform it the additional information to distinguish them (such as package name and version, not sure if there are other components necessary).

And I'm not sure if rustdoc can fundamentally handle multiple crates with the same name. I would imagine the sidebar and the search index might struggle with it.

@GuillaumeGomez
Copy link
Member

It's not supposed to... The only thing we could do is to change the crate name of the dependencies by adding the version/commit. Like this, we would be able to build the search index.

bors added a commit to rust-lang/cargo that referenced this issue Jun 3, 2019
Catch filename output collisions in rustdoc.

This does some general cleanup so that rustdoc output is computed correctly.  This allows the collision detection code to work.  There are some known issues with rustdoc creating collisions (rust-lang/rust#61378, rust-lang/rust#56169).  This is related to issue #6313.

This includes a change that `CompileMode::is_doc` no longer returns `true` for doc tests. This has bothered me for a while, as various points in the code was not handling it correctly. Separating it into `is_doc` and `is_doc_test` I think is better and more explicit.

Note for reviewing: There is a big diff in `calc_outputs`, but this is just rearranging code. Everything in `calc_outputs_rustc` is unchanged from the original.

The only functional changes should be:
- A warning is emitted for colliding rustdoc output.
- Don't create an empty fingerprint directory for doc tests.
@QuietMisdreavus
Copy link
Member

Whenever rustdoc tries to link to another crate, it calls the function html::render::extern_location:

fn extern_location(e: &clean::ExternalCrate, extern_url: Option<&str>, dst: &Path)
-> ExternalLocation
{
// See if there's documentation generated into the local directory
let local_location = dst.join(&e.name);
if local_location.is_dir() {
return Local;
}
if let Some(url) = extern_url {
let mut url = url.to_string();
if !url.ends_with("/") {
url.push('/');
}
return Remote(url);
}
// Failing that, see if there's an attribute specifying where to find this
// external crate
e.attrs.lists(sym::doc)
.filter(|a| a.check_name(sym::html_root_url))
.filter_map(|a| a.value_str())
.map(|url| {
let mut url = url.to_string();
if !url.ends_with("/") {
url.push('/')
}
Remote(url)
}).next().unwrap_or(Unknown) // Well, at least we tried.
}

So any alternate name we would choose would have to make its way there. Looking at it again, it looks like passing --extern-html-root-url won't even help until rustdoc is changed, because the local directory will always override the flag.

I did some testing, and here are some notes:

  • Rustdoc uses the package name for all dependencies once it takes over from the compiler. This is apparently pulled from the dependency's metadata, since it calls tcx.crate_name(CrateNum) for all externs (tcx.crates()). Pulling the dependency name could be a matter of matching the --extern flags against the crates loaded in by rustc, but that might not work out in the end, since the dependency name can differ by crate in this kind of situation. Since version numbers are a purely Cargo-side construct, rustdoc has no way to pull them itself, that i know of. (Is there a method to match a CrateNum to the name passed to --extern? 🤔)
  • Because of this, things like title-text for types from different crates still use the original package name (e.g. even though a crate may call a type nalgebra_crate_0_15::Matrix, rustdoc will show nalgebra::Matrix instead).
  • Since rustdoc uses CrateNums to track dependencies internally, it shouldn't be a huge architectural change to thread this information through. Adding the information to the clean::ExternCrate as it's cleaned should allow you to use it later in the process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants