Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rustdoc-json] JSON no longer inlines #99287

Merged
merged 3 commits into from
Jul 16, 2022

Conversation

GuillaumeGomez
Copy link
Member

Fixes #98007.
Fixes #96161.
Fixes #83057.
Fixes #83720.

I took over #93518 and applied the comments and added more tests.

There was one thing missing (which is in the second commit): if a non-exported item was used in a public API but not reexported, it was still missing.

cc @CraftSpider @Urgau @Enselic

r? @notriddle

@GuillaumeGomez GuillaumeGomez added T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. A-rustdoc-json Area: Rustdoc JSON backend labels Jul 15, 2022
@rustbot
Copy link
Collaborator

rustbot commented Jul 15, 2022

rustdoc-json-types is a public (although nightly-only) API. If possible, consider changing src/librustdoc/json/conversions.rs; otherwise, make sure you bump the FORMAT_VERSION constant.

cc @CraftSpider, @aDotInTheVoid

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 15, 2022
@CraftSpider
Copy link
Contributor

This looks great to me, thank you for picking it up!

@rust-log-analyzer

This comment has been minimized.

@GuillaumeGomez
Copy link
Member Author

Fixed the tidy issue.

Copy link
Member

@Enselic Enselic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for working on this! ❤️

When I ran cargo public-api regression tests with this change, I caught a seemingly big semantic change, because some tests started failing.

Consider this code in src/lib.rs (I am using a type but e.g. fn has the same problem, and I imagine any other item type.):

#![feature(no_core)]
#![no_core]
#![no_std]
mod stripped {
    pub type T = usize;
}
pub use stripped::T;

before this PR, the pub use did not emit any JSON item by itself. Instead, there were only two items in rustdoc JSON:

  • The crate module ("kind": "module")
  • The typedef ("kind": "typedef")

After this change, the pub use itself is also emitted. This runs contrary to rustdoc HTML output, where the pub use will not be present by default. There are now four items:

  • The crate module ("kind": "module", "is_crate": true)
  • The typedef ("kind": "typedef")
  • The stripped module ("kind": "module", "is_stripped": true)
  • The pub use ("kind": "import")

I don't think we want the pub use JSON item in the output by default? Because that is not what cargo doc HTML shows by default.

...on the other hand, what should we do instead? We do not inline any longer.

Silly idea maybe, but can't we inline JSON as a very last step? So that the output becomes the same as before, but internally we do not do any inlining, so that all ICEs are fixed.

In short, this PR breaks my tool, so if we could go a bit gently forward, I would appreciate it :)

src/rustdoc-json-types/lib.rs Show resolved Hide resolved
clean::KeywordItem(_) => return None,
clean::StrippedItem(ref inner) => {
match &**inner {
// We document non-empty stripped modules as a special StrippedModule type, to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StrippedModule is mentioned, but does not exist any longer, needs to be updated for Module::is_stripped.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh indeed, good catch!

@GuillaumeGomez
Copy link
Member Author

GuillaumeGomez commented Jul 15, 2022

The original PR did it the same way, so I'm surprised it's become an issue in-between. Inlining after will very likely be quite tricky and bring back most of the previous issues (conflicting IDs).

Also, the JSON output cannot be the same as the HTML one. Let me take an example:

mod private_mod {
    pub struct Foo;
}

pub fn foo() -> Foo {}

With this code, the HTML will not create a link for Foo because, since it's a private item, it doesn't have a page for itself. However, with the JSON output, we need to also get Foo so that ID doesn't link to a nonexistent item. Both are somewhat the same but the details actually make them very different. The JSON output needs to be more complete because it has to provide all the information to all its items.

Does it make sense? 😅

EDIT: I just thought about something: the JSON output was already showing pub use x::*; reexports whereas they're not displayed in the HTML output. So we're kinda making it more coherent. 😆
EDIT2: because they're directly inlined in the HTML output (which was the root of our issues in the JSON output).

@Enselic
Copy link
Member

Enselic commented Jul 16, 2022

I've further looked into this change from my side this morning and have no blocking concerns any longer. Feel free to go ahead with merging this (as if I have any authority over that 😅).

(After this has landed I will experiment with letting my tool inline the JSON itself to get back the old behaviour. Seems like that could work. If it doesn't, it is not the end of the world.)

Looking forward to seeing those ICEs going away! 💪

The original PR did it the same way, so I'm surprised it's become an issue in-between.

This is probably not a new issue, it's simply that I never ran my test suite on the old PR, because

  • I didn't have a test suite back then
  • It didn't seem finished enough to test against

@GuillaumeGomez
Copy link
Member Author

I've further looked into this change from my side this morning and have no blocking concerns any longer. Feel free to go ahead with merging this (as if I have any authority over that 😅).

Not an "authority" but you're a user providing feedback, which is very important for us. If there is no more issue for you, then I think we can indeed go forward. Thanks a lot for your help!

(After this has landed I will experiment with letting my tool inline the JSON itself to get back the old behaviour. Seems like that could work. If it doesn't, it is not the end of the world.)

I guess you already thought about it but if the type is a reexport, why not linking all its items directly into the top level on your side? That shouldn't require too much code hopefully.

Since @notriddle approved it too, let's go. :)

@bors r+

@bors
Copy link
Contributor

bors commented Jul 16, 2022

📌 Commit 451e2f1bbfce876113cf7565015b219eb7e3e8d4 has been approved by GuillaumeGomez

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 16, 2022
@GuillaumeGomez
Copy link
Member Author

@bors r=notriddle

@bors
Copy link
Contributor

bors commented Jul 16, 2022

💡 This pull request was already approved, no need to approve it again.

@bors
Copy link
Contributor

bors commented Jul 16, 2022

📌 Commit 451e2f1bbfce876113cf7565015b219eb7e3e8d4 has been approved by notriddle

It is now in the queue for this repository.

@GuillaumeGomez
Copy link
Member Author

Just realized that I forgot to apply the comments...

@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jul 16, 2022
@GuillaumeGomez
Copy link
Member Author

Let's go again!

@bors r=notriddle

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 16, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 16, 2022
…askrgr

Rollup of 10 pull requests

Successful merges:

 - rust-lang#98582 (Allow destructuring opaque types in their defining scopes)
 - rust-lang#99213 (migrate some of `rustc_passes::check_attr`'s diagnostics and derive improvements)
 - rust-lang#99258 (Provide structured suggestion for dropped temp value)
 - rust-lang#99259 (interpret/visitor: support visiting with a PlaceTy)
 - rust-lang#99287 ([rustdoc-json] JSON no longer inlines)
 - rust-lang#99290 (Revert "Highlight conflicting param-env candidates")
 - rust-lang#99316 (docs: add missing word)
 - rust-lang#99317 (Borrow Vec<T, A> as [T])
 - rust-lang#99323 (Fix flakyness of GUI tests)
 - rust-lang#99342 (Avoid some `Symbol` to `String` conversions)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 02b9701 into rust-lang:master Jul 16, 2022
@rustbot rustbot added this to the 1.64.0 milestone Jul 16, 2022
@GuillaumeGomez GuillaumeGomez deleted the rustdoc-json-double-export branch July 17, 2022 09:39
@Enselic
Copy link
Member

Enselic commented Jul 17, 2022

As I was working on doing the inlining myself in my tool, I think I bumped into an unintended regression from this change.

Step-by-step:

Paste the following in either bash or zsh:

# Create a library with a trait with one associated function
echo '
pub trait A {
    fn trait_a_fn();
}
' > a.rs

# Create another library that depends on the first library
echo '
pub use a::A;
' > b.rs

# Check with what toolchains rustdoc JSON is generated for `trait_a_fn`
for toolchain in nightly-2022-07-16 nightly-2022-07-17; do
    # Create a directory with output from each toolchain
    mkdir -p $toolchain

    # Compile the library
    rustc +$toolchain --edition=2021 \
            --crate-type rlib -o $toolchain/liba.rlib a.rs

    # Generate rustdoc JSON for both libraries
    rustdoc +$toolchain --edition=2021 \
            --extern a=$toolchain/liba.rlib \
            -Z unstable-options --output-format json \
            -o $toolchain b.rs

    # See if rustdoc JSON for `trait_a_fn` was generated
    grep --silent trait_a_fn $toolchain/b.json && echo "rustdoc JSON for 'trait_a_fn' was generated by '$toolchain'"
done

Expected output:

rustdoc JSON for 'trait_a_fn' was generated by 'nightly-2022-07-16'
rustdoc JSON for 'trait_a_fn' was generated by 'nightly-2022-07-17'

Actual output:

rustdoc JSON for 'trait_a_fn' was generated by 'nightly-2022-07-16'

Would you agree this looks like an unintentional regression? Can you reproduce? Without rustdoc JSON for trait_a_fn I am not able to do the inlining myself.

Let me know if you would like me to file a separate issue for this, and I will happily do it.

@GuillaumeGomez
Copy link
Member Author

No, it seems to be exactly the point of this PR (but maybe I misunderstood your issue?).

@Enselic
Copy link
Member

Enselic commented Jul 17, 2022

Sorry for being unclear. My impression was that we always generated one big chunk of rustdoc JSON for an entire set of crates. But given your response, I'm starting to think that rustdoc JSON always was intended to generate rustdoc JSON only for one crate at a time. But then, the boundaries are not clear to me. The rustdoc JSON for the small example above contains literally thousands of items that does not have "crate_id": 0,, i.e. that comes from another crate. Looks like they come from libstd or similar though. Which maybe makes sense.

If rustdoc JSON is only designed and intended to be built for for one crate at a time, that means no tool can inline items of imports across crate boundaries, since the IDs are not stable across different builds of rustdoc JSON. If that is how it is supposed to be, that's fine I guess. At least for now :)

@GuillaumeGomez
Copy link
Member Author

If a foreign type is present in the public API, it has its "definition" in the JSON as well.

@Enselic
Copy link
Member

Enselic commented Jul 17, 2022

Only partial definition, it seems? Because even though the trait a::A in the sample code above is used by crate b, the rustdoc JSON for crate b only includes an item for the trait a::A, but not for the associated function fn trait_a_fn(). The rustdoc JSON for the trait A looks like this:

"20:3:1569": {
    "id": "20:3:1569",
    "crate_id": 20,
    "name": "A",
    "span": null,
    "visibility": "public",
    "docs": null,
    "links": {},
    "attrs": [],
    "deprecation": null,
    "kind": "trait",
    "inner": {
        "is_auto": false,
        "is_unsafe": false,
        "items": [
            "20:4:38964"
        ],
        "generics": {
            "params": [],
            "where_predicates": []
        },
        "bounds": [],
        "implementations": []
    }
},

but there is no rustdoc JSON item for the associated function fn trait_a_fn(). It's ID is there: "20:4:38964". But it has no corresponding item in the JSON. It is missing.

However, I just noticed that if crate b starts using fn trait_a_fn(), then it ends up in the rustdoc JSON.

To be crystal clear: This do give me the expected output:

# Create a library with a trait with one associated function
echo '
pub trait A {
    fn trait_a_fn();
}
' > a.rs

# Create another library that depends on the first library
echo '
pub use a::A;
pub struct S;
impl A for S {
    fn trait_a_fn() {}
}
' > b.rs

# Check with what toolchains rustdoc JSON is generated for `trait_a_fn`
for toolchain in nightly-2022-07-16 nightly-2022-07-17; do
    # Create a directory with output from each toolchain
    mkdir -p $toolchain

    # Compile the library
    rustc +$toolchain --edition=2021 \
            --crate-type rlib -o $toolchain/liba.rlib a.rs

    # Generate rustdoc JSON for both libraries
    rustdoc +$toolchain --edition=2021 \
            --extern a=$toolchain/liba.rlib \
            -Z unstable-options --output-format json \
            -o $toolchain b.rs

    # See if rustdoc JSON for `trait_a_fn` was generated
    grep --silent trait_a_fn $toolchain/b.json && echo "rustdoc JSON for 'trait_a_fn' was generated by '$toolchain'"
done

It could probably be argued that we should not emit rustdoc JSON only partially even in the initial case, but suddenly the current behaviour seems like much less of a problem.

@GuillaumeGomez
Copy link
Member Author

So foreign items in "imported" foreign items are not in the JSON. Not sure if a bug or not at this point... Seems logical to have direct foreigns but not foreigns of foreigns I suppose.

@Enselic
Copy link
Member

Enselic commented Jul 17, 2022

Actually, I draw conclusions too quickly. There is an item generated for fn trait_a_fn() for the impl, but not for the trait. I think this can only be seen as the rustdoc JSON being incomplete. I will file an issue for this at some point in the future, unless someone beats me to it.

Enselic added a commit to Enselic/cargo-public-api that referenced this pull request Jul 17, 2022
This change was triggered by
rust-lang/rust#99287, but it would have made
sense to do even without that upstream change.
@Urgau
Copy link
Member

Urgau commented Jul 17, 2022

This new behavior feels "weird" but isn't really; you don't see the definition of a trait without clicking on it in the HTML and if it's a external crate it jumps in that crate you are not staying in the first crate.

My impression was that we always generated one big chunk of rustdoc JSON for an entire set of crates.

This is currently (was ?) the case but I don't think that something that we want. I think @CraftSpider talked about it sometimes ago, I remember something about pruning unnecessary stuff to reduce the size of the JSON and also to avoid this kind of problem.

Enselic added a commit to Enselic/cargo-public-api that referenced this pull request Jul 17, 2022
This change was triggered by
rust-lang/rust#99287, but it would have made
sense to do even without that upstream change.
aDotInTheVoid added a commit to aDotInTheVoid/rustdoc-types that referenced this pull request Jul 18, 2022
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Aug 15, 2022
…uillaumeGomez

Rustdoc-Json: Don't remove impls for items imported from private modules

After rust-lang#99287, items in private modules may still be in the json output, if a public import accesses them. To reflect this, items that are imported need to be marked as retained in the `Stripper` pass, so their impls arn't removed by `ImplStripper`.

[More context on zulip](https://rust-lang.zulipchat.com/#narrow/stream/266220-rustdoc/topic/Populating.20cache.2Eimpls), thanks to @ jyn514 for helping debug this.

`@rustbot` modify labels: +A-rustdoc-json +T-rustdoc

r? `@GuillaumeGomez`

Fixes rust-lang#100252
Fixes rust-lang#100242
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Aug 15, 2022
…uillaumeGomez

Rustdoc-Json: Don't remove impls for items imported from private modules

After rust-lang#99287, items in private modules may still be in the json output, if a public import accesses them. To reflect this, items that are imported need to be marked as retained in the `Stripper` pass, so their impls arn't removed by `ImplStripper`.

[More context on zulip](https://rust-lang.zulipchat.com/#narrow/stream/266220-rustdoc/topic/Populating.20cache.2Eimpls), thanks to @ jyn514 for helping debug this.

``@rustbot`` modify labels: +A-rustdoc-json +T-rustdoc

r? ``@GuillaumeGomez``

Fixes rust-lang#100252
Fixes rust-lang#100242
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rustdoc-json Area: Rustdoc JSON backend S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
9 participants