Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarks for rustdoc #675

Merged
merged 9 commits into from
Jul 14, 2020
Merged

Conversation

jyn514
Copy link
Member

@jyn514 jyn514 commented Jun 27, 2020

This runs rustdoc by default and requires that you set rustdoc for every perf run. This is probably not the desired behavior, I can change it once everything is working.

Closes #673
Status: Waiting on review

Example usage:

$ cargo run --release --bin collector -- --filter hyper --db doc.db bench_local --rustc $RUSTUP_HOME/toolchains/stage1/bin/rustc --rustdoc $RUSTUP_HOME/toolchains/stage1/bin/rustdoc --cargo $RUSTUP_HOME/toolchains/nightly-x86_64-unknown-linux-gnu/bin/cargo no-loops --builds Doc

I got it to show up on the site for bench_commit but not for bench_local. I'm not sure where in the code to look. Suggestions welcome.

image
image

The way this works is by having rustc-fake also pretend to be rustdoc if called as rustdoc-fake. Additionally there is some hacking around to make the subcommand rustdoc instead of rustc since currently the subcommand can only be set by a Profiler, not by a BuildKind.

r? @Mark-Simulacrum

@jyn514
Copy link
Member Author

jyn514 commented Jun 27, 2020

I already noticed that this is doing a lot more work than expected. I thought since rustdoc doesn't check item-bodies it would be faster than check, but in some cases it's almost twice as slow. Might be because it has to output the HTML.

The stats also show that rustdoc isn't helped at all by incremental. Not sure if this is an issue with the benchmark or with rustdoc itself. instructions:u:

image

@jyn514
Copy link
Member Author

jyn514 commented Jun 27, 2020

The CI failure looks unrelated to this change, maybe a regression in nightly?

@jyn514
Copy link
Member Author

jyn514 commented Jun 27, 2020

Getting thread 'tokio-runtime-worker' panicked at 'InvalidColumnType(0, "min(value)", Null): series=46, aid=ArtifactIdNumber(2)', database/src/pool/sqlite.rs:336:37 when I try to compare a bench_local to a bench_commit. Not sure why, select * from pstat where series=46; and select * from pstat where aid=2; both have output but not the two combined.

@Mark-Simulacrum
Copy link
Member

Heh, turns out CI here is failing due to an upstream soundness fix, though not one that affected us (relevant code was a panic! function anyway). Fixed in 72d38b6, and closing/reopening to hopefully re-trigger CI.

I'll take a look at your questions here in a bit.

Copy link
Member

@Mark-Simulacrum Mark-Simulacrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some initial comments on the implementation, will take a look at that panic you're getting probably tomorrow

collector/src/bin/rustc-fake.rs Outdated Show resolved Hide resolved
collector/src/bin/rustc-perf-collector/execute.rs Outdated Show resolved Hide resolved
collector/src/bin/rustc-perf-collector/main.rs Outdated Show resolved Hide resolved
@jyn514
Copy link
Member Author

jyn514 commented Jun 29, 2020

@jyn514
Copy link
Member Author

jyn514 commented Jun 29, 2020

Still not sure how to get bench_local results to show up on the site. Other than that things seem to be working though.

@Mark-Simulacrum
Copy link
Member

You mean on the graph pages? That's just not possible since we don't have an ordering between them. I guess we could plausibly record the time they were benchmarked -- but that seems like something for a future PR.

Copy link
Member

@Mark-Simulacrum Mark-Simulacrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much happy with this, two more nits

database/src/pool/sqlite.rs Outdated Show resolved Hide resolved
@Mark-Simulacrum
Copy link
Member

I guess this doesn't touch the UI at all - in particular, the primary graphs page still has 3 columns. But I think it makes sense for now to keep it that way. We might want to show rustdoc performance on the dashboard, but that can wait several release cycles (until we have a beta and such that also has that data).

@jyn514
Copy link
Member Author

jyn514 commented Jun 29, 2020

You mean on the graph pages?

I meant on any page, including the compare page. Strangely it seems to be working now, maybe I had a typo in the ID before (??) The stats are all over the place though, I'm not sure what's wrong but I can't believe there was an 80% increase in instructions from a PR that only touched rustc_resolve.

image

@Mark-Simulacrum
Copy link
Member

That looks like you're comparing an official run -- from perf.rust-lang.org's collection server -- to a local run. That won't work well -- the official run, amongst other things, is built with a more optimally configured rustc than is typical for local building. In particular, it has LLVM ThinLTO enabled, and no debug assertions etc.

@jyn514
Copy link
Member Author

jyn514 commented Jun 30, 2020

Ooh, that looks way better!
image

Thanks for the help :) I addressed the review comments too.

@jyn514 jyn514 changed the title [WIP] Add benchmarks for rustdoc Add benchmarks for rustdoc Jun 30, 2020
@Mark-Simulacrum
Copy link
Member

Okay, having done some more thinking here, I think that before we merge this in I'd like to do some work to move the initial build outside of each build kind, to cache build dependencies between check/debug/opt/doc builds. For some crates that do codegen etc through build.rs that could be significant. #678 should implement that; I plan to merge it once CI is passing.

It would be great if you could also rebase this on master once that PR merges and add doc builds to CI -- should be fairly easy, adjusting this line:

&[BuildKind::Check], // no Debug or Opt builds
. I suspect some of the crates will run into trouble, though can't be certain -- maybe things will just work out.

@jyn514
Copy link
Member Author

jyn514 commented Jul 2, 2020

I'm running bench_local --builds Doc --runs Full locally to see if any problems pop up, if I haven't reported back in an hour feel free to bump me to post the results.

@jyn514
Copy link
Member Author

jyn514 commented Jul 2, 2020

It's been stuck on Running coercions: Doc + [Full] for an hour, not sure what the issue is. I'll try rebasing but I doubt it will help.

@Mark-Simulacrum
Copy link
Member

It's been stuck on Running coercions: Doc + [Full] for an hour, not sure what the issue is. I'll try rebasing but I doubt it will help.

Hm, that's odd. cargo doc in that directory completed in 3 seconds for me. Maybe try upping to RUST_LOG=collector=trace? That way at least you'll know the exact command being run and can hopefully reproduce outside.

@jyn514
Copy link
Member Author

jyn514 commented Jul 2, 2020

Ok, the error is

[2020-07-02T18:19:33Z WARN  collector::execute] failed to deserialize stats, retrying; output: Output { status: ExitStatus(ExitStatus(0)), stdout: "", stderr: " Documenting issue-32278-big-array-of-strings v0.1.0 (/tmp/.tmpPEFu6R)\n[collector/src/bin/rustc-fake.rs:13] name = \"/home/joshua/.local/lib/cargo/target/release/rustdoc-fake\"\n    Finished dev [unoptimized + debuginfo] target(s) in 4.22s\n" }

So I just have some debug logging gone rogue, oops. The real bug is that it will retry infinitely many times.

@jyn514
Copy link
Member Author

jyn514 commented Jul 2, 2020

And a secondary error: Since I used hard links and not symbolic links, rustdoc-fake will not be updated on changes to rustc-fake. I'll fix that.

@jyn514
Copy link
Member Author

jyn514 commented Jul 2, 2020

Removing the logging did not fix the issue. I think MeasureProcessor is depending on some output that's different between rustdoc and rustc. I'm not sure why this didn't show up until now.

@Mark-Simulacrum
Copy link
Member

My guess is that we're not picking up the rustdoc run as special perhaps, and not dumping perf stat output as such?

@jyn514
Copy link
Member Author

jyn514 commented Jul 2, 2020

When I run rustdoc-fake without the collector things work fine:

$ RUSTC_REAL=$(which rustc) RUSTDOC_REAL=$(which rustdoc) $CARGO_TARGET_DIR/release/rustdoc-fake --wrap-rustc-with perf-stat ~/src/test-rustdoc/primitive-anchor.rs 
2075987279;;instructions:u;466317390;100.00;1.35;insn per cycle
1539796372;;cycles:u;466317390;100.00;3.302;GHz
466.29;msec;task-clock:u;466285639;100.00;0.959;CPUs utilized
466.28;msec;cpu-clock:u;466285639;100.00;0.959;CPUs utilized
28773;;faults:u;466285639;100.00;0.062;M/sec
207432;;max-rss;3;100.00
0.556541778;;wall-time;4;100.00

I'm not sure what's going on.

@jyn514
Copy link
Member Author

jyn514 commented Jul 2, 2020

Ok, we're telling cargo to pass --wrap-rustc-with, but it never gets passed.

[2020-07-02T19:41:00Z DEBUG collector::execute] "/home/joshua/.local/lib/rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/cargo" "rustdoc" "--manifest-path" "Cargo.toml" "-p" "file:///tmp/.tmp7c3bbt#issue-32278-big-array-of-strings:0.1.0" "--" "--wrap-rustc-with" "perf-stat"
[collector/src/bin/rustc-fake.rs:10] args_os.collect::<Vec<_>>() = [
    "--crate-type",
    "bin",
    "--crate-name",
    "issue_32278_big_array_of_strings",
    "src/main.rs",
    "-o",
    "/tmp/.tmp7c3bbt/target/doc",
    "--error-format=json",
    "--json=diagnostic-rendered-ansi",
    "--document-private-items",
    "-L",
    "dependency=/tmp/.tmp7c3bbt/target/debug/deps",
]

@jyn514
Copy link
Member Author

jyn514 commented Jul 2, 2020

Ugh, I can't replicate at all without going through collector. I'm out of ideas.

@jyn514
Copy link
Member Author

jyn514 commented Jul 2, 2020

This is a bug in cargo. cargo rustdoc does not pass the arguments if documenting a binary 🤦

@jyn514
Copy link
Member Author

jyn514 commented Jul 7, 2020

I ran bench_local --builds Doc --runs Full with the cargo fix (rust-lang/cargo#8449) and it finished successfully. However the site refused to start up, I think because there were no Check runs? I don't think it will impact anything if Doc is run alongside other performance measurements.

$ cargo +nightly run --release --bin site -- --db tmp.db
Starting server with port=2346
Loading complete but no data identified; exiting.

@nnethercote
Copy link
Contributor

To start the site I just run ./target/release/site $DATABASE. I.e. no -- or --db arguments.

@jyn514
Copy link
Member Author

jyn514 commented Jul 7, 2020

There we go!

image

I'm not sure how often cargo publishes ... I updated nightly this morning but I'm still seeing cargo 1.46.0-nightly (fede83ccf 2020-07-02).

@jyn514
Copy link
Member Author

jyn514 commented Jul 10, 2020

The cargo fix landed but hasn't been distributed on nightly yet. So this should be good to go when that happens sometime tomorrow.

@Manishearth
Copy link
Member

@jyn514 does rustc-perf use nightly cargo or the cargo in each master release?

@jyn514
Copy link
Member Author

jyn514 commented Jul 10, 2020

@Manishearth nightly (AFAIK)

- Distinguish between rustdoc and rustc
- Use `rustdoc` instead of `doc` subcommand

This only profiles the last crate and also avoids special casing
RUSTDOCFLAGS.

- Add a fake rustdoc
- Create symlink automatically if it doesn't already exist
- Add 'Doc' to CLI help
- Make Doc a default BuildKind when --builds is not passed
- Don't panic if looking at two local builds in the dashboard
- Don't panic if there are no pstat results available
- Change `subcommand()` to take BuildKind and return Option
- Remove `is_build_kind_supported`
This ensures that the doc build will succeed when run later, since
rustdoc and rustc are known to have different ICEs.
This avoids rustdoc-fake getting out of date when rustc-fake is updated.
@jyn514
Copy link
Member Author

jyn514 commented Jul 11, 2020

Same error as on #680 (comment).

@Mark-Simulacrum
Copy link
Member

Okay, that should have triggered CI to run again which should be fixed now. I'll hopefully be merging this soon after that passes (if I haven't feel free to ping me).

@Mark-Simulacrum
Copy link
Member

@jyn514 Actually, could we revert the coercions change from binary to library now that the Cargo patch has landed? I'd prefer to not land that together with this since it could affect perf in theory.

@jyn514
Copy link
Member Author

jyn514 commented Jul 13, 2020

@jyn514 Actually, could we revert the coercions change from binary to library now that the Cargo patch has landed? I'd prefer to not land that together with this since it could affect perf in theory.

Oops, good catch. Should be reverted now.

@jyn514
Copy link
Member Author

jyn514 commented Jul 13, 2020

@Mark-Simulacrum mind looking at this when you get a chance? I'd love to land this before merging rust-lang/rust#73566 so we have something to compare that to.

@Mark-Simulacrum Mark-Simulacrum merged commit 264d20f into rust-lang:master Jul 14, 2020
@jyn514 jyn514 deleted the rustdoc branch July 14, 2020 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add opt-in way to run perf on rustdoc
4 participants