Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Implement profiling subcommands #209
Conversation
It's redundant with respect to -is-final-crate -- there is no case in which we want one enabled but not the other. It's also not necessary for the `cargo pkgid` invocations. This change requires only setting -is-final-crate when perf is required, rather than setting it unconditionally. The change also requires changing run_rustc() to handle the case where perf output is not present. (This change was necessary because rustc-fake was buggy -- it always ran perf, even if USE_PERF was set to 0.) In rustc-fake, we can now combine the three separate conditions in main() into one, simplifying it. Finally, the patch also renames -is-final-crate as --wrap-rustc-with, and gives it an argument -- the only valid argument is currently "perf-stat", though the next patch will change that. This new name better reflects how it is used.
| assert!(has_valgrind); | ||
| cmd.arg("--tool=cachegrind") | ||
| .arg("--cache-sim=no") | ||
| .arg("--branch-sim=no") |
Mark-Simulacrum
Apr 11, 2018
Member
I see "Note that you cannot specify --cache-sim=no and --branch-sim=no together, as that would leave Cachegrind with no information to collect." in the manpage for valgrind; is that information wrong/misleading?
I see "Note that you cannot specify --cache-sim=no and --branch-sim=no together, as that would leave Cachegrind with no information to collect." in the manpage for valgrind; is that information wrong/misleading?
This patch implements three new subcommands, each one invoking a
different profiler. Currently each one profiles a single build ("clean",
non-incremental, Debug) of each benchmark. Command line invocation is
the same as `bench_local`. Per-build output files are put into the
--output-repo directory.
Currently the invocations of perf-record, Cachegrind and DHAT are
hard-wired. We can consider a mechanism for allowing variations later
(e.g. a config file) if it's necessary.
Some specific details:
- rustc-fake's --wrap-rustc-with flag now accepts three additional
argument: "perf-record", "cachegrind", "dhat".
- `CargoProcess::perf` is renamed `profiler`, and it has the new
`Profiler` type. It's now specified by the profiler() modifier
function, instead of via the argument to mk_cargo_process().
- run_rustc() is split into two: run_rustc() and
run_rustc_and_process_perf_stat_output().
f8be80f
to
a1e412d
|
Well spotted on the Cachegrind docs! That info is indeed wrong, with --cache-sim=no and --branch-sim=no it still collects instruction counts. That's by far the most useful stat, and disabling the cache-sim and branch-sim make it run faster. I've added a comment about this, and I told the Valgrind maintainer about the doc error. I also changed the |
9e39ee9
into
rust-lang:master
perf-record, Cachegrind, and DHAT are the profilers I've used in the past on rustc. This branch adds support for running each of them on the rustc-benchmarks.