New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement profiling subcommands #209

merged 2 commits into from Apr 12, 2018


None yet
2 participants

nnethercote commented Apr 11, 2018

perf-record, Cachegrind, and DHAT are the profilers I've used in the past on rustc. This branch adds support for running each of them on the rustc-benchmarks.

Remove the USE_PERF env var.
It's redundant with respect to -is-final-crate -- there is no case in
which we want one enabled but not the other. It's also not necessary for
the `cargo pkgid` invocations.

This change requires only setting -is-final-crate when perf is required,
rather than setting it unconditionally.

The change also requires changing run_rustc() to handle the case where
perf output is not present. (This change was necessary because
rustc-fake was buggy -- it always ran perf, even if USE_PERF was set to

In rustc-fake, we can now combine the three separate conditions in
main() into one, simplifying it.

Finally, the patch also renames -is-final-crate as
--wrap-rustc-with, and gives it an argument -- the only valid argument
is currently "perf-stat", though the next patch will change that. This
new name better reflects how it is used.

This comment has been minimized.


Mark-Simulacrum Apr 11, 2018


I see "Note that you cannot specify --cache-sim=no and --branch-sim=no together, as that would leave Cachegrind with no information to collect." in the manpage for valgrind; is that information wrong/misleading?

Implement `profile_{perf_record,cachegrind,dhat}` commands.
This patch implements three new subcommands, each one invoking a
different profiler. Currently each one profiles a single build ("clean",
non-incremental, Debug) of each benchmark. Command line invocation is
the same as `bench_local`. Per-build output files are put into the
--output-repo directory.

Currently the invocations of perf-record, Cachegrind and DHAT are
hard-wired. We can consider a mechanism for allowing variations later
(e.g. a config file) if it's necessary.

Some specific details:

- rustc-fake's --wrap-rustc-with flag now accepts three additional
  argument: "perf-record", "cachegrind", "dhat".

- `CargoProcess::perf` is renamed `profiler`, and it has the new
  `Profiler` type. It's now specified by the profiler() modifier
  function, instead of via the argument to mk_cargo_process().

- run_rustc() is split into two: run_rustc() and

This comment has been minimized.


nnethercote commented Apr 11, 2018

Well spotted on the Cachegrind docs! That info is indeed wrong, with --cache-sim=no and --branch-sim=no it still collects instruction counts. That's by far the most useful stat, and disabling the cache-sim and branch-sim make it run faster. I've added a comment about this, and I told the Valgrind maintainer about the doc error.

I also changed the out_file closure to use format! instead of multiple push_str() calls.

@Mark-Simulacrum Mark-Simulacrum merged commit 9e39ee9 into rust-lang-nursery:master Apr 12, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed

@nnethercote nnethercote deleted the nnethercote:branch_cachegrind branch Apr 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment