Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split `Liveness::users` into three. #54211

Merged
merged 1 commit into from Sep 20, 2018

Conversation

@nnethercote
Copy link
Contributor

@nnethercote nnethercote commented Sep 14, 2018

This reduces memory usage on some benchmarks because no space is wasted
for padding. For a check-clean build of keccak it reduces max-rss
by 20%.

r? @nikomatsakis, but I want to do a perf run. Locally, I had these results:

  • instructions: slight regression
  • max-rss: big win on "Clean" builds
  • faults: big win on "Clean" and "Nll" builds
  • wall-time: small win on "Clean" and "Nll" builds

So I want to see how a different machine compares.

This reduces memory usage on some benchmarks because no space is wasted
for padding. For a `check-clean` build of `keccak` it reduces `max-rss`
by 20%.
@nnethercote
Copy link
Contributor Author

@nnethercote nnethercote commented Sep 14, 2018

@bors try

@bors
Copy link
Contributor

@bors bors commented Sep 14, 2018

Trying commit efae70c with merge 7da277b...

bors added a commit that referenced this pull request Sep 14, 2018
Split `Liveness::users` into three.

This reduces memory usage on some benchmarks because no space is wasted
for padding. For a `check-clean` build of `keccak` it reduces `max-rss`
by 20%.

r? @nikomatsakis, but I want to do a perf run. Locally, I had these results:
- instructions: slight regression
- max-rss: big win on "Clean" builds
- faults: big win on "Clean" and "Nll" builds
- wall-time: small win on "Clean" and "Nll" builds

So I want to see how a different machine compares.
// separate `Vec`s so that no space is wasted for padding.
users_reader: Vec<LiveNode>,
users_writer: Vec<LiveNode>,
users_used: Vec<bool>,

This comment has been minimized.

@Mark-Simulacrum

Mark-Simulacrum Sep 14, 2018
Member

Is it worth also making this a bitset/bitvec?

This comment has been minimized.

@nnethercote

nnethercote Sep 14, 2018
Author Contributor

#54208 (comment) discusses this. It would reduce memory more but increase instruction counts. Vec<bool> might be the best middle ground; let's see how the perf results look with it.

@bors
Copy link
Contributor

@bors bors commented Sep 14, 2018

☀️ Test successful - status-travis
State: approved= try=True

@nnethercote
Copy link
Contributor Author

@nnethercote nnethercote commented Sep 14, 2018

@rust-timer
Copy link
Collaborator

@rust-timer rust-timer commented Sep 14, 2018

Success: Queued 7da277b with parent 4f921d7, comparison URL.

Copy link
Contributor

@nikomatsakis nikomatsakis left a comment

r=me if the perf results look good; note that we would like to rewrite this completely to operate on MIR. But that's a bigger job (@wesleywiser was interested, I think).

@nikomatsakis
Copy link
Contributor

@nikomatsakis nikomatsakis commented Sep 14, 2018

Perf results look like a small perf hit (5-6%) but a big memory use hit (20% etc in some cases). Interesting.

@nnethercote
Copy link
Contributor Author

@nnethercote nnethercote commented Sep 14, 2018

I think this is a rare case where instruction counts are misleading!

Note that keccak is the one mostly clearly affected, and inflate and clap-rs are also affected, and nothing else is. So only look at the results for those three benchmarks; the rest is noise. Also keccak-check is the only one that measures NLL.

With all that in mind, here are the results just for keccak, including check, debug and opt.

  • cpu-clock: 0--6% better, 0.7% better for nll-check
  • cycles: 0--4% better, 0.6% better for nll-check
  • faults: 4--18% better, 6.6% better for nll-check
  • instructions: 0--3% worse, 0.3% worse for nll-check
  • max-rss: 0--20% better, no change for nll-check
  • task-clock: 0--6% better, 0.7% better for nll-check
  • wall-time: 0--6% better, 0.7% better for nll-check

Instructions gets worse, but everything else gets better. And we have a simple theoretical explanation for this: less memory traffic. So I think we should land this, but I am happy to defer to @nikomatsakis's decision.

@wesleywiser
Copy link
Member

@wesleywiser wesleywiser commented Sep 14, 2018

I think the other important bit to note is that it seems to be mostly the clean incremental builds which are showing regressions. A 5% regression on the clean incremental time is usually a very, very small amount of clock time since clean incremental builds are usually very fast.

@nnethercote
Copy link
Contributor Author

@nnethercote nnethercote commented Sep 15, 2018

A 5% regression on the clean incremental time is usually a very, very small

At the risk of laboring the point: it's a win, not a regression on the time. It's only a regression on instruction counts.

@wesleywiser
Copy link
Member

@wesleywiser wesleywiser commented Sep 15, 2018

@nikomatsakis
Copy link
Contributor

@nikomatsakis nikomatsakis commented Sep 18, 2018

@bors r+

@bors
Copy link
Contributor

@bors bors commented Sep 18, 2018

📌 Commit efae70c has been approved by nikomatsakis

@nikomatsakis
Copy link
Contributor

@nikomatsakis nikomatsakis commented Sep 18, 2018

Sorry, forgot I hadn't already written that. Thanks @nnethercote for the extra details.

@bors
Copy link
Contributor

@bors bors commented Sep 20, 2018

Testing commit efae70c with merge 1d33aed...

bors added a commit that referenced this pull request Sep 20, 2018
…akis

Split `Liveness::users` into three.

This reduces memory usage on some benchmarks because no space is wasted
for padding. For a `check-clean` build of `keccak` it reduces `max-rss`
by 20%.

r? @nikomatsakis, but I want to do a perf run. Locally, I had these results:
- instructions: slight regression
- max-rss: big win on "Clean" builds
- faults: big win on "Clean" and "Nll" builds
- wall-time: small win on "Clean" and "Nll" builds

So I want to see how a different machine compares.
@bors
Copy link
Contributor

@bors bors commented Sep 20, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: nikomatsakis
Pushing 1d33aed to master...

@bors bors merged commit efae70c into rust-lang:master Sep 20, 2018
2 checks passed
2 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
@nnethercote nnethercote deleted the nnethercote:keccak-Liveness-memory branch Sep 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

7 participants
You can’t perform that action at this time.