Split `Liveness::users` into three. #54211

nnethercote · 2018-09-14T03:02:56Z

This reduces memory usage on some benchmarks because no space is wasted
for padding. For a check-clean build of keccak it reduces max-rss
by 20%.

r? @nikomatsakis, but I want to do a perf run. Locally, I had these results:

instructions: slight regression
max-rss: big win on "Clean" builds
faults: big win on "Clean" and "Nll" builds
wall-time: small win on "Clean" and "Nll" builds

So I want to see how a different machine compares.

This reduces memory usage on some benchmarks because no space is wasted for padding. For a `check-clean` build of `keccak` it reduces `max-rss` by 20%.

nnethercote · 2018-09-14T03:03:07Z

@bors try

bors · 2018-09-14T03:03:19Z

⌛ Trying commit efae70c with merge 7da277b...

@nikomatsakis

Split `Liveness::users` into three. This reduces memory usage on some benchmarks because no space is wasted for padding. For a `check-clean` build of `keccak` it reduces `max-rss` by 20%. r? @nikomatsakis, but I want to do a perf run. Locally, I had these results: - instructions: slight regression - max-rss: big win on "Clean" builds - faults: big win on "Clean" and "Nll" builds - wall-time: small win on "Clean" and "Nll" builds So I want to see how a different machine compares.

Mark-Simulacrum · 2018-09-14T03:13:13Z

src/librustc/middle/liveness.rs

+    // separate `Vec`s so that no space is wasted for padding.
+    users_reader: Vec<LiveNode>,
+    users_writer: Vec<LiveNode>,
+    users_used: Vec<bool>,


Is it worth also making this a bitset/bitvec?

#54208 (comment) discusses this. It would reduce memory more but increase instruction counts. Vec<bool> might be the best middle ground; let's see how the perf results look with it.

bors · 2018-09-14T05:28:16Z

☀️ Test successful - status-travis
State: approved= try=True

nnethercote · 2018-09-14T08:33:11Z

@rust-timer build 7da277b

rust-timer · 2018-09-14T08:33:12Z

Success: Queued 7da277b with parent 4f921d7, comparison URL.

nikomatsakis

r=me if the perf results look good; note that we would like to rewrite this completely to operate on MIR. But that's a bigger job (@wesleywiser was interested, I think).

nikomatsakis · 2018-09-14T15:01:59Z

Perf results look like a small perf hit (5-6%) but a big memory use hit (20% etc in some cases). Interesting.

nnethercote · 2018-09-14T23:48:18Z

I think this is a rare case where instruction counts are misleading!

Note that keccak is the one mostly clearly affected, and inflate and clap-rs are also affected, and nothing else is. So only look at the results for those three benchmarks; the rest is noise. Also keccak-check is the only one that measures NLL.

With all that in mind, here are the results just for keccak, including check, debug and opt.

cpu-clock: 0--6% better, 0.7% better for nll-check
cycles: 0--4% better, 0.6% better for nll-check
faults: 4--18% better, 6.6% better for nll-check
instructions: 0--3% worse, 0.3% worse for nll-check
max-rss: 0--20% better, no change for nll-check
task-clock: 0--6% better, 0.7% better for nll-check
wall-time: 0--6% better, 0.7% better for nll-check

Instructions gets worse, but everything else gets better. And we have a simple theoretical explanation for this: less memory traffic. So I think we should land this, but I am happy to defer to @nikomatsakis's decision.

wesleywiser · 2018-09-14T23:58:08Z

I think the other important bit to note is that it seems to be mostly the clean incremental builds which are showing regressions. A 5% regression on the clean incremental time is usually a very, very small amount of clock time since clean incremental builds are usually very fast.

nnethercote · 2018-09-15T01:40:23Z

A 5% regression on the clean incremental time is usually a very, very small

At the risk of laboring the point: it's a win, not a regression on the time. It's only a regression on instruction counts.

wesleywiser · 2018-09-15T02:10:46Z

Yes. Sorry if I wasn't clear: I'm very much in favor of merging this PR 👍

…

On Fri, Sep 14, 2018, 9:40 PM Nicholas Nethercote ***@***.***> wrote: A 5% regression on the clean incremental time is usually a very, very small At the risk of laboring the point: it's a win, not a regression on the *time*. It's only a regression on instruction counts. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#54211 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAyu2A32CcJ-YNT2Fe9kCE6Lq28iRpb2ks5ubFqfgaJpZM4Woia1> .

nikomatsakis · 2018-09-18T17:18:09Z

@bors r+

bors · 2018-09-18T17:18:10Z

📌 Commit efae70c has been approved by nikomatsakis

nikomatsakis · 2018-09-18T17:18:21Z

Sorry, forgot I hadn't already written that. Thanks @nnethercote for the extra details.

bors · 2018-09-20T00:16:57Z

⌛ Testing commit efae70c with merge 1d33aed...

@nikomatsakis

…akis Split `Liveness::users` into three. This reduces memory usage on some benchmarks because no space is wasted for padding. For a `check-clean` build of `keccak` it reduces `max-rss` by 20%. r? @nikomatsakis, but I want to do a perf run. Locally, I had these results: - instructions: slight regression - max-rss: big win on "Clean" builds - faults: big win on "Clean" and "Nll" builds - wall-time: small win on "Clean" and "Nll" builds So I want to see how a different machine compares.

bors · 2018-09-20T02:51:50Z

☀️ Test successful - status-appveyor, status-travis
Approved by: nikomatsakis
Pushing 1d33aed to master...

Split Liveness::users into three.

efae70c

This reduces memory usage on some benchmarks because no space is wasted for padding. For a `check-clean` build of `keccak` it reduces `max-rss` by 20%.

rust-highfive assigned nikomatsakis Sep 14, 2018

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 14, 2018

nnethercote mentioned this pull request Sep 14, 2018

High memory usage compiling keccak benchmark #54208

Closed

Mark-Simulacrum reviewed Sep 14, 2018

View reviewed changes

nikomatsakis approved these changes Sep 14, 2018

View reviewed changes

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 18, 2018

bors merged commit efae70c into rust-lang:master Sep 20, 2018

nnethercote deleted the keccak-Liveness-memory branch September 20, 2018 03:26

nnethercote mentioned this pull request Sep 21, 2018

Compress Liveness data some more. #54420

Merged

Split Liveness::users into three. #54211

Split Liveness::users into three. #54211

Uh oh!

Conversation

nnethercote commented Sep 14, 2018

Uh oh!

nnethercote commented Sep 14, 2018

Uh oh!

bors commented Sep 14, 2018

Uh oh!

Mark-Simulacrum Sep 14, 2018

Choose a reason for hiding this comment

Uh oh!

nnethercote Sep 14, 2018

Choose a reason for hiding this comment

Uh oh!

bors commented Sep 14, 2018

Uh oh!

nnethercote commented Sep 14, 2018

Uh oh!

rust-timer commented Sep 14, 2018

Uh oh!

nikomatsakis left a comment

Choose a reason for hiding this comment

Uh oh!

nikomatsakis commented Sep 14, 2018

Uh oh!

nnethercote commented Sep 14, 2018

Uh oh!

wesleywiser commented Sep 14, 2018

Uh oh!

nnethercote commented Sep 15, 2018

Uh oh!

wesleywiser commented Sep 15, 2018 via email

Uh oh!

nikomatsakis commented Sep 18, 2018

Uh oh!

bors commented Sep 18, 2018

Uh oh!

nikomatsakis commented Sep 18, 2018

Uh oh!

bors commented Sep 20, 2018

Uh oh!

bors commented Sep 20, 2018

Uh oh!

Uh oh!

Split `Liveness::users` into three. #54211

Split `Liveness::users` into three. #54211