Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Split `Liveness::users` into three. #54211
Conversation
This reduces memory usage on some benchmarks because no space is wasted for padding. For a `check-clean` build of `keccak` it reduces `max-rss` by 20%.
|
@bors try |
Split `Liveness::users` into three. This reduces memory usage on some benchmarks because no space is wasted for padding. For a `check-clean` build of `keccak` it reduces `max-rss` by 20%. r? @nikomatsakis, but I want to do a perf run. Locally, I had these results: - instructions: slight regression - max-rss: big win on "Clean" builds - faults: big win on "Clean" and "Nll" builds - wall-time: small win on "Clean" and "Nll" builds So I want to see how a different machine compares.
| // separate `Vec`s so that no space is wasted for padding. | ||
| users_reader: Vec<LiveNode>, | ||
| users_writer: Vec<LiveNode>, | ||
| users_used: Vec<bool>, |
Mark-Simulacrum
Sep 14, 2018
Member
Is it worth also making this a bitset/bitvec?
Is it worth also making this a bitset/bitvec?
nnethercote
Sep 14, 2018
Author
Contributor
#54208 (comment) discusses this. It would reduce memory more but increase instruction counts. Vec<bool> might be the best middle ground; let's see how the perf results look with it.
#54208 (comment) discusses this. It would reduce memory more but increase instruction counts. Vec<bool> might be the best middle ground; let's see how the perf results look with it.
|
|
|
@rust-timer build 7da277b |
|
Success: Queued 7da277b with parent 4f921d7, comparison URL. |
|
r=me if the perf results look good; note that we would like to rewrite this completely to operate on MIR. But that's a bigger job (@wesleywiser was interested, I think). |
|
Perf results look like a small perf hit (5-6%) but a big memory use hit (20% etc in some cases). Interesting. |
|
I think this is a rare case where instruction counts are misleading! Note that With all that in mind, here are the results just for
Instructions gets worse, but everything else gets better. And we have a simple theoretical explanation for this: less memory traffic. So I think we should land this, but I am happy to defer to @nikomatsakis's decision. |
|
I think the other important bit to note is that it seems to be mostly the |
At the risk of laboring the point: it's a win, not a regression on the time. It's only a regression on instruction counts. |
|
Yes. Sorry if I wasn't clear: I'm very much in favor of merging this PR
…On Fri, Sep 14, 2018, 9:40 PM Nicholas Nethercote ***@***.***> wrote:
A 5% regression on the clean incremental time is usually a very, very small
At the risk of laboring the point: it's a win, not a regression on the
*time*. It's only a regression on instruction counts.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#54211 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAyu2A32CcJ-YNT2Fe9kCE6Lq28iRpb2ks5ubFqfgaJpZM4Woia1>
.
|
|
@bors r+ |
|
|
|
Sorry, forgot I hadn't already written that. Thanks @nnethercote for the extra details. |
…akis Split `Liveness::users` into three. This reduces memory usage on some benchmarks because no space is wasted for padding. For a `check-clean` build of `keccak` it reduces `max-rss` by 20%. r? @nikomatsakis, but I want to do a perf run. Locally, I had these results: - instructions: slight regression - max-rss: big win on "Clean" builds - faults: big win on "Clean" and "Nll" builds - wall-time: small win on "Clean" and "Nll" builds So I want to see how a different machine compares.
|
|
This reduces memory usage on some benchmarks because no space is wasted
for padding. For a
check-cleanbuild ofkeccakit reducesmax-rssby 20%.
r? @nikomatsakis, but I want to do a perf run. Locally, I had these results:
So I want to see how a different machine compares.