Pack u128 in the compiler to mitigate new alignment #120080

cuviper · 2024-01-18T02:26:33Z

This is based on #116672, adding a new #[repr(packed(8))] wrapper on u128 to avoid changing any of the compiler's size assertions. This is needed in two places:

SwitchTargets, otherwise its SmallVec<[u128; 1]> gets padded up to 32 bytes.
LitKind::Int, so that entire enum can stay 24 bytes.
- This change definitely has far-reaching effects though, since it's public.

cuviper · 2024-01-18T02:26:54Z

@bors try @rust-timer queue

[WIP] pack u128 in the compiler to mitigate new alignment This is based on rust-lang#116672, adding a new `#[repr(packed(8))]` wrapper on `u128` to avoid changing any of the compiler's size assertions. This is needed in two places: * `SwitchTargets`, otherwise its `SmallVec<[u128; 1]>` gets padded up to 32 bytes. * `LitKind::Int`, so that entire `enum` can stay 24 bytes. * This change definitely has far-reaching effects though, since it's public. r? ghost

bors · 2024-01-18T02:28:04Z

⌛ Trying commit 6718116 with merge 966f88e...

bors · 2024-01-18T03:53:28Z

☀️ Try build successful - checks-actions
Build commit: 966f88e (966f88e8ebc1d23bfabb946b308fd0afb43d94a9)

rust-timer · 2024-01-18T06:49:24Z

Finished benchmarking commit (966f88e): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.5%	[-0.7%, -0.4%]	2
Improvements ✅ (secondary)	-0.5%	[-1.1%, -0.1%]	8
All ❌✅ (primary)	-0.5%	[-0.7%, -0.4%]	2

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.3%	[0.9%, 1.7%]	2
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.7%	[-5.3%, -0.8%]	6
All ❌✅ (primary)	-	-	0

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 663.454s -> 665.105s (0.25%)
Artifact size: 308.30 MiB -> 308.35 MiB (0.02%)

nikic · 2024-01-18T08:38:26Z

@bors try @rust-timer queue

(With just the SwitchTargets change, to see how much each contributes to memory usage.)

bors · 2024-01-18T08:39:40Z

⌛ Trying commit 477f5f5 with merge f72022d...

[WIP] pack u128 in the compiler to mitigate new alignment This is based on rust-lang#116672, adding a new `#[repr(packed(8))]` wrapper on `u128` to avoid changing any of the compiler's size assertions. This is needed in two places: * `SwitchTargets`, otherwise its `SmallVec<[u128; 1]>` gets padded up to 32 bytes. * `LitKind::Int`, so that entire `enum` can stay 24 bytes. * This change definitely has far-reaching effects though, since it's public. r? ghost

bors · 2024-01-18T10:06:07Z

☀️ Try build successful - checks-actions
Build commit: f72022d (f72022d49fa49cf47479c9c9dd2da621a3740c0d)

rust-timer · 2024-01-18T12:18:51Z

Finished benchmarking commit (f72022d): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.7%	[0.4%, 1.4%]	6
Improvements ✅ (primary)	-0.5%	[-0.5%, -0.5%]	1
Improvements ✅ (secondary)	-0.5%	[-1.0%, -0.4%]	9
All ❌✅ (primary)	-0.5%	[-0.5%, -0.5%]	1

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.7%	[0.6%, 3.4%]	3
Regressions ❌ (secondary)	3.4%	[0.4%, 12.2%]	21
Improvements ✅ (primary)	-1.7%	[-1.7%, -1.7%]	1
Improvements ✅ (secondary)	-2.3%	[-2.3%, -2.3%]	1
All ❌✅ (primary)	0.9%	[-1.7%, 3.4%]	4

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 665.247s -> 665.866s (0.09%)
Artifact size: 308.30 MiB -> 308.34 MiB (0.01%)

nikic · 2024-01-18T12:57:34Z

Okay, seems pretty clear that the LitKind::Int change is the more important one. Given that we're going from a 10% max-rss regression to a 5% improvement (huh, why do we get an improvement over baseline...?) on deep-vector, I tend to think that doing this is worthwhile, even if it makes things slightly uglier.

cuviper · 2024-01-18T18:35:10Z

OK, should I add my commits to the other PR, or would you rather handle this separately?

nikic · 2024-01-18T20:36:39Z

@cuviper Let's do this one separately.

rustbot · 2024-01-20T04:19:02Z

Some changes occurred in src/tools/clippy

cc @rust-lang/clippy

This PR changes MIR

cc @oli-obk, @RalfJung, @JakobDegen, @davidtwco, @celinval, @vakaras

cuviper · 2024-01-20T04:19:31Z

r? nikic

RalfJung · 2024-01-20T07:04:27Z

compiler/rustc_data_structures/src/packed.rs

+
+#[repr(packed(8))]
+#[derive(Copy, Clone, Debug, Hash, PartialEq, Eq, PartialOrd, Ord)]
+pub struct Pu128(pub u128);


Is there any reason this needs to be 8-aligned? Is there anything to be gained from reducing alignment further?

I just chose that to match its former alignment, which should still be a nice "native" alignment (on 64-bit hosts). We could try smaller, but in a quick local check, even a full #[repr(packed)] didn't change these static_assert_sizes any further. Generally, I suspect there's almost always going to be a usize or pointer nearby keeping overall struct alignment up.

Maybe we should make it match target_pointer_width? But we don't test performance on other targets...

cuviper · 2024-01-20T19:35:04Z

Let's give this a new perf run for the delta since #116672 merged.

@bors try @rust-timer queue

Pack u128 in the compiler to mitigate new alignment This is based on rust-lang#116672, adding a new `#[repr(packed(8))]` wrapper on `u128` to avoid changing any of the compiler's size assertions. This is needed in two places: * `SwitchTargets`, otherwise its `SmallVec<[u128; 1]>` gets padded up to 32 bytes. * `LitKind::Int`, so that entire `enum` can stay 24 bytes. * This change definitely has far-reaching effects though, since it's public.

bors · 2024-01-20T19:36:15Z

⌛ Trying commit 33e0422 with merge 52da7ed...

bors · 2024-01-20T21:02:11Z

☀️ Try build successful - checks-actions
Build commit: 52da7ed (52da7edc6b6a6bb2373f1aa7f8d9e88cc4db990d)

rust-timer · 2024-01-20T22:22:51Z

Finished benchmarking commit (52da7ed): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.2%, 0.4%]	4
Regressions ❌ (secondary)	2.5%	[2.5%, 2.5%]	1
Improvements ✅ (primary)	-0.8%	[-0.8%, -0.8%]	1
Improvements ✅ (secondary)	-0.6%	[-0.7%, -0.4%]	5
All ❌✅ (primary)	0.1%	[-0.8%, 0.4%]	5

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.0%	[1.1%, 2.4%]	3
Improvements ✅ (primary)	-1.1%	[-1.7%, -0.6%]	3
Improvements ✅ (secondary)	-4.1%	[-10.9%, -1.3%]	20
All ❌✅ (primary)	-1.1%	[-1.7%, -0.6%]	3

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 665.65s -> 665.236s (-0.06%)
Artifact size: 308.31 MiB -> 308.29 MiB (-0.01%)

nikic · 2024-01-22T09:19:24Z

@bors r+ rollup=never

bors · 2024-01-22T09:19:27Z

📌 Commit 33e0422 has been approved by nikic

It is now in the queue for this repository.

bors · 2024-01-22T13:08:23Z

⌛ Testing commit 33e0422 with merge 3066253...

bors · 2024-01-22T15:07:11Z

☀️ Test successful - checks-actions
Approved by: nikic
Pushing 3066253 to master...

rust-timer · 2024-01-22T16:20:59Z

Finished benchmarking commit (3066253): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.2%, 0.5%]	12
Regressions ❌ (secondary)	1.2%	[0.4%, 2.1%]	2
Improvements ✅ (primary)	-0.7%	[-0.7%, -0.7%]	1
Improvements ✅ (secondary)	-0.6%	[-0.7%, -0.3%]	5
All ❌✅ (primary)	0.3%	[-0.7%, 0.5%]	13

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	1.8%	[1.5%, 2.0%]	3
Improvements ✅ (primary)	-1.3%	[-2.4%, -0.7%]	3
Improvements ✅ (secondary)	-3.8%	[-9.9%, -1.4%]	21
All ❌✅ (primary)	-1.3%	[-2.4%, -0.7%]	3

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.2%	[2.9%, 3.3%]	4
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 663.975s -> 664.769s (0.12%)
Artifact size: 308.39 MiB -> 308.37 MiB (-0.01%)

Kobzol · 2024-01-23T08:34:37Z

The max-RSS improvements outweight the instruction count losses.

@rustbot label: +perf-regression-triaged

Pack u128 in the compiler to mitigate new alignment This is based on rust-lang#116672, adding a new `#[repr(packed(8))]` wrapper on `u128` to avoid changing any of the compiler's size assertions. This is needed in two places: * `SwitchTargets`, otherwise its `SmallVec<[u128; 1]>` gets padded up to 32 bytes. * `LitKind::Int`, so that entire `enum` can stay 24 bytes. * This change definitely has far-reaching effects though, since it's public.

Related PRs so far: - rust-lang/rust#119869 - rust-lang/rust#120080 - rust-lang/rust#120128 - rust-lang/rust#119369 - rust-lang/rust#116672 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses. --------- Signed-off-by: Felipe R. Monteiro <felisous@amazon.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: tautschnig <tautschnig@users.noreply.github.com> Co-authored-by: Qinheping Hu <qinhh@amazon.com> Co-authored-by: Michael Tautschnig <tautschn@amazon.com> Co-authored-by: Felipe R. Monteiro <felisous@amazon.com>

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Jan 18, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 18, 2024

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 18, 2024

cuviper mentioned this pull request Jan 18, 2024

LLVM 18 x86 data layout update #116672

Merged

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 18, 2024

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jan 18, 2024

cuviper added 3 commits January 19, 2024 20:10

Add Pu128 = #[repr(packed(8))] u128

cb7d863

Pack the u128 in SwitchTargets

167555f

Pack the u128 in LitKind::Int

33e0422

cuviper force-pushed the 128-align-packed branch from 477f5f5 to 33e0422 Compare January 20, 2024 04:16

rustbot assigned nikic Jan 20, 2024

RalfJung reviewed Jan 20, 2024

View reviewed changes

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 20, 2024

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 20, 2024

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 22, 2024

bors added the merged-by-bors This PR was explicitly merged by bors. label Jan 22, 2024

bors merged commit 3066253 into rust-lang:master Jan 22, 2024

rustbot added this to the 1.77.0 milestone Jan 22, 2024

bors mentioned this pull request Jan 22, 2024

Rollup of 10 pull requests #120235

Closed

cuviper deleted the 128-align-packed branch January 23, 2024 04:34

rustbot added the perf-regression-triaged The performance regression has been triaged. label Jan 23, 2024

celinval mentioned this pull request Jan 24, 2024

Upgrade Rust toolchain to nightly-2024-01-23 model-checking/kani#2983

Merged

Pack u128 in the compiler to mitigate new alignment #120080

Pack u128 in the compiler to mitigate new alignment #120080

Uh oh!

Conversation

cuviper commented Jan 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cuviper commented Jan 18, 2024

Uh oh!

This comment has been minimized.

bors commented Jan 18, 2024

Uh oh!

bors commented Jan 18, 2024

Uh oh!

This comment has been minimized.

rust-timer commented Jan 18, 2024

Overall result: ✅ improvements - no action needed

Uh oh!

nikic commented Jan 18, 2024

Uh oh!

This comment has been minimized.

bors commented Jan 18, 2024

Uh oh!

bors commented Jan 18, 2024

Uh oh!

This comment has been minimized.

rust-timer commented Jan 18, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Uh oh!

nikic commented Jan 18, 2024

Uh oh!

cuviper commented Jan 18, 2024

Uh oh!

nikic commented Jan 18, 2024

Uh oh!

rustbot commented Jan 20, 2024

Uh oh!

cuviper commented Jan 20, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cuviper commented Jan 20, 2024

Uh oh!

This comment has been minimized.

bors commented Jan 20, 2024

Uh oh!

bors commented Jan 20, 2024

Uh oh!

This comment has been minimized.

rust-timer commented Jan 20, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Uh oh!

nikic commented Jan 22, 2024

Uh oh!

bors commented Jan 22, 2024

Uh oh!

bors commented Jan 22, 2024

Uh oh!

bors commented Jan 22, 2024

Uh oh!

rust-timer commented Jan 22, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Uh oh!

Kobzol commented Jan 23, 2024

Uh oh!

Uh oh!

cuviper commented Jan 18, 2024 •

edited

Loading