Rollup of 9 pull requests #123645

matthiaskrgr · 2024-04-08T20:06:25Z

Successful merges:

Fix argument ABI for overaligned structs on ppc64le #122781 (Fix argument ABI for overaligned structs on ppc64le)
Safe Transmute: Compute transmutability from rustc_target::abi::Layout #123367 (Safe Transmute: Compute transmutability from rustc_target::abi::Layout)
Fix ByMove coroutine-closure shim (for 2021 precise closure capturing behavior) #123518 (Fix ByMove coroutine-closure shim (for 2021 precise closure capturing behavior))
bootstrap: remove unused pub fns #123547 (bootstrap: remove unused pub fns)
Don't emit divide-by-zero panic paths in StepBy::len #123564 (Don't emit divide-by-zero panic paths in StepBy::len)
Restore pred_known_to_hold_modulo_regions #123578 (Restore pred_known_to_hold_modulo_regions)
Remove unnecessary cast from LLVMRustGetInstrProfIncrementIntrinsic #123591 (Remove unnecessary cast from LLVMRustGetInstrProfIncrementIntrinsic)
parser: reduce visibility of unnecessary public UnmatchedDelim #123632 (parser: reduce visibility of unnecessary public UnmatchedDelim)
CFI: Fix ICE in KCFI non-associated function pointers #123635 (CFI: Fix ICE in KCFI non-associated function pointers)

r? @ghost
@rustbot modify labels: rollup

…tBorrow

I happened to notice today that there's actually two such calls emitted in the assembly: <https://rust.godbolt.org/z/1Wbbd3Ts6> Since they're impossible, hopefully telling LLVM that will also help optimizations elsewhere.

This reverts commit 399a258.

This particular cast appears to have been copied over from clang, but there are plenty of other call sites in clang that don't bother with a cast here, and it works fine without one. For context, `llvm::Intrinsic::ID` is a typedef for `unsigned`, and `llvm::Intrinsic::instrprof_increment` is a member of `enum IndependentIntrinsics : unsigned`.

When passing a 16 (or higher) aligned struct by value on ppc64le, it needs to be passed as an array of `i128` rather than an array of `i64`. This will force the use of an even starting register. For the case of a 16 byte struct with alignment 16 it is important that `[1 x i128]` is used instead of `i128` -- apparently, the latter will get treated similarly to `[2 x i64]`, not exhibiting the correct ABI. Add a `force_array` flag to `Uniform` to support this. The relevant clang code can be found here: https://github.com/llvm/llvm-project/blob/fe2119a7b08b6e468b2a67768904ea85b1bf0a45/clang/lib/CodeGen/Targets/PPC.cpp#L878-L884 https://github.com/llvm/llvm-project/blob/fe2119a7b08b6e468b2a67768904ea85b1bf0a45/clang/lib/CodeGen/Targets/PPC.cpp#L780-L784 I think the corresponding psABI wording is this: > Fixed size aggregates and unions passed by value are mapped to as > many doublewords of the parameter save area as the value uses in > memory. Aggregrates and unions are aligned according to their > alignment requirements. This may result in doublewords being > skipped for alignment. In particular the last sentence. Fixes rust-lang#122767.

The actual ABI implication here is that in some cases the values are required to be "consecutive", i.e. must either all be passed in registers or all on stack (without padding). Adjust the code to either use Uniform::new() or Uniform::consecutive() depending on which behavior is needed. Then, when lowering this in LLVM, skip the [1 x i128] to i128 simplification if is_consecutive is set. i128 is the only case I'm aware of where this is problematic right now. If we find other cases, we can extend this (either based on target information or possibly just by not simplifying for is_consecutive entirely).

`lexer::UnmatchedDelim` struct in `rustc_parse` is unnecessary public outside of the crate. This commit reduces the visibility to `pub(crate)`. Beside, this removes unnecessary field `expected_delim` that causes warnings after changing the visibility.

In its first step of computing transmutability, `rustc_transmutability` constructs a byte-level representation of type layout (`Tree`). Previously, this representation was computed for ADTs by inspecting the ADT definition and performing our own layout computations. This process was error-prone, verbose, and limited our ability to analyze many types (particularly default-repr types). In this PR, we instead construct `Tree`s from `rustc_target::abi::Layout`s. This helps ensure that layout optimizations are reflected our analyses, and increases the kinds of types we can now analyze, including: - default repr ADTs - transparent unions - `UnsafeCell`-containing types Overall, this PR expands the expressvity of `rustc_transmutability` to be much closer to the transmutability analysis performed by miri. Future PRs will work to close the remaining gaps (e.g., support for `Box`, raw pointers, `NonZero*`, coroutines, etc.).

We oddly weren't testing the more usual case of casting non-methods to function pointers. The KCFI shim insertion logic would ICE on these due to asking for an irrefutable associated item if we cast a function to a function pointer without needing a traditional shim.

Fix argument ABI for overaligned structs on ppc64le When passing a 16 (or higher) aligned struct by value on ppc64le, it needs to be passed as an array of `i128` rather than an array of `i64`. This will force the use of an even starting doubleword. For the case of a 16 byte struct with alignment 16 it is important that `[1 x i128]` is used instead of `i128` -- apparently, the latter will get treated similarly to `[2 x i64]`, not exhibiting the correct ABI. Add a `force_array` flag to `Uniform` to support this. The relevant clang code can be found here: https://github.com/llvm/llvm-project/blob/fe2119a7b08b6e468b2a67768904ea85b1bf0a45/clang/lib/CodeGen/Targets/PPC.cpp#L878-L884 https://github.com/llvm/llvm-project/blob/fe2119a7b08b6e468b2a67768904ea85b1bf0a45/clang/lib/CodeGen/Targets/PPC.cpp#L780-L784 I think the corresponding psABI wording is this: > Fixed size aggregates and unions passed by value are mapped to as > many doublewords of the parameter save area as the value uses in > memory. Aggregrates and unions are aligned according to their > alignment requirements. This may result in doublewords being > skipped for alignment. In particular the last sentence. Though I didn't find any wording for Clang's behavior of clamping the alignment to 16. Fixes rust-lang#122767. r? `@cuviper`

Safe Transmute: Compute transmutability from `rustc_target::abi::Layout` In its first step of computing transmutability, `rustc_transmutability` constructs a byte-level representation of type layout (`Tree`). Previously, this representation was computed for ADTs by inspecting the ADT definition and performing our own layout computations. This process was error-prone, verbose, and limited our ability to analyze many types (particularly default-repr types). In this PR, we instead construct `Tree`s from `rustc_target::abi::Layout`s. This helps ensure that layout optimizations are reflected our analyses, and increases the kinds of types we can now analyze, including: - default repr ADTs - transparent unions - `UnsafeCell`-containing types Overall, this PR expands the expressvity of `rustc_transmutability` to be much closer to the transmutability analysis performed by miri. Future PRs will work to close the remaining gaps (e.g., support for `Box`, raw pointers, `NonZero*`, coroutines, etc.). r? `@compiler-errors`

…li-obk Fix `ByMove` coroutine-closure shim (for 2021 precise closure capturing behavior) This PR reworks the way that we perform the `ByMove` coroutine-closure shim to account for the fact that the upvars of the outer coroutine-closure and the inner coroutine might not line up due to edition-2021 closure capture rules changes. Specifically, the number of upvars may differ *and/or* the inner coroutine may have additional projections applied to an upvar. This PR reworks the information we pass into the `ByMoveBody` MIR visitor to account for both of these facts. I tried to leave comments explaining exactly what everything is doing, but let me know if you have questions. r? oli-obk

bootstrap: remove unused pub fns Looks dead, remove.

Don't emit divide-by-zero panic paths in `StepBy::len` I happened to notice today that there's actually two such calls emitted in the assembly: <https://rust.godbolt.org/z/1Wbbd3Ts6> Since they're impossible, hopefully telling LLVM that will also help optimizations elsewhere.

…errors Restore `pred_known_to_hold_modulo_regions` As requested by `@lcnr` in rust-lang#123275 (comment) this PR restores `pred_known_to_hold_modulo_regions` to fix that "unexpected unsized tail" beta regression. This also adds the reduced repro from rust-lang#123275 (comment) as a sub-optimal test is better than no test at all, and it'll also cover rust-lang#108721. It still ICEs on master, even though https://github.com/phlip9/rustc-warp-ice doesn't on nightly anymore, since rust-lang#122493. Fixes rust-lang#123275. r? `@compiler-errors` but feel free to close if you'd rather have a better test instead cc `@wesleywiser` who had signed up to do the revert Will need a backport if we go with this PR: `@rustbot` label +beta-nominated

Remove unnecessary cast from `LLVMRustGetInstrProfIncrementIntrinsic` (Noticed while reviewing rust-lang#123409.) This particular cast appears to have been copied over from clang, but there are plenty of other call sites in clang that don't bother with a cast here, and it works fine without one. For context, `llvm::Intrinsic::ID` is a typedef for `unsigned`, and `llvm::Intrinsic::instrprof_increment` is a member of `enum IndependentIntrinsics : unsigned`. --- The formatting change in `unwrap(M)` is the result of manually running `clang-format` on this file, and then reverting all changes other than the ones affecting these lines.

…ity, r=compiler-errors parser: reduce visibility of unnecessary public `UnmatchedDelim` `lexer::UnmatchedDelim` struct in `rustc_parse` is unnecessary public outside of the crate. This commit reduces the visibility to `pub(crate)`. Beside, this removes unnecessary field `expected_delim` that causes warnings after changing the visibility.

…rrors CFI: Fix ICE in KCFI non-associated function pointers We oddly weren't testing the more usual case of casting non-methods to function pointers. The KCFI shim insertion logic would ICE on these due to asking for an irrefutable associated item if we cast a function to a function pointer without needing a traditional shim. r? `@compiler-errors`

matthiaskrgr · 2024-04-08T20:08:42Z

@bors r+ rollup=never p=9

bors · 2024-04-08T20:08:44Z

📌 Commit 0520200 has been approved by matthiaskrgr

It is now in the queue for this repository.

bors · 2024-04-08T20:31:11Z

⌛ Testing commit 0520200 with merge ab5bda1...

bors · 2024-04-08T22:33:00Z

☀️ Test successful - checks-actions
Approved by: matthiaskrgr
Pushing ab5bda1 to master...

rust-timer · 2024-04-08T22:35:30Z

📌 Perf builds for each rolled up PR:

PR#	Message	Perf Build Sha
#122781	Fix argument ABI for overaligned structs on ppc64le	`4d11f102478c26159e85fc7a18b5dda982a910af` (link)
#123367	Safe Transmute: Compute transmutability from `rustc_target:…	`d48eec570e52dd40d56dfe2ace23860a2f4d1bcc` (link)
#123518	Fix `ByMove` coroutine-closure shim (for 2021 precise closu…	`45462ccdbe38f5d714cbf7a2d20eba2b3cb764f8` (link)
#123547	bootstrap: remove unused pub fns	`20b02e2984f2bfacbd9f0fbb284a4152f295456f` (link)
#123564	Don't emit divide-by-zero panic paths in `StepBy::len`	`996a3d3a2519d88b7dfcd5a3338852d3c66353dc` (link)
#123578	Restore `pred_known_to_hold_modulo_regions`	`27f15405ff45d13bad79f35d0db9ff86b5077162` (link)
#123591	Remove unnecessary cast from `LLVMRustGetInstrProfIncrement…	`305df686c4203cd60c4addfcda683a529b18d463` (link)
#123632	parser: reduce visibility of unnecessary public `UnmatchedD…	`db7e4db29b614de21ca340391312c2e42fe2dcaa` (link)
#123635	CFI: Fix ICE in KCFI non-associated function pointers	`e5abc3f2c21db8dca36a9ef8b23adb22131b3818` (link)

previous master: 211518e5fb

In the case of a perf regression, run the following command for each PR you suspect might be the cause: @rust-timer build $SHA

rust-timer · 2024-04-08T23:48:34Z

Finished benchmarking commit (ab5bda1): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.6%	[0.6%, 0.6%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.3%	[-0.4%, -0.3%]	2
Improvements ✅ (secondary)	-1.8%	[-1.8%, -1.8%]	1
All ❌✅ (primary)	-0.0%	[-0.4%, 0.6%]	3

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.1%	[1.3%, 5.0%]	3
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.7%	[-2.7%, -2.7%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.7%	[-2.7%, 5.0%]	4

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.1%	[0.0%, 0.4%]	27
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.1%	[0.0%, 0.4%]	27

Bootstrap: 668.886s -> 669.403s (0.08%)
Artifact size: 318.49 MiB -> 318.50 MiB (0.00%)

pnkfelix · 2024-04-09T20:55:36Z

Looks like a temporary spike. Marking as triaged.

@rustbot label: +perf-regression-triaged

compiler-errors and others added 23 commits April 5, 2024 15:28

Rework the ByMoveBody shim to actually work correctly

3674032

Add some helpful comments

0f13bd4

Check the base of the place too!

49c4ebc

Account for an additional reborrow inserted by UniqueImmBorrow and Mu…

ad0fcac

…tBorrow

bootstrap: remove unused pub fns

e0af5ea

Don't emit divide-by-zero panic paths in StepBy::len

00bd247

I happened to notice today that there's actually two such calls emitted in the assembly: <https://rust.godbolt.org/z/1Wbbd3Ts6> Since they're impossible, hopefully telling LLVM that will also help optimizations elsewhere.

add non-regression test for issue 123275

54f8db8

Revert "remove pred_known_to_hold_modulo_regions"

68b4257

This reverts commit 399a258.

Rollup merge of rust-lang#123547 - klensy:bs-pubs, r=onur-ozkan

5627497

bootstrap: remove unused pub fns Looks dead, remove.

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 8, 2024

bors added the merged-by-bors This PR was explicitly merged by bors. label Apr 8, 2024

bors merged commit ab5bda1 into rust-lang:master Apr 8, 2024
12 checks passed

rustbot added this to the 1.79.0 milestone Apr 8, 2024

bors mentioned this pull request Apr 8, 2024

Relocate coroutine upvars into Unresumed state #120168

Closed

rustbot added the perf-regression Performance regression. label Apr 8, 2024

rustbot added the perf-regression-triaged The performance regression has been triaged. label Apr 9, 2024

matthiaskrgr deleted the rollup-yd8d7f1 branch September 1, 2024 17:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rollup of 9 pull requests #123645

Rollup of 9 pull requests #123645

matthiaskrgr commented Apr 8, 2024

matthiaskrgr commented Apr 8, 2024

bors commented Apr 8, 2024

bors commented Apr 8, 2024

bors commented Apr 8, 2024

rust-timer commented Apr 8, 2024

rust-timer commented Apr 8, 2024

pnkfelix commented Apr 9, 2024

Rollup of 9 pull requests #123645

Rollup of 9 pull requests #123645

Conversation

matthiaskrgr commented Apr 8, 2024

matthiaskrgr commented Apr 8, 2024

bors commented Apr 8, 2024

bors commented Apr 8, 2024

bors commented Apr 8, 2024

rust-timer commented Apr 8, 2024

rust-timer commented Apr 8, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Instruction count

Max RSS (memory usage)

Cycles

Binary size

pnkfelix commented Apr 9, 2024