Always generate GEP i8 / ptradd for struct offsets #121665

erikdesjardins · 2024-02-27T03:52:58Z

This implements #98615, and goes a bit further to remove struct_gep entirely.

Upstream LLVM is in the beginning stages of migrating to ptradd. LLVM 19 will canonicalize all constant-offset GEPs to i8, which has roughly the same effect as this change.

Fixes #121719.

Split out from #121577.

r? @nikic

rustbot · 2024-02-27T03:53:08Z

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

erikdesjardins · 2024-02-27T03:53:57Z

compiler/rustc_codegen_llvm/src/va_arg.rs

+    // typedef struct  va_list {
+    //     void * stack; // next stack param
+    //     void * gr_top; // end of GP arg reg save area
+    //     void * vr_top; // end of FP/SIMD arg reg save area
+    //     int gr_offs; // offset from  gr_top to next GP register arg
+    //     int vr_offs; // offset from  vr_top to next FP/SIMD register arg
+    // } va_list;
    let va_list_addr = list.immediate();
-    let va_list_layout = list.deref(bx.cx).layout;
-    let va_list_ty = va_list_layout.llvm_type(bx);
+
+    // There is no padding between fields since `void*` is size=8 align=8, `int` is size=4 align=4.
+    // See https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst
+    // Table 1, Byte size and byte alignment of fundamental data types
+    // Table 3, Mapping of C & C++ built-in data types
+    let ptr_offset = 8;
+    let i32_offset = 4;
+    let gr_top = bx.inbounds_ptradd(va_list_addr, bx.cx.const_usize(ptr_offset));
+    let vr_top = bx.inbounds_ptradd(va_list_addr, bx.cx.const_usize(2 * ptr_offset));
+    let gr_offs = bx.inbounds_ptradd(va_list_addr, bx.cx.const_usize(3 * ptr_offset));
+    let vr_offs = bx.inbounds_ptradd(va_list_addr, bx.cx.const_usize(3 * ptr_offset + i32_offset));


Normally I'd feel bad about this kind of manual layout computation, but this file is already packed full of hardcoded offsets and assumptions.

At some point we should re-evaluate whether we can use the LLVM va_arg directly.

To that point, I'm confused about the comment

The LLVM va_arg instruction is lacking in some instances, so we should only use it as a fallback.

...does Clang also implement this logic manually?

Yes. LLVM types are not rich enough to encode all the necessary ABI requirements to make something like the va_arg instruction work correctly. It would probably be best to remove it entirely.

erikdesjardins · 2024-02-27T03:56:21Z

compiler/rustc_codegen_ssa/src/mir/place.rs

-            let llval = match self.layout.abi {
-                _ if offset.bytes() == 0 => {
-                    // Unions and newtypes only use an offset of 0.
-                    // Also handles the first field of Scalar, ScalarPair, and Vector layouts.
-                    self.llval
-                }
-                Abi::ScalarPair(..) => {
-                    // FIXME(nikic): Generate this for all ABIs.
-                    bx.inbounds_gep(bx.type_i8(), self.llval, &[bx.const_usize(offset.bytes())])
-                }
-                Abi::Scalar(_) | Abi::Vector { .. } if field.is_zst() => {
-                    // ZST fields (even some that require alignment) are not included in Scalar,
-                    // ScalarPair, and Vector layouts, so manually offset the pointer.
-                    bx.gep(bx.cx().type_i8(), self.llval, &[bx.const_usize(offset.bytes())])
-                }
-                Abi::Scalar(_) => {
-                    // All fields of Scalar layouts must have been handled by this point.
-                    // Vector layouts have additional fields for each element of the vector, so don't panic in that case.
-                    bug!(
-                        "offset of non-ZST field `{:?}` does not match layout `{:#?}`",
-                        field,
-                        self.layout
-                    );
-                }
-                _ => {
-                    let ty = bx.backend_type(self.layout);
-                    bx.struct_gep(ty, self.llval, bx.cx().backend_field_index(self.layout, ix))
-                }
+            let llval = if offset.bytes() == 0 {
+                self.llval
+            } else {
+                bx.inbounds_ptradd(self.llval, bx.const_usize(offset.bytes()))
            };


The old code used non-inbounds GEP for ZSTs. But it's okay for inbounds to point one-past-the end:

The base pointer has an in bounds address of an allocated object, which means that it points into an allocated object, or to its end.

...which handles the ZST being "outside" of e.g. struct Foo(u64, ()) (where it is at an offset of 8 bytes), so I think using inbounds for ZST fields is okay. Unless ZSTs can be arbitrarily far outside the bounds of a layout? (@oli-obk?)

Miri doesn't catch it yet: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=682be508e94b238472eccdb9e1fdef3b

And it seems to be an open question on whether it is ok or not: rust-lang/unsafe-code-guidelines#93

Tho looks like it's leaning towards it being ok I think?

Only the initial field offset in that example (x.1) will go through this code, so I think it would be fine...but I'll change it back to non-inbounds for ZSTs anyways, so that the only functional change in this PR is the GEP i8/ptradd change, and since ZST field offsets should never be performance critical, so it doesn't really matter.

compiler/rustc_codegen_ssa/src/mir/place.rs

Nilstrieb · 2024-02-27T05:52:02Z

@bors try @rust-timer queue

Always generate GEP i8 / ptradd for struct offsets This implements rust-lang#98615, and goes a bit further to remove `struct_gep` entirely. Upstream LLVM is in the beginning stages of [migrating to `ptradd`](https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699). LLVM 19 will [canonicalize](llvm/llvm-project#68882) all constant-offset GEPs to i8, which has roughly the same effect as this change. Split out from rust-lang#121577. r? `@nikic`

bors · 2024-02-27T05:53:12Z

⌛ Trying commit 4724cd4 with merge 85edff9...

bors · 2024-02-27T07:21:19Z

☀️ Try build successful - checks-actions
Build commit: 85edff9 (85edff996ef166238dcfa53fcba0386f7a6cc2f5)

rust-timer · 2024-02-27T09:36:01Z

Finished benchmarking commit (85edff9): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.2%, 0.4%]	3
Regressions ❌ (secondary)	0.5%	[0.1%, 1.0%]	15
Improvements ✅ (primary)	-0.6%	[-1.6%, -0.3%]	11
Improvements ✅ (secondary)	-0.6%	[-1.2%, -0.2%]	10
All ❌✅ (primary)	-0.4%	[-1.6%, 0.4%]	14

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.1%	[3.1%, 3.1%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.4%	[2.3%, 2.5%]	2
Improvements ✅ (primary)	-1.0%	[-1.6%, -0.6%]	20
Improvements ✅ (secondary)	-1.0%	[-1.6%, -0.7%]	3
All ❌✅ (primary)	-1.0%	[-1.6%, -0.6%]	20

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.5%	[0.1%, 1.0%]	12
Regressions ❌ (secondary)	0.4%	[0.0%, 1.0%]	18
Improvements ✅ (primary)	-0.4%	[-0.7%, -0.0%]	9
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.1%	[-0.7%, 1.0%]	21

Bootstrap: 649.3s -> 644.43s (-0.75%)
Artifact size: 311.08 MiB -> 311.01 MiB (-0.02%)

…e offset

nikic · 2024-03-01T09:29:34Z

We did not encounter any significant issues when we started canonicalizing these upstream, so I think it's fairly safe to do this step now. We could wait until the LLVM 19 release, but it doesn't really seem necessary.

@bors r+ rollup=never

bors · 2024-03-01T09:29:38Z

📌 Commit 4724cd4 has been approved by nikic

It is now in the queue for this repository.

RalfJung · 2024-03-02T08:51:20Z

This PR is at the top of the bors queue but not being processed...

nikic · 2024-03-02T08:57:02Z

@bors r+

bors · 2024-03-02T08:57:05Z

📌 Commit 4016510 has been approved by nikic

It is now in the queue for this repository.

rust-timer · 2024-03-04T01:55:16Z

Finished benchmarking commit (70aa0b8): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.2%, 0.4%]	3
Regressions ❌ (secondary)	0.5%	[0.2%, 1.0%]	12
Improvements ✅ (primary)	-0.7%	[-1.6%, -0.3%]	14
Improvements ✅ (secondary)	-0.5%	[-1.0%, -0.3%]	12
All ❌✅ (primary)	-0.5%	[-1.6%, 0.4%]	17

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-1.6%	[-1.6%, -1.6%]	1
All ❌✅ (primary)	-	-	0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.2%	[-1.6%, -0.9%]	4
Improvements ✅ (secondary)	-7.3%	[-7.4%, -7.2%]	2
All ❌✅ (primary)	-1.2%	[-1.6%, -0.9%]	4

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.5%	[0.1%, 1.0%]	12
Regressions ❌ (secondary)	1.0%	[0.0%, 1.3%]	37
Improvements ✅ (primary)	-0.4%	[-0.7%, -0.0%]	9
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.1%	[-0.7%, 1.0%]	21

Bootstrap: 649.87s -> 644.02s (-0.90%)
Artifact size: 175.04 MiB -> 174.95 MiB (-0.06%)

nnethercote · 2024-03-04T02:06:18Z

Are the binary size increases expected?

erikdesjardins · 2024-03-05T05:01:22Z

It's not unexpected per se, since an intended consequence of this is more GEPs being merged, which can push things under inlining/vectorization thresholds.

It looks like almost all of the size regressions are nearly the exact same magnitude:

js diff

[...document.querySelectorAll('tr.toggle')].forEach(e => {
    const [before, after] = [...e.querySelectorAll('td.numeric')];

    const td = document.createElement('td');
    const diff = after.querySelector('abbr').title - before.querySelector('abbr').title;

    td.textContent = diff > 0 ? '+' + diff : diff;
    
    e.append(td);
});

which suggests that something in the std "runtime" is getting slightly bigger.

I'll diff one of those and see if I can figure out what it is

Always generate GEP i8 / ptradd for struct offsets This implements rust-lang#98615, and goes a bit further to remove `struct_gep` entirely. Upstream LLVM is in the beginning stages of [migrating to `ptradd`](https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699). LLVM 19 will [canonicalize](llvm/llvm-project#68882) all constant-offset GEPs to i8, which has roughly the same effect as this change. Fixes rust-lang#121719. Split out from rust-lang#121577. r? `@nikic`

Use GEP inbounds for ZST and DST field offsets ZST field offsets have been non-`inbounds` since I made [this old layout change](https://github.com/rust-lang/rust/pull/73453/files#diff-160634de1c336f2cf325ff95b312777326f1ab29fec9b9b21d5ee9aae215ecf5). Before that, they would have been `inbounds` due to using `struct_gep`. Using `inbounds` for ZSTs likely doesn't matter for performance, but I'd like to remove the special case. DST field offsets have been non-`inbounds` since the alignment-aware DST field offset computation was first [implemented](erikdesjardins@a2557d4#diff-04fd352da30ca186fe0bb71cc81a503d1eb8a02ca17a3769e1b95981cd20964aR1188) in 1.6 (back then `GEPi()` would be used for `inbounds`), but I don't think there was any reason for it. Split out from rust-lang#121577 / rust-lang#121665. r? `@oli-obk` cc `@RalfJung` -- is there some weird situation where field offsets can't be `inbounds`? Note that it's fine for `inbounds` offsets to be one-past-the-end, so it's okay even if there's a ZST as the last field in the layout: > The base pointer has an in bounds address of an allocated object, which means that it points into an allocated object, or to its end. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction) For rust-lang/unsafe-code-guidelines#93, zero-offset GEP is (now) always `inbounds`: > Note that getelementptr with all-zero indices is always considered to be inbounds, even if the base pointer does not point to an allocated object. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)

erikdesjardins · 2024-03-10T06:04:37Z

The regression is all in backtrace-related code:

_ZN3std12backtrace_rs9symbolize5gimli7Context3new                                                              +288 bytes (17760 -> 18048) +047 insts (3525 -> 3572), +0 funcs ( 1 ->  1)
_ZN9addr2line16ResUnit$LT$R$GT$25find_function_or_location28_$u7b$$u7b$closure$u7d$$u7d$                       +256 bytes (11792 -> 12048) +023 insts (2507 -> 2530), +0 funcs ( 1 ->  1)
_ZN3std12backtrace_rs9symbolize5gimli7resolve                                                                  +240 bytes (14800 -> 15040) +025 insts (2709 -> 2734), +0 funcs ( 1 ->  1)
_ZN4core6option19Option$LT$$RF$T$GT$6cloned                                                                    +176 bytes (    0 ->   176) +056 insts (   0 ->   56), +1 funcs ( 0 ->  1)
_ZN5gimli4read8rnglists20RngListIter$LT$R$GT$4next                                                             +160 bytes ( 3248 ->  3408) +043 insts ( 813 ->  856), +0 funcs ( 1 ->  1)
_ZN9addr2line11render_file                                                                                     +112 bytes (  848 ->   960) +034 insts ( 201 ->  235), +0 funcs ( 1 ->  1)
_ZN93_$LT$gimli..read..line..LineProgramHeader$LT$R$C$Offset$GT$$u20$as$u20$core..clone..Clone$GT$5clone       +096 bytes ( 1120 ->  1216) +033 insts ( 244 ->  277), +0 funcs ( 1 ->  1)
_ZN5gimli4read5dwarf13Unit$LT$R$GT$3new                                                                        +064 bytes (10224 -> 10288) +016 insts (2063 -> 2079), +0 funcs ( 1 ->  1)
_ZN5gimli4read4unit18Attribute$LT$R$GT$5value                                                                  +032 bytes ( 1328 ->  1360) +006 insts ( 382 ->  388), +0 funcs ( 1 ->  1)
_ZN9addr2line30LoopingLookup$LT$T$C$L$C$F$GT$10new_lookup                                                      +032 bytes ( 1536 ->  1568) -011 insts ( 352 ->  341), +0 funcs ( 1 ->  1)
_ZN9addr2line5Lines5parse                                                                                      +016 bytes ( 9920 ->  9936) +012 insts (2144 -> 2156), +0 funcs ( 1 ->  1)
_ZN5gimli4read5dwarf14Dwarf$LT$R$GT$11attr_string                                                              +016 bytes (  592 ->   608) +004 insts ( 165 ->  169), +0 funcs ( 1 ->  1)
_ZN5alloc7raw_vec11finish_grow                                                                                 +016 bytes (  576 ->   592) +003 insts ( 188 ->  191), +0 funcs ( 4 ->  4)
_ZN3std2io5Write18write_all_vectored                                                                           +016 bytes ( 1328 ->  1344) +003 insts ( 347 ->  350), +0 funcs ( 2 ->  2)
_ZN9addr2line16ResUnit$LT$R$GT$18dwarf_and_unit_dwo                                                            +016 bytes ( 1376 ->  1392) +002 insts ( 289 ->  291), +0 funcs ( 1 ->  1)
_ZN11miniz_oxide7inflate4core10decompress                                                                      +016 bytes ( 7888 ->  7904) +001 insts (1913 -> 1914), +0 funcs ( 1 ->  1)
_ZN5alloc7raw_vec19RawVec$LT$T$C$A$GT$7reserve21do_reserve_and_handle                                          -016 bytes (  944 ->   928) +008 insts ( 232 ->  240), +0 funcs ( 5 ->  5)
_ZN5gimli4read6abbrev13Abbreviations6insert                                                                    -016 bytes ( 4864 ->  4848) +006 insts ( 918 ->  924), +0 funcs ( 1 ->  1)
_ZN9addr2line8function17Function$LT$R$GT$14parse_children                                                      -016 bytes ( 5376 ->  5360) +003 insts (1088 -> 1091), +0 funcs ( 1 ->  1)
_ZN3std4path7PathBuf14_set_extension                                                                           -016 bytes (  624 ->   608) +000 insts ( 167 ->  167), +0 funcs ( 1 ->  1)
_ZN5alloc3ffi5c_str7CString19_from_vec_unchecked                                                               -016 bytes (  352 ->   336) +000 insts (  97 ->   97), +0 funcs ( 1 ->  1)
_ZN91_$LT$std..sys_common..backtrace.._print..DisplayBacktrace$u20$as$u20$core..fmt..Display$GT$3fmt           -016 bytes (  592 ->   576) -002 insts ( 123 ->  121), +0 funcs ( 1 ->  1)
_ZN69_$LT$std..sys..pal..unix..stdio..Stderr$u20$as$u20$std..io..Write$GT$14write_vectored                     -016 bytes (   96 ->    80) -004 insts (  29 ->   25), +0 funcs ( 1 ->  1)
_ZN5alloc7raw_vec19RawVec$LT$T$C$A$GT$16reserve_for_push                                                       -032 bytes ( 3296 ->  3264) +027 insts ( 834 ->  861), +0 funcs (17 -> 17)

(Option::cloned doesn't get inlined...in _ZN9addr2line11render_file.)

(Generated with this ad-hoc script. There's probably some existing way to do this though, isn't there...)

It's kind of interesting that the regression is spread across a bunch of different functions but all in backtrace-related code. Although I guess it's just because one function was optimized differently and the butterfly effect cascaded to everything else in its call graph.

Looking at gimli7Context3new, it's very hard to see anything meaningful in the asm diff, even with all constants and registers normalized, hundreds of instructions are moved around. It doesn't look obviously worse, just...perturbed. Same with find_function_or_location.

So I don't really see anything actionable here.

It does make me wonder if we should be compiling backtrace-related crates with opt-level=s though--there's a ton of enormous functions and just scanning through (not necessarily in the changed functions) I see some unrolled loops. Size is probably more of a priority than runtime performance for backtrace code.

Kobzol · 2024-03-10T08:11:25Z

(Generated with this ad-hoc script. There's probably some existing way to do this though, isn't there...)

$  cargo run --bin collector binary_stats --symbols <rustc>

But it will show slightly different stuff than your script.

Add #[inline] to Option::copied/cloned In rust-lang#121665 (comment), I noticed that `Option::cloned` stopped being inlined in some backtrace code. Let's see if this helps. r? `@ghost`

workingjubilee · 2024-03-11T03:09:38Z

It does make me wonder if we should be compiling backtrace-related crates with opt-level=s though--there's a ton of enormous functions and just scanning through (not necessarily in the changed functions) I see some unrolled loops. Size is probably more of a priority than runtime performance for backtrace code.

Runtime performance isn't zero concern for backtrace code, oddly enough (mainly for the "forced capture" case), but backtrace-rs contains algorithmic optimizations which already make the appropriate trades, so yes, if you can get a significantly smaller code size by optimizing in that direction, that would be good.

We can also look at working with the addr2line and gimli maintainers to reduce the code size a bit, if we can identify the heaviest-hitters in those crates.

Use ptradd for vtable indexing Extension of rust-lang#121665. After this, the only remaining usages of GEP are [this](https://github.com/rust-lang/rust/blob/cd81f5b27ee00b49d413db50b5e6af871cebcf23/compiler/rustc_codegen_llvm/src/intrinsic.rs#L909-L920) kinda janky Emscription EH code, which I'll change in a future PR, and array indexing / pointer offsets, where there isn't yet a canonical `ptradd` form. (Out of curiosity I tried converting the latter to `ptradd(ptr, mul(size, index))`, but that causes codegen regressions right now.) r? `@nikic`

…=davidtwco Remove the unused `field_remapping` field from `TypeLowering` The `field_remapping` field of `TypeLowering` has been unused since rust-lang#121665. This PR removes it, then replaces the `TypeLowering` struct with its only remaining member `&'ll Type`.

Use ptradd for vtable indexing Extension of rust-lang#121665. After this, the only remaining usages of GEP are [this](https://github.com/rust-lang/rust/blob/cd81f5b27ee00b49d413db50b5e6af871cebcf23/compiler/rustc_codegen_llvm/src/intrinsic.rs#L909-L920) kinda janky Emscription EH code, which I'll change in a future PR, and array indexing / pointer offsets, where there isn't yet a canonical `ptradd` form. (Out of curiosity I tried converting the latter to `ptradd(ptr, mul(size, index))`, but that causes codegen regressions right now.) r? `@nikic`

Rollup merge of rust-lang#122320 - erikdesjardins:vtable, r=nikic Use ptradd for vtable indexing Extension of rust-lang#121665. After this, the only remaining usages of GEP are [this](https://github.com/rust-lang/rust/blob/cd81f5b27ee00b49d413db50b5e6af871cebcf23/compiler/rustc_codegen_llvm/src/intrinsic.rs#L909-L920) kinda janky Emscription EH code, which I'll change in a future PR, and array indexing / pointer offsets, where there isn't yet a canonical `ptradd` form. (Out of curiosity I tried converting the latter to `ptradd(ptr, mul(size, index))`, but that causes codegen regressions right now.) r? `@nikic`

Rollup merge of rust-lang#122166 - beetrees:remove-field-remapping, r=davidtwco Remove the unused `field_remapping` field from `TypeLowering` The `field_remapping` field of `TypeLowering` has been unused since rust-lang#121665. This PR removes it, then replaces the `TypeLowering` struct with its only remaining member `&'ll Type`.

erikdesjardins added 3 commits February 26, 2024 22:28

always use gep inbounds i8 (ptradd) for field offsets

123015e

remove struct_gep, use manual layout calculations for va_arg

beed25b

introduce and use ptradd/inbounds_ptradd instead of gep

4724cd4

rustbot assigned nikic Feb 27, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 27, 2024

erikdesjardins commented Feb 27, 2024

View reviewed changes

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 27, 2024

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Feb 27, 2024

DianQK mentioned this pull request Feb 28, 2024

Common branch folding for enum types fails when there are more than three enum variants. #121719

Closed

erikdesjardins added 2 commits February 27, 2024 23:00

use non-inbounds GEP for ZSTs, add fixmes

c1017d4

test merging of multiple match branches that access fields of the sam…

4016510

…e offset

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 1, 2024

RalfJung closed this Mar 2, 2024

RalfJung reopened this Mar 2, 2024

This was referenced Mar 4, 2024

Subtree update of rustc_codegen_gcc #121390

Closed

Stop using LLVM struct types for alloca, byval, sret, and many GEPs #121577

Closed

erikdesjardins deleted the ptradd branch March 4, 2024 01:55

Mark-Simulacrum added the perf-regression-triaged The performance regression has been triaged. label Mar 5, 2024

beetrees mentioned this pull request Mar 8, 2024

Remove the unused field_remapping field from TypeLowering #122166

Merged

This was referenced Mar 10, 2024

Improve symbol name normalization rust-lang/rustc-perf#1843

Merged

When printing binary stats diff, combine all unchanged rows into one rust-lang/rustc-perf#1844

Merged

Add #[inline] to Option::copied/cloned #122313

Closed

erikdesjardins mentioned this pull request Mar 11, 2024

Use ptradd for vtable indexing #122320

Merged

erikdesjardins mentioned this pull request Mar 13, 2024

Represent Result<usize, Box<T>> as ScalarPair(i64, ptr) #121668

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Always generate GEP i8 / ptradd for struct offsets #121665

Always generate GEP i8 / ptradd for struct offsets #121665

erikdesjardins commented Feb 27, 2024 •

edited

Loading

rustbot commented Feb 27, 2024

erikdesjardins Feb 27, 2024

nikic Mar 1, 2024

erikdesjardins Feb 27, 2024

oli-obk Feb 27, 2024

erikdesjardins Feb 28, 2024

Nilstrieb commented Feb 27, 2024

This comment has been minimized.

bors commented Feb 27, 2024

bors commented Feb 27, 2024

This comment has been minimized.

rust-timer commented Feb 27, 2024

nikic commented Mar 1, 2024

bors commented Mar 1, 2024

RalfJung commented Mar 2, 2024 •

edited

Loading

nikic commented Mar 2, 2024

bors commented Mar 2, 2024

rust-timer commented Mar 4, 2024

nnethercote commented Mar 4, 2024

erikdesjardins commented Mar 5, 2024

erikdesjardins commented Mar 10, 2024 •

edited

Loading

Kobzol commented Mar 10, 2024

workingjubilee commented Mar 11, 2024

Always generate GEP i8 / ptradd for struct offsets #121665

Always generate GEP i8 / ptradd for struct offsets #121665

Conversation

erikdesjardins commented Feb 27, 2024 • edited Loading

rustbot commented Feb 27, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Nilstrieb commented Feb 27, 2024

This comment has been minimized.

bors commented Feb 27, 2024

bors commented Feb 27, 2024

This comment has been minimized.

rust-timer commented Feb 27, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

nikic commented Mar 1, 2024

bors commented Mar 1, 2024

RalfJung commented Mar 2, 2024 • edited Loading

nikic commented Mar 2, 2024

bors commented Mar 2, 2024

rust-timer commented Mar 4, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

nnethercote commented Mar 4, 2024

erikdesjardins commented Mar 5, 2024

erikdesjardins commented Mar 10, 2024 • edited Loading

Kobzol commented Mar 10, 2024

workingjubilee commented Mar 11, 2024

erikdesjardins commented Feb 27, 2024 •

edited

Loading

RalfJung commented Mar 2, 2024 •

edited

Loading

erikdesjardins commented Mar 10, 2024 •

edited

Loading