Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the same DISubprogram for each instance of the same inlined function within a caller #114643

Merged
merged 2 commits into from Aug 22, 2023

Conversation

dpaoliello
Copy link
Contributor

@dpaoliello dpaoliello commented Aug 9, 2023

Issue Details:

The call to panic within a function like Option::unwrap is translated to LLVM as a tail call (as it will never return), when multiple calls to the same function like this is inlined LLVM will notice the common tail call block (i.e., loading the same panic string + location info and then calling panic) and merge them together.

When merging these instructions together, LLVM will also attempt to merge the debug locations as well, but this fails (i.e., debug info is dropped) as Rust emits a new DISubprogram at each inline site thus LLVM doesn't recognize that these are actually the same function and so thinks that there isn't a common debug location.

As an example of this when building for x86_64 Windows (note the lack of .cv_loc before the call to panic, thus it will be attributed to the same line at the addq instruction):

	.cv_loc	0 1 23 0                        # src\lib.rs:23:0
	addq	$40, %rsp
	retq
	leaq	.Lalloc_f570dea0a53168780ce9a91e67646421(%rip), %rcx
	leaq	.Lalloc_629ace53b7e5b76aaa810d549cc84ea3(%rip), %r8
	movl	$43, %edx
	callq	_ZN4core9panicking5panic17h12e60b9063f6dee8E
	int3

Fix Details:

Cache the DISubprogram emitted for each inlined function instance within a caller so that this can be reused if that instance is encountered again, this also requires caching the DILexicalBlock and DIVariable objects to avoid creating duplicates.

After this change the above assembly now looks like:

	.cv_loc	0 1 23 0                        # src\lib.rs:23:0
	addq	$40, %rsp
	retq
	.cv_inline_site_id 5 within 0 inlined_at 1 0 0
	.cv_inline_site_id 6 within 5 inlined_at 1 12 0
	.cv_loc	6 2 935 0                       # library\core\src\option.rs:935:0
	leaq	.Lalloc_5f55955de67e57c79064b537689facea(%rip), %rcx
	leaq	.Lalloc_e741d4de8cb5801e1fd7a6c6795c1559(%rip), %r8
	movl	$43, %edx
	callq	_ZN4core9panicking5panic17hde1558f32d5b1c04E
	int3

@rustbot
Copy link
Collaborator

rustbot commented Aug 9, 2023

r? @fee1-dead

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 9, 2023
@rustbot
Copy link
Collaborator

rustbot commented Aug 9, 2023

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo

@rust-log-analyzer

This comment has been minimized.

@fee1-dead
Copy link
Member

r? compiler

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@cjgillot
Copy link
Contributor

@bors try @rust-timer queue

@bors
Copy link
Contributor

bors commented Aug 10, 2023

⌛ Trying commit 450caba2c72fb9909449b7fc2d09f1b47672244c with merge 4f0955dc1384542c3a8e0630c446c01ee2ccbd6c...

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 10, 2023
@bors
Copy link
Contributor

bors commented Aug 10, 2023

💔 Test failed - checks-actions

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 10, 2023
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Aug 11, 2023

@dpaoliello: 🔑 Insufficient privileges: not in try users

@cjgillot
Copy link
Contributor

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Aug 12, 2023

⌛ Trying commit 687bffa with merge bc26305dffb3dc48c9018ff524fe40c907d35b06...

@bors
Copy link
Contributor

bors commented Aug 12, 2023

☀️ Try build successful - checks-actions
Build commit: bc26305dffb3dc48c9018ff524fe40c907d35b06 (bc26305dffb3dc48c9018ff524fe40c907d35b06)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (bc26305dffb3dc48c9018ff524fe40c907d35b06): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.4% [0.4%, 0.4%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.2% [-1.2%, -1.2%] 1
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.4% [-0.4%, -0.4%] 1
Improvements ✅
(secondary)
-3.0% [-4.1%, -2.1%] 6
All ❌✅ (primary) -0.4% [-0.4%, -0.4%] 1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.0% [-1.0%, -1.0%] 1
All ❌✅ (primary) - - 0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.4% [-0.8%, -0.0%] 62
Improvements ✅
(secondary)
-0.4% [-1.5%, -0.0%] 18
All ❌✅ (primary) -0.4% [-0.8%, -0.0%] 62

Bootstrap: 632.009s -> 631.987s (-0.00%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 12, 2023
@dpaoliello
Copy link
Contributor Author

r? @wesleywiser

@rustbot rustbot assigned wesleywiser and unassigned cjgillot Aug 17, 2023
@wesleywiser
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Aug 22, 2023

📌 Commit 1097e09 has been approved by wesleywiser

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 22, 2023
@bors
Copy link
Contributor

bors commented Aug 22, 2023

⌛ Testing commit 1097e09 with merge 154ae32...

@bors
Copy link
Contributor

bors commented Aug 22, 2023

☀️ Test successful - checks-actions
Approved by: wesleywiser
Pushing 154ae32 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Aug 22, 2023
@bors bors merged commit 154ae32 into rust-lang:master Aug 22, 2023
12 checks passed
@rustbot rustbot added this to the 1.74.0 milestone Aug 22, 2023
@dpaoliello dpaoliello deleted the inlinedebuginfo branch August 22, 2023 22:05
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (154ae32): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.1% [-1.1%, -1.1%] 1
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.4% [-0.4%, -0.4%] 1
Improvements ✅
(secondary)
-3.7% [-6.6%, -2.1%] 3
All ❌✅ (primary) -0.4% [-0.4%, -0.4%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.4% [-0.8%, -0.0%] 62
Improvements ✅
(secondary)
-0.4% [-1.6%, -0.0%] 18
All ❌✅ (primary) -0.4% [-0.8%, -0.0%] 62

Bootstrap: 635.929s -> 635.404s (-0.08%)
Artifact size: 347.06 MiB -> 347.06 MiB (-0.00%)

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 8, 2023
Use the same DISubprogram for each instance of the same inlined function within a caller

# Issue Details:
The call to `panic` within a function like `Option::unwrap` is translated to LLVM as a `tail call` (as it will never return), when multiple calls to the same function like this are inlined LLVM will notice the common `tail call` block (i.e., loading the same panic string + location info and then calling `panic`) and merge them together.

When merging these instructions together, LLVM will also attempt to merge the debug locations as well, but this fails (i.e., debug info is dropped) as Rust emits a new `DISubprogram` at each inline site thus LLVM doesn't recognize that these are actually the same function and so thinks that there isn't a common debug location.

As an example of this, consider the following program:
```rust
#[no_mangle]
fn add_numbers(x: &Option<i32>, y: &Option<i32>) -> i32 {
    let x1 = x.unwrap();
    let y1 = y.unwrap();

    x1 + y1
}
```

 When building for x86_64 Windows using 1.72 it generates (note the lack of `.cv_loc` before the call to `panic`, thus it will be attributed to the same line at the `addq` instruction):

```llvm
	.cv_loc	0 1 3 0                        # src\lib.rs:3:0
	addq	$40, %rsp
	retq
	leaq	.Lalloc_f570dea0a53168780ce9a91e67646421(%rip), %rcx
	leaq	.Lalloc_629ace53b7e5b76aaa810d549cc84ea3(%rip), %r8
	movl	$43, %edx
	callq	_ZN4core9panicking5panic17h12e60b9063f6dee8E
	int3
```

# Fix Details:
Cache the `DISubprogram` emitted for each inlined function instance within a caller so that this can be reused if that instance is encountered again.

Ideally, we would also deduplicate child scopes and variables, however my attempt to do that with rust-lang#114643 resulted in asserts when building for Linux (rust-lang#115156) which would require some deep changes to Rust to fix (rust-lang#115455).

Instead, when using an inlined function as a debug scope, we will also create a new child scope such that subsequent child scopes and variables do not collide (from LLVM's perspective).

After this change the above assembly now (with <https://reviews.llvm.org/D159226> as well) shows the `panic!` was inlined from `unwrap` in `option.rs` at line 935 into the current function in `lib.rs` at line 0 (line 0 is emitted since it is ambiguous which line to use as there were two inline sites that lead to this same code):

```llvm
	.cv_loc	0 1 3 0                        # src\lib.rs:3:0
	addq	$40, %rsp
	retq
	.cv_inline_site_id 6 within 0 inlined_at 1 0 0
	.cv_loc	6 2 935 0                       # library\core\src\option.rs:935:0
	leaq	.Lalloc_5f55955de67e57c79064b537689facea(%rip), %rcx
	leaq	.Lalloc_e741d4de8cb5801e1fd7a6c6795c1559(%rip), %r8
	movl	$43, %edx
	callq	_ZN4core9panicking5panic17hde1558f32d5b1c04E
	int3
```
antoyo pushed a commit to antoyo/rust that referenced this pull request Oct 9, 2023
…wiser

Use the same DISubprogram for each instance of the same inlined function within a caller

# Issue Details:
The call to `panic` within a function like `Option::unwrap` is translated to LLVM as a `tail call` (as it will never return), when multiple calls to the same function like this is inlined LLVM will notice the common `tail call` block (i.e., loading the same panic string + location info and then calling `panic`) and merge them together.

When merging these instructions together, LLVM will also attempt to merge the debug locations as well, but this fails (i.e., debug info is dropped) as Rust emits a new `DISubprogram` at each inline site thus LLVM doesn't recognize that these are actually the same function and so thinks that there isn't a common debug location.

As an example of this when building for x86_64 Windows (note the lack of `.cv_loc` before the call to `panic`, thus it will be attributed to the same line at the `addq` instruction):

```
	.cv_loc	0 1 23 0                        # src\lib.rs:23:0
	addq	$40, %rsp
	retq
	leaq	.Lalloc_f570dea0a53168780ce9a91e67646421(%rip), %rcx
	leaq	.Lalloc_629ace53b7e5b76aaa810d549cc84ea3(%rip), %r8
	movl	$43, %edx
	callq	_ZN4core9panicking5panic17h12e60b9063f6dee8E
	int3
```

# Fix Details:
Cache the `DISubprogram` emitted for each inlined function instance within a caller so that this can be reused if that instance is encountered again, this also requires caching the `DILexicalBlock` and `DIVariable` objects to avoid creating duplicates.

After this change the above assembly now looks like:

```
	.cv_loc	0 1 23 0                        # src\lib.rs:23:0
	addq	$40, %rsp
	retq
	.cv_inline_site_id 5 within 0 inlined_at 1 0 0
	.cv_inline_site_id 6 within 5 inlined_at 1 12 0
	.cv_loc	6 2 935 0                       # library\core\src\option.rs:935:0
	leaq	.Lalloc_5f55955de67e57c79064b537689facea(%rip), %rcx
	leaq	.Lalloc_e741d4de8cb5801e1fd7a6c6795c1559(%rip), %r8
	movl	$43, %edx
	callq	_ZN4core9panicking5panic17hde1558f32d5b1c04E
	int3
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants