Lower `Or` pattern without allocating place #111752

dingxiangfei2009 · 2023-05-19T07:53:15Z

While reviewing #111644, it occurs to me that while we directly lower conjunctive predicates, which are connected with &&, into the desirable control flow, today we don't directly lower the disjunctive predicates, which are connected with ||, in the similar fashion. Instead, we allocate a place for the boolean temporary to hold the result of evaluating the || expression.

Usually I would expect optimization at later stages to "inline" the evaluation of boolean predicates into simple CFG, but #111583 is an example where && is failing to be optimized away and the assembly shows that both the expensive operands are evaluated. Therefore, I would like to make a small change to make the CFG a bit more straight-forward without invoking the as_temp machinery, and plus avoid allocating the place to hold the boolean result as well.

rustbot · 2023-05-19T07:53:21Z

r? @davidtwco

(rustbot has picked a reviewer for you, use r? to override)

azizghuloum · 2023-05-19T09:16:58Z

@dingxiangfei2009 was going to come after || once done with the && case. There are also the ! and if cases that probably needlessly allocate as well. One step at a time I guess.

dingxiangfei2009 · 2023-05-19T09:29:09Z

@azizghuloum Okay, I will do the Not case too.

cjgillot

Could you add a few tests for this? For instance a mir-opt/built test to show the resulting MIR with temporaries to drop?
If you can, make the test a preceding commit, so we can see the diff with this change.

compiler/rustc_mir_build/src/build/matches/mod.rs

cjgillot · 2023-05-19T09:28:17Z

compiler/rustc_mir_build/src/build/matches/mod.rs

@@ -35,7 +35,7 @@ use std::mem;
 impl<'a, 'tcx> Builder<'a, 'tcx> {
    pub(crate) fn then_else_break(


Could you use the occasion to add a comment?

Yes, will do.

compiler/rustc_mir_build/src/build/matches/mod.rs

cjgillot · 2023-05-27T16:03:59Z

tests/mir-opt/building/logical_or_in_conditional.rs

+}
+
+fn main() {
+    test_or();


Could you add the ExprKind::And, ExprKind::Use, ExprKind::Not cases too?
And also a complex one with a let condition somewhere?

Will do. I will also incorprate the other one from #111644.

tests/mir-opt/building/logical_or_in_conditional.test_or.built.after.mir

dingxiangfei2009 · 2023-06-05T20:11:29Z

I am trying another way to reduce the basic block count.

dingxiangfei2009 · 2023-06-13T22:46:49Z

cc @cjgillot Good day! I am going to drop the draft status on this PR and requesting for another review.

The majority of excessive block production is due to scope maintenance. To reduce this, this new patch sees a bit of overhaul of the lowering process. To assist construction of scope maintenance, a helper ThenElseBreakPads is introduced, which keeps a stack of "landing pads" to be connected after exiting the scopes. These "landing pads" are the blocks at the tails of a chain of blocks produced in order to break out one or more scopes by dropping temps and storages. With this helper, we "recycle" clean blocks as much as we can and avoid creation of simple "goto" blocks if exiting some scopes are no-ops.

As a result, for the following Rust program,

if Droppy(0).0 > 0 || Droppy(1).0 > 1 {}

... we have this control flow in MIR, visualised.

I have to confess, this may not be the best way to do it. Any hint or pointer is very much welcome. 🙇

bors · 2023-06-14T18:12:34Z

☔ The latest upstream changes (presumably #112418) made this pull request unmergeable. Please resolve the merge conflicts.

bors · 2023-09-01T10:54:59Z

⌛ Testing commit 67553e8 with merge f4555ef...

bors · 2023-09-01T12:39:53Z

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing f4555ef to master...

Zalathar · 2023-09-02T04:47:07Z

tests/run-coverage/lazy_boolean.coverage

   LL|       |    if
-   LL|      1|        !
+   LL|       |        !
   LL|      1|        is_true
   LL|      0|    {


From a coverage perspective this seems wrong; the ! operator should have the same execution count as its operand.

However, at a glance I can't say whether this PR is actually doing something wrong, or just exposing a latent issue in the coverage instrumentor's heuristics.

(I'm not particularly concerned about this being a problem in practice; I just want to explicitly note it as something for me to keep my eye on.)

It might be the case that because ! in if condition is now directly lowered into switchInts, by-passing the operator evaluation, the profiler might not pick it up. In other contexts, such as boolean expressions, ! should still show up I think.

That makes sense. So if !cond { a } else { b } gets lowered as though it were if cond { b } else { a }. The ! doesn’t exist in MIR, and the coverage pass doesn’t have a special case to detect this, so it isn’t included in the coverage region.

This seems acceptable for now. The mappings are slightly worse, but they aren’t lying, and most coverage users probably won’t notice the difference.

rust-timer · 2023-09-02T07:40:45Z

Finished benchmarking commit (f4555ef): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.4%	[0.2%, 1.0%]	8
Regressions ❌ (secondary)	0.9%	[0.5%, 1.6%]	21
Improvements ✅ (primary)	-1.0%	[-3.7%, -0.3%]	10
Improvements ✅ (secondary)	-0.4%	[-0.5%, -0.3%]	3
All ❌✅ (primary)	-0.3%	[-3.7%, 1.0%]	18

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	4.6%	[1.3%, 11.6%]	4
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-3.4%	[-7.5%, -0.1%]	7
Improvements ✅ (secondary)	-2.8%	[-3.5%, -2.5%]	3
All ❌✅ (primary)	-0.5%	[-7.5%, 11.6%]	11

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.4%	[1.4%, 1.4%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.6%	[-2.7%, -0.5%]	2
Improvements ✅ (secondary)	-2.3%	[-2.4%, -2.2%]	3
All ❌✅ (primary)	-0.6%	[-2.7%, 1.4%]	3

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.0%, 0.4%]	3
Regressions ❌ (secondary)	0.3%	[0.1%, 1.3%]	13
Improvements ✅ (primary)	-0.5%	[-2.3%, -0.0%]	102
Improvements ✅ (secondary)	-0.4%	[-0.7%, -0.0%]	17
All ❌✅ (primary)	-0.5%	[-2.3%, 0.4%]	105

Bootstrap: 629.909s -> 631.77s (0.30%)
Artifact size: 316.39 MiB -> 316.73 MiB (0.11%)

dingxiangfei2009 · 2023-09-03T13:59:00Z

I also owe here an explanation about a change in codegen test slice-as_chunks.rs, which expects lshr exact or udiv exact call in the LLVM IR.

It turns out that because the emitted instructions are re-arranged, leading to the optimizer to recognize the udiv exact .. 4 instruction as a target for common expression elimination with the first udiv ... 4. When InstCombine is run, udiv instructions are rewritten with bit operations and a lshr, but unfortunately not a lshr exact.

The IR related to this issue is attached here.

nnethercote · 2023-09-03T21:10:25Z

Performance-wise, icounts/cycles/wall-times have a mix of wins and losses that more or less balance out. But binary size has lots of wins, which is nice.

pnkfelix · 2023-09-06T01:55:33Z

Performance-wise, icounts/cycles/wall-times have a mix of wins and losses that more or less balance out. But binary size has lots of wins, which is nice.

@rustbot label: +perf-regression-triaged

Make the success arms of `if lhs || rhs` meet up in a separate block Extracted from rust-lang#118305, where this is necessary to avoid introducing a bug when injecting marker statements into the then/else arms. --- In the previous code (rust-lang#111752), the success block of `lhs` would jump directly to the success block of `rhs`. However, `rhs_success_block` could already contain statements that are specific to the RHS, and the direct goto causes them to be executed in the LHS success path as well. This patch therefore creates a fresh block that the LHS and RHS success blocks can both jump to. --- I think the reason we currently get away with this is that `rhs_success_block` usually doesn't contain anything other than StorageDead statements for locals used by the RHS, and those statements don't seem to cause problems in the LHS success path (which never makes those locals live). But if we start adding meaningful statements for branch coverage (or MC/DC coverage), it's important to keep the LHS and RHS blocks separate.

Rollup merge of rust-lang#121784 - Zalathar:if-or-converge, r=Nadrieril Make the success arms of `if lhs || rhs` meet up in a separate block Extracted from rust-lang#118305, where this is necessary to avoid introducing a bug when injecting marker statements into the then/else arms. --- In the previous code (rust-lang#111752), the success block of `lhs` would jump directly to the success block of `rhs`. However, `rhs_success_block` could already contain statements that are specific to the RHS, and the direct goto causes them to be executed in the LHS success path as well. This patch therefore creates a fresh block that the LHS and RHS success blocks can both jump to. --- I think the reason we currently get away with this is that `rhs_success_block` usually doesn't contain anything other than StorageDead statements for locals used by the RHS, and those statements don't seem to cause problems in the LHS success path (which never makes those locals live). But if we start adding meaningful statements for branch coverage (or MC/DC coverage), it's important to keep the LHS and RHS blocks separate.

rustbot assigned davidtwco May 19, 2023

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 19, 2023

This comment has been minimized.

Sign in to view

dingxiangfei2009 force-pushed the lower-or-pattern branch from aabf291 to 3863c63 Compare May 19, 2023 09:28

cjgillot reviewed May 19, 2023

View reviewed changes

cjgillot self-assigned this May 19, 2023

cjgillot mentioned this pull request May 19, 2023

optimizing Use expressions inside if condition #111644

Closed

dingxiangfei2009 force-pushed the lower-or-pattern branch from 3863c63 to 12d5d8f Compare May 19, 2023 10:11

cjgillot reviewed May 19, 2023

View reviewed changes

compiler/rustc_mir_build/src/build/matches/mod.rs Outdated Show resolved Hide resolved

dingxiangfei2009 force-pushed the lower-or-pattern branch from 12d5d8f to 0ab73c0 Compare May 25, 2023 21:52

This comment has been minimized.

Sign in to view

dingxiangfei2009 force-pushed the lower-or-pattern branch from 0ab73c0 to 48eefc9 Compare May 27, 2023 13:18

cjgillot reviewed May 28, 2023

View reviewed changes

dingxiangfei2009 force-pushed the lower-or-pattern branch from 48eefc9 to c73f139 Compare June 5, 2023 12:42

dingxiangfei2009 marked this pull request as draft June 5, 2023 20:10

dingxiangfei2009 force-pushed the lower-or-pattern branch 3 times, most recently from 48643ab to 268c712 Compare June 13, 2023 06:00

This comment has been minimized.

Sign in to view

dingxiangfei2009 force-pushed the lower-or-pattern branch from 268c712 to ba91dfb Compare June 13, 2023 21:02

dingxiangfei2009 marked this pull request as ready for review June 13, 2023 22:46

dingxiangfei2009 requested a review from cjgillot June 13, 2023 22:46

dingxiangfei2009 force-pushed the lower-or-pattern branch from ba91dfb to 9a2dc19 Compare June 16, 2023 07:04

bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 1, 2023

bors merged commit f4555ef into rust-lang:master Sep 1, 2023
12 checks passed

rustbot added this to the 1.74.0 milestone Sep 1, 2023

Zalathar mentioned this pull request Sep 2, 2023

coverage: Explicitly test the coverage maps produced by codegen/LLVM #114843

Merged

Zalathar reviewed Sep 2, 2023

View reviewed changes

Zalathar mentioned this pull request Sep 2, 2023

Coverage instrumentation sometimes thinks the negation operator isn't executable #115468

Closed

dingxiangfei2009 deleted the lower-or-pattern branch September 3, 2023 14:34

rustbot added the perf-regression-triaged The performance regression has been triaged. label Sep 6, 2023

Zalathar mentioned this pull request Sep 23, 2023

Add support for branch coverage in source-based coverage #79649

Open

Zalathar mentioned this pull request Nov 23, 2023

coverage: Use SpanMarker to improve coverage spans for if ! expressions #118198

Merged

This was referenced Feb 27, 2024

Draft of branch coverage support #118305

Closed

Make the success arms of if lhs || rhs meet up in a separate block #121784

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lower `Or` pattern without allocating place #111752

Lower `Or` pattern without allocating place #111752

dingxiangfei2009 commented May 19, 2023 •

edited

Loading

rustbot commented May 19, 2023

This comment has been minimized.

azizghuloum commented May 19, 2023

dingxiangfei2009 commented May 19, 2023

cjgillot left a comment

cjgillot May 19, 2023

dingxiangfei2009 May 21, 2023

This comment has been minimized.

cjgillot May 27, 2023

dingxiangfei2009 May 31, 2023

dingxiangfei2009 commented Jun 5, 2023

This comment has been minimized.

dingxiangfei2009 commented Jun 13, 2023

bors commented Jun 14, 2023

bors commented Sep 1, 2023

bors commented Sep 1, 2023

Zalathar Sep 2, 2023

dingxiangfei2009 Sep 3, 2023

Zalathar Sep 4, 2023

rust-timer commented Sep 2, 2023

dingxiangfei2009 commented Sep 3, 2023 •

edited

Loading

nnethercote commented Sep 3, 2023

pnkfelix commented Sep 6, 2023

		@@ -35,7 +35,7 @@ use std::mem;
		impl<'a, 'tcx> Builder<'a, 'tcx> {
		pub(crate) fn then_else_break(

Lower Or pattern without allocating place #111752

Lower Or pattern without allocating place #111752

Conversation

dingxiangfei2009 commented May 19, 2023 • edited Loading

rustbot commented May 19, 2023

This comment has been minimized.

azizghuloum commented May 19, 2023

dingxiangfei2009 commented May 19, 2023

cjgillot left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment has been minimized.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dingxiangfei2009 commented Jun 5, 2023

This comment has been minimized.

dingxiangfei2009 commented Jun 13, 2023

bors commented Jun 14, 2023

bors commented Sep 1, 2023

bors commented Sep 1, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rust-timer commented Sep 2, 2023

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

dingxiangfei2009 commented Sep 3, 2023 • edited Loading

nnethercote commented Sep 3, 2023

pnkfelix commented Sep 6, 2023

Lower `Or` pattern without allocating place #111752

Lower `Or` pattern without allocating place #111752

dingxiangfei2009 commented May 19, 2023 •

edited

Loading

dingxiangfei2009 commented Sep 3, 2023 •

edited

Loading