[mir] or-patterns duplicate the same branch #75439

tesuji · 2020-08-12T04:27:13Z

#![feature(const_fn_transmute)]
#![feature(or_patterns)]

use std::mem::transmute;

pub const fn foo3(bytes: [u8; 16]) -> Option<[u8; 4]> {
    // big endian `u32`s
    let dwords: [u32; 4] = unsafe { transmute(bytes) };
    const FF: u32 = 0x0000_ffff_u32.to_be();
    if let [0, 0, 0 | FF, ip] = dwords {
        Some(unsafe { transmute(ip) })
    } else {
        None
    }
}

The or-pattern 0 | FF has this MIR:

    bb3: {
        switchInt(_2[2 of 4]) -> [0_u32: bb6, 4294901760_u32: bb7, otherwise: bb4]; // scope 1 at src/lib.rs:10:19: 10:20
    }

While bb6 and bb7 points to the same branch:

    bb6: {
        StorageLive(_4);                 // scope 1 at src/lib.rs:10:27: 10:29
        _4 = _2[3 of 4];                 // scope 1 at src/lib.rs:10:27: 10:29
        goto -> bb5;                     // scope 1 at src/lib.rs:10:5: 14:6
    }

    bb7: {
        StorageLive(_4);                 // scope 1 at src/lib.rs:10:27: 10:29
        _4 = _2[3 of 4];                 // scope 1 at src/lib.rs:10:27: 10:29
        goto -> bb5;                     // scope 1 at src/lib.rs:10:5: 14:6
    }

I would expect the or-pattern points the same branch.

The text was updated successfully, but these errors were encountered:

JulianKnodt · 2020-08-15T23:37:23Z

Was this resolved by #75537? I believe that this would actually match on it, just because all the statements are the same

tesuji · 2020-08-16T01:19:58Z

I've just checked. The MIR isn't improved with #75537 .

JulianKnodt · 2020-08-16T03:56:48Z

Ah woops yea I can't count, this has 3 branches not 2, which is the only permitted number of branches being optimized currently.

@lzutao

Add test for checking duplicated branch or-patterns This adds a regression test for checking `or-patterns` in MIR as shown in rust-lang#75439. This doesn't introduce a fix as I'm not sure where it would go(I suspect maybe here: src/librustc_mir_build/build/matches/mod.rs), and I'm not particularly able to fix it. cc: @lzutao

@lzutao

Add test for checking duplicated branch or-patterns This adds a regression test for checking `or-patterns` in MIR as shown in rust-lang#75439. This doesn't introduce a fix as I'm not sure where it would go(I suspect maybe here: src/librustc_mir_build/build/matches/mod.rs), and I'm not particularly able to fix it. cc: @lzutao

@lzutao

Add test for checking duplicated branch or-patterns This adds a regression test for checking `or-patterns` in MIR as shown in rust-lang#75439. This doesn't introduce a fix as I'm not sure where it would go(I suspect maybe here: src/librustc_mir_build/build/matches/mod.rs), and I'm not particularly able to fix it. cc: @lzutao

@lzutao

Add test for checking duplicated branch or-patterns This adds a regression test for checking `or-patterns` in MIR as shown in rust-lang#75439. This doesn't introduce a fix as I'm not sure where it would go(I suspect maybe here: src/librustc_mir_build/build/matches/mod.rs), and I'm not particularly able to fix it. cc: @lzutao

@lzutao

Add test for checking duplicated branch or-patterns This adds a regression test for checking `or-patterns` in MIR as shown in rust-lang#75439. This doesn't introduce a fix as I'm not sure where it would go(I suspect maybe here: src/librustc_mir_build/build/matches/mod.rs), and I'm not particularly able to fix it. cc: @lzutao

@lzutao

Add test for checking duplicated branch or-patterns This adds a regression test for checking `or-patterns` in MIR as shown in rust-lang#75439. This doesn't introduce a fix as I'm not sure where it would go(I suspect maybe here: src/librustc_mir_build/build/matches/mod.rs), and I'm not particularly able to fix it. cc: @lzutao

mark-i-m · 2020-11-07T21:45:18Z

This might be a suitable way to solve the same problem, but I'm not really sure:

rust/compiler/rustc_mir_build/src/build/matches/mod.rs

Line 1233 in b2d115f

fn merge_trivial_subcandidates(

i.e. combine them when building the MIR in the first place.

osa1 · 2021-01-24T11:33:30Z

I was able to repro this today (85e355e).

osa1 · 2021-02-14T09:53:52Z

@rustbot claim

I'm not sure if I'll be able to fix this, but I spent some time this weekend studying lowering to MIR and tried to understand the issue here.

First of all, the inefficiency here exists even in the final LLVM (with -O), so I think this may worth fixing. LLVM for the original repro:

issue75439 $ rustc +stage1_2 test.rs --crate-type=rlib --emit=llvm-ir -O
issue75439 $ cat test.ll
; ModuleID = 'test.3a1fbbbh-cgu.0'
source_filename = "test.3a1fbbbh-cgu.0"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; test::foo3
; Function Attrs: norecurse nounwind nonlazybind readnone uwtable
define i40 @_ZN4test4foo317h93f9bb6eba5df219E(i128 %0) unnamed_addr #0 {
start:
  %1 = and i128 %0, 18446744073709551615
  %2 = icmp eq i128 %1, 0
  br i1 %2, label %bb3, label %bb9

bb3:                                              ; preds = %start
  %_3.sroa.5.0.extract.shift = lshr i128 %0, 64
  %_3.sroa.5.0.extract.trunc = trunc i128 %_3.sroa.5.0.extract.shift to i32
  switch i32 %_3.sroa.5.0.extract.trunc, label %bb9 [
    i32 0, label %bb6
    i32 -65536, label %bb7
  ]

bb6:                                              ; preds = %bb3
  br label %bb9

bb7:                                              ; preds = %bb3
  br label %bb9

bb9:                                              ; preds = %bb7, %bb6, %start, %bb3
  %.sroa.0.0 = phi i40 [ 0, %bb3 ], [ 0, %start ], [ 1, %bb6 ], [ 1, %bb7 ]
  %3 = lshr i128 %0, 88
  %4 = trunc i128 %3 to i40
  %.sroa.3.0.insert.shift = and i40 %4, -256
  %.sroa.0.0.insert.insert = or i40 %.sroa.0.0, %.sroa.3.0.insert.shift
  ret i40 %.sroa.0.0.insert.insert
}

attributes #0 = { norecurse nounwind nonlazybind readnone uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" }

!llvm.module.flags = !{!0, !1}

!0 = !{i32 7, !"PIC Level", i32 2}
!1 = !{i32 2, !"RtLibUseGOT", i32 1}

bb6 and bb7 are the targets of the or pattern and they're duplicates. I'm not sure why LLVM can't eliminate those.

Now, I looked at why these duplicate blocks are generated, in MIR. I believe the reason is the same as the reason why we generate duplicate blocks in this simpler program:

pub fn test(dwords: [u32; 2]) {
    match dwords {
        [0 | 1, _a] => (),
        _ => (),
    }
}

LLVM is able to optimize the duplicate blocks in this program, but in MIR we still have duplication very similar to the original repro:

fn test(_1: [u32; 2]) -> () {
    debug dwords => _1;
    let mut _0: ();
    let _2: u32;
    scope 1 {
        debug _a => _2;
    }

    bb0: {
        switchInt(_1[0 of 2]) -> [0_u32: bb2, 1_u32: bb3, otherwise: bb4];
    }

    bb1: {
        StorageDead(_2);
        goto -> bb4;
    }

    bb2: {
        StorageLive(_2);
        _2 = _1[1 of 2];
        goto -> bb1;
    }

    bb3: {
        StorageLive(_2);
        _2 = _1[1 of 2];
        goto -> bb1;
    }

    bb4: {
        return;
    }
}

Namely bb2 and bb3.

I think if we could generate better MIR for this simpler program that would also fix the problem with the original repro.

Here's the original MIR for this program (before any passes):

fn test(_1: [u32; 2]) -> () {
    debug dwords => _1;
    let mut _0: ();
    let _2: u32;
    scope 1 {
        debug _a => _2;
    }

    bb0: {
        FakeRead(ForMatchedPlace, _1);
        switchInt(_1[0 of 2]) -> [0_u32: bb1, 1_u32: bb2, otherwise: bb3];
    }

    bb1: {
        falseEdge -> [real: bb5, imaginary: bb2];
    }

    bb2: {
        falseEdge -> [real: bb6, imaginary: bb3];
    }

    bb3: {
        _0 = ();
        goto -> bb7;
    }

    bb4: {
        _0 = ();
        StorageDead(_2);
        goto -> bb7;
    }

    bb5: {
        StorageLive(_2);
        _2 = _1[1 of 2];
        goto -> bb4;
    }

    bb6: {
        StorageLive(_2);
        _2 = _1[1 of 2];
        goto -> bb4;
    }

    bb7: {
        return;
    }
}

The reason for this duplication is: originally for this program, in the first arm, we have a candidate with two "match pairs": the or pattern, and the binding (_a). We simplify this (simplify_candidate) to a candidate with a binding (_a) and two subcandidates, for the branches of the or pattern.

When generating MIR for a subcandidate, we first generate a block just to be able to chain blocks of arms for borrow checking (using "false eges"). This is why we create bb5 when generating bb1, and bb6 when generating bb2.

Then in those new blocks (bb5 and bb6) we introduce the bindings of the parent candidate, with some extra stuff for ascriptions and guards. This generates the duplicate code

        StorageLive(_2);
        _2 = _1[1 of 2];
        goto -> ...;

for all alternatives of the or pattern.

(The intermediate blocks bb1 and bb2 are removed after SimplifyBranches and the next SimplifyCfg)

I think it's possible to generate better code here when the subcandidates (transitively) don't have any bindings, ascriptions, and guards (I don't understand the guards in details yet so I don't know if there are room for optimization also). In those cases, before generating MIR for the subcandidates, we create a "target" block where we will generate the binding code. Then create branches for each subcandidate as we do today (with the false edges), but these subcandidates jump to the "target" block instead of to fresh blocks. The target block will then have the code for the binding in the parent candidate.

In other words we generate one block for the parent candidate's bindings instead of introducing the bindings it in each subcandidate.

I actually have an implementation of this, but it causes linker errors when compiling libraries. I'll submit a PR once it's working.

One problem is the idea is the check "do the subcandidates have any bindings, ascriptions, or guards" requires traversing the entire candidate tree. I don't know if we can implement this more efficiently somehow.

The MIR (before passes) with this implementation would look like:

fn test(_1: [u32; 2]) -> () {
    debug dwords => _1;
    let mut _0: ();
    let _2: u32;
    scope 1 {
        debug _a => _2;
    }

    bb0: {
        FakeRead(ForMatchedPlace, _1);
        switchInt(_1[0 of 2]) -> [0_u32: bb1, 1_u32: bb2, otherwise: bb3];
    }

    bb1: {
        falseEdge -> [real: bb5, imaginary: bb2];
    }

    bb2: {
        falseEdge -> [real: bb5, imaginary: bb3];
    }

    bb3: {
        _0 = ();
        goto -> bb6;
    }

    bb4: {
        _0 = ();
        StorageDead(_2);
        goto -> bb6;
    }

    bb5: {
        StorageLive(_2);
        _2 = _1[1 of 2];
        goto -> bb4;
    }

    bb6: {
        return;
    }
}

Because we no longer have duplicate blocks for or pattern alternatives the issue is fixed.

mark-i-m · 2021-02-14T14:58:27Z

It looks like you already figured out most of this, but here some notes I wrote from the last time I went through some of that code: rust-lang/rustc-dev-guide#943

osa1 · 2021-02-17T08:14:49Z

Thanks @mark-i-m, that's very helpful.

Update: I figured this out and have a working prototype. On the way I was also able to come up with examples where the duplication is much larger then the original repro or my simpler repro above, which my code fixes. I'm hoping to submit a PR this weekend.

osa1 · 2021-02-20T07:29:11Z

Update: my patch works mostly fine. There are two failures in the test suite, both are caused by disappeared StorageDead instructions after merging the duplicate blocks. I don't make any changes related to drop scheduling so I'm not sure what the reason is. I asked a question about this on Zulip. Once that's fixed it will be passing the test suite and I'll submit a PR.

In the meantime, if anyone knows about the question I linked above, that would be very helpful.

Nadrieril · 2024-02-07T00:01:50Z

One problem is the idea is the check "do the subcandidates have any bindings, ascriptions, or guards" requires traversing the entire candidate tree.

Isn't that already what we do in merge_trivial_subcandidates though? My surface understanding is that merge_trivial_subcandidates should solve exactly this, but clearly it doesn't :/

JulianKnodt · 2024-02-07T19:53:34Z

since this is reviving a ~~long dead~~ old post, you probably want to double check this behavior is still happening, and close this issue if not.

Even if it's not still happening, it might be good to close the issue and start a new one so it's more easily tracked (and doesn't ping people who aren't working on it actively anymore)

Nadrieril · 2024-02-07T21:09:47Z

I did check, it's still duplicated

This comment has been minimized.

Sign in to view

rustbot added A-mir Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html A-patterns Relating to patterns and pattern matching F-or_patterns `#![feature(or_patterns)]` labels Aug 12, 2020

jonas-schievink added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 12, 2020

JulianKnodt mentioned this issue Aug 16, 2020

Add test for checking duplicated branch or-patterns #75580

Merged

wesleywiser added the A-mir-opt Area: MIR optimizations label Aug 18, 2020

rustbot assigned osa1 Feb 14, 2021

workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023

osa1 removed their assignment Feb 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mir] or-patterns duplicate the same branch #75439

[mir] or-patterns duplicate the same branch #75439

tesuji commented Aug 12, 2020 •

edited by rustbot

This comment has been minimized.

JulianKnodt commented Aug 15, 2020

tesuji commented Aug 16, 2020

JulianKnodt commented Aug 16, 2020

mark-i-m commented Nov 7, 2020 •

edited

osa1 commented Jan 24, 2021

osa1 commented Feb 14, 2021

mark-i-m commented Feb 14, 2021

osa1 commented Feb 17, 2021

osa1 commented Feb 20, 2021

Nadrieril commented Feb 7, 2024

JulianKnodt commented Feb 7, 2024 •

edited

Nadrieril commented Feb 7, 2024

[mir] or-patterns duplicate the same branch #75439

[mir] or-patterns duplicate the same branch #75439

Comments

tesuji commented Aug 12, 2020 • edited by rustbot

This comment has been minimized.

JulianKnodt commented Aug 15, 2020

tesuji commented Aug 16, 2020

JulianKnodt commented Aug 16, 2020

mark-i-m commented Nov 7, 2020 • edited

osa1 commented Jan 24, 2021

osa1 commented Feb 14, 2021

mark-i-m commented Feb 14, 2021

osa1 commented Feb 17, 2021

osa1 commented Feb 20, 2021

Nadrieril commented Feb 7, 2024

JulianKnodt commented Feb 7, 2024 • edited

Nadrieril commented Feb 7, 2024

tesuji commented Aug 12, 2020 •

edited by rustbot

mark-i-m commented Nov 7, 2020 •

edited

JulianKnodt commented Feb 7, 2024 •

edited