Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3-way comparison is branchier after 1.71 #125338

Open
mqudsi opened this issue May 20, 2024 · 6 comments
Open

3-way comparison is branchier after 1.71 #125338

mqudsi opened this issue May 20, 2024 · 6 comments
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@mqudsi
Copy link
Contributor

mqudsi commented May 20, 2024

Code

The following code contains two impls of the same (or what should be the same) logic. Prior to 1.71.0, the first impl (using Option::map() instead of match) would generate llvm ir that could be optimized to use cmov instructions, but the second impl (using match instead) would use jumps instead.

1.71.0+ generate the same (branchy) llvm ir for both implementations.
The regression goes away if compiling with -Zinline-mir=no, but the match implementation remains branchy.

#[derive(PartialEq)]
pub enum WSL {
    Any,
    V1,
    V2,
}

#[inline(never)]
pub fn is_wsl(wsl: Option<WSL>, v: WSL) -> bool {
    wsl.map(|wsl| v == WSL::Any || wsl == v).unwrap_or(false)
}

#[inline(never)]
pub fn is_wsl2(wsl: Option<WSL>, v: WSL) -> bool {
    match (wsl, v) {
        (Some(_), WSL::Any) => true,
        (Some(x), y) => x == y,
        _ => false
    }
}

Godbolt link

Optimized IR:

define noundef zeroext i1 @_ZN7example6is_wsl17h6965abaffb382d78E(i8 noundef %wsl, i8 noundef %0) unnamed_addr #0 {
start:
  %1 = icmp ne i8 %wsl, 3
  %_3.i.i = icmp eq i8 %0, 0
  %_4.i.i = icmp eq i8 %0, %wsl
  %.0.i.i = or i1 %_3.i.i, %_4.i.i
  %.0 = and i1 %1, %.0.i.i
  ret i1 %.0
}

compiling to following x86_64:

example::is_wsl::h6965abaffb382d78:
        cmp     dil, 3
        setne   cl
        test    sil, sil
        sete    dl
        cmp     sil, dil
        sete    al
        or      al, dl
        and     al, cl
        ret

vs unoptimized ir and asm:

define noundef zeroext i1 @_ZN7example7is_wsl217h164d1336a42bfb8dE(i8 noundef %wsl, i8 noundef %v) unnamed_addr #0 {
start:
  %.not = icmp eq i8 %wsl, 3
  br i1 %.not, label %bb5, label %bb2

bb2:                                              ; preds = %start
  %0 = icmp eq i8 %v, 0
  %1 = icmp eq i8 %wsl, %v
  %spec.select = or i1 %0, %1
  br label %bb5

bb5:                                              ; preds = %bb2, %start
  %.0 = phi i1 [ false, %start ], [ %spec.select, %bb2 ]
  ret i1 %.0
}
example::is_wsl2::h164d1336a42bfb8d:
        cmp     dil, 3
        jne     .LBB1_2
        xor     eax, eax
        ret
.LBB1_2:
        test    sil, sil
        sete    cl
        cmp     dil, sil
        sete    al
        or      al, cl
        ret

Version it worked on

It most recently worked on: 1.70.0

Version with regression

rustc --version --verbose:

1.71.0-stable

@rustbot modify labels: +regression-from-stable-to-stable -regression-untriaged +A-codegen +C-bug +I-slow +T-compiler

cc #91743

@mqudsi mqudsi added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels May 20, 2024
@rustbot rustbot added I-prioritize Issue: Indicates that prioritization has been requested for this issue. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. A-codegen Area: Code generation I-slow Issue: Problems and improvements with respect to performance of generated code. regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed regression-untriaged Untriaged performance or correctness regression. labels May 20, 2024
@jieyouxu jieyouxu added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label May 20, 2024
@DianQK
Copy link
Member

DianQK commented May 21, 2024

Alive2: https://alive2.llvm.org/ce/z/VqNWdC

Related code: https://github.com/llvm/llvm-project/blob/3fa6b3bbdb0f9de18def8596d1f7fcec2ef77b5e/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L2992-L3012

There are many limitations to hoisting branch instructions. I am not sure if it is better to change these limitations.

@mqudsi
Copy link
Contributor Author

mqudsi commented May 21, 2024

It's actually a more fundamental issue than the case in the original post; even stripping down the code to just a binary enum still puts a branch in there. Godbolt

#[derive(PartialEq)]
pub enum WSL {
    Any,
    V1,
}

#[inline(never)]
pub fn is_wsl(wsl: Option<WSL>, v: WSL) -> bool {
    wsl.map(|wsl| wsl == v).unwrap_or(false)
}
example::is_wsl::h2f539bf0253b949c:
        cmp     dil, 2
        jne     .LBB0_2
        xor     eax, eax
        ret
.LBB0_2:
        test    dil, 1
        sete    al
        xor     al, sil
        ret

I'm going to leave the repro in the first post unmodified as the ideal fix would apply to that more difficult case as well, as this is just a contrived example.

@apiraino
Copy link
Contributor

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-medium

@rustbot rustbot added P-medium Medium priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels May 21, 2024
@mqudsi
Copy link
Contributor Author

mqudsi commented May 22, 2024

While this was tagged as being fallout from #91743, it's a much later regression. bisecting the nightlies gives me these possible commits as the source of the regression:

I'm guessing it's #110705

@saethlin
Copy link
Member

While this was tagged as being fallout from #91743, it's a much later regression. bisecting the nightlies gives me these possible commits as the source of the regression:

You can bisect MIR inliner issues to any number of MIR inliner heuristic tweaks depending on the details of your reproducer. The problem here is an interaction with LLVM and MIR inlining in general, and blaming any particular heuristic tweak is a mistake.

@saethlin saethlin removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label May 22, 2024
@mqudsi
Copy link
Contributor Author

mqudsi commented May 22, 2024

Thanks for the explanation, @saethlin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants