Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llvm-20: Nightly compile time regression building comrak with release profile #137909

Closed
parasyte opened this issue Mar 3, 2025 · 6 comments · Fixed by #138695
Closed

llvm-20: Nightly compile time regression building comrak with release profile #137909

parasyte opened this issue Mar 3, 2025 · 6 comments · Fixed by #138695
Assignees
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-compiletime Issue: Problems and improvements with respect to compile times. llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly.

Comments

@parasyte
Copy link

parasyte commented Mar 3, 2025

Building comrak with cargo build --release seems to never finish on x86_64 (both on Windows and Linux). It normally builds on the stable channel on my machine in approximately 26 seconds.

Bisect results:

searched nightlies: from nightly-2025-02-15 to nightly-2025-02-25
regressed nightly: nightly-2025-02-18
searched commit range: 5bc6231...ce36a96
regressed commit: ce36a96

bisected with cargo-bisect-rustc v0.6.9

Host triple: x86_64-pc-windows-msvc
Reproduce with:

cargo bisect-rustc --start=2025-02-15 --end=2025-02-25 --script "C:\\Program Files\\Git\\usr\\bin\\bash.exe" -- ./regress.sh

regress.sh:

#!/bin/bash

set -eux -o pipefail

timeout 60 cargo build --release

@rustbot modify labels: +regression-from-stable-to-nightly -regression-untriaged

@parasyte parasyte added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels Mar 3, 2025
@rustbot rustbot added I-prioritize Issue: Indicates that prioritization has been requested for this issue. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. and removed regression-untriaged Untriaged performance or correctness regression. labels Mar 3, 2025
@tgross35 tgross35 added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Mar 3, 2025
@saethlin saethlin added the I-compiletime Issue: Problems and improvements with respect to compile times. label Mar 3, 2025
@saethlin
Copy link
Member

saethlin commented Mar 3, 2025

perf top during a compile of the above looks like this:

Overhead  Shared Object                            Symbol
  44.24%  libLLVM.so.20.1-rust-1.87.0-nightly      [.] programUndefinedIfUndefOrPoison(llvm::Value const*, bool) [clone .llvm.5409155288196509361]
  13.71%  libLLVM.so.20.1-rust-1.87.0-nightly      [.] isGuaranteedNotToBeUndefOrPoison(llvm::Value const*, llvm::AssumptionCache*, llvm::Instruction const*, llvm::DominatorTree const*, unsigned int, UndefPoisonKind) [clone .llvm.5409155288196509361]
   0.78%  libLLVM.so.20.1-rust-1.87.0-nightly      [.] canCreateUndefOrPoison(llvm::Operator const*, UndefPoisonKind, bool) [clone .llvm.5409155288196509361]
   0.62%  libLLVM.so.20.1-rust-1.87.0-nightly      [.] llvm::Operator::hasPoisonGeneratingAnnotations() const
   0.45%  libLLVM.so.20.1-rust-1.87.0-nightly      [.] llvm::propagatesPoison(llvm::Use const&)
   0.20%  libLLVM.so.20.1-rust-1.87.0-nightly      [.] llvm::getKnowledgeForValue(llvm::Value const*, llvm::ArrayRef<llvm::Attribute::AttrKind>, llvm::AssumptionCache*, llvm::function_ref<bool (llvm::RetainedKnowledge, llvm::Instruction*, llvm::CallBase::B

This is not a hang, the build completes after 20 minutes (phew).

@saethlin saethlin removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Mar 3, 2025
@moxian
Copy link
Contributor

moxian commented Mar 3, 2025

This can be somewhat "minimized" to just scanners::autolink_email function, which has no external dependencies and can be tested in isolation (when decorated with #[no_mangle] or similar)

It's a huge match statement, and I would guess codegen takes polynomial time somehow?
Limiting the match the first 70 cases (0-69) makes the function compile in 85 seconds on my machine (~instant on stable);
Limiting it to the first 60 cases (0-59) drops the time down to 40 seconds
Limiting to first 50 (0-49) drops it further to 16sec

Having all 128 match arms, but deleting either arm 2 ( 2=>{return None;} ) or arm 11 (11=>{return Some(cursor);}) makes it compile near-instantly. Which is probably not surprising since those are the only two return paths.


Minimized further:

#[no_mangle]
pub fn autolink_email(s: &[u8]) -> Option<usize> {
    let mut cursor = 0;
    let mut marker = 0;
    let len = s.len();

    let mut yych: u8 = 0;
    let mut yystate: usize = 0;
    'yyl: loop {
        match yystate {
            0 => {
                return None;
            }
            1 => {
                // changing this to `return cursor`, the above to `return 1234` and the return type to `->usize`
                // makes the code compile fast again
                return Some(cursor);
            }
            2 => match yych {
                _ => {
                    yystate = 6;
                    continue 'yyl;
                }
            },
            3 => {
                marker = cursor;
                continue 'yyl;
            }
            4 => {
                cursor = marker;
                yystate = 2;
                continue 'yyl;
            }


            // add some extra copies of the block if your machine is too fast to notice the slowdown
            10 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            11 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            12 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            13 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            14 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            15 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            16 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            17 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            18 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            19 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            20 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            21 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            22 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            23 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            24 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            25 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            26 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            27 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            28 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            29 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            30 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            31 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            32 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            33 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            34 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            35 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            36 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            37 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            38 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }
            39 => {
                yych = unsafe { if cursor < len { *s.get_unchecked(cursor) } else { 0 } };
                continue 'yyl;
            }


            // regular `panic!()` also works, but a `return` doesn't
            _ => unsafe{ core::hint::unreachable_unchecked() },
        }
    }
}

fn main() {}

llvm passes timings:

> rustc +nightly -Copt-level=0 src/main.rs --emit=mir,llvm-ir -C strip=debuginfo  -C extra-filename=-slow-0
> clang.exe .\main-slow-0.ll -O3 -ftime-report
warning: overriding the module target triple with x86_64-pc-windows-msvc19.42.34435 [-Woverride-module]
===-------------------------------------------------------------------------===
                          Pass execution timing report
===-------------------------------------------------------------------------===
  Total Execution Time: 14.3594 seconds (14.3573 wall clock)

   ---User Time---   --User+System--   ---Wall Time---  --- Name ---
   7.2344 ( 50.4%)   7.2344 ( 50.4%)   7.2246 ( 50.3%)  EarlyCSEPass
   7.1250 ( 49.6%)   7.1250 ( 49.6%)   7.1274 ( 49.6%)  SROAPass
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0012 (  0.0%)  SimplifyCFGPass
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0007 (  0.0%)  IPSCCPPass
<.. nothing of interest in the rest of the output ..>

@apiraino
Copy link
Contributor

apiraino commented Mar 3, 2025

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-high

@rustbot rustbot added P-high High priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Mar 3, 2025
@dianqk dianqk self-assigned this Mar 3, 2025
@moxian
Copy link
Contributor

moxian commented Mar 4, 2025

llvm bisects to llvm/llvm-project#96631 (with just opt main-slow-1.ll -S -O1 as the reproducer).
I wouldn't think this is the direct cause of the slowdown, and it looks more like it accidentally uncovered other codegen deficiencies (although i'm rather clueless in matters of LLVM).
But still, cc @nikic just in case

@dianqk
Copy link
Member

dianqk commented Mar 4, 2025

llvm bisects to llvm/llvm-project#96631 (with just opt main-slow-1.ll -S -O1 as the reproducer).

The bisection this time is correct. Basically, this is due to the complexity of calculations involving phi nodes similar to %i1 = phi i64 [0, %bb], [%i1, %bb1], [%i1, %bb2], ... [%i1, %bbn], [%arg, %bbn+1].

The patch for the fix is roughly:

diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index e3e026f7979d..9b098bcb28bc 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -7824,6 +7824,8 @@ static bool isGuaranteedNotToBeUndefOrPoison(
       unsigned Num = PN->getNumIncomingValues();
       bool IsWellDefined = true;
       for (unsigned i = 0; i < Num; ++i) {
+        if (PN == PN->getIncomingValue(i))
+          continue;
         auto *TI = PN->getIncomingBlock(i)->getTerminator();
         if (!isGuaranteedNotToBeUndefOrPoison(PN->getIncomingValue(i), AC, TI,
                                               DT, Depth + 1, Kind)) {

@dianqk
Copy link
Member

dianqk commented Mar 6, 2025

Upstream issue: llvm/llvm-project#130110.

@dianqk dianqk added the llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes label Mar 12, 2025
@bors bors closed this as completed in 87e60a7 Mar 20, 2025
github-actions bot pushed a commit to rust-lang/rustc-dev-guide that referenced this issue Mar 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-compiletime Issue: Problems and improvements with respect to compile times. llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants