Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM unrolls loops fully, leading to non-linear compilation time #74384

Open
io12 opened this issue Jul 16, 2020 · 15 comments
Open

LLVM unrolls loops fully, leading to non-linear compilation time #74384

io12 opened this issue Jul 16, 2020 · 15 comments
Assignees
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-hang Issue: The compiler never terminates, due to infinite loops, deadlock, livelock, etc. ICEBreaker-LLVM Bugs identified for the LLVM ICE-breaker group P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@io12
Copy link

io12 commented Jul 16, 2020

I tried this code:

#[derive(Copy, Clone)]
pub enum Foo {
    A,
    B(u8),
}

pub fn foo() -> Box<[[[Foo; 50]; 50]; 50]> {
    Box::new([[[Foo::A; 50]; 50]; 50])
}

I expected to see this happen:

cargo build --release

(Above command eventually should terminate)

Instead, this happened: Compiler doesn't terminate.

Meta

rustc --version --verbose:

rustc 1.44.1 (c7087fe00 2020-06-17)
rustc 1.46.0-nightly (346aec9b0 2020-07-11)

A crate with the repro can be found here: https://github.com/io12/llvm-rustc-bug-repro.

It seems like this is an LLVM bug.

@io12 io12 added the C-bug Category: This is a bug. label Jul 16, 2020
@estebank estebank added I-nominated I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 16, 2020
@JohnTitor JohnTitor added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-hang Issue: The compiler never terminates, due to infinite loops, deadlock, livelock, etc. and removed I-slow Issue: Problems and improvements with respect to performance of generated code. labels Jul 16, 2020
@JohnTitor
Copy link
Member

It seems Box is unnecessary, slightly minimized:

#[derive(Copy, Clone)]
pub enum Foo {
    A,
    B(u8),
}

pub fn foo() -> [[[Foo; 50]; 50]; 50] {
    [[[Foo::A; 50]; 50]; 50]
}

When its nesting is a double, it will be terminated in seconds. But when it's a triple, it hangs. This doesn't seem to be technically infinite, but I would label it as I-hang. Correct me if it's wrong label.

@spastorino
Copy link
Member

@estebank I think this is worth a nomination as you have done, but I'm interested in hearing from you what's the reason for the nomination. What's exactly what you wanted to discuss during the meeting about this issue?.

@spastorino spastorino added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jul 16, 2020
@estebank
Copy link
Contributor

@spastorino wanted to flag it for prioritizing and assignment.

@spastorino
Copy link
Member

spastorino commented Jul 22, 2020

Assigning P-critical as discussed as part of the Prioritization Working Group procedure and removing I-prioritize.

We can remove I-nominated because it was requested to be assigned, but given that this is P-critical it will be tracked anyway.

@spastorino spastorino added P-critical Critical priority and removed I-nominated I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jul 22, 2020
@spastorino
Copy link
Member

spastorino commented Jul 22, 2020

This was reconsidered, per @Mark-Simulacrum's comments and we consider now P-medium.
So also re-adding I-nominated.

Anyway, given that we checked this issue slightly during our last meeting, I wanted also to check with @pnkfelix if we want this nomination up or not given that we hadn't assigned the issue yet.

@spastorino spastorino added P-medium Medium priority I-nominated and removed P-critical Critical priority labels Jul 22, 2020
@pnkfelix
Copy link
Member

@rustbot ping llvm

@rustbot rustbot added the ICEBreaker-LLVM Bugs identified for the LLVM ICE-breaker group label Jul 31, 2020
@rustbot
Copy link
Collaborator

rustbot commented Jul 31, 2020

Hey LLVM ICE-breakers! This bug has been identified as a good
"LLVM ICE-breaking candidate". In case it's useful, here are some
instructions for tackling these sorts of bugs. Maybe take a look?
Thanks! <3

cc @camelid @comex @cuviper @DutchGhost @hanna-kruppe @hdhoang @heyrutvik @JOE1994 @jryans @mmilenko @nagisa @nikic @Noah-Kennedy @SiavoshZarrasvand @spastorino @vertexclique @vgxbj

@pnkfelix
Copy link
Member

self-assigning to 1. verify this is an LLVM bug and if so, 2. isolate .bc file that we can use to file bug against LLVM itself.

@pnkfelix pnkfelix self-assigned this Jul 31, 2020
@ChrisJefferson
Copy link
Contributor

This seems to work fine in rustc 1.54.0 (a178d03 2021-07-26)

@estebank estebank added the E-needs-test Call for participation: Writing correctness tests. label Aug 18, 2021
@io12
Copy link
Author

io12 commented Aug 18, 2021

I can still reproduce the bug with rustc 1.54.0 (a178d03 2021-07-26).

@JOE1994
Copy link
Contributor

JOE1994 commented Sep 25, 2021

Compiling with opt-level=2 terminates quickly.

$ rustc --crate-type lib -C opt-level=2 lib.rs

Compiling with opt-level=3 hangs.

$ rustc --crate-type lib -C opt-level=3 lib.rs

Generating just the llvm-ir with opt-level=3 terminates quickly.

$ rustc --crate-type lib -C opt-level=3 --emit llvm-ir lib.rs

$ rustc --version --verbose

rustc 1.55.0 (c8dfcfe04 2021-09-06)
binary: rustc
commit-hash: c8dfcfe046a7680554bf4eb612bad840e7631c4b
commit-date: 2021-09-06
host: x86_64-pc-windows-msvc
release: 1.55.0
LLVM version: 12.0.1

@JOE1994
Copy link
Contributor

JOE1994 commented Sep 25, 2021

When filing a bug to upstream llvm,
I think we can provide two llvm-ir (or llvm-bc) to compare:

  • llvm-ir from opt-level=3 : Compilation hangs while lowering this IR code to machine code
  • llvm-ir from opt-level=2 : Full compilation terminates quickly.

@pnkfelix Would you mind if I go ahead and file a bug to upstream llvm?

@JOE1994
Copy link
Contributor

JOE1994 commented Oct 9, 2021

I did more investigation to better understand where the compiler is taking so much time.
I locally built rustc 1.55.0 along with LLVM version 12.0.1, and used gdb to inspect the call stack.

$ gdb --args rustc +stage1 --crate-type lib -C opt-level=3 lib.rs

I kept running & pausing rustc on gdb for >30 minutes (compiler didn't terminate in the meantime).
Call stack always looked the same from #7 llvm::InstructionCombiningPass::runOnFunction(llvm::Function&)
to the bottom (base). Below is one snapshot of the call stack printed from gdb.

#0  0x00007ffff2427143 in llvm::ValueHandleBase::AddToExistingUseList(llvm::ValueHandleBase**) () from /home/lonelyjoe/.rustup/toolchains/stage1/lib/librustc_driver-c1ce3dd11691b812.so
#1  0x00007ffff19be9a9 in llvm::WeakTrackingVH& llvm::SmallVectorImpl<llvm::WeakTrackingVH>::emplace_back<llvm::Instruction*&>(llvm::Instruction*&) ()
   from /home/lonelyjoe/.rustup/toolchains/stage1/lib/librustc_driver-c1ce3dd11691b812.so
#2  0x00007ffff1b18cc7 in llvm::InstCombinerImpl::visitAllocSite(llvm::Instruction&) () from /home/lonelyjoe/.rustup/toolchains/stage1/lib/librustc_driver-c1ce3dd11691b812.so
#3  0x00007ffff1b564b4 in llvm::InstCombinerImpl::visitCallBase(llvm::CallBase&) () from /home/lonelyjoe/.rustup/toolchains/stage1/lib/librustc_driver-c1ce3dd11691b812.so
#4  0x00007ffff1b5770b in llvm::InstCombinerImpl::visitCallInst(llvm::CallInst&) () from /home/lonelyjoe/.rustup/toolchains/stage1/lib/librustc_driver-c1ce3dd11691b812.so
#5  0x00007ffff1b1ec4f in llvm::InstCombinerImpl::run() () from /home/lonelyjoe/.rustup/toolchains/stage1/lib/librustc_driver-c1ce3dd11691b812.so
#6  0x00007ffff1b20cd1 in combineInstructionsOverFunction(llvm::Function&, llvm::InstCombineWorklist&, llvm::AAResults*, llvm::AssumptionCache&, llvm::TargetLibraryInfo&, llvm::TargetTransformInfo&, llvm::DominatorTree&, llvm::OptimizationRemarkEmitter&, llvm::BlockFrequencyInfo*, llvm::ProfileSummaryInfo*, unsigned int, llvm::LoopInfo*) ()
   from /home/lonelyjoe/.rustup/toolchains/stage1/lib/librustc_driver-c1ce3dd11691b812.so
#7  0x00007ffff1b22df4 in llvm::InstructionCombiningPass::runOnFunction(llvm::Function&) () from /home/lonelyjoe/.rustup/toolchains/stage1/lib/librustc_driver-c1ce3dd11691b812.so
#8  0x00007ffff23cd2d7 in llvm::FPPassManager::runOnFunction(llvm::Function&) () from /home/lonelyjoe/.rustup/toolchains/stage1/lib/librustc_driver-c1ce3dd11691b812.so
#9  0x00007ffff1e6bf31 in (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) () from /home/lonelyjoe/.rustup/toolchains/stage1/lib/librustc_driver-c1ce3dd11691b812.so
#10 0x00007ffff23cc64f in llvm::legacy::PassManagerImpl::run(llvm::Module&) () from /home/lonelyjoe/.rustup/toolchains/stage1/lib/librustc_driver-c1ce3dd11691b812.so
#11 0x00007ffff231c419 in LLVMRunPassManager () from /home/lonelyjoe/.rustup/toolchains/stage1/lib/librustc_driver-c1ce3dd11691b812.so
#12 0x00007fffefc92d04 in rustc_codegen_llvm::back::lto::run_pass_manager () at compiler/rustc_codegen_llvm/src/back/lto.rs:634
#13 0x00007fffefc9e7a7 in rustc_codegen_llvm::back::lto::optimize_thin_module () at compiler/rustc_codegen_llvm/src/back/lto.rs:862
#14 <rustc_codegen_llvm::LlvmCodegenBackend as rustc_codegen_ssa::traits::write::WriteBackendMethods>::optimize_thin () at compiler/rustc_codegen_llvm/src/lib.rs:168
#15 0x00007fffefd0e490 in rustc_codegen_ssa::back::lto::LtoModuleCodegen<B>::optimize () at /home/lonelyjoe/workspace/rust/compiler/rustc_codegen_ssa/src/back/lto.rs:79
#16 0x00007fffefc2a177 in rustc_codegen_ssa::back::write::execute_lto_work_item () at /home/lonelyjoe/workspace/rust/compiler/rustc_codegen_ssa/src/back/write.rs:889
#17 rustc_codegen_ssa::back::write::execute_work_item () at /home/lonelyjoe/workspace/rust/compiler/rustc_codegen_ssa/src/back/write.rs:742
#18 0x00007fffefd0f75c in rustc_codegen_ssa::back::write::spawn_work::{{closure}} () at /home/lonelyjoe/workspace/rust/compiler/rustc_codegen_ssa/src/back/write.rs:1663
#19 std::sys_common::backtrace::__rust_begin_short_backtrace () at /home/lonelyjoe/workspace/rust/library/std/src/sys_common/backtrace.rs:125
#20 0x00007fffefd17077 in std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}} () at /home/lonelyjoe/workspace/rust/library/std/src/thread/mod.rs:476
#21 <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once () at /home/lonelyjoe/workspace/rust/library/std/src/panic.rs:347
#22 std::panicking::try::do_call () at /home/lonelyjoe/workspace/rust/library/std/src/panicking.rs:401
#23 std::panicking::try () at /home/lonelyjoe/workspace/rust/library/std/src/panicking.rs:365
#24 std::panic::catch_unwind () at /home/lonelyjoe/workspace/rust/library/std/src/panic.rs:434
#25 std::thread::Builder::spawn_unchecked::{{closure}} () at /home/lonelyjoe/workspace/rust/library/std/src/thread/mod.rs:475
#26 core::ops::function::FnOnce::call_once{{vtable-shim}} () at /home/lonelyjoe/workspace/rust/library/core/src/ops/function.rs:227
#27 0x00007fffee8d9a67 in <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once () at /home/lonelyjoe/workspace/rust/library/alloc/src/boxed.rs:1572
#28 <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once () at /home/lonelyjoe/workspace/rust/library/alloc/src/boxed.rs:1572
#29 std::sys::unix::thread::Thread::new::thread_start () at library/std/src/sys/unix/thread.rs:74
#30 0x00007fffee05b6db in start_thread (arg=0x7fffe2625700) at pthread_create.c:463
#31 0x00007fffee59871f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

@nagisa
Copy link
Member

nagisa commented Oct 9, 2021

This is definitely not an infinite loop. Replacing the 50 with smaller values the compile times grow at a

11 = 1.94s
12 = 2.87s
13 = 4.20s
14 = 6.14s
15 = 8.41s
17 = 16.30s
19 = 29.05s

From what I can tell LLVM ends up unconditionally unrolling all 3 loops here, producing n³ instruction sets of

%n = getelementptr inbounds [5 x [5 x [5 x { i8, i8 }]]], [5 x [5 x [5 x { i8, i8 }]]]* %0, i64 0, i64 4, i64 4, i64 3, i32 1
store i8 ..., i8* %n, align 1

At N=50, there will be 250k instructions in a single function, which will naturally take quite a long time to process.

@nagisa
Copy link
Member

nagisa commented Oct 9, 2021

I filed https://bugs.llvm.org/show_bug.cgi?id=52123 but not holding my breath about this being resolved quickly – there are a couple similar issues on the tracker:

https://bugs.llvm.org/show_bug.cgi?id=51922
https://bugs.llvm.org/show_bug.cgi?id=51841
and a couple similar.

@nagisa nagisa changed the title Compiler doesn't terminate with --release LLVM unrolls loops fully, leading to non-linear compilation time Oct 9, 2021
@nikic nikic removed the E-needs-test Call for participation: Writing correctness tests. label Mar 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-hang Issue: The compiler never terminates, due to infinite loops, deadlock, livelock, etc. ICEBreaker-LLVM Bugs identified for the LLVM ICE-breaker group P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants