LLVM unreachable instruction #11

hudson-ayers · 2020-09-25T00:44:34Z

While symbolically executing a function, Haybale threw the following error:

'UnreachableInstruction`: Reached an LLVM 'Unreachable' instruction

Should I interpret this to mean the code under analysis is somehow invalid?

The text was updated successfully, but these errors were encountered:

cdisselkoen · 2020-09-25T03:46:09Z

This means that LLVM doesn't think the instruction should be reachable, based on LLVM's own semantics; but Haybale was able to reach it. That's unexpected, as Haybale is more-or-less intended to provide exactly LLVM's semantics. I think it's safe to assume that any LLVM code generated by a production compiler (clang, rustc, etc) is valid, so the problem is somewhere in Haybale or maybe in whatever code you have that's sitting on top of Haybale. Off the top of my head here's a couple possible causes:

A bug in Haybale
There's some known limitations with how Haybale handles LLVM Invoke/Resume that may cause Haybale to explore paths which LLVM thinks aren't possible. If this is the case in your example, then the path with this error should just be ignored.
A call to an external function that never returns (e.g., a function related to panic handling in whatever system/language you're analyzing), but the hook in your Config returned something other than ReturnValue::Abort
Maybe LLVM has a function parameter marked nonnull, and therefore thinks some code is unreachable because the parameter would have to be null in order to get there. Haybale doesn't currently pay attention to the nonnull attribute

hudson-ayers · 2020-09-25T19:51:39Z

Thanks for your response! I will look into this some more and see if I can figure out which of these causes is responsible.

hudson-ayers · 2022-05-11T19:24:20Z

I dug into one of the examples where I was hitting this issue. here is the function being executed (disable()):

#[derive(Copy, Clone, Debug)]
pub enum Clock {
    HSB(HSBClock),
    PBA(PBAClock),
    PBB(PBBClock),
    PBC(PBCClock),
    PBD(PBDClock),
}

impl ClockInterface for Clock {
    fn disable(&self) {
        match self {
            &Clock::HSB(v) => mask_clock!(HSB_MASK_OFFSET: hsbmask & !(1 << (v as u32))),
            &Clock::PBA(v) => mask_clock!(PBA_MASK_OFFSET: pbamask & !(1 << (v as u32))),
            &Clock::PBB(v) => mask_clock!(PBB_MASK_OFFSET: pbbmask & !(1 << (v as u32))),
            &Clock::PBC(v) => mask_clock!(PBC_MASK_OFFSET: pbcmask & !(1 << (v as u32))),
            &Clock::PBD(v) => mask_clock!(PBD_MASK_OFFSET: pbdmask & !(1 << (v as u32))),
        }
    }
}

And here is the LLVM IR generated for that function:

; Function Attrs: minsize nofree norecurse nounwind optsize
define internal fastcc void @"_ZN75_$LT$sam4l..pm..Clock$u20$as$u20$kernel..platform..chip..ClockInterface$GT$7disable17h5694fc505bd2f03dE"({ i8, i8 }* noalias nocapture noundef readonly align 1 dereferenceable(2) %0) unnamed_addr #8 !dbg !77370 {
  call void @llvm.dbg.value(metadata { i8, i8 }* %0, metadata !77372, metadata !DIExpression()), !dbg !77393
  %2 = getelementptr inbounds { i8, i8 }, { i8, i8 }* %0, i32 0, i32 0, !dbg !77394
  %3 = load i8, i8* %2, align 1, !dbg !77394, !range !77395
  %4 = zext i8 %3 to i32, !dbg !77394
  switch i32 %4, label %5 [
    i32 0, label %6
    i32 1, label %12
    i32 2, label %18
    i32 3, label %24
    i32 4, label %30
  ], !dbg !77396

5:                                                ; preds = %1
  unreachable, !dbg !77394

6:                                                ; preds = %1
  %7 = getelementptr inbounds { i8, i8 }, { i8, i8 }* %0, i32 0, i32 1, !dbg !77397
  %8 = load i8, i8* %7, align 1, !dbg !77397, !range !77398
...

Haybale is reaching the "unreachable" in basic block 5. As you can see, LLVM seems to be using basic block 5 as the default label for the switch statement, as it should be impossible for the input integer (%4) to be anything other 0-4 based on the definition of the enum. However, Haybale is apparently unaware of this constraint on the input integer, and thus considering bb %5 as a reachable path. Notably, I have not tried executing Haybale on just this function, I am reaching it as part of a larger execution (not sure if that matters). Any thoughts on why this might be happening?

hudson-ayers · 2022-05-11T23:30:15Z

I tried just executing this method directly and get the same result

cdisselkoen · 2022-05-12T00:28:37Z

There is nothing in the LLVM IR (other than the unreachable itself) that communicates the restriction that %4 must be in the range 0-4. So I don't see how Haybale could know this. As Haybale is designed to follow LLVM IR semantics, Haybale is correct in reporting that bb %5 is reachable.

This seems to be a compelling example to motivate squashing the UnreachableInstruction errors in your code. We could add a setting to Haybale to have it squash them itself, but then that would raise the question should we have a similar setting for all the other error types, or perhaps a user-defined lambda that takes an error and returns a bool whether to squash it? That quickly becomes a slippery slope.

It seems much simpler to me to leave this outside of the scope of Haybale. Haybale iterates over all the paths in the LLVM IR, which includes this one; and it's up to Haybale's caller to decide what to do with each path. Callers are free to do anything they want with paths that end in errors, based on the particular error type or any other information they might know. In your case, I might recommend that your calling code just ignore paths that resulted in UnreachableInstruction because they are impossible (assuming that this example generalizes).

hudson-ayers · 2022-05-13T15:59:39Z

Thanks, this makes a lot of sense. I will try to take a look at a couple more examples to confirm this generalizes, then go forward with ignoring those paths.

hudson-ayers · 2022-06-01T20:44:52Z

Ignoring these paths has been sufficient for my purposes, thanks for the guidance

hudson-ayers closed this as completed Jun 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLVM unreachable instruction #11

LLVM unreachable instruction #11

hudson-ayers commented Sep 25, 2020

cdisselkoen commented Sep 25, 2020

hudson-ayers commented Sep 25, 2020

hudson-ayers commented May 11, 2022

hudson-ayers commented May 11, 2022

cdisselkoen commented May 12, 2022

hudson-ayers commented May 13, 2022

hudson-ayers commented Jun 1, 2022

LLVM unreachable instruction #11

LLVM unreachable instruction #11

Comments

hudson-ayers commented Sep 25, 2020

cdisselkoen commented Sep 25, 2020

hudson-ayers commented Sep 25, 2020

hudson-ayers commented May 11, 2022

hudson-ayers commented May 11, 2022

cdisselkoen commented May 12, 2022

hudson-ayers commented May 13, 2022

hudson-ayers commented Jun 1, 2022