Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV in LLVM when linking to extern "Rust" { fn main(); } #67946

Closed
jonas-schievink opened this issue Jan 6, 2020 · 7 comments · Fixed by #69379
Closed

SIGSEGV in LLVM when linking to extern "Rust" { fn main(); } #67946

jonas-schievink opened this issue Jan 6, 2020 · 7 comments · Fixed by #69379
Assignees
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@jonas-schievink
Copy link
Contributor

jonas-schievink commented Jan 6, 2020

This code crashes in LLVM during codegen on stable, beta, and nightly:

#[no_mangle]
fn whatever() {
    extern "Rust" {
        fn main();
    }

    unsafe { main(); }
}

This originally happened in real-world code (a code base derived from cortex-m-rt).

This issue has been assigned to @jumbatm via this comment.

@jonas-schievink jonas-schievink added I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-nominated T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-bug Category: This is a bug. labels Jan 6, 2020
@Mark-Simulacrum
Copy link
Member

We generally don't guarantee behavior with no_mangle, I think. I guess a segfault in llvm is unexpected though :)

@jonas-schievink
Copy link
Contributor Author

I think the #[no_mangle] is a red herring, since the name of that function doesn't actually matter (ie. it can be malloc or something completely harmless). I think it's just needed to make sure the function is codegen-ed.

@pnkfelix
Copy link
Member

pnkfelix commented Jan 9, 2020

I keep seeing this in my output when I play around with this on the playpen:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc

Here's a variant that does not use #[no_mangle], adding credence to @jonas-schievink 's point that it is probably a red herring.

@pnkfelix pnkfelix changed the title SIGSEGV in LLVM when calling main from #[no_mangle] fn SIGSEGV in LLVM when linking to extern "Rust" { fn main(); } Jan 9, 2020
@pnkfelix
Copy link
Member

pnkfelix commented Jan 9, 2020

triage: P-high. Leaving nominated in hopes that someone at triage meeting has a bright idea has to what's going on here (and what we might do about it).

@pnkfelix pnkfelix added the P-high High priority label Jan 9, 2020
@Mark-Simulacrum
Copy link
Member

Hm, bad_alloc implies OOM to me, but maybe that's not always the case.

@jumbatm
Copy link
Contributor

jumbatm commented Feb 15, 2020

Compiling with LLVM debug symbols + assertions, I get the same assertion failure for both:

rustc: /home/jon/src/rust/src/llvm-project/llvm/include/llvm/Support/Casting.h:264: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = llvm::Function; Y = llvm::Value; typename llvm::cast_retty<X, Y*>::ret_type = llvm::Function*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

Tracing the callstack with gdb, the assertion is triggered on the unwrap call in this function:

extern "C" void LLVMRustRemoveFunctionAttributes(LLVMValueRef Fn,
unsigned Index,
LLVMRustAttribute RustAttr) {
Function *F = unwrap<Function>(Fn);
Attribute Attr = Attribute::get(F->getContext(), fromRust(RustAttr));

As the error suggests, this means that the downcast from llvm::Value to llvm::Function failed for Fn.

Printing out what type Fn is (with some casting because gdb doesn't know what the return types are):

(gdb) call (void)LLVMDumpType((LLVMTypeRef)LLVMTypeOf(Fn))
i32 (i32, i8**)*

That's strange -- it's reported as a function pointer, which is the same as what's reported for actual llvm::Functions, but we're still failing the downcast.

Usually, dumping an llvm::Function should give you the contents of the function. For example:

    auto *mainTy = llvm::FunctionType::get(llvm::Type::getInt32Ty(theContext), { llvm::Type::getInt32Ty(theContext), llvm::Type::getInt8PtrTy(theContext)->getPointerTo() }, false);

    auto *mainFn = llvm::Function::Create(mainTy, llvm::GlobalValue::LinkageTypes::ExternalLinkage, "main", module);
    auto *entry = llvm::BasicBlock::Create(theContext, "entry", mainFn);

    llvm::IRBuilder<> builder(theContext);
    builder.SetInsertPoint(entry);
    auto *inst = builder.CreateRet(llvm::ConstantInt::get(llvm::Type::getInt32Ty(theContext), 0));

gives us

define i32 @main(i32, i8**) {
entry:
  ret i32 0
}

But dumping Fn gives this:

(gdb) p (void)LLVMDumpValue(Fn)
i32 (i32, i8**)* ret (void ()* @main)

So it appears to be pointing to a ret instruction? I'm not 100% sure -- I've ever seen a ret instruction being prefixed with a type, but I've taken it to be indicating the type of @main, the operand.

I'll do a bit more digging.

@rustbot claim

@rustbot rustbot assigned rustbot and unassigned rustbot Feb 15, 2020
@jumbatm
Copy link
Contributor

jumbatm commented Feb 16, 2020

After manually removing the LLVM build to force a complete rebuild with debug symbols, I was able to get a proper dump which makes far more sense:

...
(gdb) call $5->dump()
i32 (i32, i8**)* bitcast (void ()* @main to i32 (i32, i8**)*)
(gdb) 

Note that this is not a bitcast instruction but a constant expr bitcast -- ie, one generated at compile time.

As it turns out, in the event that a function exists with athe same name but a different type in a module, llvm::Module::getOrInsertFunction(llvm::StringRef, llvm::FunctionType, llvm::AttributeList) will return a compile time bitcast to that function:

llvm/lib/IR/Module.cpp:157-161

  // If the function exists but has the wrong type, return a bitcast to the
  // right type.
  auto *PTy = PointerType::get(Ty, F->getAddressSpace());
  if (F->getType() != PTy)
    return {Ty, ConstantExpr::getBitCast(F, PTy)};

which is the overload called here

extern "C" LLVMValueRef LLVMRustGetOrInsertFunction(LLVMModuleRef M,
const char *Name,
LLVMTypeRef FunctionTy) {
return wrap(
unwrap(M)->getOrInsertFunction(Name, unwrap<FunctionType>(FunctionTy))
#if LLVM_VERSION_GE(9, 0)
.getCallee()
#endif
);
}

So, later, when we attempt to use any LLVMRust* functions that do a downcast to llvm::Function,
we trigger the assertion failure because the downcast from constant expr bitcast to llvm::Function is not valid.

We run into this in both your cases because when codegen attempts to insert a declaration for extern "Rust" { fn main(); } with signature void ()*, the program's main already exists in the module with the signature i32 (i32, i8**)*, so the bitcast is returned instead of an llvm:Function *.

This issue can be reproduced without causing a crash by forcing a name collision with a function that's not main (as we don't end up trying to cast the bitcast to a Function in this scenario):

#[no_mangle]
pub fn bar(b: bool) {
    // ...
}

pub fn foo() {
    extern "Rust" {
        fn bar();
    }
    unsafe { bar(); }
}

Compiling with

rustc +stage1 foo.rs --crate-type=rlib --emit=llvm-ir -Copt-level=0

we get this bitcode here:

...
; Function Attrs: nonlazybind uwtable
define void @bar(i1 zeroext %b) unnamed_addr #0 {
start:
  ret void
}

; foo::foo
; Function Attrs: nonlazybind uwtable
define void @_ZN3foo3foo17he8a114b84f989db4E() unnamed_addr #0 {
start:
  call void bitcast (void (i1)* @bar to void ()*)()
  br label %bb1

bb1:                                              ; preds = %start
  ret void
}
...

which I think exposes the problem more clearly: In an extern block which declares a function with the same name as some existing function within the module, we'll end up generating a call to the function in the module (potentially with an unsafe bitcast), instead of shadowing the module's function and generating a call to the external function (which, to me, appeared to be the intent of the example).

You can see that this example shows that the #[no_mangle] in the original example isn't quite so much a red herring as it might seem -- in my example particularly, it's needed to ensure the name collision which demonstrates this behaviour. The examples without #[no_mangle] work because we induce a name collision on the unmangled main function instead (not the user-defined one; the one which calls rt::lang_start).

At this point, I'm not sure how to proceed. Should we error out here because of the name collision to avoid generating crazy function pointer bitcasts? Or should we do some kind of trickery to rename the shadowing symbol so that it correctly calls the extern function? Or is it as @Mark-Simulacrum says -- is the current codegen into bitcast fine because we don't make any behaviour guarantees for #[no_mangle] symbols?

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Feb 27, 2020
Fail on multiple declarations of `main`.

Closes rust-lang#67946.

Previously, when inserting the entry function, we only checked for
duplicate _definitions_ of `main`.  However, it's possible to cause
problems even only having a duplicate _declaration_. For example,
shadowing `main` using an extern block isn't caught by the current
check, and causes an assertion failure down the line in in LLVM code.

r? @pnkfelix
@bors bors closed this as completed in 350491d Feb 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants