Skip to content

Commit

Permalink
Demangle ThinLTO-generated symbols as well
Browse files Browse the repository at this point in the history
  • Loading branch information
alexcrichton committed Nov 25, 2017
1 parent 6c94e7f commit 48646c6
Showing 1 changed file with 26 additions and 1 deletion.
27 changes: 26 additions & 1 deletion src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,25 @@ pub struct Demangle<'a> {
// Note that this demangler isn't quite as fancy as it could be. We have lots
// of other information in our symbols like hashes, version, type information,
// etc. Additionally, this doesn't handle glue symbols at all.
pub fn demangle(s: &str) -> Demangle {
pub fn demangle(mut s: &str) -> Demangle {
// During ThinLTO LLVM may import and rename internal symbols, so strip out
// those endings first as they're on of the last manglings applied to symbol
// names.
let llvm = ".llvm.";
if let Some(i) = s.find(llvm) {
let candidate = &s[i + llvm.len()..];
let all_hex = candidate.chars().all(|c| {
match c {
'A' ... 'F' | '0' ... '9' => true,
_ => false,
}
});

if all_hex {
s = &s[..i];
}
}

// First validate the symbol. If it doesn't look like anything we're
// expecting, we just print it literally. Note that we must handle non-rust
// symbols because we could have any function in the backtrace.
Expand Down Expand Up @@ -358,4 +376,11 @@ mod tests {
// Not a valid hash, has a non-hex-digit.
t_nohash!("_ZN3foo17hg5af221e174051e9E", "foo::hg5af221e174051e9");
}

#[test]
fn demangle_thinlto() {
// One element, no hash.
t!("_ZN3fooE.llvm.9D1C9369", "foo");
t_nohash!("_ZN9backtrace3foo17hbb467fcdaea5d79bE.llvm.A5310EB9", "backtrace::foo");
}
}

9 comments on commit 48646c6

@m4b
Copy link
Contributor

@m4b m4b commented on 48646c6 Dec 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added an issue about this and other issues here rust-lang/rust#46552 and then randomly checked this repo for updates.

So this seems intentional; but why? c++filt no longer works out of the box :/ and now comments like this https://github.com//m4b/rust/blob/383e313d181eceb3155eb1089d448144f830ee23/src/librustc_trans/back/symbol_names.rs#L339-L351 are invalid/misleading, as well as very likely causing trouble in debugger backends...

@alexcrichton
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this seems intentional; but why?

LLVM generates them and the job of this crate is to "just work"? Tools like c++filt never were at 100% parity with our name mangling before anyway?

@m4b
Copy link
Contributor

@m4b m4b commented on 48646c6 Dec 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never had any trouble with c++filt until recently.

I find it hard to believe llvm is just spitting out symbols willy nilly without any say in the matter. C++ also does LTO with LLVM and this format would break their name mangler as well and could potentially cause huge problems, so seems there has to be a way to get it to output well formed names...

Also my last two points are still valid I think, which makes the existence of these names troublesome.

I understand this crate is supposed to just work, and that’s good, but I guess:

  1. I don’t understand rusts name mangling scheme anymore
  2. I was just trying to get some clarification about what’s going on, what can be done.

Since this crate seems somewhat canonical wrt rust names it worried me a little bit that these names got implemented (and hence validated).

I dunno if that makes sense?

@alexcrichton
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er to be clear it's not rustc doing this mangling, it's llvm's ThinLTO passes.

@m4b
Copy link
Contributor

@m4b m4b commented on 48646c6 Dec 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that llvm was emitting this was the understanding I was operating under, but I assumed the rustc compiler was incorrectly passing mangled names (but it seems its using the interface fine, iiuc).

After investigating this, I believe that this is a pretty serious bug. As expected, I am unable to emit C++ mangled names with .llvm and a hash (like above) appended with thin lto for clang 5.0. (doing so would break c++ symbol names, and if it were a global symbol, could cause serious linking errors, in addition to breaking all debug info)

However, I was able to do this with clang 3.9 in some cases.

That rustc (or technically llvm) is emitting these names is a bug; in the cases it is emitted, all debug information either disappears from the binary for those symbols, or the debug information references a non existent symbol (since its not expecting .llvm.<hash> appended).

A simple repro is:

m4b@efrit ::  [ ~/tmp/bad_debug ] cat hello.rs
fn main() {
    println!("hello");
}
m4b@efrit ::  [ ~/tmp/bad_debug ] rustc --version
rustc 1.23.0-nightly (e97ba8328 2017-11-25)
m4b@efrit ::  [ ~/tmp/bad_debug ] rustc -g -o hello-nightly hello.rs
m4b@efrit ::  [ ~/tmp/bad_debug ] rustup default stable
  stable-x86_64-unknown-linux-gnu unchanged - rustc 1.22.1 (05e2e1c41 2017-11-22)
m4b@efrit ::  [ ~/tmp/bad_debug ] rustc -g -o hello hello.rs
m4b@efrit ::  [ ~/tmp/bad_debug ] bingrep -t150 hello | grep rust_panic
             131c0    LOCAL      FUNC        rust_panic                                                                                                                                                0x57      .text(14)                0x0    
m4b@efrit ::  [ ~/tmp/bad_debug ] bingrep -t150 hello-nightly | grep rust_panic
              afc0    GLOBAL     FUNC        rust_panic.llvm.526D1984                                                                                                                                  0x64      .text(14)                0x2    

You can then see the dwarf debug information does not match:

m4b@efrit ::  [ ~/tmp/bad_debug ] /usr/bin/llvm-dwarfdump hello | grep rust_panic
                    DW_AT_linkage_name [DW_FORM_strp]	( .debug_str[0x00050f10] = "_ZN3std9panicking10rust_panicE")
                    DW_AT_name [DW_FORM_strp]	( .debug_str[0x00050f2f] = "rust_panic")
m4b@efrit ::  [ ~/tmp/bad_debug ] /usr/bin/llvm-dwarfdump hello-nightly | grep rust_panic
                    DW_AT_linkage_name [DW_FORM_strp]	( .debug_str[0x00017502] = "_ZN3std9panicking10rust_panicE")
                    DW_AT_name [DW_FORM_strp]	( .debug_str[0x00017521] = "rust_panic")

(i.e., the linkage_name field matches the properly mangled rust name in both cases)

when I use my compiler patched with m4b/rust@383e313 the debug information isn't even omitted at all.

@m4b
Copy link
Contributor

@m4b m4b commented on 48646c6 Dec 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be that upgrading to llvm 5.0 will magically fix these issues; I don't know, I hope so. Otherwise, this is a pretty big bug imho, and should probably be worked out before enabling rust's thinlto by default.

@alexcrichton
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like an excellent comment for upstream! (I'm sure others have thoughts on this as well...)

@m4b
Copy link
Contributor

@m4b m4b commented on 48646c6 Dec 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean rust upstream or llvm upstream ?

If rust I already opened an issue regarding invalid symbol mangling in rust (linked above); if llvm upstream I suppose I could open a bug but if it’s not reproducible in 5.0 my guess is they won’t care ?

@alexcrichton
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have an opinion, I just wanted to point out that ongoing discussion on this particular commit is probably not actually going to cause any change to happen.

Please sign in to comment.