Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non well-formed symbol mangling and bad symbol tracking issue #46552

Open
m4b opened this issue Dec 7, 2017 · 2 comments
Open

non well-formed symbol mangling and bad symbol tracking issue #46552

m4b opened this issue Dec 7, 2017 · 2 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@m4b
Copy link
Contributor

m4b commented Dec 7, 2017

So, here's my ultimate desire for this eventually so that stuff like this doesn't happen again:

  1. There is a single, canonical name mangler in rust. Anything that ever emits a binary symbol uses this mangler, without exception, and can't modify the bytes; it and it alone are responsible for the ABI facing names.
  2. Even better, this is enforced in the types, and only at the last minute is the raw str pointer or bytes passed to the backend to emit ABI names
  3. The rust testing suite verifies that every symbol in a generated binary are validly mangled.

You might be surprised to learn that 1 is not true in practice. So I've opened this issue for tracking various cases I've found, and what follows are the specimens:

Examples

As of rustc 1.22, on the following program compiled via rustc hello.rs:

fn main() {
    println!("hello");
}

I am seeing the following issues:

Spurious thread locals

symbol: .tdata._ZN3std10sys_common11thread_info11THREAD_INFO7__getit5__KEY17hc69464ef038d7e85E

40    LOCAL      TLS         .tdata._ZN3std10sys_common11thread_info11THREAD_INFO7__getit5__KEY17hc69464ef038d7e85E  0x0       .tdata(20)              0x0

Example source location: https://github.com//m4b/rust/blob/383e313d181eceb3155eb1089d448144f830ee23/src/libstd/sys_common/thread_info.rs#L21

Why incorrect

  1. does not start with _ZN.

  2. This looks like its meant to be the section name. The exact same TLS variable does occur later on, with the same address:

40    LOCAL      TLS         std::sys_common::thread_info::THREAD_INFO::__getit::__KEY::hc69464ef038d7e85                            0x30      .tdata(20)              0x0

but correctly mangled.

This could be an llvm bug; I have verified this only seems to occur to variables generated via the thread_local! macro, and furthermore, is inside of the crate .rlib static archive. (you can output this by something like:

ar p libstd-fe0b1b991511fcaa.rlib std-fe0b1b991511fcaa.std0.rust-cgu.o > libstd.o

(your libstd-<hash> will vary of course tho)

E.g. here is every symbol with either .tdata in its name or referencing that section in rust 1.22 libstd object file (inside .rlib):

sections:
  3473   .tdata._ZN3std11collections4hash3map11RandomState3new4KEYS7__getit5__KEY17h98644cd8ad1049dbE                                                                                                                                                                                                                                             SHT_PROGBITS   WRITE ALLOC TLS        0x42030     0x0    0x20                       0x0       0x8    
  3543   .tdata._ZN3std2io5stdio12LOCAL_STDOUT7__getit5__KEY17h53b08df14c3cb33dE                                                                                                                                                                                                                                                                  SHT_PROGBITS   WRITE ALLOC TLS        0x42440     0x0    0x28                       0x0       0x20   
  3627   .tdata._ZN3std10sys_common11thread_info11THREAD_INFO7__getit5__KEY17hc69464ef038d7e85E                                                                                                                                                                                                                                                   SHT_PROGBITS   WRITE ALLOC TLS        0x42840     0x0    0x30                       0x0       0x20   
  3647   .tdata._ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h715a8958c4cd11efE                                                                                                                                                                                                                                                                 SHT_PROGBITS   WRITE ALLOC TLS        0x42980     0x0    0x28                       0x0       0x20   
  3659   .tdata._ZN3std9panicking18update_panic_count11PANIC_COUNT7__getit5__KEY17h01a9f669bb84595fE                                                                                                                                                                                                                                              SHT_PROGBITS   WRITE ALLOC TLS        0x42a18     0x0    0x18                       0x0       0x8    
  3665   .tdata._ZN3std4rand10thread_rng14THREAD_RNG_KEY7__getit5__KEY17h8ec4cb227256fe90E                                                                                                                                                                                                                                                        SHT_PROGBITS   WRITE ALLOC TLS        0x42a80     0x0    0x10                       0x0       0x8    
  symbols:
                 0    LOCAL      TLS         .tdata._ZN3std10sys_common11thre…   0x0       .tdata._ZN3std10sys_common11thread_info11THREAD_INFO7__getit5__KEY17hc69464ef038d7e85E(3627)                                                                                                                                                                                                                                                  0x0    
                 0    LOCAL      TLS         .tdata._ZN3std11collections4hash…   0x0       .tdata._ZN3std11collections4hash3map11RandomState3new4KEYS7__getit5__KEY17h98644cd8ad1049dbE(3473)                                                                                                                                                                                                                                            0x0    
                 0    LOCAL      TLS         .tdata._ZN3std2io5stdio12LOCAL_S…   0x0       .tdata._ZN3std2io5stdio12LOCAL_STDOUT7__getit5__KEY17h53b08df14c3cb33dE(3543)                                                                                                                                                                                                                                                                 0x0    
                 0    LOCAL      TLS         .tdata._ZN3std4rand10thread_rng1…   0x0       .tdata._ZN3std4rand10thread_rng14THREAD_RNG_KEY7__getit5__KEY17h8ec4cb227256fe90E(3665)                                                                                                                                                                                                                                                       0x0    
                 0    LOCAL      TLS         .tdata._ZN3std9panicking12LOCAL_…   0x0       .tdata._ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h715a8958c4cd11efE(3647)                                                                                                                                                                                                                                                                0x0    
                 0    LOCAL      TLS         .tdata._ZN3std9panicking18update…   0x0       .tdata._ZN3std9panicking18update_panic_count11PANIC_COUNT7__getit5__KEY17h01a9f669bb84595fE(3659)                                                                                                                                                                                                                                             0x0    
                 0    LOCAL      TLS         _ZN3std10sys_common11thread_info…   0x30      .tdata._ZN3std10sys_common11thread_info11THREAD_INFO7__getit5__KEY17hc69464ef038d7e85E(3627)                                                                                                                                                                                                                                                  0x0    
                 0    LOCAL      TLS         _ZN3std11collections4hash3map11R…   0x20      .tdata._ZN3std11collections4hash3map11RandomState3new4KEYS7__getit5__KEY17h98644cd8ad1049dbE(3473)                                                                                                                                                                                                                                            0x0    
                 0    LOCAL      TLS         _ZN3std2io5stdio12LOCAL_STDOUT7_…   0x28      .tdata._ZN3std2io5stdio12LOCAL_STDOUT7__getit5__KEY17h53b08df14c3cb33dE(3543)                                                                                                                                                                                                                                                                 0x0    
                 0    LOCAL      TLS         _ZN3std4rand10thread_rng14THREAD…   0x10      .tdata._ZN3std4rand10thread_rng14THREAD_RNG_KEY7__getit5__KEY17h8ec4cb227256fe90E(3665)                                                                                                                                                                                                                                                       0x0    
                 0    LOCAL      TLS         _ZN3std9panicking12LOCAL_STDERR7…   0x28      .tdata._ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h715a8958c4cd11efE(3647)                                                                                                                                                                                                                                                                0x0    
                 0    LOCAL      TLS         _ZN3std9panicking18update_panic_…   0x18      .tdata._ZN3std9panicking18update_panic_count11PANIC_COUNT7__getit5__KEY17h01a9f669bb84595fE(3659)                                                     

The spurious symbols will never be gc'd by the linker, and they will never get referenced; so they're just taking up space (albeit not much).

backtrace.rs nested static

symbol: _ZN3std10sys_common9backtrace11log_enabled7ENABLED17hc187c5b3618ccb2eE.0.0

Example source location: https://github.com//m4b/rust/blob/383e313d181eceb3155eb1089d448144f830ee23/src/libstd/sys_common/backtrace.rs#L148

Why Incorrect

No mangled symbol is allowed to have characters after the final E, but this has .0.0

2622e8    LOCAL      OBJECT      _ZN3std10sys_common9backtrace11log_enabled7ENABLED17hc187c5b3618ccb2eE.0.0 0x8       .bss(27)                0x0 

I have definitely seen other examples of this, and with different numbers at the end; I think it has to do with nested statics somehow.

Nightly

On nightly it looks like there has been a pretty substantial regression w.r.t. valid symbol names being output:

bingrep -D  -t 65 hello-nightly  | grep -e "E...[[:digit:]] "
26ce98    GLOBAL     OBJECT      ref.7.llvm.D64EB761                                                  0x18      .data.rel.ro(23)        0x2
26e160    GLOBAL     OBJECT      _ZN3std3sys4unix2os8ENV_LOCK17hbf5ac5d1fa9db31cE.llvm.D64EB761       0x28      .data(26)               0x2    
 5bf60    GLOBAL     OBJECT      str.4.llvm.6C0E7CF1                                                  0x1a      .rodata(16)             0x2    
 5bf7a    GLOBAL     OBJECT      str.5.llvm.6C0E7CF1                                                  0x0       .rodata(16)             0x2    
 548f0    GLOBAL     FUNC        _ZN4core3ptr13drop_in_place17hd0b6a86080ab42c4E.llvm.F74E5798        0x6       .text(14)               0x2    
 5bf80    GLOBAL     OBJECT      ref.7.llvm.6C0E7CF1                                                  0x40      .rodata(16)             0x2    
 5d1d0    GLOBAL     OBJECT      str.a.llvm.F74E5798                                                  0x1f      .rodata(16)             0x2    
26d920    GLOBAL     OBJECT      panic_bounds_check_loc.e.llvm.F74E5798                               0x18      .data.rel.ro(23)        0x2  

This runs the whole gamut of functions, global memory, read only strings, all apparently (sometimes) having extra characters appended.

Special Mentions

{{closure}} in symbols are useless, and very hard to print in debuggers.

E.g.:

47d80    LOCAL      FUNC        core::fmt::Formatter::pad_integral::{{closure}}::h6acabc645f5ef2ad 0x10f     .text(14)               0x0

which is from the use of this closure:

https://github.com//m4b/rust/blob/383e313d181eceb3155eb1089d448144f830ee23/src/libcore/fmt/mod.rs#L1108

The compiler knows the line number (it will even omit this sometimes like @[closure; mod.rs:1108] or whatever); why not just output that instead of {{closure}}?

m4b referenced this issue in rust-lang/rustc-demangle Dec 7, 2017
@pietroalbini pietroalbini added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Jan 23, 2018
@pnkfelix pnkfelix added A-linkage Area: linking into static, shared libraries and binaries T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 20, 2018
@bjorn3
Copy link
Member

bjorn3 commented Mar 4, 2022

The .tdata.* sections are used to store tls data. The respective symbols point to the start of the sections. The same is done for all other kinds of sections. If you use -Zfunction-sections=no, there should be a single .tdata section and thus a single .tdata symbol. The .llvm.* suffixes are added by LLVM's ThinLTO to prevent symbol conflicts of symbols that were private but had to be made public due to cross-cgu inlining. The v0 symbol mangling scheme has been ratified to explicitly allow this. The str.*, ref.* and panic_bounds_check_loc.* symbols are probably unnamed local data objects given a unique name. I don't think there is any bug here.

@bjorn3
Copy link
Member

bjorn3 commented May 16, 2022

I think this can be closed as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants