New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for the `linkage` feature #29603

Open
aturon opened this Issue Nov 5, 2015 · 12 comments

Comments

Projects
None yet
10 participants
@aturon
Member

aturon commented Nov 5, 2015

Tracks stabilization for the linkage attribute.

@aturon

This comment has been minimized.

Member

aturon commented Nov 5, 2015

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Nov 5, 2015

Currently this translates to the various linkage models of LLVM, for example the values of the attribute can be:

match name {                                                         
    "appending" => Some(llvm::AppendingLinkage),                     
    "available_externally" => Some(llvm::AvailableExternallyLinkage),
    "common" => Some(llvm::CommonLinkage),                           
    "extern_weak" => Some(llvm::ExternalWeakLinkage),                
    "external" => Some(llvm::ExternalLinkage),                       
    "internal" => Some(llvm::InternalLinkage),                       
    "linkonce" => Some(llvm::LinkOnceAnyLinkage),                    
    "linkonce_odr" => Some(llvm::LinkOnceODRLinkage),                
    "private" => Some(llvm::PrivateLinkage),                         
    "weak" => Some(llvm::WeakAnyLinkage),                            
    "weak_odr" => Some(llvm::WeakODRLinkage),                        
    _ => None,                                                       
}                                                                    

Some worries about this are:

  • These are very LLVM specific, it's unclear how applicable they are to other backends.
  • Beyond external or weak, I've never seen a reason to use the other attributes.
  • These linkage methods are platform-specific and aren't guaranteed to work everywhere.

That being said, it's the only way to do weak symbols on Linux and it's also convenient for exporting a known symbol without worrying about privacy on the Rust side of things. I would personally want to reduce the set of accepted linkage types and then state "well of course linkage is platform-specific!"

@aturon

This comment has been minimized.

Member

aturon commented Nov 5, 2015

cc #29629

@mahkoh

This comment has been minimized.

Contributor

mahkoh commented Jan 11, 2016

This feature is fundamentally broken. Consider the following code:

static unsigned long B = 0;
unsigned long *A = &B;
void f(void) { }
#![feature(linkage)]

#[link_name = "c_part"]
extern {
    #[linkage = "extern_weak"] static A: *const usize;
    fn f();
}

fn main() {
    unsafe {
        f();
        println!("{:x} @ {:x} @ {:p}", *(*A as *const usize), *A, A);
    }
}

Which prints 0 @ 5617765adaa8 @ 0x5617765ad890 meaning that the A seen by the rust code contains the address of A and not A itself. It's also easy to get LLVM to abort with this.

@mahkoh

This comment has been minimized.

Contributor

mahkoh commented Jan 11, 2016

This attribute is also used incorrectly here.

@brson

This comment has been minimized.

Contributor

brson commented Jun 1, 2017

#18804 needs to be resolved before stabilization.

@nagisa

This comment has been minimized.

Contributor

nagisa commented Jan 6, 2018

I think we should at least work on stabilising the weak and extern_weak linkage. Both are very useful and a fairly widespread linkage options that seem to be supported (unlike many other options currently exposed) by at least all of the tier 1 targets.

To give an example of a use-case for weak linkage in Rust code (I need weak linkage for libloading), one would be for global statics unique even between multiple versions of the same crate. Consider that, currently, if a binary links log = ^0.3 and log = ^0.4 some way, or another, it will end up having two distinct global loggers. This could trivially be resolved with some use of the weak linkage option (as it ensures – at link time – only one instance of global with the same name).

That being said, #[linkage] should always infect whatever it applies to with unsafety. Consider for example these two crates:

// crate older version
#[linkage="weak"]
static mut FOO: u32 = !0;
// crate newer version
#[linkage="weak"]
static mut FOO: char = 'a';

when linked together, all uses of FOO as seen from the newer-version crate would be UB if the linker happened to choose to link-in the value from the older version. This is functionally a transmute without size checks.

@mahkoh

This comment has been minimized.

Contributor

mahkoh commented Jan 13, 2018

extern_weak is broken in such a dubious way that even though I pointed out two years ago that it was being used incorrectly in the stdlib, the bug still hasn't been fixed.

~$ cat strong.rs 
extern {
    static __progname: *const u8;
}

fn main() {
    unsafe {
        println!("  __progname:\t\t{:?}", __progname);
        println!(" &__progname:\t\t{:?}", &__progname as *const _);
        if !__progname.is_null() {
            println!(" *__progname:\t\t{:?}", *__progname as char);
        }
    }
}
~$ cat weak.rs 
#![feature(linkage)]

extern {
    #[linkage = "extern_weak"]
    static __progname: *mut *const u8;
}

fn main() {
    unsafe {
        println!("  __progname:\t\t{:?}", __progname);
        println!(" &__progname:\t\t{:?}", &__progname as *const _);
        if !__progname.is_null() {
            println!(" *__progname:\t\t{:?}", *__progname);
            if !(*__progname).is_null() {
                println!("**__progname:\t\t{:?}", **__progname as char);
            }
        }
    }
}
~$ rustc strong.rs 
~$ rustc weak.rs 
~$ ./strong 
  __progname:		0x7ffdfc7438ac
 &__progname:		0x7f10560a5360
 *__progname:		's'
~$ ./weak 
  __progname:		0x7f730b21f360
 &__progname:		0x5574a33bb008
 *__progname:		0x7fffe4ca38b2
**__progname:		'w'

gcc and clang handle this attribute correctly.

PS: The address of __dso_handle isn't actually too significant in __cxa_thread_atexit_impl and &__dso_handle is only a few bytes off.

PSS: Wow, looks like I already explained this above (also two years ago). Maybe the NSA is trying to keep this potential remote code execution unfixed. 🤔

@cramertj

This comment has been minimized.

Member

cramertj commented Jan 17, 2018

Ping @alexcrichton @nagisa what's the status here? Is there a bug here that can be solved / mentored?

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Jan 17, 2018

@cramertj I personally consider this a perma-unstable issue for now. In general symbol visibility and ABIs are something that historically rustc hasn't done much to specify and has had a lot of freedom over. We relatively regularly tweak ABIs, symbol names, etc. There's a very thin layer at the end (like a C ABI) which is pretty stable but even that gets sketchy sometimes (#[no_mangle] deep in a module hierarchy?).

I think we've benefitted quite greatly from the symbol visibility flexibility we've had historically in terms of compiler refactorings and heading off regressions. It's hard to introduce a regression when you can't rely on the feature in the first place!

Along those lines I think there's definitely some select use cases where using something like #[linkage] is critical, but from what I've seen they tend to be few and far between and somewhat esoteric. A blanket and general #[linkage] attribute I think is way too powerful to solve this use case and it'd be better to poke around at various motivational use cases to see if there's a more narrow solution.

(plus that and the whole #[linkage] is incredibly platform/LLVM specific and I don't think we full understand all the linkage modes in LLVM and how they apply to all platforms as well)

@acmcarther

This comment has been minimized.

acmcarther commented Mar 28, 2018

Given that crate owners can't control how many instances of their crate will be included in a given binary, it seems that we really need a mechanism at least for weak linkage in stable Rust.

I got bit by the fallout from #29603 (comment), where rust-libloading gained a cc complation step to workaround this missing feature.

@mjbshaw

This comment has been minimized.

Contributor

mjbshaw commented Apr 25, 2018

I would love some mechanism to merge statics and variables with the same value (and name, possibly).

Consider the following code:

// In my actual code, this is a more complicated proc macro.
macro_rules! special_number {
  ($value: expr) => {
    {
      // In my actual code, these static variables also have
      // the special `export_name` of
      // "\x01L_special_number_<unique_id>", where
      // `<unique_id>` is a unique identifier to avoid
      // symbol conflicts.
      #[link_section = ".data,__custom_special_section"]
      static SPECIAL_NUMBER: usize = $value;
      &SPECIAL_NUMBER
    }
  };
}

extern {
  fn consume_special_number(value: &usize);
}

pub fn main() {
  unsafe {
    consume_special_number(special_number!(42));
    consume_special_number(special_number!(42));
    consume_special_number(special_number!(42));
  };
}

This will generate three different static variables. I would love it if I could get Rust to merge these static variables into one single static variable. Using a const doesn't work (it does merge the values, but you can't provide link attributes on a const).

There are two ways the merging could be performed:

  1. By value. Statics with the same value (that opt-in to merging) would be merged into a single static variable.
  2. By name. Statics with the same export_name (that opt-in to merging) would be merged into a single static variable (and wouldn't result in duplicate symbol errors).

I would be prefer option 2 (merging by name).

Perhaps this is what the linkonce_odr linkage type is for, but using the same export_name causes

The linkonce_odr and weak_odr linkage types are similar to this, I think, but don't work (in Rust) for merging globals/statics within a single translation unit. Rust could either extend them or introduce a new linkage type that does ODR merging within translation units.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment