Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability from the top-level of the compilation not to mark #[no_mangle] items exported from shared library #73958

Open
hsivonen opened this issue Jul 2, 2020 · 33 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@hsivonen
Copy link
Member

hsivonen commented Jul 2, 2020

Firefox currently builds all its Rust code into a static artifact called gkrust which is then statically linked, with cross-language LTO, with C++ artifacts to form the shippable libxul.so/dylib/dll.

Upon examining the exported symbols of libxul, the FFI functions that are meant for internal glue between Rust and C++ show up.

This is a problem for two reasons:

  1. It's a binary size issue, because cross-language LTO is supposed to inline the FFI functions into their callers. However, having them exported means also keeping those copies around. Also, unused FFI functions can't be eliminated as dead code.
  2. It gives problematic third-party software more opportunities to hook into libxul in unsupported ways.

Since there's a variety of third-party crates that go into libxul, it wouldn't be practical to reannotate each FFI function. Therefore, please add a way from the top level of the Rust compilation to turn off the shared library export metadata generation for #[no_mangle] items.

@jonas-schievink jonas-schievink added A-linkage Area: linking into static, shared libraries and binaries T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 2, 2020
@RReverser
Copy link
Contributor

Ran into this today again, because I defined my own malloc-like allocator function and added extern "C" for the calling convention (it's meant to be passed as a callback over FFI), but since I called it malloc, it ended up being a linker error due to a clash with an actual "malloc".

For now, I can try to prefix it with something reasonably unique, but I'd really rather have private functions stay private and not have to worry about mangling their names manually.

@bjorn3
Copy link
Member

bjorn3 commented Oct 9, 2020

#[no_mangle] is for when your functions are not private.

@RReverser
Copy link
Contributor

@bjorn3 Right, the situation is a bit different from this issue / my original comment it refers to, but the point is still the same in that #[no_mangle] shouldn't affect privacy but only name generation.

@bjorn3
Copy link
Member

bjorn3 commented Oct 9, 2020

Mangling is the way to achieve privacy. While ELF and other object formats technically have different visibility modes, those are less useful for Rust than for C. Rust splits up a single crate into multiple codegen units, so any function used in a different codegen unit from the one defining it has to have an external linkage, making it available to all object files used in the linker invocation, thus your malloc would affect all crates not just your own. In addition there is the concept of visbility in ELF, which allows you to hide symbols from the linked library, but this is still not invisible to other crates. If you want to override the memory allocator for other crates, then you should use #[global_allocator]. If you only override malloc for Rust code, that will cause inconsistencies where an explicit malloc is used in C/Rust and then free on the other end of ffi.

@RReverser
Copy link
Contributor

which allows you to hide symbols from the linked library, but this is still not invisible to other crates

This is all I want (as well as what @hsivonen describes). I want to be able to expose functions to the C library that is statically linked in as part of build.rs, but I don't want them to end up being publicly visible out of my final artifact.

@bjorn3
Copy link
Member

bjorn3 commented Oct 9, 2020

If you only override malloc for Rust code, that will cause inconsistencies where an explicit malloc is used in C/Rust and then free on the other end of ffi.

This problem extends to overriding it for both Rust code and statically linked C code without also overriding it for dynamically linked C code like libc.

@luser
Copy link
Contributor

luser commented Nov 4, 2020

To speak to what we did in C/C++ for Firefox, we used the -fvisibility=hidden compiler flag to default all symbols to hidden visibility, and then explicitly annotated public symbols with __attribute__ ((visibility ("default"))). We also implemented a system to generate wrappers for system headers whose declarations do not contain visibility annotations. The wrapper headers boil down to basically:

#pragma GCC visibility push(hidden)
#include_next <real_header>
#pragma GCC visibility pop

To @hsivonen's point—the Rust code in Firefox needs to provide a C API to the C/C++ code for FFI purposes, and #[no_mangle] is a part of that. Separating the concerns of name mangling and symbol visibility sounds useful so that in situations like this where the Rust compiler is only producing an intermediate static library that gets linked into the final binary, developers can fine-tune the behavior they need. Additionally, providing solutions that allow overriding behavior is useful because otherwise it's one more thing you need to audit in your dependency graph, and crate authors can't all be expected to understand the nuance here.

@bjorn3
Copy link
Member

bjorn3 commented Nov 4, 2020

I think this is basically what #[rustc_std_internal_symbol] does.

@Manishearth
Copy link
Member

It seems like a useful path forward would be to split #[no_mangle] into #[no_mangle(private)] and #[export_symbol] (alternatively, make #[used] apply to functions), and have #[no_mangle] imply both.

Something similar can be done with #[export_name], it feels strange that symbol renaming is only available via methods that export the symbol.

But also I think there should be a codegen flag for this.

@bjorn3
Copy link
Member

bjorn3 commented Nov 8, 2022

Isn't #[no_mangle(private)] basically #[rustc_std_internal_symbol]? It doesn't export the symbol from cdylib's while still disabling symbol mangling.

By the way why are you using #[no_mangle] if you don't want the symbol to be exported from the cdylib? The standard library needs this to break some cyclic dependencies, but I don't see any reason why regular user code needs it.

@hsivonen
Copy link
Member Author

hsivonen commented Nov 8, 2022

By the way why are you using #[no_mangle] if you don't want the symbol to be exported from the cdylib?

  1. In the Firefox case, the Rust code is compiled into a static lib that gets linked into a larger dylib that also contains C++ code. The Rust code needs to be visible to the C++ code during the static linking step but shouldn't remain visible from the combined dylib.
  2. #[no_mangle] is the idiomatic way how crates on crates.io provide FFI, so the Rust top-level compilation (the omnibus Rust static lib that gets linked with C++) doesn't get to control how the crate ecosystem annotates the things that are currently annotated #[no_mangle], which is why there's a need to be able to control the behavior of #[no_mangle] as it appears across dependencies from the top-level config.

@bjorn3
Copy link
Member

bjorn3 commented Nov 8, 2022

In the Firefox case, the Rust code is compiled into a static lib that gets linked into a larger dylib that also contains C++ code. The Rust code needs to be visible to the C++ code during the static linking step but shouldn't remain visible from the combined dylib.

For this rustc can't do anything as it doesn't control the linker invocation. If it links a cdylib itself it passes a list of all symbols to export to the linker. You did have to emulate this in firefox.

#[no_mangle] is the idiomatic way how crates on crates.io provide FFI, so the Rust top-level compilation (the omnibus Rust static lib that gets linked with C++) doesn't get to control how the crate ecosystem annotates the things that are currently annotated #[no_mangle], which is why there's a need to be able to control the behavior of #[no_mangle] as it appears across dependencies from the top-level config.

I did recommend putting the part of the library that exports the C api that wraps the rust api in a different crate from the one exporting the rust api. If you don't want the C api to be exported you can then simply depend only on the crate with the rust api.

@hsivonen
Copy link
Member Author

hsivonen commented Nov 8, 2022

#[no_mangle] is the idiomatic way how crates on crates.io provide FFI, so the Rust top-level compilation (the omnibus Rust static lib that gets linked with C++) doesn't get to control how the crate ecosystem annotates the things that are currently annotated #[no_mangle], which is why there's a need to be able to control the behavior of #[no_mangle] as it appears across dependencies from the top-level config.

I did recommend putting the part of the library that exports the C api that wraps the rust api in a different crate from the one exporting the rust api. If you don't want the C api to be exported you can then simply depend only on the crate with the rust api.

This doesn't solve the problem that this issue is about: hiding symbols in the resulting shared library formed by linking a static lib generated by Rust compilation with some other object code.

E.g. encoding_rs and encoding_c are a Rust crate and its C API crate. Since Firefox needs the C API, it builds encoding_c as part of its omnibus Rust artifact. Once that's linked with C++ object code to form libxul, having the C API be visible from outside libxul is unwanted.

@bjorn3
Copy link
Member

bjorn3 commented Nov 8, 2022

Once that's linked with C++ object code to form libxul, having the C API be visible from outside libxul is unwanted.

The only way to hide those symbols when linking a rust staticlib into a dylib is to pass the linker a list of symbols to hide. This is not something rustc has any control over. If rustc were to do the linking it does this for you, but in the case of libxul rustc doesn't do the linking and as such the build system that invokes the linker is responsible for making sure the symbols are hidden.

@luser
Copy link
Contributor

luser commented Nov 9, 2022

My previous comment here explained the Firefox solution for C/C++, where we encountered similar issues with third-party code.

@bjorn3
Copy link
Member

bjorn3 commented Nov 9, 2022

Right, hidden visibility doesn't work for rust as we don't yet know in advance if a crate ends up in a rust dylib or cdylib. In the former case hidden visibility is incorrect as a downstream crate may use a function marked with hidden visibility (for example because another function got inlined), while for a cdylib this is not the case. Having a global compiler flag that disables linking as rust dylib only solves part of the issue as the standard library will still not get hidden visibility and libstd is compiled as rust dylib too, so trying to use it with -Zbuild-std will error out.

@sffc
Copy link

sffc commented Nov 9, 2022

The other issue, as raised in #104130, is that we don't want to decide on which symbols to actually link in the cdylib until link time. The alternative is to remove unused APIs via cfgs or similar, but this doesn't scale when there are 500 exported symbols (like in ICU4X). A better workflow is to pass the list of desired symbols to export directly to the linker, but this only works if the symbols have non-default visibility.

@bjorn3
Copy link
Member

bjorn3 commented Nov 9, 2022

A better workflow is to pass the list of desired symbols to export directly to the linker, but this only works if the symbols have non-default visibility.

Rustc passes the list of symbol to link to the linker and this works just fine despite using default visibility. Rustc basically says make everything hidden except this specific list of symbols.

// Write an LD version script
let res: io::Result<()> = try {
let mut f = BufWriter::new(File::create(&path)?);
writeln!(f, "{{")?;
if !symbols.is_empty() {
writeln!(f, " global:")?;
for sym in symbols {
debug!(" {};", sym);
writeln!(f, " {};", sym)?;
}
}
writeln!(f, "\n local:\n *;\n}};")?;
};
if let Err(error) = res {
self.sess.emit_fatal(errors::VersionScriptWriteFailure { error });
}

@sffc
Copy link

sffc commented Nov 9, 2022

I can't speak for ld, but for wasm-ld I've found as in #104130 that even if I filter the list of exported symbols passed to the linker, all explicitly no_mangle symbols are still found and re-exported by the linker. The mechanics for this are explained in the wasm-ld docs page.

@Manishearth
Copy link
Member

@bjorn3 This does not work fine, see #104130, where it's necessary to be able to configure that.

I feel like this issue has had sufficient people with different real-world use cases justifying the need for this, can we move past questioning the need? A lot of these use cases come from complex projects with complex properties, and one thing I've noticed from my years of working on large codebases is that almost always when a large codebase wants feature X, people will suggest a laundry list of things to try instead, and they either won't work or suffer from "I bet that almost works" syndrome

I think the stdlib point is a good one and I suspect #104130 paired with a solution here is a good way to fix that.

@bjorn3
Copy link
Member

bjorn3 commented Nov 9, 2022

I can't speak for ld, but for wasm-ld I've found as in #104130 that even if I filter the list of exported symbols passed to the linker, all explicitly no_mangle symbols are still found and re-exported by the linker. The mechanics for this are explained in the wasm-ld docs page.

That is a wasm specific bug where we fail to hide symbols that are not in the rustc generated list of symbols to be exported. On Windows, macOS and any ELF based system the rustc can pass a specific list and the linker respects this without linking anything else AFAIK.

@bjorn3 This does not work fine, see #104130, where it's necessary to be able to configure that.

That is a different use case from the firefox one. In #104130 rustc is invoking the linker and proving an option when compiling the cdylib to only export specified symbols is realistic, albeit requiring a wasm-ld change to allow rustc to hide all symbols except those passed in through --export given that even wasm-strip+wasm-opt can't reclaim the extra size, I am now convinced that having this option for cdylib crates is fine.

The firefox use case is not something rustc can fix as it doesn't control the linker.

I feel like this issue has had sufficient people with different real-world use cases justifying the need for this, can we move past questioning the need?

I don't question the need, but I think that rustc can't provide the solution for the staticlib crate type, only for the cdylib crate type. For the staticlib crate type I think the only one that can provide a full solution is the thing that invokes the linker by specifying an explicit list of symbols to export. Sure rustc could provide a partial solution using hidden visibility for user crates, but if I understand the post you linked correctly, that is an "I Bet That Almost Works" solution.

@Manishearth
Copy link
Member

That is a different use case from the firefox one

Yes, I understand, I'm highlighting that there are different use cases for the same feature from varied backgrounds.

@Manishearth
Copy link
Member

, I am now convinced that having this option for cdylib crates is fine.

Thanks! Do you have ideas as to what is the best path forward here? @sffc can correct me if I'm wrong on this, but we can either have:

It seems like you're leaning towards option 3?

@bjorn3
Copy link
Member

bjorn3 commented Nov 9, 2022

I think wasm-ld should get a flag that makes it default to hiding all symbols and rustc should use this unconditionally even if none of the suggested flags about controlling #[no_mangle] exports are used. This makes rustc in control of the list of symbols to be exported by cdylib's just like it is on every other platform.

In addition we should get a way to allow the user to control the list of symbols exported from a cdylib that works on any platform. I'm not sure if it should be in the form of an allowlist, in the form of a denylist or if both should be supported. I think implementing it in the form of an allowlist would be best as it avoids accidentally exporting symbols you didn't mean to even as you update dependencies, but you probably know better what works than me.

@Manishearth
Copy link
Member

I guess the wasm-ld stuff occurs upstream, right?

@bjorn3
Copy link
Member

bjorn3 commented Nov 9, 2022

Indeed. It is part of LLVM.

@sffc
Copy link

sffc commented Nov 10, 2022

I found some new info:

rustc is passing --export-dynamic to wasm-ld:

--export-dynamic: When building an executable, export any non-hidden symbols. By default only the entry point and any symbols marked as exports (either via the command line or via the export-name source attribute) are exported.

If I manually remove the --export-dynamic from the wasm-ld invocation, then I get my expected behavior where I only get the symbols listed in --export arguments.

Related: ed56145

@Manishearth
Copy link
Member

Hmm, ideally that flag would be manually set via link-args but given that it's automatically set perhaps we need another flag to turn it off?

@sffc
Copy link

sffc commented Nov 10, 2022

So there is --no-export-dynamic, but it seems that the presence of --export-dynamic earlier in the call chain takes precedence. (EDIT: actually this works fine; --no-export-dynamic can turn off a previous --export-dynamic.) However, it doesn't really address the issue that in Rust we really should have a way to stop all the --export arguments from being added automatically.

If rustc changed such that I could turn off both the --export and the --export-dynamic flags, then I think this issue could be resolved

@RReverser
Copy link
Contributor

then I think this issue could be resolved

You're probably referring to #104130 not to this issue, as it wouldn't be resolved by Wasm fix alone.

@sffc
Copy link

sffc commented Nov 10, 2022

then I think this issue could be resolved

You're probably referring to #104130 not to this issue, as it wouldn't be resolved by Wasm fix alone.

Indeed.

@danakj
Copy link
Contributor

danakj commented Jul 5, 2023

It would seem that any FFI bindings tool is going to end up putting all its FFI functions into the exported symbol list due to this conflated use of #[no_mangle] causing export.

We're using CXX in Chromium, and it does the following:

  • CXX makes functions in Rust that will be called from a generated C++ file, they are #[no_mangle] so that C++ can find them, and use a custom mangling to not collide with other things.
  • The generated C++ files expose friendly C++ Apis and thunks that call through to the #[no_mangle] Rust functions.

We don't want to export all FFI functions from our chrome.dll, or our Mac framework. We can technically use an export list to keep them from being exported on Mac, but Windows linker does not provide the same affordance, so things are much more complicated there.

So we really need FFI functions to not be part of the chrome.dll export list. Right now it looks like this, but with many many more.

        145   90 0AA21FD0 cxxbridge1$std$weak_ptr$f64$clone = cxxbridge1$std$weak_ptr$f64$clone
        146   91 0AA22010 cxxbridge1$std$weak_ptr$f64$downgrade = cxxbridge1$std$weak_ptr$f64$downgrade 
        147   92 0AA220C0 cxxbridge1$std$weak_ptr$f64$drop = cxxbridge1$std$weak_ptr$f64$drop
        148   93 0AA21FB0 cxxbridge1$std$weak_ptr$f64$null = cxxbridge1$std$weak_ptr$f64$null
        149   94 0AA22050 cxxbridge1$std$weak_ptr$f64$upgrade = cxxbridge1$std$weak_ptr$f64$upgrade     
        150   95 0AA21390 cxxbridge1$std$weak_ptr$i16$clone = cxxbridge1$std$weak_ptr$i16$clone
        151   96 0AA213D0 cxxbridge1$std$weak_ptr$i16$downgrade = cxxbridge1$std$weak_ptr$i16$downgrade 
        152   97 0AA21480 cxxbridge1$std$weak_ptr$i16$drop = cxxbridge1$std$weak_ptr$i16$drop
        153   98 0AA21370 cxxbridge1$std$weak_ptr$i16$null = cxxbridge1$std$weak_ptr$i16$null
        154   99 0AA21410 cxxbridge1$std$weak_ptr$i16$upgrade = cxxbridge1$std$weak_ptr$i16$upgrade     
        155   9A 0AA216A0 cxxbridge1$std$weak_ptr$i32$clone = cxxbridge1$std$weak_ptr$i32$clone
        156   9B 0AA216E0 cxxbridge1$std$weak_ptr$i32$downgrade = cxxbridge1$std$weak_ptr$i32$downgrade 
        157   9C 0AA21790 cxxbridge1$std$weak_ptr$i32$drop = cxxbridge1$std$weak_ptr$i32$drop
        158   9D 0AA21680 cxxbridge1$std$weak_ptr$i32$null = cxxbridge1$std$weak_ptr$i32$null
        159   9E 0AA21720 cxxbridge1$std$weak_ptr$i32$upgrade = cxxbridge1$std$weak_ptr$i32$upgrade     
        160   9F 0AA219B0 cxxbridge1$std$weak_ptr$i64$clone = cxxbridge1$std$weak_ptr$i64$clone
        161   A0 0AA219F0 cxxbridge1$std$weak_ptr$i64$downgrade = cxxbridge1$std$weak_ptr$i64$downgrade 
        162   A1 0AA21AA0 cxxbridge1$std$weak_ptr$i64$drop = cxxbridge1$std$weak_ptr$i64$drop
        163   A2 0AA21990 cxxbridge1$std$weak_ptr$i64$null = cxxbridge1$std$weak_ptr$i64$null
        164   A3 0AA21A30 cxxbridge1$std$weak_ptr$i64$upgrade = cxxbridge1$std$weak_ptr$i64$upgrade
        165   A4 0AA21080 cxxbridge1$std$weak_ptr$i8$clone = cxxbridge1$std$weak_ptr$i8$clone
        166   A5 0AA210C0 cxxbridge1$std$weak_ptr$i8$downgrade = cxxbridge1$std$weak_ptr$i8$downgrade   
        167   A6 0AA21170 cxxbridge1$std$weak_ptr$i8$drop = cxxbridge1$std$weak_ptr$i8$drop
        168   A7 0AA21060 cxxbridge1$std$weak_ptr$i8$null = cxxbridge1$std$weak_ptr$i8$null
        169   A8 0AA21100 cxxbridge1$std$weak_ptr$i8$upgrade = cxxbridge1$std$weak_ptr$i8$upgrade       
        170   A9 0AA22640 cxxbridge1$std$weak_ptr$isize$clone = cxxbridge1$std$weak_ptr$isize$clone     
        171   AA 0AA22680 cxxbridge1$std$weak_ptr$isize$downgrade = cxxbridge1$std$weak_ptr$isize$downgrade
        172   AB 0AA226D0 cxxbridge1$std$weak_ptr$isize$drop = cxxbridge1$std$weak_ptr$isize$drop       
        173   AC 0AA22620 cxxbridge1$std$weak_ptr$isize$null = cxxbridge1$std$weak_ptr$isize$null

For clarity, there is a separate issue on Linux/Mac of all Rust symbols ending up in the DLL export list, but these are not #[no_mangle] so I think that should be tracked elsewhere (#37530 ?).

Chromium bug for this, which blocks Windows: https://bugs.chromium.org/p/chromium/issues/detail?id=1462356

@amyspark
Copy link

Hi all, I found that this is a big issue when attempting to build the pixbufloader_svg component of librsvg: https://gitlab.gnome.org/GNOME/librsvg/-/merge_requests/925

For the time being, I'm relying on manually link_whole ing the library into a shared version with a list of exports. (Alternatively, for macOS, single-object prelinking can be done to the static library, or stripping for Linux.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

9 participants