Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental compilation fails when a generic function uses a private symbol #53929

Open
mjbshaw opened this issue Sep 3, 2018 · 10 comments
Open
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@mjbshaw
Copy link
Contributor

mjbshaw commented Sep 3, 2018

The way that incremental compilation works breaks code that uses private symbols. For example, the following code will fail to compile when using incremental compilation (e.g. the default cargo build), but will build successfully when incremental compilation is disabled (e.g. the default cargo build --release):

// I have reproduced this issue for the following targets:
// mips-unknown-linux-gnu
// x86_64-unknown-linux-gnu
// x86_64-apple-darwin
// aarch64-apple-ios
fn foo<T>() {
    // See m:<mangling> at https://llvm.org/docs/LangRef.html#data-layout
    // This is only reproducible when using ELF, Mips, or Mach-O mangling.
    #[cfg_attr(all(target_os = "linux", not(target_arch = "mips")),
               export_name = "\x01.L_this_is_a_private_symbol")]
    #[cfg_attr(any(target_os = "macos", target_os = "ios"),
               export_name = "\x01L_this_is_a_private_symbol")]
    #[cfg_attr(target_arch = "mips", export_name = "\x01$_this_is_a_private_symbol")]
    static X: usize = 0;

    // Use the static symbol in a way that prevents the optimizer from
    // optimizing it away.
    unsafe { std::ptr::read_volatile(&X as *const _); }
}

fn main() {
    foo::<usize>();
}

Output (for both cargo build and cargo build --release):

$ cargo build
   Compiling zzz v0.1.0 (file:///private/tmp/zzz)
LLVM ERROR: unsupported relocation of undefined symbol 'L_this_is_a_private_symbol'
error: Could not compile `zzz`.

To learn more, run the command again with --verbose.
$ cargo build --release
   Compiling zzz v0.1.0 (file:///private/tmp/zzz)
    Finished release [optimized] target(s) in 0.26s

I have reproduced this issue for the following targets:

  • mips-unknown-linux-gnu
  • x86_64-unknown-linux-gnu
  • x86_64-apple-darwin
  • aarch64-apple-ios

I'm interested in fixing this issue, and have been trying to better understand librustc_mir/monomorphize/partitioning.rs (which I assume is where the problem is). I'll update this issue if I've made any progress, but I would appreciate any guidance anyone might be able to offer.

These private symbols are useful for low-level FFI work (e.g. the linker on macOS/iOS will deduplicate selector names only if they have a private symbol name).

@mjbshaw
Copy link
Contributor Author

mjbshaw commented Sep 3, 2018

I forgot to mention what the cause of the error is:

The private symbol is partitioned into a different codegen unit than the generic function, which means the generic function is attempting to reference a different module's private symbol, which obviously fails.

I believe the correct fix would be to modify the partitioning code so that private symbols and functions that reference them are always partitioned together into a codegen unit.

@pnkfelix pnkfelix added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-incr-comp Area: Incremental compilation WG-incr-comp Working group: Incremental compilation labels Sep 4, 2018
@michaelwoerister
Copy link
Member

Yes, the partitioning code does not really handle setting explicit linkage very well yet.

Incremental compilation will create two compilation units for each Rust module, one for the generic code and one for the non-generic code. So I think we would need to treat private symbols the same as inline functions: instantiate them in every CGU they are accessed in. However, that could have surprising. Consider the following example:

mod foo {
    #[linkage="private"]
    static mut FOO: usize = 0;

    fn inc_non_generic() {
        unsafe { FOO += 1; }
    }

    fn inc_generic<T>(stuff: &Vec<T>) {
        unsafe { FOO += stuff.len(); }
    }
}

You'd think that these access the same FOO but the two functions would actually each access a private copy. That's very much not what one would expect. So anything that accesses something with private or internal linkage would have to be forced into the same CGU. It's doable but it would be a bit of work.

@mjbshaw
Copy link
Contributor Author

mjbshaw commented Sep 4, 2018

Yeah, as much as I would love to be able to inline a static (so it gets duplicated across codegen units), I too think that would be too surprising. I think Rust would need some kind of annotation that allowed developers to opt-in to that kind of behavior on a per-static basis.

For this particular issue at hand, I think a minimal solution would be acceptable. That is, a private static nested inside of a generic function should always go in the same codegen unit as the monomorphisations of that generic function. This minimal solution would suffice for my personal needs.

Extending that further, a slightly better solution (but I'm guessing more work) would be to make sure that private statics outside of the function (but within the same module) get put into the same codegen units as the functions that use them.

The "nuclear" solution is to make sure that all code (even separate modules) that uses private statics gets put into the same codegen unit. That sounds like the most work to me and I don't think it's currently worth it.

The implication of neither duplicating private statics nor merging code that uses them into a single codegen unit is that these functions cannot be inlined, or else linking will fail. The user must explicitly annotate them with #[inline(never)]. I think that's acceptable for the time being.

@michaelwoerister
Copy link
Member

One thing we could experiment with -- and which would make sure that things always work as long as they are in the same module -- is to make incremental compilation not split generic and non-generic code into different CGUs. That might make compilation slower (because the cache is hit less often during incr. comp.) but it might also make compilation faster (because less per-CGU work has to be duplicated, such as instantiating inline functions). That would be a simple change to make and we could test it on perf.rust-lang.org before merging.

@michaelwoerister
Copy link
Member

One thing we could experiment with -- and which would make sure that things always work as long as they are in the same module -- is to make incremental compilation not split generic and non-generic code into different CGUs.

I tried this out in #53963 but unfortunately the performance hit was too big.

@pnkfelix pnkfelix added the A-linkage Area: linking into static, shared libraries and binaries label Sep 20, 2018
@jonas-schievink jonas-schievink added the C-bug Category: This is a bug. label Aug 19, 2019
@wesleywiser wesleywiser self-assigned this Nov 17, 2020
@pnkfelix
Copy link
Member

discussed in today's meeting; wg-incr-comp thinks this might be fixed...

@wesleywiser
Copy link
Member

This does not seem to be fixed. cargo build now reports a slightly different error:

LLVM ERROR: Undefined temporary symbol .L_this_is_a_private_symbol

@wesleywiser wesleywiser removed their assignment Dec 1, 2020
@nvzqz
Copy link
Contributor

nvzqz commented Dec 16, 2020

I would like to use a \x01-prefixed static across crates with inlining. What's the best course of action right now to accomplish this?

In 1.48 I'm hitting LLVM ERROR: unsupported relocation of undefined symbol like the original post. However, I'm not using a generic function.

My use case is creating \x01L_OBJC_SELECTOR_REFERENCES_XX and \x01L_OBJC_METH_VAR_NAME_XX symbols that get picked up by the Objective-C runtime at program start.

nvzqz added a commit to nvzqz/fruity that referenced this issue Dec 17, 2020
This uses the same approach as native selectors in Objective-C. Swift
selectors are similar, and in this case would use `L_selector(class)`
for the selector reference and `L_selector_data(class)` for the "class"
string data.

This is a step towards #2.

However, this currently fails to compile when used externally. That is
because a private symbols (required by linker for selector sections)
cannot cross compilation units. This appears to be a bug in LLVM.
See rust-lang/rust#53929 for current status.
@madsmtm
Copy link
Contributor

madsmtm commented Jun 22, 2022

Yeah as @nvzqz noted, this problem is not specific to generics nor incremental compilation, it can also happen across crate boundaries if #[inline] is specified. The following code:

// crate foo
#[cfg_attr(
    all(target_os = "linux", not(target_arch = "mips")),
    export_name = "\x01.L_this_is_a_private_symbol"
)]
#[cfg_attr(
    any(target_os = "macos", target_os = "ios"),
    export_name = "\x01L_this_is_a_private_symbol"
)]
#[cfg_attr(target_arch = "mips", export_name = "\x01$_this_is_a_private_symbol")]
pub static X: usize = 42;

#[inline]
pub fn foo() -> usize {
    unsafe { core::ptr::read_volatile(&X) }
}
// crate bar
use foo::foo;

pub fn bar() {
    assert_eq!(foo(), 42);
}

Fails with unsupported relocation of undefined symbol (macOS/iOS) or Undefined temporary symbol (Linux).

@madsmtm
Copy link
Contributor

madsmtm commented Jun 22, 2022

I think the way to fix this is probably not in the partitioning code, it fundamentally can't do this across crates!

Instead, we would have to find a way to either:

  • Duplicate the statics as noted
  • Fix LLVM / our LLVM output, and make the code link as-is - while the symbol is somewhat special to linkers, I can't really see a reason why you shouldn't be able to access them from another codegen unit / object file? But perhaps someone else can give me some pointers here?

@saethlin saethlin removed A-incr-comp Area: Incremental compilation WG-incr-comp Working group: Incremental compilation labels Sep 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants