Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --symbol-prefix flag #1375

Open
joshlf opened this issue Aug 27, 2018 · 7 comments

Comments

@joshlf
Copy link
Contributor

@joshlf joshlf commented Aug 27, 2018

I'm trying to integrate a C library which supports symbol renaming. I want to use this feature in order to allow consumers to depend on multiple versions of my crate without encountering symbol conflicts during build.

In order to do this, I'd like the C symbol FOO to be exposed through Rust as FOO - the name people expect. However, I will invoke the C library's build system in a way that causes the emitted object file to expose the name CUSTOM_PREFIX_FOO. Thus, I need to also emit a #[link(name = "CUSTOM_PREFIX_FOO")] attribute on the Rust-side FOO declaration.

It would be great if bindgen supported this natively, e.g. via a --symbol-prefix flag. This feature would instruct bindgen to emit link attributes as described above. Concretely:

  • Original C name - FOO
  • C name after prefixing (this is what's exported from the object files) - CUSTOM_PREFIX_FOO
  • Rust name (what consumers of the crate see) - FOO
  • Attribute on Rust-side declaration - #[link(name = "CUSTOM_PREFIX_FOO")]
@emilio

This comment has been minimized.

Copy link
Collaborator

@emilio emilio commented Aug 28, 2018

We should use the C link_name if it differs from the function name. Do you have an example C code?

That being said, I can see a function in ParseCallbacks to tweak the mangled name of a function declaration, that'd be relatively trivial. Would you be interested in implementing it?

@joshlf

This comment has been minimized.

Copy link
Contributor Author

@joshlf joshlf commented Aug 28, 2018

We should use the C link_name if it differs from the function name. Do you have an example C code?

Unfortunately that probably won't work. The way the renaming happens on the C side is with a macro variable that is set if you want a symbol prefix. So if we set that macro variable for the invocation of bindgen, then bindgen will see symbols like CUSTOM_PREFIX_FOO, and will thus emit Rust names with the prefix (which we don't want). If, on the other hand, we invoke bindgen without the macro variable, so it sees symbols like FOO, then it will emit the Rust names without the prefix (which we want) but Rust will attempt to link against the same names, which won't work because when we actually compile the C library, we'll do it with the prefix enabled.

That being said, I can see a function in ParseCallbacks to tweak the mangled name of a function declaration, that'd be relatively trivial. Would you be interested in implementing it?

Sure thing. Could I get some basic mentoring instructions? I'm not familiar with the codebase. Also, would this enable us to provide a command-line flag? Right now, I'm invoking bindgen as a CLI tool.

@emilio

This comment has been minimized.

Copy link
Collaborator

@emilio emilio commented Aug 29, 2018

Unfortunately that probably won't work. The way the renaming happens on the C side is with a macro variable that is set if you want a symbol prefix. So if we set that macro variable for the invocation of bindgen, then bindgen will see symbols like CUSTOM_PREFIX_FOO, and will thus emit Rust names with the prefix (which we don't want). If, on the other hand, we invoke bindgen without the macro variable, so it sees symbols like FOO, then it will emit the Rust names without the prefix (which we want) but Rust will attempt to link against the same names, which won't work because when we actually compile the C library, we'll do it with the prefix enabled.

I see, that makes sense, thanks. You could re-export the rust items with different names, something like:

mod bindings {
    // Bindgen-generated code.
}

pub use bindings::{prefix_my_function as my_function, prefix_my_other_function as my_other_function, ...};

But agree it's not ideal.

Sure thing. Could I get some basic mentoring instructions? I'm not familiar with the codebase. Also, would this enable us to provide a command-line flag? Right now, I'm invoking bindgen as a CLI tool.

Sure, happy to!

The place we store the mangled name of a function is:

https://github.com/rust-lang-nursery/rust-bindgen/blob/0db9588e84463c211340f0bc2cb6da2b3a99c441/src/ir/function.rs#L80

That gets computed in:

https://github.com/rust-lang-nursery/rust-bindgen/blob/0db9588e84463c211340f0bc2cb6da2b3a99c441/src/ir/function.rs#L243

That function should already have all the arguments you need to get a hand to the options. The options you can get from ctx.options().

For the parse callbacks, you can see an example here for example:

https://github.com/rust-lang-nursery/rust-bindgen/blob/0db9588e84463c211340f0bc2cb6da2b3a99c441/src/ir/var.rs#L141

I think it would make sense an MVP API like:

pub fn link_name_override(&self, function_name: &str) -> Option<String>;

Where you can get the name passed by the caller, or using cursor.spelling() directly (we tweak a bit the spelling for destructors, not sure what's best there, but I think this API wouldn't be very useful for C++ so I think doing the simple thing for now makes sense).

To make it work for the CLI you need to add some kind of option. I guess only an option to prefix it for now given it suits your needs should be fine. You'd look at src/options.rs, fill around all the boilerplate and docs and add a Builder method as well. That should be a bit boring but should be fine.

Once you have that you could do something like the following in cursor_name (I made up the priority of the conditions, mixing the options and the callbacks would make little sense):

if let Some(callbacks) = ctx.parse_callbacks() {
    if let Some(override) = callbacks.link_name_override(&cursor.spelling()) {
         return Some(override);
    }
}

if let Some(prefix) = ctx.options().link_name_prefix {
    let mut mangled = cursor.spelling();
    mangled.insert_str(0, prefix);
    return Some(mangled);
}

// ...

That should do I'd think. Then you'd need to add a test for this, via the command line flags (see the tests in tests/headers for examples), an integration test (see the bindgen-integration directory), or both :)

Hope it helps, thanks!

@joshlf

This comment has been minimized.

Copy link
Contributor Author

@joshlf joshlf commented Aug 30, 2018

Awesome, those are fantastic instructions; thanks!

A few questions:

  • Is there a way to detect whether a symbol is a C++ symbol or not? It looks like there are different methods that return manglings (Cursor::cxx_manglings and Cursor::mangling), but neither seem like they can be used to reliably determine whether something is a C++ symbol.
  • How would you feel about this feature being C/asm-only for the time being? I personally have no use for C++ mangling at the moment, and it would probably require very different code (probably something like "give me an inline namespace and I'll assume that all C++ symbols were put in that namespace and then compute the appropriate mangled symbol name").

I think it would make sense an MVP API like:

pub fn link_name_override(&self, function_name: &str) -> Option<String>;

Where you can get the name passed by the caller, or using cursor.spelling() directly (we tweak a bit the spelling for destructors, not sure what's best there, but I think this API wouldn't be very useful for C++ so I think doing the simple thing for now makes sense).

I'm not sure I understand who would be calling that method, and what they'd be passing for function_name. And what type is this a method on?

@emilio

This comment has been minimized.

Copy link
Collaborator

@emilio emilio commented Aug 31, 2018

Is there a way to detect whether a symbol is a C++ symbol or not? It looks like there are different methods that return manglings (Cursor::cxx_manglings and Cursor::mangling), but neither seem like they can be used to reliably determine whether something is a C++ symbol.

Not really. Maybe using the parent ID of the function or something you can tell something is definitely C++, but... In any case given this is opt-in I don't think it's a big issue.

How would you feel about this feature being C/asm-only for the time being? I personally have no use for C++ mangling at the moment, and it would probably require very different code (probably something like "give me an inline namespace and I'll assume that all C++ symbols were put in that namespace and then compute the appropriate mangled symbol name").

I think doing the simple thing even if it isn't amazing for C++ so far makes sense.

I'm not sure I understand who would be calling that method, and what they'd be passing for function_name. And what type is this a method on?

It'd be a method inside the this trait: https://github.com/rust-lang-nursery/rust-bindgen/blob/0db9588e84463c211340f0bc2cb6da2b3a99c441/src/callbacks.rs#L26

Which is a customization hook for people that use bindgen at build.rs time. The example code in my comment above uses and calls that.

@wjakobczyk

This comment has been minimized.

Copy link

@wjakobczyk wjakobczyk commented Oct 22, 2019

Are there any chances of getting this solved in mainline?

@joshlf

This comment has been minimized.

Copy link
Contributor Author

@joshlf joshlf commented Oct 22, 2019

As you may have guessed by the fact that I haven't touched this in over a year, I didn't end up having time to work on it. If you'd like somebody to, I'd suggest posting this issue in This Week in Rust's Call for Participation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.