Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate C code to export static inline functions #1090

Closed
jrmuizel opened this issue Oct 21, 2017 · 22 comments · Fixed by #2335
Closed

Generate C code to export static inline functions #1090

jrmuizel opened this issue Oct 21, 2017 · 22 comments · Fixed by #2335

Comments

@jrmuizel
Copy link
Contributor

https://github.com/intelxed/xed has a bunch of inline functions as part of its interface. In order to make the xed bindings usable (valarauca/xed-sys#1) we need to export them somehow. If bindgen could generate C code that just wraps the static inline functions we would be able to use bindgen to generate bindings to that code which would solve the problem.

@emilio
Copy link
Contributor

emilio commented Oct 21, 2017

For now, you can also compile it with -fno-inline-functions, I think.

@jrmuizel
Copy link
Contributor Author

Won't that prevent inlining inside xed?

@jrmuizel
Copy link
Contributor Author

Further, -fno-inline-functions and .generate_inline_functions(true) doesn't actually seem to generate bindings for those functions. Perhaps because they are static inline?

@emilio
Copy link
Contributor

emilio commented Oct 21, 2017

Won't that prevent inlining inside xed?

Yup, I think so, which is unfortunate, I agree.

Further, -fno-inline-functions and .generate_inline_functions(true) doesn't actually seem to generate bindings for those functions. Perhaps because they are static inline?

Yeah, if they aren't visible in other translation units we can't generate symbols for them... I guess wrapping them is the best solution then

@jrmuizel
Copy link
Contributor Author

What's the best approach to implement this?

@jrmuizel
Copy link
Contributor Author

It looks like currently static inline functions end up in the ir with a name of None.

@fitzgen
Copy link
Member

fitzgen commented Oct 23, 2017

-fkeep-inline-functions should allow the functions to remain inlined at call sites, but still give us something to link to, IIUC.


This feature is something I've wanted for a while, but haven't had the need to actually implement. The biggest open questions I have are:

  • How are we going to integrate this feature with our test suite? Emit tests/expectations/tests/whatever.c next to tests/expectations/tests/whatever.rs?

  • What will the user ergonomics be like? Can we compile the emitted C for the user automatically from build.rs? As much as possible, I'd like to avoid users having to manually add new dependencies on things like gcc or cmake crates and add even more stuff to their build.rs.


What's the best approach to implement this?

A new pass over the final IR, probably called from inside ir::context::BindgenContext::gen right before or after our existing codegen. This should probably live in a new module next to codegen and ir named cgen or inlgen something like that but better :-p

It would be awesome if there was a c_quote! { ... } macro... but that's a pretty tall order if one doesn't exist already.

It looks like currently static inline functions end up in the ir with a name of None.

Unfortunately, I don't have any insight into why this might be off the top of my head.

@jrmuizel
Copy link
Contributor Author

It looks like currently static inline functions end up in the ir with a name of None.

Unfortunately, I don't have any insight into why this might be off the top of my head.

This is fixed with #1091

I'm probably not going to look at this again anytime soon as I've switched from using https://github.com/valarauca/xed-sys/ to https://github.com/zyantific/zydis-rs which doesn't have this problem.

@kmod-midori
Copy link

static inline functions are still not generated with the option enabled when I tested on mbedTLS.

@noahgibbs
Copy link

We have this trouble on YJIT quite extensively - commenting and subscribing here :-)

@chrysn
Copy link
Contributor

chrysn commented May 18, 2022 via email

@noahgibbs
Copy link

Just using C2Rust might not work well for us - one thing we need is more support for complex preprocessor defines than bindgen gives (e.g. operators on constants in a define.) In general, we use a lot of C preprocessor constants. Could work well for somebody else, though! Some method of integrating the two tools could be interesting. But it also sounds like it doesn't really exist yet. So we'd need to do a significant amount of tools work on our large existing codebase to make it happen. Not impossible, but not on our short-term roadmap, I don't think.

@pvdrz
Copy link
Contributor

pvdrz commented Aug 11, 2022

I was thinking about tackling this issue. However, I'd like to understand a bit better what needs to be done first. Let's say we have a header file header.h like this:

static inline int foo();

In my head what would happen is that we would generate an intermediate c file:

#include <header.h>

int foo_wo_inline() {
    return foo();
}

and then this file would be processed by bindgen and produce something like this:

extern "C" {
    fn foo_no_inline() -> ::std::os::raw::c_int;
}

pub unsafe extern "C" fn foo() -> ::std::os::raw::c_int {
    foo_no_inline()
}

Despite of the name juggling, is this a good way to approach this issue?

@jschwe
Copy link
Contributor

jschwe commented Aug 15, 2022

@pvdrz That sounds basically like what we are currently doing manually in our project. Some comments:

  • It might be better to generate a C file and a header file. Bindgen just needs to process the new header file.
  • The generated C file should be put into a configurable location, since it needs to be integrated into the C build system - otherwise the new symbol will not be present.

On the Rust side I think using link_name like this might be an improvement:

extern "C" {
    #[link_name = "foo_no_inline"]
    fn foo() -> ::std::os::raw::c_int;
}

@pvdrz
Copy link
Contributor

pvdrz commented Aug 16, 2022

I started taking a deeper look at this today and I have a couple questions (@emilio, @fitzgen):

  • I was thinking on implementing this c_quote/cgen functionality at least to be able to generate the headers and extra C code we need. Would it be OK for you if this was done via a trait implementation for the elements of the IR we need? It would be something like:

    trait Emit {
        fn emit<W: io::Write>(&self, ctx: &BindgenContext, buf: W);
    }
  • If we generate this extra C code in BindgenContext::gen how do we get bindgen to process these new files and merge the output with the already existing output from the original header file? Should we create a new BindgenContext or even a new Builder, process the extra code and then merge the resulting items?

Update: I have a very incomplete but working code generator for C and is good enough to emit the wrapper functions we need. The only thing I need to move forward is knowing how to handle these "two stages" where we first find all the static inline functions and then run bindgen again over the generated files.

@pvdrz
Copy link
Contributor

pvdrz commented Aug 26, 2022

I've done some progress this week.

Right now when a header file has inline functions like this one:

typedef int num;

static inline num foo(num arg) { return arg + 42; }
inline num bar(num arg) { return arg + 0x45; }
num baz(num arg);

the following is printed to stdout:

[bindgen-tutorial 0.1.0] cargo:rerun-if-changed=new_wrapper.h
[bindgen-tutorial 0.1.0] FOUND INLINED FUNCTION: foo
[bindgen-tutorial 0.1.0] FOUND INLINED FUNCTION: bar
[bindgen-tutorial 0.1.0] HEADERS
[bindgen-tutorial 0.1.0] headers: int foo__extern(int arg);
[bindgen-tutorial 0.1.0] headers: int bar__extern(int arg);
[bindgen-tutorial 0.1.0] WRAPPERS
[bindgen-tutorial 0.1.0] wrappers: int foo__extern(int arg) { return foo(arg); }
[bindgen-tutorial 0.1.0] wrappers: int bar__extern(int arg) { return bar(arg); }

@pvdrz
Copy link
Contributor

pvdrz commented Aug 30, 2022

@emilio I have a question before going any further on this issue.

It seems the most robust way of generating the necessary rust code for inlined functions is using bindgen to process the generated c code that wraps the inlined functions.

Regardless of how this c code was generated. We would need to compile the c code first and then "re-run" bindgen over it respecting all the BindgenOptions that were set by the user.

The only way to do this that I can think of is making BindgenOptions: Clone, clone the options and reuse them to "re-run" bindgen. But that's not possible right now because BindgenOptions hold non-clonable state that isn't actually provided by the user a options.

So the question is: Would it make sense to split BindgenOptions into the actual user input and the internal state, clone this user input and re-run bindgen to process the new header files?

@chrysn
Copy link
Contributor

chrysn commented Aug 31, 2022 via email

@pvdrz
Copy link
Contributor

pvdrz commented Aug 31, 2022

Regardless of how this c code was generated. We would need to compile the c code first and then "re-run" bindgen over it respecting all the BindgenOptions that were set by the user.
I'd be careful about compiling code ourselves (because the CFLAGs we see might not be fully accurate for compilation, eg. because C code is compiled by GCC, and the options were stripped down to support LLVM). At least in my use case (riot-sys), it would be prefereable to have the C file exported somewhere where the remaining build system could pick it up and build it together with the rest of the C code.

I think that one way to solve this would be to add some sort of callback so the user can specify "how" to compile this generated c code and then bindgen do the final step of generating the bindings including the generated header file. Does that sound like a good solution?

@jschwe
Copy link
Contributor

jschwe commented Aug 31, 2022

Regardless of how this c code was generated. We would need to compile the c code first and then "re-run" bindgen

Why does the C-code need to be compiled before bindgen can be run? Why can't bindgen just generate the new header files first, and then generate the bindings? I'm a bit confused why compilation suddenly is necessary.

@pvdrz
Copy link
Contributor

pvdrz commented Aug 31, 2022

Regardless of how this c code was generated. We would need to compile the c code first and then "re-run" bindgen

Why does the C-code need to be compiled before bindgen can be run? Why can't bindgen just generate the new header files first, and then generate the bindings? I'm a bit confused why compilation suddenly is necessary.

Yeah you're right. I thought that compiling before would serve as a "sanity check" for the generated code but in principle bindgen should be able to generate bindings for the new header files without compiling the new source code beforehand.

@pvdrz
Copy link
Contributor

pvdrz commented Sep 2, 2022

Ok I have a very hacky PoC here. The pipeline is be the following:

  • Save the builder original state in case we need to rerun it later (this is done in a hacky way using the command line arguments 😬)
  • If the header files contain inlined function declarations and generate_inline_functions is enabled:
    • A new header and source files are created with wrappers for all the inlined functions (if the function is foo it is wrapped as foo_extern).
    • The builder original state is recreated but the generate_inline_functions option is disabled and the header inputs are updated to include generated header file.
    • Binding generation is run again. If a function contains the __extern suffix, the canonical name is modified to remove it but the link name is respected.

From here the user can compile the generated source files into a library and use the println!("cargo:...") options to link the library.


I think this is a good approach as it can be done incrementally without a huge PR:

  • Split BindgenOptions so we can keep the actual user input immutably and we can use it later to rerun binding generation.
  • Add options to specify the directory and name of the generated header and source files and the suffix we use for the wrappers.
  • Add a generator for the header and source files. Mine is pretty naive but it only relies on the bindgen IR and has no extra dependencies. I think this can be handled incrementally if it is OK to bail out when code cannot be generated. I suppose some code generation is better than no code generation.
  • Modify the generate methods so we can re-run bindgen with the new headers. Also use the suffix to fix the generated function's names.
  • Update the user manual with an example on how to compile and link against the generated library.

There are alternatives here like:

  • Try to use something like clang_reparseTranslationUnit and include the new headers here to avoid parsing everything from scratch but I'm not sure if that would work.
  • Use an existing code generator or try to use the clang AST directly instead of the bindgen's IR.

I'd love to have some input before committing to any path in particular :)

pvdrz added a commit to ferrous-systems/rust-bindgen that referenced this issue Oct 7, 2022
This adds a mechanism so `bindgen` is able to run `Bindings::generate`
multiple times with the same user input if the `generate_static_inline`
option is enabled and `GenerateError::ShouldRestart` is returned by
`Bindings::generate`.

This is done to eventually solve rust-lang#1090 which would require to check for
any static inline functions and generate a new header file to be used as
an extra input and run `Bindings::generate` again.
pvdrz added a commit to ferrous-systems/rust-bindgen that referenced this issue Oct 12, 2022
This adds a mechanism so `bindgen` is able to run `Bindings::generate`
multiple times with the same user input if the `generate_static_inline`
option is enabled and `GenerateError::ShouldRestart` is returned by
`Bindings::generate`.

This is done to eventually solve rust-lang#1090 which would require to check for
any static inline functions and generate a new header file to be used as
an extra input and run `Bindings::generate` again.
pvdrz added a commit that referenced this issue Nov 1, 2022
* Run `Bindings::generate` again if required.

This adds a mechanism so `bindgen` is able to run `Bindings::generate`
multiple times with the same user input if the `generate_static_inline`
option is enabled and `GenerateResult::ShouldRestart` is returned by
`Bindings::generate`.

This is done to eventually solve #1090 which would require to check for
any static inline functions and generate a new header file to be used as
an extra input and run `Bindings::generate` again.
pvdrz added a commit to ferrous-systems/rust-bindgen that referenced this issue Nov 4, 2022
If bindgen finds an inlined function and the
`--generate-inline-functions` options is enabled, then:

- It will generate two new source and header files with external
  functions that wrap the inlined functions.
- Rerun `Bindings::generate` using the new header file to include these
  wrappers in the generated bindings.

The following options were added:
- `--extern-function-suffix=<suffix>`: Adds <suffix> to the name of each
  external wrapper function (`__extern` is used by default).
- `--extern-functions-file-name=<name>`: Uses <name> as the file name
  for the header and source files (`extern` is used by default).
- `--extern-function-directory=<dir>`: Creates the source and header
  files inside <dir> (`/tmp/bindgen` is used by default).

The C code serialization is experimental and only supports a very
limited set of C functions.

Fixes rust-lang#1090.
pvdrz added a commit to ferrous-systems/rust-bindgen that referenced this issue Nov 4, 2022
If bindgen finds an inlined function and the
`--generate-inline-functions` options is enabled, then:

- It will generate two new source and header files with external
  functions that wrap the inlined functions.
- Rerun `Bindings::generate` using the new header file to include these
  wrappers in the generated bindings.

The following options were added:
- `--extern-function-suffix=<suffix>`: Adds <suffix> to the name of each
  external wrapper function (`__extern` is used by default).
- `--extern-functions-file-name=<name>`: Uses <name> as the file name
  for the header and source files (`extern` is used by default).
- `--extern-function-directory=<dir>`: Creates the source and header
  files inside <dir> (`/tmp/bindgen` is used by default).

The C code serialization is experimental and only supports a very
limited set of C functions.

Fixes rust-lang#1090.
pvdrz added a commit to ferrous-systems/rust-bindgen that referenced this issue Nov 8, 2022
If bindgen finds an inlined function and the
`--generate-extern-functions` options is enabled, then:

- It will generate two new source and header files with external
  functions that wrap the inlined functions.
- Rerun `Bindings::generate` using the new header file to include these
  wrappers in the generated bindings.

The following additional options were added:
- `--extern-function-suffix=<suffix>`: Adds <suffix> to the name of each
  external wrapper function (`__extern` is used by default).
- `--extern-functions-file-name=<name>`: Uses <name> as the file name
  for the header and source files (`extern` is used by default).
- `--extern-function-directory=<dir>`: Creates the source and header
  files inside <dir> (`/tmp/bindgen` is used by default).

The C code serialization is experimental and only supports a very
limited set of C functions.

Fixes rust-lang#1090.
pvdrz added a commit to ferrous-systems/rust-bindgen that referenced this issue Nov 9, 2022
If bindgen finds an inlined function and the
`--generate-extern-functions` options is enabled, then:

- It will generate two new source and header files with external
  functions that wrap the inlined functions.
- Rerun `Bindings::generate` using the new header file to include these
  wrappers in the generated bindings.

The following additional options were added:
- `--extern-function-suffix=<suffix>`: Adds <suffix> to the name of each
  external wrapper function (`__extern` is used by default).
- `--extern-functions-file-name=<name>`: Uses <name> as the file name
  for the header and source files (`extern` is used by default).
- `--extern-function-directory=<dir>`: Creates the source and header
  files inside <dir> (`/tmp/bindgen` is used by default).

The C code serialization is experimental and only supports a very
limited set of C functions.

Fixes rust-lang#1090.
pvdrz added a commit to ferrous-systems/rust-bindgen that referenced this issue Nov 22, 2022
If bindgen finds an inlined function and the
`--generate-extern-functions` options is enabled, then:

- It will generate two new source and header files with external
  functions that wrap the inlined functions.
- Rerun `Bindings::generate` using the new header file to include these
  wrappers in the generated bindings.

The following additional options were added:
- `--extern-function-suffix=<suffix>`: Adds <suffix> to the name of each
  external wrapper function (`__extern` is used by default).
- `--extern-functions-file-name=<name>`: Uses <name> as the file name
  for the header and source files (`extern` is used by default).
- `--extern-function-directory=<dir>`: Creates the source and header
  files inside <dir> (`/tmp/bindgen` is used by default).

The C code serialization is experimental and only supports a very
limited set of C functions.

Fixes rust-lang#1090.
pvdrz added a commit to ferrous-systems/rust-bindgen that referenced this issue Nov 24, 2022
If bindgen finds an inlined function and the
`--generate-extern-functions` options is enabled, then:

- It will generate two new source and header files with external
  functions that wrap the inlined functions.
- Rerun `Bindings::generate` using the new header file to include these
  wrappers in the generated bindings.

The following additional options were added:
- `--extern-function-suffix=<suffix>`: Adds <suffix> to the name of each
  external wrapper function (`__extern` is used by default).
- `--extern-functions-file-name=<name>`: Uses <name> as the file name
  for the header and source files (`extern` is used by default).
- `--extern-function-directory=<dir>`: Creates the source and header
  files inside <dir> (`/tmp/bindgen` is used by default).

The C code serialization is experimental and only supports a very
limited set of C functions.

Fixes rust-lang#1090.
pvdrz added a commit to ferrous-systems/rust-bindgen that referenced this issue Dec 9, 2022
If bindgen finds an inlined function and the
`--generate-extern-functions` options is enabled, then:

- It will generate two new source and header files with external
  functions that wrap the inlined functions.
- Rerun `Bindings::generate` using the new header file to include these
  wrappers in the generated bindings.

The following additional options were added:
- `--extern-function-suffix=<suffix>`: Adds <suffix> to the name of each
  external wrapper function (`__extern` is used by default).
- `--extern-functions-file-name=<name>`: Uses <name> as the file name
  for the header and source files (`extern` is used by default).
- `--extern-function-directory=<dir>`: Creates the source and header
  files inside <dir> (`/tmp/bindgen` is used by default).

The C code serialization is experimental and only supports a very
limited set of C functions.

Fixes rust-lang#1090.
pvdrz added a commit to ferrous-systems/rust-bindgen that referenced this issue Jan 16, 2023
If bindgen finds an inlined function and the
`--generate-extern-functions` options is enabled, then:

- It will generate two new source and header files with external
  functions that wrap the inlined functions.
- Rerun `Bindings::generate` using the new header file to include these
  wrappers in the generated bindings.

The following additional options were added:
- `--extern-function-suffix=<suffix>`: Adds <suffix> to the name of each
  external wrapper function (`__extern` is used by default).
- `--extern-functions-file-name=<name>`: Uses <name> as the file name
  for the header and source files (`extern` is used by default).
- `--extern-function-directory=<dir>`: Creates the source and header
  files inside <dir> (`/tmp/bindgen` is used by default).

The C code serialization is experimental and only supports a very
limited set of C functions.

Fixes rust-lang#1090.
pvdrz added a commit to ferrous-systems/rust-bindgen that referenced this issue Jan 16, 2023
If bindgen finds an inlined function and the
`--generate-extern-functions` options is enabled, then:

- It will generate two new source and header files with external
  functions that wrap the inlined functions.
- Rerun `Bindings::generate` using the new header file to include these
  wrappers in the generated bindings.

The following additional options were added:
- `--extern-function-suffix=<suffix>`: Adds <suffix> to the name of each
  external wrapper function (`__extern` is used by default).
- `--extern-functions-file-name=<name>`: Uses <name> as the file name
  for the header and source files (`extern` is used by default).
- `--extern-function-directory=<dir>`: Creates the source and header
  files inside <dir> (`/tmp/bindgen` is used by default).

The C code serialization is experimental and only supports a very
limited set of C functions.

Fixes rust-lang#1090.
amanjeev added a commit to ferrous-systems/rust-bindgen that referenced this issue Jan 19, 2023
integration test: document the GH issue

integration test (macro): same order in impl as the trait for consistency

integration test: move each setup into its own function, just to save face

inline static func integration test
amanjeev added a commit to ferrous-systems/rust-bindgen that referenced this issue Jan 20, 2023
amanjeev added a commit to ferrous-systems/rust-bindgen that referenced this issue Jan 20, 2023
- Because directory structure is confusing to the author of this commit

rust-langGH-1090
amanjeev added a commit to ferrous-systems/rust-bindgen that referenced this issue Jan 20, 2023
amanjeev added a commit to ferrous-systems/rust-bindgen that referenced this issue Jan 22, 2023
test(static inlined): pretty print the diff

Issue: rust-langGH-1090

test(static inlined): refactor paths once again

- Because directory structure is confusing to the author of this commit

rust-langGH-1090

test(static inlined): refactor test files

- Expected files should be under tests/expectations/generated
- Remove extern_stub.h because we can reuse the header from generate-extern-functions.h

test(static inlined): diff test; generated files

refactor(static inlined): integration test

Issue: rust-langGH-1090

rust-langGH-1090: Integration tests

integration test: document the GH issue

integration test (macro): same order in impl as the trait for consistency

integration test: move each setup into its own function, just to save face

inline static func integration test

resolve accidental conflict
pvdrz added a commit to ferrous-systems/rust-bindgen that referenced this issue Jan 24, 2023
If bindgen finds an inlined function and the
`--generate-extern-functions` options is enabled, then:

- It will generate two new source and header files with external
  functions that wrap the inlined functions.
- Rerun `Bindings::generate` using the new header file to include these
  wrappers in the generated bindings.

The following additional options were added:
- `--extern-function-suffix=<suffix>`: Adds <suffix> to the name of each
  external wrapper function (`__extern` is used by default).
- `--extern-functions-file-name=<name>`: Uses <name> as the file name
  for the header and source files (`extern` is used by default).
- `--extern-function-directory=<dir>`: Creates the source and header
  files inside <dir> (`/tmp/bindgen` is used by default).

The C code serialization is experimental and only supports a very
limited set of C functions.

Fixes rust-lang#1090.
pvdrz pushed a commit to ferrous-systems/rust-bindgen that referenced this issue Jan 24, 2023
test(static inlined): pretty print the diff

Issue: rust-langGH-1090

test(static inlined): refactor paths once again

- Because directory structure is confusing to the author of this commit

rust-langGH-1090

test(static inlined): refactor test files

- Expected files should be under tests/expectations/generated
- Remove extern_stub.h because we can reuse the header from generate-extern-functions.h

test(static inlined): diff test; generated files

refactor(static inlined): integration test

Issue: rust-langGH-1090

rust-langGH-1090: Integration tests

integration test: document the GH issue

integration test (macro): same order in impl as the trait for consistency

integration test: move each setup into its own function, just to save face

inline static func integration test

resolve accidental conflict
pvdrz added a commit that referenced this issue Feb 7, 2023
* Generate extern wrappers for inlined functions

If bindgen finds an inlined function and the
`--generate-extern-functions` options is enabled, then:

- It will generate two new source and header files with external
  functions that wrap the inlined functions.
- Rerun `Bindings::generate` using the new header file to include these
  wrappers in the generated bindings.

The following additional options were added:
- `--extern-function-suffix=<suffix>`: Adds <suffix> to the name of each
  external wrapper function (`__extern` is used by default).
- `--extern-functions-file-name=<name>`: Uses <name> as the file name
  for the header and source files (`extern` is used by default).
- `--extern-function-directory=<dir>`: Creates the source and header
  files inside <dir> (`/tmp/bindgen` is used by default).

The C code serialization is experimental and only supports a very
limited set of C functions.

Fixes #1090.

---------

Co-authored-by: Amanjeev Sethi <aj@amanjeev.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants