Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functional C macros are not expanded. #753

Open
nicokoch opened this issue Jun 16, 2017 · 27 comments
Open

Functional C macros are not expanded. #753

nicokoch opened this issue Jun 16, 2017 · 27 comments
Labels
enhancement help wanted rust-for-linux Issues relevant to the Rust for Linux project

Comments

@nicokoch
Copy link

nicokoch commented Jun 16, 2017

So I'm generating bindings to <linux/uinput>. This works pretty well, but two c macros in particular do not produce any bindings in the output. Reduced input below:

Input C/C++ Header

#define UI_DEV_CREATE		_IO(UINPUT_IOCTL_BASE, 1)
#define UI_DEV_DESTROY		_IO(UINPUT_IOCTL_BASE, 2)

Bindgen Invocation

let bindings = bindgen::Builder::default()
        .no_unstable_rust()
        .header("wrapper/wrapper.h")
        .expect("Unable to generate bindings").

wrapper.h looks like this:

#include <linux/uinput.h>

Actual Results

No constant UI_DEV_CREATE or UI_DEV_DESTROY generated in output.

Expected Results

Constants UI_DEV_CREATE and UI_DEV_DESTROY generated in output.

I'm not really that familiar with the kernel macro _IO and not sure if this is technically even possible, but I figured to report just to get a more professional opinion. If it's not possible, workarounds welcome.

@emilio
Copy link
Contributor

emilio commented Jun 16, 2017

How is _IO defined? Grepping define _IO in my /usr/include/linux dir didn't yield anything.

@emilio
Copy link
Contributor

emilio commented Jun 16, 2017

As in... If we don't know how _IO is defined, how could we generate a sensible constant?

@emilio
Copy link
Contributor

emilio commented Jun 16, 2017

That being said, even with that it doesn't work (huh, I'd swear we knew how to deal with macro expansions):

#define _IO(a_) (a_)

#define UI_DEV_CREATE _IO(1)
#define UI_DEV_DESTROY _IO(2)

@emilio emilio changed the title Nested c macro does not generate any output Functional C macros are not expanded. Jun 16, 2017
@emilio
Copy link
Contributor

emilio commented Jun 16, 2017

This is probably a bug for rust-cexpr. Indeed, it's a dupe of jethrogb/rust-cexpr#3.

@emilio
Copy link
Contributor

emilio commented Jun 16, 2017

I guess I can try to add to libclang a way to get whether the macro definition belongs to a functional macro...

@nicokoch
Copy link
Author

If it helps any:
_IO is defined in uapi/asm-generic/ioctl.h as:
#define _IO(type,nr) _IOC(_IOC_NONE,(type),(nr),0)
and _IOC in turn is defined as:

#define _IOC(dir,type,nr,size) \
      (((dir)  << _IOC_DIRSHIFT) | \
      ((type) << _IOC_TYPESHIFT) | \
      ((nr)   << _IOC_NRSHIFT) | \
      ((size) << _IOC_SIZESHIFT))

@emilio
Copy link
Contributor

emilio commented Jun 16, 2017

Ok, so libclang provides a clang_Cursor_isMacroFunctionLike API. Perhaps we could use that.

@emilio
Copy link
Contributor

emilio commented Jun 16, 2017

I've asked @jethrogb about what the best API for this would be. It shouldn't be a big deal to support it in bindgen once cexpr support is there.

@nicokoch
Copy link
Author

Meanwhile, I used the following in my wrapper.h to workaround the issue:

#include <linux/uinput.h>

const __u64 _UI_DEV_CREATE = UI_DEV_CREATE;
#undef UI_DEV_CREATE
const __u64 UI_DEV_CREATE = _UI_DEV_CREATE;

const __u64 _UI_DEV_DESTROY = UI_DEV_DESTROY;
#undef UI_DEV_DESTROY
const __u64 UI_DEV_DESTROY = _UI_DEV_DESTROY;

@emilio
Copy link
Contributor

emilio commented Jun 16, 2017

Yeah, note that that would only work in libclang 3.9+ though.

@jethrogb
Copy link
Contributor

jethrogb commented Jul 26, 2017

Current state: C preprocessor needed. @emilio can you add "help wanted" label?

@clia
Copy link

clia commented Oct 22, 2018

How about the progress of this?
I'm generating the PostgreSQL header files using bindgen, but in the output files there're a lot of macros being lost.

@jethrogb
Copy link
Contributor

@clia The state is unchanged as of my latest comment #753 (comment). This issue is marked "help wanted" and waiting for someone (anyone) to implement this functionality.

@iDawer
Copy link

iDawer commented Feb 1, 2019

Another workaround preserving original names:

wrapper.h
#include <linux/ioctl.h>
#include <linux/usbdevice_fs.h>

#define MARK_FIX_753(req_name) const unsigned long int Fix753_##req_name = req_name;

MARK_FIX_753(USBDEVFS_CONTROL);
MARK_FIX_753(USBDEVFS_BULK);
build.rs
...
#[derive(Debug)]
pub struct Fix753 {}
impl bindgen::callbacks::ParseCallbacks for Fix753 {
    fn item_name(&self, original_item_name: &str) -> Option<String> {
        Some(original_item_name.trim_start_matches("Fix753_").to_owned())
    }
}

fn main() {
    let bindings = bindgen::Builder::default()
        .header("wrapper.h")
        .parse_callbacks(Box::new(Fix753 {}))
        .generate()
        .expect("Unable to generate bindings");
    ...
}
output bindings.rs
...
pub const USBDEVFS_CONTROL: ::std::os::raw::c_ulong = 3222820096;
pub const USBDEVFS_BULK: ::std::os::raw::c_ulong = 3222820098;

Seems like this workaround shouldn't conflict with future implementation.

@vext01
Copy link

vext01 commented Apr 3, 2020

Not sure if this is the same problem, but I'm having issues getting at RTLD_DEFAULT from dlfcn.h.

It's defined thusly:

# define RTLD_DEFAULT   ((void *) 0)

Without hacks, it's totally absent from bindings.rs. Not sure why.

I've tried a few workarounds, including @iDawer's

wrapper.h:

#define _GNU_SOURCE                                                                                
#define __USE_GNU                                                                                  
#include <dlfcn.h>                                                                                 
#include <link.h>                                                                                  
                                                                                                                                                                                                                                        
// Import using another name, doesn't work.                                      
//const void *MY_RTLD_DEFAULT = RTLD_DEFAULT;                                                      
                                                                                                     
#define MARK_FIX_753(req_name) const void *Fix753_##req_name = req_name;                           
MARK_FIX_753(RTLD_DEFAULT);                                                                        
                                                                                                     
// Direct definition, works.                                                                       
//#define MY_RTLD_DEFAULT 0

build.rs:

extern crate bindgen;                                                                                
                                                                                                     
use std::{env, path::PathBuf};                                                                       
                                                                                                     
const BINDINGS_FILE: &'static str = "bindings.rs";                                                   
const WRAPPER_HEADER: &'static str = "wrapper.h";                                                    
                                                                                                     
#[derive(Debug)]                                                                                     
pub struct Fix753 {}                                                                                 
impl bindgen::callbacks::ParseCallbacks for Fix753 {                                                 
    fn item_name(&self, original_item_name: &str) -> Option<String> {                                
        Some(original_item_name.trim_start_matches("Fix753_").to_owned())                            
    }                                                                                                
}                                                                                                    
                                                                                                     
fn main() {                                                                                          
    // Rust target spec is needed for now so that auto-generated tests pass.                         
    // https://github.com/rust-lang-nursery/rust-bindgen/issues/1370#issuecomment-426597356          
    let bindings = bindgen::Builder::default()                                                     
        .header(WRAPPER_HEADER)                                                                    
        //.rust_target(bindgen::RustTarget::Stable_1_26)                                           
        .parse_callbacks(Box::new(Fix753 {}))                                                      
        .generate()                                                                                
        .expect("bindgen failed");                                                                 
                                                                                                   
    let out_path = PathBuf::from(env::var("OUT_DIR").unwrap());                                    
    bindings                                                                                       
        .write_to_file(out_path.join(BINDINGS_FILE))                                               
        .expect("Couldn't write bindings!");                                                       
}

bindings.rs:

extern "C" {                                                                                            
    #[link_name = "\u{1}Fix753_RTLD_DEFAULT"]                                                           
    pub static mut RTLD_DEFAULT: *const ::std::os::raw::c_void;                                         
}

Notice the value is not expanded. It's referenced as an extern :\

Needless to say, trying to use RTLD_DEFAULT gives a linker error:

...
  = note: /usr/bin/ld: /home/vext01/research/yorick/phdrs/target/debug/deps/phdrs-51a0fdcb223a7a9a.1sf375tf7jcp7kb.rcgu.o: in function `phdrs::find_symbol':
          /home/vext01/research/yorick/phdrs/src/lib.rs:(.text._ZN5phdrs11find_symbol17h0db50af030d9c91cE+0x6c): undefined reference to `Fix753_RTLD_DEFAULT'                                                    
          collect2: error: ld returned 1 exit status

Questions:

  • Is RTLD_DEFAULT not expanded due to the problems discussed earlier in this thread?
  • Can anyone think of a workaround?

This is bindgen-0.52.

@diwic
Copy link
Contributor

diwic commented Feb 21, 2021

I encountered both of these problems, which I believe to be two different ones, neither

#define SNDRV_PCM_STATE_OPEN ((snd_pcm_state_t) 0)

nor

#define SNDRV_CTL_IOCTL_ELEM_LIST	_IOWR('U', 0x10, struct snd_ctl_elem_list)

...shows up in the output at all. Errors from cexpr are swallowed by bindgen and results in no output, and that's what happening here.
Anyhow, what I ended up with as a workaround was a regex to cover both these cases, as can be seen here.

The regexes are r"#define\s+(\w+)\s+\(\((\w+)\)\s*(\d+)\)" and r"#define\s+(\w+)\s+_IO([WR]*)\(([^,)]+),\s*([^,)]+),?\s*([^,)]*)\)" and the code handling these are good enough for me, but there might be special cases for other headers that it does not handle.

intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this issue May 16, 2022
bindgen (a tool which generates the "raw" C bindings for Rust) only
works (so far) with "simple" C `#define`s. In order to avoid having
to manually maintain these constants in the (potential) Rust side,
this patch converts them into an `enum`.

There may be support in the future for expanding macros that end up in
a "numeric" one: rust-lang/rust-bindgen#753.

Co-developed-by: Wedson Almeida Filho <wedsonaf@google.com>
Signed-off-by: Wedson Almeida Filho <wedsonaf@google.com>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
@jbaublitz
Copy link
Contributor

@jethrogb @emilio I now have two projects that I'm a part of that depend on this so I'd willing to help. Where do you want me to start? Does the preprocessor need to be added first at the cexpr level?

github-actions bot pushed a commit to gnoliyil/fuchsia that referenced this issue Sep 28, 2022
This adds helper functions to simplify calling zxio functions from
inet sockets.

This also adds stubs/missing_includes.h to the syncio library so
bindgen can generate zxio shutdown option constants. Without this,
bindgen runs into the issue where it can't expand C macros.
See: rust-lang/rust-bindgen#753

Change-Id: I8e3e9aa015212a2cb8dabdc678d5c08329060014
Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/729108
Reviewed-by: Adam Barth <abarth@google.com>
Commit-Queue: Vickie Cheng <vickiecheng@google.com>
andrzejtp added a commit to andrzejtp/v4l2r that referenced this issue Feb 8, 2023
rust-lang/rust-bindgen#753

As a result all V4L2_FWHT_FL_* macros have no bindings generated for them.
The github issue describes a suggested workaround, which is applied in
this patch which adds a dedicated "runbindgen" utility, and is centered
around using a wrapper C header file for the problematic header and
defining a custom parse callback.

This works for _BITUL(), but unfortunately not for GENMASK() found in
v4l2-controls.h, which is included from videodev2.h. However, there are
just 2 invocations of GENMASK, so in the fix753.h wrapper we can #undef
the 2 masks and define them manually.

To generate bindings - inside runbindgen directory:

cargo run -- -o ../videodev2_64.rs  -I /path/to/kerneldir/usr/include/
cargo run -- -o ../videodev2_32.rs  -I /path/to/kerneldir/usr/include/ -s /usr/i686-linux-gnu/ -t i686-linux-gnu
Gnurou pushed a commit to Gnurou/v4l2r that referenced this issue Mar 1, 2023
rust-lang/rust-bindgen#753

As a result all V4L2_FWHT_FL_* macros have no bindings generated for them.
The github issue describes a suggested workaround, which is applied in
this patch which adds a dedicated "runbindgen" utility, and is centered
around using a wrapper C header file for the problematic header and
defining a custom parse callback.

This works for _BITUL(), but unfortunately not for GENMASK() found in
v4l2-controls.h, which is included from videodev2.h. However, there are
just 2 invocations of GENMASK, so in the fix753.h wrapper we can #undef
the 2 masks and define them manually.

To generate bindings - inside runbindgen directory:

cargo run -- -o ../videodev2_64.rs  -I /path/to/kerneldir/usr/include/
cargo run -- -o ../videodev2_32.rs  -I /path/to/kerneldir/usr/include/ -s /usr/i686-linux-gnu/ -t i686-linux-gnu
@eliaxelang007
Copy link

Any updates to this issue?

I'm trying to create rust bindings for raylib, but bindgen isn't generating rust bindings for the color macros:

/* raylib.h */

/* ... */

/* These aren't being generated in the bindgen output. */
#define WHITE      CLITERAL(Color){ 255, 255, 255, 255 }   // White
#define BLACK      CLITERAL(Color){ 0, 0, 0, 255 }         // Black
#define BLANK      CLITERAL(Color){ 0, 0, 0, 0 }           // Blank (Transparent)
#define MAGENTA    CLITERAL(Color){ 255, 0, 255, 255 }     // Magenta
#define RAYWHITE   CLITERAL(Color){ 245, 245, 245, 255 }   // My own White (raylib logo)

/* ... */

koverstreet pushed a commit to koverstreet/bcachefs-tools that referenced this issue Apr 27, 2023
In an effort to rewrite `bch2_read_super` from C to Rust,
it is neccessary to have `opt_get!` macro defined, and some
FMODE_* consts (defined as macro in `include/linux/blkdev.h`)
defined as Rust const.

Bindgen is currently unable to exapnd C functional macro [1],
this this commit use the workaround as introduced in [2].

[1] rust-lang/rust-bindgen#753
[2] rust-lang/rust-bindgen#753 (comment)

Signed-off-by: TruongSinh Tran-Nguyen <i@truongsinh.pro>
truongsinh pushed a commit to truongsinh/bcachefs-tools that referenced this issue Apr 28, 2023
In an effort to rewrite `bch2_read_super` from C to Rust,
it is neccessary to have `opt_get!` macro defined, and some
FMODE_* consts (defined as macro in `include/linux/blkdev.h`)
defined as Rust const.

Bindgen is currently unable to exapnd C functional macro [1],
this this commit use the workaround as introduced in [2].

[1] rust-lang/rust-bindgen#753
[2] rust-lang/rust-bindgen#753 (comment)

Signed-off-by: TruongSinh Tran-Nguyen <i@truongsinh.pro>
@pustekuchen91
Copy link

pustekuchen91 commented Aug 11, 2023

Is there a possible solution in the near feature?

I'm trying to convert macros for pin definitions of the harmony framework

#define MyGPIO_Get()               (((PORT_REGS->GROUP[2].PORT_IN >> 0U)) & 0x01U)
#define MyGPIO_PIN                  PORT_PIN_PC00

@jbaublitz
Copy link
Contributor

@emilio I'd like to propose a temporary and longer term solution that will take a while due to it requiring LLVM.

In the short term, @ojeda found a workaround with clang where we should be able to use temporary intermediate files with the macros surrounded by parentheses to evaluate these sorts of expressions to a constant. This may slow down compile time, but it should have the benefit of allowing this particular case to be handled using current clang.

In the longer term, @ojeda talked with the clang maintainers and they are open to having someone add functionality to LLVM and clang to essentially keep C preprocessor macro information available for a given parsed file. As you probably already know, currently, a lot of the cpp information is dropped by the time you've parsed the file. The clang maintainer is open to changes that would effectively allow using a header file as a context from which you could evaluate arbitrary macro expressions as I understand it (correct me if I'm wrong, @ojeda, since I wasn't part of this conversation). This would provide bindgen with the ability to have much more robust macro evaluate through clang.

For now, would you be willing to accept the shorter term solution if I put up a PR and would you be interested in something like the longer term solution later on?

@pvdrz
Copy link
Contributor

pvdrz commented Aug 14, 2023

@jbaublitz could you expand a bit on how this workaround with intermediate files would work?

@jbaublitz
Copy link
Contributor

@pvdrz Sure! Just as a heads up, I think @ojeda and I have come up with two similar but slightly different solutions so we're emailing back and forth to get on the same page. However, I've had success in our test case around macros in cryptsetup that use function-like macros in their expansion to define constants using my solution.

I've looked at the bindgen code and I feel like we kind of have two routes here. Currently, it looks like we have cexpr doing a lot of the heavy lifting in the case of parsing #defines. I'll outline the two possibilities I see with my solution and the tradeoffs and hopefully one of them will be acceptable for bindgen while we work on the longer term work of LLVM API changes.

The first option is a more localized change and could possibly be used as a backup for cexpr if it fails even, but involves more temporary files and therefore potential impacts to compile time for a large number of function-like macro definitions. In this case, if cexpr fails to parse the expansion of the macro (for example if it uses function-like macros in the expansion), it is possible to write out a temporary file containing a use of the macro. Clang can read in this temporary file and parse the macro invocation and provide an integer literal as the output in cases like the one I liked above. The drawback is that there would like need to be a temporary file for each failure to parse the macro expansion. This may not be acceptable but I wanted to put it out there as an option.

The other option would be a larger change but would only require one temporary file per header being passed to bindgen. I've had success with the clang crate generating a list of macro definitions in the header file, putting them all in a temporary file, and then evaluating each one, mapping the evaluation to the original macro definition name, returning all of that information in a HashMap. I'm not entirely sure how this would work with cexpr, but it is definitely the more performant option, and I have example code using the clang crate that demonstrates that it would work.

Do either of these sound acceptable or at least like something we can iterate on to find a solution that would be acceptable for bindgen? I understand that both of them are definitely workarounds and not the ideal solution we hope to get to with LLVM changes, but I think it could provide some quality of life improvements for right now. For example, when we built our Rust bindings for libcryptsetup against a new version that used function-like macros in their #defines after previously not doing so, we had a build failure of our high level bindings because those constants suddenly disappeared from libcryptsetup-rs-sys. Either of these solutions would at least prevent that from happening in the future.

@ojeda
Copy link

ojeda commented Aug 16, 2023

In the short term, @ojeda found a workaround with clang where we should be able to use temporary intermediate files with the macros surrounded by parentheses to evaluate these sorts of expressions to a constant. This may slow down compile time, but it should have the benefit of allowing this particular case to be handled using current clang.

Yeah, to give an example:

void f(void) {
    (M);
}

Then one can query with a cursor placed at (, and Clang will resolve the expression. It seems to always give Kind: Int though, at least through c-index-test -- we should test corner cases up to 64-bit and see if it works for our purposes.

Another way is initializing a variable:

int m = M;

If one requests m, that seems to return the value after type conversion.

The clang maintainer is open to changes that would effectively allow using a header file as a context from which you could evaluate arbitrary macro expressions as I understand it (correct me if I'm wrong, @ojeda, since I wasn't part of this conversation).

@AaronBallman told me that the C indexing APIs (include/clang-c/) work by building an AST, but that AST does not keep the macro information. If we implemented the change, then Aaron offered to review the changes and so on.

The change would involve keeping enough information to be able to evaluate the macros (or perhaps directly saving a map of evaluations, similar to what you suggest later, but on Clang's side).

(Thanks a lot @jbaublitz for looking into this)

@AaronBallman
Copy link

FWIW, Clang's C indexing APIs have a TU-level flag named CXTranslationUnit_DetailedPreprocessingRecord. Passing that causes the indexing APIs to behave as if -Xclang -detailed-preprocessing-record was passed to the compiler which might get you what you need already. If it doesn't get you what you need, then it's quite likely in the correct area for where you'd make changes to expose the macro information you're after. (I'm not intimately familiar with this part of the code, but you may need to call clang_annotateTokens to get cursors that visit macros.)

@ojeda
Copy link

ojeda commented Aug 16, 2023

Yeah, in my case I just tested via c-index-test, which from a quick look it seems to set CXTranslationUnit_DetailedPreprocessingRecord for -evaluate-cursor-at (and no reparsing). But if it happens to e.g. work via the API without changes to libclang, that would be great. Thanks a lot @AaronBallman for taking a look!

@jbaublitz
Copy link
Contributor

@pvdrz Just a few more considerations:

  • Evaluating all macros in a single file could result in failure to parse any of the macros if some represent expansions that result in a syntax failure when prepended with a semicolon. If we take that approach, we might want to have a fallback mode if the single intermediary file fails to compile to evaluate each macro in its own intermediary file.
  • We could always make the intermediary file approach opt-in only. That way, we're not having anyone use it by default, but they can opt in if they find this issue and want to use this feature before Clang supports this kind of thing directly.
  • Corner cases where this macro evaluation doesn't work as well: char-cast integers will show up as integers still, the macro evaluation doesn't seem to work very well for 128-bit integers even when cast as (__uint128_t) and they appear to be truncated still.

@jbaublitz
Copy link
Contributor

@pvdrz Would you prefer to just have the support in LLVM as opposed to the workaround I proposed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement help wanted rust-for-linux Issues relevant to the Rust for Linux project
Projects
None yet
Development

No branches or pull requests