Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bitfield methods not generated #816

Closed
glyn opened this issue Jul 17, 2017 · 26 comments
Closed

bitfield methods not generated #816

glyn opened this issue Jul 17, 2017 · 26 comments

Comments

@glyn
Copy link
Contributor

glyn commented Jul 17, 2017

Some bitfield getters and setter methods which were generated by bindgen 24.x are no longer generated by bindgen 26.x.

The following is a stripped down branch of my project which reproduces the problem with a single C struct.

https://github.com/glyn/jvmkill/tree/reproduce-bindgen-bug

Input C/C++ Header

typedef struct {
    unsigned int can_tag_objects : 1;
    unsigned int can_generate_field_modification_events : 1;
    unsigned int can_generate_field_access_events : 1;
    unsigned int can_get_bytecodes : 1;
    unsigned int can_get_synthetic_attribute : 1;
    unsigned int can_get_owned_monitor_info : 1;
    unsigned int can_get_current_contended_monitor : 1;
    unsigned int can_get_monitor_info : 1;
    unsigned int can_pop_frame : 1;
    unsigned int can_redefine_classes : 1;
    unsigned int can_signal_thread : 1;
    unsigned int can_get_source_file_name : 1;
    unsigned int can_get_line_numbers : 1;
    unsigned int can_get_source_debug_extension : 1;
    unsigned int can_access_local_variables : 1;
    unsigned int can_maintain_original_method_order : 1;
    unsigned int can_generate_single_step_events : 1;
    unsigned int can_generate_exception_events : 1;
    unsigned int can_generate_frame_pop_events : 1;
    unsigned int can_generate_breakpoint_events : 1;
    unsigned int can_suspend : 1;
    unsigned int can_redefine_any_class : 1;
    unsigned int can_get_current_thread_cpu_time : 1;
    unsigned int can_get_thread_cpu_time : 1;
    unsigned int can_generate_method_entry_events : 1;
    unsigned int can_generate_method_exit_events : 1;
    unsigned int can_generate_all_class_hook_events : 1;
    unsigned int can_generate_compiled_method_load_events : 1;
    unsigned int can_generate_monitor_events : 1;
    unsigned int can_generate_vm_object_alloc_events : 1;
    unsigned int can_generate_native_method_bind_events : 1;
    unsigned int can_generate_garbage_collection_events : 1;
    unsigned int can_generate_object_free_events : 1;
    unsigned int can_force_early_return : 1;
    unsigned int can_get_owned_monitor_stack_depth_info : 1;
    unsigned int can_get_constant_pool : 1;
    unsigned int can_set_native_method_prefix : 1;
    unsigned int can_retransform_classes : 1;
    unsigned int can_retransform_any_class : 1;
    unsigned int can_generate_resource_exhaustion_heap_events : 1;
    unsigned int can_generate_resource_exhaustion_threads_events : 1;
    unsigned int : 7;
    unsigned int : 16;
    unsigned int : 16;
    unsigned int : 16;
    unsigned int : 16;
    unsigned int : 16;
} jvmtiCapabilities;

Bindgen Invocation

    let bindings = bindgen::Builder::default()
        .header(input.h)
        .generate()
        .expect("Failed to generate bindings");

    bindings
        .write_to_file(output.rs)
        .expect("Failed to write bindings");
}

Actual Results

Incorrect bindings from bindgen 26.3:

/* automatically generated by rust-bindgen */

#[repr(C)]
#[derive(Debug, Copy)]
pub struct jvmtiCapabilities {
    pub _bitfield_1: [u16; 8usize],
    pub __bindgen_align: [u32; 0usize],
}
#[test]
fn bindgen_test_layout_jvmtiCapabilities() {
    assert_eq!(::std::mem::size_of::<jvmtiCapabilities>() , 16usize , concat !
               ( "Size of: " , stringify ! ( jvmtiCapabilities ) ));
    assert_eq! (::std::mem::align_of::<jvmtiCapabilities>() , 4usize , concat
                ! ( "Alignment of " , stringify ! ( jvmtiCapabilities ) ));
}
impl Clone for jvmtiCapabilities {
    fn clone(&self) -> Self { *self }
}

Expected Results

Correct bindings from bindgen 24.x:

/* automatically generated by rust-bindgen */

#[repr(C)]
#[derive(Debug, Copy)]
pub struct jvmtiCapabilities {
    pub _bitfield_1: [u8; 4usize],
    pub _bitfield_2: [u16; 2usize],
    pub _bitfield_3: [u16; 2usize],
    pub _bitfield_4: [u16; 2usize],
    pub __bindgen_align: [u32; 0usize],
}
#[test]
fn bindgen_test_layout_jvmtiCapabilities() {
    assert_eq!(::std::mem::size_of::<jvmtiCapabilities>() , 16usize , concat !
               ( "Size of: " , stringify ! ( jvmtiCapabilities ) ));
    assert_eq! (::std::mem::align_of::<jvmtiCapabilities>() , 4usize , concat
                ! ( "Alignment of " , stringify ! ( jvmtiCapabilities ) ));
}
impl Clone for jvmtiCapabilities {
    fn clone(&self) -> Self { *self }
}
impl jvmtiCapabilities {
    #[inline]
    pub fn can_tag_objects(&self) -> ::std::os::raw::c_uint {
        let mask = 1usize as u32;
        let field_val: u32 =
            unsafe { ::std::mem::transmute(self._bitfield_1) };
        let val = (field_val & mask) >> 0usize;
        unsafe { ::std::mem::transmute(val as u32) }
    }
    #[inline]
    pub fn set_can_tag_objects(&mut self, val: ::std::os::raw::c_uint) {
        let mask = 1usize as u32;
        let val = val as u32 as u32;
        let mut field_val: u32 =
            unsafe { ::std::mem::transmute(self._bitfield_1) };
        field_val &= !mask;
        field_val |= (val << 0usize) & mask;
        self._bitfield_1 = unsafe { ::std::mem::transmute(field_val) };
    }
    // etc. etc.
}

RUST_LOG=bindgen Output

Please see this gist since the log was too long for a github issue body.

@fitzgen fitzgen added the bug label Jul 17, 2017
@fitzgen
Copy link
Member

fitzgen commented Jul 17, 2017

Thanks for the regression report!

@emilio
Copy link
Contributor

emilio commented Jul 18, 2017

So this is because with the correct bitfield layout, that bitfield is actually > 64 bits, so we can't use a rust integer to represent it...

@glyn
Copy link
Contributor Author

glyn commented Jul 18, 2017

@emilio I see. How difficult would it be to use a representation which is not a single integer?

@emilio
Copy link
Contributor

emilio commented Jul 18, 2017

It'd require some rework I personally don't have the time to do right now... but it'd be nice, and it should be doable (you just need to calculate a window where you can do a load for said bitfield, and choose one of them with the proper bitmask)

@glyn
Copy link
Contributor Author

glyn commented Aug 23, 2017

Please note this bug is blocking our project from upgrading from bindgen v0.24.x

@fitzgen
Copy link
Member

fitzgen commented Aug 23, 2017

Please note this bug is blocking our project from upgrading from bindgen v0.24.x

Hi @glyn, if you're interested in unblocking your project, here are some tips for navigating around this code:

What I believe is involved in fixing this bug is

  1. Adding support for emitting bitfield allocation units that are arrays, eg [u64; 2].

  2. Adding support for loading logical bitfields out of such bitfield allocation units. This will probably mean that the Bitfield::mask method will need updating, as well as some code inside src/codegen/mod.rs dealing with bitfields.

@glyn
Copy link
Contributor Author

glyn commented Aug 23, 2017

Thanks @fitzgen - very helpful.

@glyn
Copy link
Contributor Author

glyn commented Sep 4, 2017

@emilio mentioned in #servo that #940 switches from using quote_ty! to using quote!.

@glyn
Copy link
Contributor Author

glyn commented Sep 7, 2017

Progress has been slow in understanding the code, partly because some of it appears to be based on an invalid assumption that masks can always be 64 bits wide and that bitfield allocation units can always be manipulated as 64 bit entities (see #954).

I'm not sure how performance critical the generated code is, but I'm inclined to say that we should make it correct first and optimise it later. So I'm thinking of treating bitfields as arrays of u8 and developing code to do masking etc. on those arrays.

Testing is a bit of a concern as that kind of code will be quite fiddly and would benefit from a good unit test suite. But I think that would be inconvenient to do through codegen. I wonder if we could develop some kind of helper type and make the generated code depend on that type.

Any comments on the general approach gratefully received from @emilio, @fitzgen, or others.

@pylbrecht
Copy link

I'll be doing my best to support @glyn with this issue.

@emilio
Copy link
Contributor

emilio commented Sep 7, 2017

I'm not sure how performance critical the generated code is, but I'm inclined to say that we should make it correct first and optimise it later. So I'm thinking of treating bitfields as arrays of u8 and developing code to do masking etc. on those arrays.

With this, you mean actually generating arrays of u8 on the structs? Or just treating them as-if? Because if the first, I think you'll find alignment issues and such.

@glyn
Copy link
Contributor Author

glyn commented Sep 7, 2017

My understanding, such as it is, is that bitfields are grouped into contiguous blocks of bytes which start on a byte boundary. Look at this for example to see that bindgen already generates arrays of u8.

@emilio
Copy link
Contributor

emilio commented Sep 7, 2017

The final alignment depends on the types specified as members. See:

https://github.com/rust-lang-nursery/rust-bindgen/blob/978b5316f79aae7a66fd7390d13276a005f9fc9a/tests/expectations/tests/struct_with_bitfields.rs#L12

For an example where we generate 4-byte-aligned bitfields.

@glyn
Copy link
Contributor Author

glyn commented Oct 12, 2017

A status update: @pylbrecht is unable to commit time to this issue. I am also finding it hard to get traction on this issue with the maximum of one day per week I get to work on it.

I have been exploring a generic overlay of an array of unsigned integers of any supported precision. The goal was to unit test masking and other bitfield operations thoroughly and independently of codegen. However that approach ran aground because of Rust restrictions in the way arrays are handled as generic type parameters.

I think the way forward is to generate a specific overlay and do a bit more work during codegen to tailor the code to the array length and integer precision. It will be messier to get good unit tests, although bindgen doesn't seem particularly reliant on unit tests, so maybe that would be acceptable.

To gain traction, I need to understand the codegen code in more detail. The code overview in CONTRIBUTING.md is great, but I'm still trying to figure out which pieces of code will need to be changed to implement this issue. Alternatively, it might be better to revisit this issue after #954 has been fixed as there will then be less confusion in the implementation.

@fitzgen
Copy link
Member

fitzgen commented Oct 12, 2017

Regarding the "generic overlay": were you thinking emitting in the bindings' prelude something like

union BindgenBitfieldUnit<Storage, Align>
where
    Storage: Copy + AsRef<[u8]>,
    Align: Copy
{
    storage: Storage,
    align: [Align; 0],
}

impl<Storage, Align> BindgenBitfieldUnit<Storage, Align>
where
    Storage: Copy + AsRef<[u8]>,
    Align: Copy
{
    #[inline]
    fn get(&self, bit_offset: usize, bit_width: usize) -> u64 {
        unimplemented!()
    }
    
    fn set(&mut self, bit_offset: usize, bit_width: usize, val: u64) {
        unimplemented!()
    }
}

Which only manages storage and reading/writing that storage? And then the generated code for bitfields would do something like this:

/*
This would be generated from:

struct TaggedPtr {
    uintptr_t ptr : 61;
    char      tag : 3;
};
*/
pub struct TaggedPtr {
    _bitfield_unit_1: BindgenBitfieldUnit<[u8; 8], u64>,
}

impl TaggedPtr {
    pub fn ptr(&self) -> usize {
        self._bitfield_unit_1.get(0, 61) as usize
    }
    
    pub fn set_ptr(&mut self, ptr: usize) {
        self._bitfield_unit_1.set(0, 61, ptr as u64)
    }
    
    pub fn tag(&self) -> ::std::os::raw::c_char {
        self._bitfield_unit_1.get(62, 3) as ::std::os::raw::c_char
    }
    
    pub fn set_tag(&mut self, tag: ::std::os::raw::c_char) {
        self._bitfield_unit_1.set(62, 3, tag as u64);
    }
}

This seems like a reasonable way to split responsibilities and break the problem down into more manageable chunks.

it might be better to revisit this issue after #954 has been fixed as there will then be less confusion in the implementation.

Agreed. Breaking things down into smaller blocks is almost always helpful.

@glyn
Copy link
Contributor Author

glyn commented Oct 12, 2017

@fitgen Thanks for your reply.

Regarding the "generic overlay": were you thinking emitting in the bindings' prelude something like ... [snip]

My approach was roughly like that, but yours is more likely to work with current Rust since it avoids dealing with the full array as a generic type parameter.

This seems like a reasonable way to split responsibilities and break the problem down into more manageable chunks.

it might be better to revisit this issue after #954 has been fixed as there will then be less confusion in the implementation.

Agreed. Breaking things down into smaller blocks is almost always helpful.

I'll take a look at #954 but, unfortunately, it won't be any time soon.

@glyn
Copy link
Contributor Author

glyn commented Nov 13, 2017

I've spent several hours reading the code and it's making more sense now. I fixed #954.

For this issue, I like the look of @fitzgen's union BindgenBitfieldUnit<Storage, Align> approach, but that would restrict the target Rust version to ≥1.19. Is that a problem?

@fitzgen
Copy link
Member

fitzgen commented Nov 13, 2017

I poked at this a little while ago. The union can be avoided, but I had other issues as well, that I didn't spend enough time digging into to get tot he bottom of.

  • llvm wasn't able to boil this away into the simplest loads and masking I'd hoped for. This could be my fault, and it desn't even work properly yet, so maybe this shouldn't even be a worry yet.

  • This is all very specific about endianity, but I guess our bitfield allocation unit computations are very x86[_64] PS ABI specific anyways, so shrug

  • It is just buggy and not handling endianity correctly anyways.

#[derive(Copy, Clone, Debug, Default, Eq, Hash, Ord, PartialEq, PartialOrd)]
pub struct BindgenBitfieldUnit<Storage, Align>
where
    Storage: AsRef<[u8]> + AsMut<[u8]>,
{
    storage: Storage,
    align: [Align; 0],
}

impl<Storage, Align> BindgenBitfieldUnit<Storage, Align>
where
    Storage: AsRef<[u8]> + AsMut<[u8]>,
{
    #[inline]
    pub fn new(storage: Storage) -> Self {
        Self {
            storage,
            align: [],
        }
    }

    #[inline]
    pub fn get_bit(&self, index: usize) -> bool {
        debug_assert!(index / 8 < self.storage.as_ref().len());

        let byte_index = self.storage.as_ref().len() - (index / 8) - 1;
        let byte = self.storage.as_ref()[byte_index];

        let bit_index = index % 8;
        let mask = 1 << bit_index;

        byte & mask == mask
    }

    #[inline]
    pub fn set_bit(&mut self, index: usize, val: bool) {
        debug_assert!(index / 8 < self.storage.as_ref().len());

        let byte_index = self.storage.as_ref().len() - (index / 8) - 1;
        let byte = &mut self.storage.as_mut()[byte_index];

        let bit_index = index % 8;
        let mask = 1 << bit_index;

        if val {
            *byte |= mask;
        } else {
            *byte &= !mask;
        }
    }

    #[inline]
    pub fn get(&self, bit_offset: usize, bit_width: u8) -> u64 {
        debug_assert!(bit_width <= 64);
        debug_assert!(bit_offset / 8 < self.storage.as_ref().len());
        debug_assert!((bit_offset + (bit_width as usize)) / 8 <= self.storage.as_ref().len());

        let mut val = 0;

        for i in 0..(bit_width as usize) {
            if self.get_bit(i + bit_offset) {
                val |= 1 << i;
            }
        }

        val
    }

    #[inline]
    pub fn set(&mut self, bit_offset: usize, bit_width: u8, val: u64) {
        debug_assert!(bit_width <= 64);
        debug_assert!(bit_offset / 8 < self.storage.as_ref().len());
        debug_assert!((bit_offset + (bit_width as usize)) / 8 <= self.storage.as_ref().len());

        for i in 0..(bit_width as usize) {
            let mask = 1 << i;
            let val_bit_is_set = val & mask == mask;
            self.set_bit(i + bit_offset, val_bit_is_set);
        }
    }
}

WIP commit defining BindgenBitfieldUnit is here: fitzgen@e5bcf3b

WIP tree with integration into codegen is here: https://github.com/fitzgen/rust-bindgen/commits/bindgen-bitfield-unit

@glyn
Copy link
Contributor Author

glyn commented Nov 13, 2017

Thanks @fitzgen. I take it that you would prefer to avoid union then. ;-)

@fitzgen
Copy link
Member

fitzgen commented Nov 13, 2017

Yeah, if possible, and in this case it isn't too difficult :)

@glyn
Copy link
Contributor Author

glyn commented Nov 14, 2017

@fitzgen I'm exploring something similar to your WIP commit with get taking, and set returning, a u8 rather than a u64. The endianness considerations then disappear from that code and cluster in the unit constructors and field getters and setters.

Correction: re-reading your code, I see the u64 is just a way of passing a bit-strip. Please ignore the comment above.

@glyn
Copy link
Contributor Author

glyn commented Nov 17, 2017

@fitzgen I corrected a little bug in your WIP commit and now all the tests pass. Is this suitable for merging to master? We can then work on lifting the 64 bit restriction. Please see this commit.

@fitzgen
Copy link
Member

fitzgen commented Nov 17, 2017

@glyn Awesome! Thank you very much! Can you open a PR with the WIP commit and your fix squashed together with a nice commit message? Then I can do a final double check and merge it.

Thanks!

@glyn
Copy link
Contributor Author

glyn commented Nov 20, 2017

@fitzgen Since it was then relatively easy to lift the 64 bit unit restriction and that is the real motivation for the above code, I've submitted a PR with this all done together. Hope that works for you.

bors-servo pushed a commit that referenced this issue Nov 23, 2017
Support bitfield allocation units larger than 64 bits

Individual bitfields are still limited to at most 64 bits, but this restriction can be weakened when Rust supports `u128`.

This implements issue #816.

Usage notes:

* Since common code is added to each generated binding, a program which uses
  more than one binding may need to work around the duplication by including
  each binding in its own module.
* The values created by bitfield allocation unit constructors can be assigned
  directly to the corresponding struct fields with no need for transmutation.

Implementation notes:

`__BindgenBitfieldUnit` represents a bitfield allocation unit using a `Storage`
type accessible as a slice of `u8`. The alignment of the unit is inherited from
an `Align` type by virtue of the field:
```
align: [Align; 0],
```
The position of this field in the struct is irrelevant.

It is assumed that the alignment of the `Storage` type is no larger than the
alignment of the `Align` type, which will be true if the `Storage` type is, for
example, an array of `u8`. This assumption is checked in a debug assertion.

Although the double underscore (__) prefix is reserved for implementations of
C++, there are precedents for this convention elsewhere in bindgen and so the
convention is adopted here too.

Acknowledgement:

Thanks to @fitzgen for an initial implementation of `__BindgenBitfieldUnit` and
code to integrate it into bindgen.

r? @emilio
@fitzgen
Copy link
Member

fitzgen commented Nov 27, 2017

This can be closed now, right? @glyn @emilio

@glyn
Copy link
Contributor Author

glyn commented Dec 7, 2017

@fitzgen Yes. I checked the behaviour using the example at the start of this issue.

@glyn glyn closed this as completed Dec 7, 2017
fitzgen added a commit to fitzgen/rust-bindgen that referenced this issue Dec 7, 2017
bors-servo pushed a commit that referenced this issue Dec 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants