Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C# Bitfield struct support #465

Open
gafter opened this issue Apr 21, 2017 · 3 comments

Comments

Projects
None yet
4 participants
@gafter
Copy link
Member

commented Apr 21, 2017

@fanoI commented on Fri Oct 23 2015

Support for Struct Bitfileds could be added to the language?
Having to talk with low level hardware in C/C++ is really natural to write something as:

typedef struct reg {
         unsigned char bit1: 1;
         unsigned char bit2: 1;
         unsigned char bit3: 1;
         unsigned char bit4: 1;
         unsigned char bit5: 1;
         unsigned char bit6: 1;
         unsigned char bit7: 1;
         unsigned char bit8: 1;
} reg_t;

reg_t reg;

to write in the VI bit one simply does: reg.bit6 = 1 and get the value of the bit itself it is easy.
When it is time to write to the device you simple cast the the thing to char (they are 8 bit that is a byte) and use the write method.

But how this struct could be represented in C#?
For example in this way:

 [StructLayout(LayoutKind.Sequential, Pack=0)]
 public struct reg {
         [BitfieldLength(1)]
         unsigned byte bit1;

         [BitfieldLength(1)]
         unsigned byte bit2;

         [BitfieldLength(1)]
         unsigned byte bit3;

         [BitfieldLength(1)]
         unsigned byte bit4;

         [BitfieldLength(1)]
         unsigned byte bit5;

         [BitfieldLength(1)]
         unsigned byte bit6;

         [BitfieldLength(1)]
         unsigned byte bit7;

         [BitfieldLength(1)]
         unsigned byte bit8;
}

The compiler probably will inject some hidden method to simulate the access to a reg bit (it will probably create a a normal 8 bit hidden field and uses bitmask behind the scenes).

Obviously BitfieldLength could assume any value maybe the only restriction we can have is that the size in byte of the struct could be a multiple of 8 (while pack 0 is written in the struct attributes above it is should not implicate that the compiler cannot add an hidden padding field at the end of the struct to align to the size of a native type if it needs to do this).
With this bitfield struct in place one could easily create exotic integer values as Int24 o Int128 this for example is an Int24 struct:

  [StructLayout(LayoutKind.Sequential, Pack=0)]
 public struct Int24 {
         [BitfieldLength(24)]
         internal int m_value
         public const int MaxValue = 0x7fffff;
         public const int MinValue = unchecked((int)0x800000); 

         // Compares this object to another object, returning an integer that 
         // indicates the relationship. 
         // Returns a value less than zero if this  object
         // null is considered to be less than any instance. 
         // If object is not of type Int24, this method throws an ArgumentException.
         //
         public int CompareTo(Object value) {
             if (value == null) { 
                 return 1;
             } 
            if (value is Int24) { 
                 // Need to use compare because subtraction will wrap
                 // to positive for very large neg numbers, etc. 
                 int i = (int)value;
                 if (m_value < i) return -1;
                 if (m_value > i) return 1;
                 return 0; 
           }
           throw new ArgumentException (Environment.GetResourceString("Arg_MustBeInt24")); 
    } 

   [...]
 }

At this point one could use as a normal Int32: Int24 a = 42; probably this will trigger an unwanted conversion from Int32 and for new int types > 64 their values cannot be represented as literals so probably this could be coupled with the possibility to define our constant.


@Joe4evr commented on Fri Oct 23 2015

I'm pretty sure this is a dupe, but I can't find the previous issue off-hand.

Anyway, getting the value of a single bit in the CLR is actually not so easy. For reference here's a good piece about booleans in the CLR, the most relevant part of which is:

Performance is also the core reason why a bool is not a single bit. There are few processors that make a bit directly addressable, the smallest unit is a byte. An extra instruction is required to fish the bit out of the byte, that doesn't come for free. And it is never atomic.

The closest thing you're going to get to this is Binary Literals (#215, code for which is already checked in), but I wouldn't hold my breath for awkwardly sized integer types.


@HaloFour commented on Fri Oct 23 2015

@Joe4evr

It was over on the CoreCLR repo: #1635


@fanoI commented on Fri Oct 23 2015

Yes I reported the issue in the CoreCLR repo but they said that was better to do this in the language that modify the CLR so I cross posted it here.

However the idea is that the Roslyn compiler transforms that struct in something more digestible by the CLR giving to the developer the illusion that he can access the bits directly when in reality bitmasking is done (on one byte in the reg example), I don't pretend that the .NET makes bit really accessible as indeed this would not map with any CPU existing.
I imagine GCC itself does a sort of transformation internally as for sure my X86 does not support bit addressing.

This is a possible way to transform it:

  [StructLayout(LayoutKind.Sequential, Pack=0)]
  public struct reg {
        internal uint data;

         /*  [BitfieldLength(1)]
          *  unsigned byte bit1;
           */
         public byte bit1
         {
               /* 
                * Bitmask based on the position of the field in the structure and 
                * the BitfieldLength attribute to get the value
                */
               get { return data & ...;  }
              /* 
               * Bitmask based on the position of the field and 
               * the BitfieldLength attribute to set the value
                */
               set { data = (data & ~... ) | (value & 0x1FFF);  }
         }

          /*  [BitfieldLength(1)]
          *  unsigned byte bit1;
           */
         public byte bit2
         {
               /* 
                * Bitmask based on the position of the field in the structure and 
                * the BitfieldLength attribute to get the value
                */
               get { return data & ...;  }
              /* 
               * Bitmask based on the position of the field and 
               * the BitfieldLength attribute to set the value
                */
               set { data = (data & ~... ) | (value & 0x1FFF);  }
         }

         [...]
}

Implemented in this way it will be better that what GCC does as it retains the "illusion" that it bit1 is a bit and doesn't likes if you try to pass a "bit" as a pointer to a function, C# would pass a "bitmasked" byte as a reference / output parameter and should not have problems.


@DerpMcDerp commented on Tue Feb 23 2016

C's bitfields aren't designed that well. I'd recommend basing a modern bitfield proposal on the BitField template ( https://github.com/v8/v8/blob/5.0.71/src/utils.h#L244 ) from Google Chrome's v8 JavaScript engine which fixes some of the problems:

e.g.

// specify the underlying member you want to use to back the bitfield
byte bit_field_;

// then create an overlay view over it
typedef BitField<char, 1, 2, decltype(bit_field_)> CharField;

1 is the starting bit position of the slice
2 is the bit length of the slice
char means how you want to interpret the bits of the bit slice as
decltype(bit_field_) is the underlying type of the slice

Then you can do stuff like:

CharField::decode((byte)bit_field_);
CharField::encode((char)value);
CharField::is_valid((char)value); // tests if value can fit in 2 bits
CharField::kMask; // returns (byte)0b0110, i.e. the mask
CharField::kShift; // returns 1, the starting bit position you specified
CharField::kNext; // returns the position of the next field, i.e.

    typedef BitField<char, 1, 2, decltype(bit_field_)> CharField;
    typedef BitField<int, CharField::kNext, 2, decltype(bit_field_)> IntField;
    // now IntField appears after CharField

The reason why this is way superior than C bitfields is:

  • You can overlay multiple BitFields over each other which is useful for parallel operations and for interpreting a superset bits as something else, there is no non-DRY way in C to do something like:
union object {
    uintptr_t data_;
    struct {
        uintptr_t ptr : 59;
        uintptr_t extra : 5;
    };
    struct {
        intptr_t fixnum : 62;
        bool : 1;
        bool isFixNum : 1;
    };
    struct {
        uintptr_t : 59;
        uint typecode : 3;
        bool gcbit : 1;
        bool : 1;
    };
};

C bitfields make this sort of thing such a pain that it's almost always better to resort to your own masking.

  • BitFields are ad-hoc. You can treat any arbitrary integer like value as a bitfield just by overlaying the BitField view over it. With C, you're required to memcpy the data to the bitfield then use the bitfield as if it were a view.
  • You can do fancy things like use the BitField view to have the compiler generate bitmasks and let you query things like min/max values. Doing it the C way would require something like new keywords or constexpr to recover that kind of info:
struct asdf {
    int foo : 2;
    int bar : 2;
    int baz : 3;
};

asdf a;

// these should be integral constant expressions:
bitmaskof(asdf::bar) | bitmaskof(a.bar); // 0b111000
bitsizeof(asdf::baz) | bitmaskof(a.baz); // 3
bitstartof(asdf::baz) | bitstartof(a.baz); // 4

@GeirGrusom commented on Tue Feb 23 2016

With C, you're required to memcpy the data to the bitfield then use the bitfield as if it were a view.

Why couldn't you just use a pointer, which is a view in itself?


@DerpMcDerp commented on Wed Feb 24 2016

Type punning through a pointer/reference is undefined behavior since it violates C/C++'s strict type aliasing rules. Modern compilers break your code on high optimization levels if you do that.


@GeirGrusom commented on Sun Feb 28 2016

How is memcpy any less of a type violation than pointers? Aren't you just doing the same thing in a different storage location?

edit: nevermind. Strict aliasing is the rule that C and C++ compilers assume that two different pointers of different types don't point to the same data, so it may optimize away code that depends on them pointing to the same thing producing a very hard to debug error. Copying resolves this.

@yaakov-h

This comment has been minimized.

Copy link
Contributor

commented Apr 21, 2017

Looks like this would largely address #457.

@fanoI

This comment has been minimized.

Copy link

commented Apr 21, 2017

@gafter but the fact that I see this here means that you are taking in consideration to add this to the C# language in the future? Or it is when I see "Champion" that this happens?

@sgf

This comment has been minimized.

Copy link

commented Jun 1, 2019

request too,
like C/C++ Bit Fields https://docs.microsoft.com/en-us/cpp/c-language/c-bit-fields?view=vs-2019
https://docs.microsoft.com/en-us/cpp/cpp/cpp-bit-fields?view=vs-2019
this is a very useful.
im parse the IP or TCP Header. thats many Bit-Filed in there.
but its hard use C# define that now.
One Way is use shifting operation in Property getter setter.
But the code is hard to read.
but If c# have this feature, then it will become simple.

and why not just let C# has the same style like the C/C++

struct reg {
byte bit1: 1;
short bit3: 3;
ushort bit5: 5;
int bit32;
int bit64;
} 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.