C# Bitfield struct support #465

gafter · 2017-04-21T00:41:14Z

gafter
Apr 21, 2017

@fanoI commented on Fri Oct 23 2015

Support for Struct Bitfileds could be added to the language?
Having to talk with low level hardware in C/C++ is really natural to write something as:

typedef struct reg {
         unsigned char bit1: 1;
         unsigned char bit2: 1;
         unsigned char bit3: 1;
         unsigned char bit4: 1;
         unsigned char bit5: 1;
         unsigned char bit6: 1;
         unsigned char bit7: 1;
         unsigned char bit8: 1;
} reg_t;

reg_t reg;

to write in the VI bit one simply does: reg.bit6 = 1 and get the value of the bit itself it is easy.
When it is time to write to the device you simple cast the the thing to char (they are 8 bit that is a byte) and use the write method.

But how this struct could be represented in C#?
For example in this way:

 [StructLayout(LayoutKind.Sequential, Pack=0)]
 public struct reg {
         [BitfieldLength(1)]
         unsigned byte bit1;

         [BitfieldLength(1)]
         unsigned byte bit2;

         [BitfieldLength(1)]
         unsigned byte bit3;

         [BitfieldLength(1)]
         unsigned byte bit4;

         [BitfieldLength(1)]
         unsigned byte bit5;

         [BitfieldLength(1)]
         unsigned byte bit6;

         [BitfieldLength(1)]
         unsigned byte bit7;

         [BitfieldLength(1)]
         unsigned byte bit8;
}

The compiler probably will inject some hidden method to simulate the access to a reg bit (it will probably create a a normal 8 bit hidden field and uses bitmask behind the scenes).

Obviously BitfieldLength could assume any value maybe the only restriction we can have is that the size in byte of the struct could be a multiple of 8 (while pack 0 is written in the struct attributes above it is should not implicate that the compiler cannot add an hidden padding field at the end of the struct to align to the size of a native type if it needs to do this).
With this bitfield struct in place one could easily create exotic integer values as Int24 o Int128 this for example is an Int24 struct:

  [StructLayout(LayoutKind.Sequential, Pack=0)]
 public struct Int24 {
         [BitfieldLength(24)]
         internal int m_value
         public const int MaxValue = 0x7fffff;
         public const int MinValue = unchecked((int)0x800000); 

         // Compares this object to another object, returning an integer that 
         // indicates the relationship. 
         // Returns a value less than zero if this  object
         // null is considered to be less than any instance. 
         // If object is not of type Int24, this method throws an ArgumentException.
         //
         public int CompareTo(Object value) {
             if (value == null) { 
                 return 1;
             } 
            if (value is Int24) { 
                 // Need to use compare because subtraction will wrap
                 // to positive for very large neg numbers, etc. 
                 int i = (int)value;
                 if (m_value < i) return -1;
                 if (m_value > i) return 1;
                 return 0; 
           }
           throw new ArgumentException (Environment.GetResourceString("Arg_MustBeInt24")); 
    } 

   [...]
 }

At this point one could use as a normal Int32: Int24 a = 42; probably this will trigger an unwanted conversion from Int32 and for new int types > 64 their values cannot be represented as literals so probably this could be coupled with the possibility to define our constant.

@Joe4evr commented on Fri Oct 23 2015

I'm pretty sure this is a dupe, but I can't find the previous issue off-hand.

Anyway, getting the value of a single bit in the CLR is actually not so easy. For reference here's a good piece about booleans in the CLR, the most relevant part of which is:

Performance is also the core reason why a bool is not a single bit. There are few processors that make a bit directly addressable, the smallest unit is a byte. An extra instruction is required to fish the bit out of the byte, that doesn't come for free. And it is never atomic.

The closest thing you're going to get to this is Binary Literals (#215, code for which is already checked in), but I wouldn't hold my breath for awkwardly sized integer types.

@HaloFour commented on Fri Oct 23 2015

@Joe4evr

It was over on the CoreCLR repo: #1635

@fanoI commented on Fri Oct 23 2015

Yes I reported the issue in the CoreCLR repo but they said that was better to do this in the language that modify the CLR so I cross posted it here.

However the idea is that the Roslyn compiler transforms that struct in something more digestible by the CLR giving to the developer the illusion that he can access the bits directly when in reality bitmasking is done (on one byte in the reg example), I don't pretend that the .NET makes bit really accessible as indeed this would not map with any CPU existing.
I imagine GCC itself does a sort of transformation internally as for sure my X86 does not support bit addressing.

This is a possible way to transform it:

  [StructLayout(LayoutKind.Sequential, Pack=0)]
  public struct reg {
        internal uint data;

         /*  [BitfieldLength(1)]
          *  unsigned byte bit1;
           */
         public byte bit1
         {
               /* 
                * Bitmask based on the position of the field in the structure and 
                * the BitfieldLength attribute to get the value
                */
               get { return data & ...;  }
              /* 
               * Bitmask based on the position of the field and 
               * the BitfieldLength attribute to set the value
                */
               set { data = (data & ~... ) | (value & 0x1FFF);  }
         }

          /*  [BitfieldLength(1)]
          *  unsigned byte bit1;
           */
         public byte bit2
         {
               /* 
                * Bitmask based on the position of the field in the structure and 
                * the BitfieldLength attribute to get the value
                */
               get { return data & ...;  }
              /* 
               * Bitmask based on the position of the field and 
               * the BitfieldLength attribute to set the value
                */
               set { data = (data & ~... ) | (value & 0x1FFF);  }
         }

         [...]
}

Implemented in this way it will be better that what GCC does as it retains the "illusion" that it bit1 is a bit and doesn't likes if you try to pass a "bit" as a pointer to a function, C# would pass a "bitmasked" byte as a reference / output parameter and should not have problems.

@DerpMcDerp commented on Tue Feb 23 2016

C's bitfields aren't designed that well. I'd recommend basing a modern bitfield proposal on the BitField template ( https://github.com/v8/v8/blob/5.0.71/src/utils.h#L244 ) from Google Chrome's v8 JavaScript engine which fixes some of the problems:

e.g.

// specify the underlying member you want to use to back the bitfield
byte bit_field_;

// then create an overlay view over it
typedef BitField<char, 1, 2, decltype(bit_field_)> CharField;

1 is the starting bit position of the slice
2 is the bit length of the slice
char means how you want to interpret the bits of the bit slice as
decltype(bit_field_) is the underlying type of the slice

Then you can do stuff like:

CharField::decode((byte)bit_field_);
CharField::encode((char)value);
CharField::is_valid((char)value); // tests if value can fit in 2 bits
CharField::kMask; // returns (byte)0b0110, i.e. the mask
CharField::kShift; // returns 1, the starting bit position you specified
CharField::kNext; // returns the position of the next field, i.e.

    typedef BitField<char, 1, 2, decltype(bit_field_)> CharField;
    typedef BitField<int, CharField::kNext, 2, decltype(bit_field_)> IntField;
    // now IntField appears after CharField

The reason why this is way superior than C bitfields is:

You can overlay multiple BitFields over each other which is useful for parallel operations and for interpreting a superset bits as something else, there is no non-DRY way in C to do something like:

union object {
    uintptr_t data_;
    struct {
        uintptr_t ptr : 59;
        uintptr_t extra : 5;
    };
    struct {
        intptr_t fixnum : 62;
        bool : 1;
        bool isFixNum : 1;
    };
    struct {
        uintptr_t : 59;
        uint typecode : 3;
        bool gcbit : 1;
        bool : 1;
    };
};

C bitfields make this sort of thing such a pain that it's almost always better to resort to your own masking.

BitFields are ad-hoc. You can treat any arbitrary integer like value as a bitfield just by overlaying the BitField view over it. With C, you're required to memcpy the data to the bitfield then use the bitfield as if it were a view.
You can do fancy things like use the BitField view to have the compiler generate bitmasks and let you query things like min/max values. Doing it the C way would require something like new keywords or constexpr to recover that kind of info:

struct asdf {
    int foo : 2;
    int bar : 2;
    int baz : 3;
};

asdf a;

// these should be integral constant expressions:
bitmaskof(asdf::bar) | bitmaskof(a.bar); // 0b111000
bitsizeof(asdf::baz) | bitmaskof(a.baz); // 3
bitstartof(asdf::baz) | bitstartof(a.baz); // 4

@GeirGrusom commented on Tue Feb 23 2016

With C, you're required to memcpy the data to the bitfield then use the bitfield as if it were a view.

Why couldn't you just use a pointer, which is a view in itself?

@DerpMcDerp commented on Wed Feb 24 2016

Type punning through a pointer/reference is undefined behavior since it violates C/C++'s strict type aliasing rules. Modern compilers break your code on high optimization levels if you do that.

@GeirGrusom commented on Sun Feb 28 2016

How is memcpy any less of a type violation than pointers? Aren't you just doing the same thing in a different storage location?

edit: nevermind. Strict aliasing is the rule that C and C++ compilers assume that two different pointers of different types don't point to the same data, so it may optimize away code that depends on them pointing to the same thing producing a very hard to debug error. Copying resolves this.

yaakov-h · 2017-04-21T07:11:51Z

yaakov-h
Apr 21, 2017

Looks like this would largely address #457.

0 replies

fanoI · 2017-04-21T09:34:03Z

fanoI
Apr 21, 2017

@gafter but the fact that I see this here means that you are taking in consideration to add this to the C# language in the future? Or it is when I see "Champion" that this happens?

0 replies

sgf · 2019-06-01T16:24:40Z

sgf
Jun 1, 2019

request +1,
like C/C++ Bit Fields https://docs.microsoft.com/en-us/cpp/c-language/c-bit-fields?view=vs-2019
https://docs.microsoft.com/en-us/cpp/cpp/cpp-bit-fields?view=vs-2019
this is a very useful.
im parse the IP or TCP or SCTP Header, thats many Bit-Filed in there.
but its hard use C# define that now.
One Way is use shifting operation in Property getter setter.
But the code is hard to read.
but If c# have this feature, then it will become simple.

and why not just let C# has the same style like the C/C++

struct reg {
byte bit1: 1;
short bit3: 3;
ushort bit5: 5;
int bit32;
int bit64;
}

1 reply

This comment has been minimized.

Sign in to view

sgf · 2019-06-29T13:30:19Z

sgf
Jun 29, 2019

@gafter could be this support add in C# 8.0 candidate or C# 9.0 candidate

0 replies

gafter · 2019-06-29T19:02:14Z

gafter
Jun 29, 2019
Author

@sgf Since we're wrapping up C# 8, it doesn't make any sense to delay C# 8 by adding any features to its scope.

This proposal doesn't have any LDM champion, so there is no way to move it forward at this point. Until someone on the LDM takes it up and advocates it as important enough to invest in, it won't be planned for any particular release.

0 replies

sgf · 2019-06-30T08:29:29Z

sgf
Jun 30, 2019

Ok,thanks,got it.

@sgf Since we're wrapping up C# 8, it doesn't make any sense to delay C# 8 by adding any features to its scope.

This proposal doesn't have any LDM champion, so there is no way to move it forward at this point. Until someone on the LDM takes it up and advocates it as important enough to invest in, it won't be planned for any particular release.

0 replies

Unknown6656 · 2019-07-28T11:58:08Z

Unknown6656
Jul 28, 2019

@sgf: I do like the syntax (and the proposal), ~~however, I am wondering whether some newbies to C# might find that confusing, e.g. the difference between S1 and S2 is only one character.~~

public struct S1
{
    public byte b : 1;
}

public struct S2
{
    public byte b = 1;
}

public struct S3
{
    public byte b : 1 = 1;
}

EDIT: Nevermind, one cannot initialize struct fields.

0 replies

asimonf · 2021-08-13T18:51:22Z

asimonf
Aug 13, 2021

so... is this dead in the water? Talking with C code using packed structs is a boon in IoT applications with little memory. I already rely on DotNetCore on many Beaglebone projects of mine and those are usually fairly low-level.

0 replies

fubar-coder · 2021-08-24T10:23:14Z

fubar-coder
Aug 24, 2021

Bit fields are highly OS/CPU specific and (IMO) only work well for programming languages whose code will always be compiled for a specific target platform. Here are some problems I see for a platform independent programming language like C#:

Ordering issues
- Does it start at the LSB or MSB?
- Should the bit order follow the current OS/CPU architecture the code runs on or should we always start at Bit 0 (as it would be masked by &1)?
- How do the bit order and/or values change between Linux on different platforms (x64 and ARM32 or ARM64, Apple Silicon, Little Endian/Big Endian)?
How are bit value overflows handled?
- Should it always be unchecked (more practical) or checked (depending on the current compiler settingsl) by default?

Explicit & masks and shifts should have the same performance (because this is what the compiler would do anyways), but surely isn't as convenient.

1 reply

asimonf Aug 24, 2021

Ordering can be solved by providing an attribute hint: Respect endianness, MSB-first or LSB-first. This would make it easy to write portable code that interacts with native libraries easily using platform defines.

Overflows don't have to be handled. This functionality could be limited to unsafe contexts because the primary use-case is interacting with native structures that have specific layouts. The ordering being explicity would leave implementation details to the user of the code since ordering is indeed not guaranteed to be similar. Managed solutions with small memory footprints don't have to rely on this.

When writing code bindings, making it inconvenient to map C structs to C# structs makes the code less readable and maintainable. This tool, while niche, would help reduce that maintenance burden. The language should have more unsafe niceties like this and the recent native ints. It makes interop code more readable, as messy as it already generally is. Using explicit & masks makes the code less readable and could lead to a lot of repetition for structures having a lot of bitfields.

Rekkonnect · 2021-11-13T17:23:58Z

Rekkonnect
Nov 13, 2021

I toyed around with a bitfield struct that I want for a project, and I could see a simplified version of that declaration. Specifically:

My current implementation involves declaring this struct:

private struct Info
{
    // Flags
    private const int
        HasDiedFlag = 1 << 31,
        HasPromotedFlag = 1 << 30,

        AllFlagMask = HasDiedFlag | HasPromotedFlag;

    private int bits;

    // Bring C back:tm:
    public int MoveTimes
    {
        get => bits & ~AllFlagMask;
        set => bits = (bits & AllFlagMask) | (value & ~AllFlagMask);
    }

    public bool HasDied
    {
        get => HasFlag(bits, HasDiedFlag);
        set => Toggle(ref bits, HasDiedFlag, value);
    }
    public bool HasPromoted
    {
        get => HasFlag(bits, HasPromotedFlag);
        set => Toggle(ref bits, HasPromotedFlag, value);
    }

    public void RegisterMoveTime() => bits++;
    public void UnregisterMoveTime() => bits--;

    private static bool HasFlag(int bits, int value) => (bits & value) != 0;
    private static void Toggle(ref int bits, int value, bool toggle) => bits = Toggle(bits, value, toggle);
    private static int Toggle(int bits, int value, bool toggle)
    {
        return toggle switch
        {
            true => bits | value,
            false => bits & ~value,
        };
    }
}

I came up with a more concise syntax for this declaration, like the following:

private bitfield struct Info
{
    public bool HasDied[^1];
    public bool HasPromoted[^2];
    public int MoveTimes[...]; // would become ..^2

    public void RegisterMoveTime() => bits++;
    public void UnregisterMoveTime() => bits--;
}

Within the brackets following the property, a constant System.Index or System.Range is provided, which refects the index of the bits in the bitfield that the property reflects. [...] denotes that the property's bits are the remaining unreserved ones. Conflicting ranges are by default enabled (potentially able to opt-out).

The bits is an auto-generated private field of the type, whose type is the corresponding unsigned primitive integer one for the struct's size (byte, ushort, uint, ulong for sizes 1, 2, 4, 8 respectively). For sizes other than 1, 2, 4 and 8, a fixed byte bits[Size] buffer is used instead.

In the above case, it is assumed that the type of the bitfield struct is int, like enums. For a different size, a primitive struct type representing the size would have to be specified, or a raw number. Where that number would be put is not something I've considered yet, though I believe it could follow enum's steps and have it be specified after a :. For example, for a bitfield of size 8, you would have the following options:

bitfield struct B : ulong
bitfield struct B : 8

0 replies

sgf · 2022-02-12T17:54:26Z

sgf
Feb 12, 2022

None of your words have anything to do with a constructive discussion revolving around C#. It doesn't help shape its development, or provide any progress on the proposal itself. This is not the appropriate place for such conversations.

ok go ahead, I don't think the .net team will implement this feature until 2025. If not lucky enough.,Even in 2030, it may not be taken seriously.

6 replies

pedoc Dec 3, 2022

but it currently seems to have very little interest from the community.

@CyrusNajmabadi
The importance of this function is related to the country and the existing market environment.

In China, for example, there is clearly no demand for this feature in the web development world.

But the reality is that most .Net developers cannot compete with Java Web developers, and most of them are now concentrated in lower-level fields (such as medical industry hardware, etc.), and these jobs need to communicate with C language or low-level protocols Interact frequently. For these people, this feature is very important, but they don't often comment here, but it doesn't mean they are not interested.

CyrusNajmabadi Dec 3, 2022
Collaborator

For these people, this feature is very important, but they don't often comment here

This seems unlikely. I don't see what about that community would make them unlikely to give feedback to to us through our myriad channels that are available.

pedoc Dec 3, 2022

This seems unlikely. I don't see what about that community would make them unlikely to give feedback to to us through our myriad channels that are available.

This can be hard to empathize with. At least, in China, GFW prevents many people from accessing github. This is not a technical problem, but it exists objectively.

In addition, for most practitioners in industries such as medical care or hardware, they are more accustomed to using C, and currently C# lacks some key functions to attract them (such as bitfield).
At the same time, practitioners in traditional industries are not as active as developers in the Web field.

Of course, I can't speak for all of these people. But I'll try to make them appear here as well. 😄

sgf Dec 3, 2022

For these people, this feature is very important, but they don't often comment here

This seems unlikely. I don't see what about that community would make them unlikely to give feedback to to us through our myriad channels that are available.

As for the feedback you mentioned, in fact, you may not know that I can now access github normally by purchasing an expensive VPN server.
Otherwise, the github website is intermittently unavailable where I am (unstable, often unable to connect, this ratio can usually reach 70% or more, not only the webpage, but git itself cannot be used normally).
I don't want to mention politics too much, and of course I don't like the hypocrisy of Chinese politics. But being born in one place and wanting to immigrate, etc., it's not easy, even for 99.9% of ordinary people.
I have already made a long description, and I don’t want to continue. Whether bitfield is realized or not, it’s up to you. After all, we are just ordinary people, and many things cannot be promoted.

CyrusNajmabadi Dec 3, 2022
Collaborator

The presumption here is that github is the only source for feedback. It is not. What I'm saying is that across all our avenues for feedback, bitfield interest seems middling.

CyrusNajmabadi · 2023-02-24T22:04:08Z

CyrusNajmabadi Feb 24, 2023
Collaborator

@sgf your post was removed for violating the .net organization's code of conduct. Please be constructive and keep your posts on topic and not personal. Thanks.

sgf · 2023-02-24T22:05:51Z

sgf Feb 24, 2023

It seems like this would be trivially solvable by just having helper methods (potentially generated by a tool) that most allow you to read/write these fields trivially and not have to do any of the manual work above.

A Japanese developer tried to implement your so-called easy-to-solve class library that supports BitField technology. The repository has been written by the author for many years, but it is still not perfect. I tried it and found it to be barely there and not very useful. I think his technical ability is above mine. Here is the repository Link https://github.com/ufcpp/BitFields .If you're interested in trying and learning, you can try it out yourself.

CyrusNajmabadi · 2023-02-24T22:26:23Z

CyrusNajmabadi Feb 24, 2023
Collaborator

I tried it and found it to be barely there and not very useful.

Sounds like feedback to give them. Or consider contributing improvements as it's open source.

ufcpp · 2023-02-24T22:39:47Z

ufcpp Feb 24, 2023

I look forward to your contribution, @sgf.

msedi · 2022-12-03T13:11:52Z

msedi
Dec 3, 2022

Isn't this also a matter of performance? How is the C bitfield translated into assembler code? If the source generator only does the bitfiddling identical to what C does and the resulting performance is the same I would agree that a source generator would be sufficient. That is what the source generator could solve, but it looks a bit unnatural to do this with methods. If partial properties would exists that would be the better choice.

I have to read gigabytes of data that are bitfields, fiddling them out "works" but is really not nice (but I have never measured the difference between a direct read and the bit-fiddling).

I think currently in the case of this interop many people are using C++/CLI which brings its own complications... I have never met someone who really likes it.

Nevertheless, if performance could be improved there are few areas where it could help:

There are many structs in windows that deal with bitfields
File I/O. Reading the bytes directly into a struct would be for sure more performant than splitting them later.
Network communication. I think many network low level APIs would benefit

5 replies

Rekkonnect Dec 3, 2022

Isn't this also a matter of performance? How is the C bitfield translated into assembler code?

The bit operations are not elided, they're simply masked under syntax sugar for convenience. You're very prone to making mistakes with your bitfields, and they're too annoying to write manually once you write enough of them. You do not lose a lot of performance either, modern hardware include features (instruction cache lines, pipelining, branching, etc.) that would barely even slow your performance down, and if that happens, you may be interested in mircobenchmarking your application anyway for fine-tuning. Bit operations are the easiest for your computer, multiplications and divisions are significantly slower and more demanding at a clock cycle level (exact details per architecture may vary), even additions are worse than bitwise operations.

A source generator can be as effective as the one for regex is. It can spit out the bitwise operators for each individual field, and optimize itself for aligned fields.

msedi Dec 3, 2022

You're very prone to making mistakes with your bitfields

I absolutely agree, so I'm fine with everything that helps on that

modern hardware include features

Understood, the question the is if the JIT produces the same "performance" as a C compiler

you may be interested in mircobenchmarking

and you are right here. That would then be necessary.

Rekkonnect Dec 3, 2022

Understood, the question the is if the JIT produces the same "performance" as a C compiler

That's a runtime concern which applies to the currently available bitfield support using FieldOffset and your own methods for masking the bits and retrieving the vaules of your fields. Introducing source generators will basically delegate the process of writing the necessary methods to get the values of the fields, they still produce C# at compile time that then gets compiled down to IL. The C# that the source generators produce should be carefully crafted to contain the fewest possible bitwise operations when getting or setting the fields in your bitfield struct. That's as far as source generators are concerned, and the maximum performance you can get when writing C# on your own.

CyrusNajmabadi Dec 3, 2022
Collaborator

Isn't this also a matter of performance?

Not at the c# level. We'll just emit the same il that you would write out by hand (or have a generator emit).

This comment has been minimized.

Sign in to view

pawchen · 2023-03-26T04:52:38Z

pawchen
Mar 26, 2023

Recently I'm playing with Vulkan ray tracing with my own generated C# library. At first I was crashing the driver randomly. After days of API dumping and comparing I found that their VkAccelerationStructureInstanceKHR has bit fields defined. My buggy Marshal.SizeOf<AccelerationStructureInstanceKHR>() returned 72 while their Vulkan-Hpp sizeof(vk::AccelerationStructureInstanceKHR) returns 64.

My mapping struct was generated anyways so I can add code in the generator to emit bit fiddling property accessors. But it would be nice that C# had similar features so the code gen could be much straight forward.

1 reply

This comment has been minimized.

Sign in to view

tamlin-mike · 2023-04-01T10:34:37Z

tamlin-mike
Apr 1, 2023

Funny how it never rains, it pours. :-)

Just last week I found myself in gfx lib situation myself where this issue came up, though voxel-generation instead of Vulkan.

I encountered a struct with a bunch of bool's. No biggie, right? Well, if you deal with 100 million instances it can be. So, I changed those bool's to be properties (memory pressure had become more important than the performance hit) and then were about to add the backing-store's as bitfields.

Oh...

Yeah, I was forced to write that usual ugly mass of repetetive manual code for each and every accessor, and thanks to the wonderful feature "no macros" one can't even lessen the pain even a little bit.

That sucks.

Thinking back through C# code I've seen over the last decade, and the number of hand-rolled implementations of this language deficiency, I can state with 100% certainty that this deficiency has hurt people and code for a long time.

So yeah, this language feature def. has my
+1.

But I wonder... most if not all use-cases I have encountered in C# have been around treating data as packed bool's. This scenario is semantically different from e.g. interfacing with hardware or dealing with protocol data, where values can be represented by 2 or more bits, and some scenarios can allow for both signed and unsigned quantities (e.g. RLE-encoded data).

Could that differentiation possibly warrant two separate propsals, and data types? It could seriously lower the bar to specification and implementation of the simpler case.
@gafter, @CyrusNajmabadi comments?

In case someone is tempted to "solve" the packed-bools at a combination of library- + compiler level (like ValueTuple), forget it. You would have to manually write (at least) 64 overrides for each of the 1-64 generic's. 🤣

A few random ideas on the design:

Implementation must not require changes to the CLR, that would set the bar too high for a proposal to effectively ever be accepted.
Allow byte-order specification at declaration. For the case I had, and many others I have seen, the packed-bool's byte-ordering were irrelevant, but for e.g. hardware interfacing or protocol data the byte order is vital.
To not even suggest byte-order is an attribute, don't use attribute-like syntax for byte-order specification.
No suprises on the bit-ordering within a byte. First "virtual field" within a byte is at bit 0, and consecutive fields are in consecutive bits. (I kinda like that definition, "virtual field")
Use (require) inherits-syntax to specify backing-field size. Must be one of the predefined u* integral types. (to consider; allow future use of types from Numerics as base, to allow for e.g. AVX512 instructions to handle masking?) (for bit-sizes 8/16/32 the JIT would love MC68k - it has bitfield instructions).
Allow syntactically direct access to the value (a.k.a. the "backing" field). This is vital, for numerous reasons.
Force byte-order specification to be a compile-time constant, so you can write code like what's done in e.g. Numerics ( if (T == typeof(ushort)) {} else if (T == typeof(uint)) {} ...) to give the JIT enough information to do its job properly.
The auto-generated properties (each with the same name as the "virtual field" it represents) always have public visibility to reduce usless clutter.
Consideration: To simplify spec. and impl., would it make sense to disallow user-specified methods in the otherwise only-compiler-generated type (backing_t below)? I'd say yes, it does. Should the need turn out to overshadow the added complexity, it could be added at a later stage.

If limiting the following to the packed-bool's scenario, could something like the following provide food for thought, or even work? Since this scenario is much, much simpler than having variable-size and variable-type packed fields, it makes sense to me to explore this first.

Working name PackedBool.

Considerations:

No changes to CLR. That limits us to play with attributes and typenames of generated types.
The byte-order of every generated PackedBool type must be compile-time constant, that the user can access (see comment on Numerics above). The only (?) way we can encode this data is as a generic type-argument.
Since we can't switch on enumeratee's at compile-time, LittleEndian, BigEndian and Native must be types.
Since (if?) we don't want to allow user to make up willy-nilly endian types, those three types can not only be part of a namespace - they must be part of a finalizable construct, which gives us that they are inner types of a sealed type. Let's call the containing type ByteOrder, and the namespace it lives in System.PackedBool.
Since we're forced to use generic and type-arguments, to limit endinanness in a where clause the three types must inherit a common base. Since we can't inherit a static class in C#, this prevents them from being static, forcing them to be sealed.

(Please note that everything herein is written directly in a web browser. No code is tested in any way or form. Consider this a thought experiment.)

This leads to something like:

namespace System.PackedBool
{
  public static class ByteOrder
  {
    public class Endianness {}
    public sealed class LittleEndian : Endianness {}
    public sealed class BigEndian : Endianness {}
    public sealed class Native : Endianness {}
  }
}

I now might have gone overboard a bit in the refactoring and design. 😄 Fingers crossed it still makes sense.

namespace System.PackedBool {

// This allows for easier optimal assembler implementations in the
// (to-native) compilers for U = ubyte/ushort/uint/ulong,
// and makes potential later conversion-to-intrinsics almost trivial.
internal struct Data<U>
{
  U val;

  // NOTE
  // No error checking for the value of bitNo. It's expected calling
  // compiler-generated code never generating bitNo >= (sizeof(U)*8).

  [PerfCritical, Inline]
  protected static bool GetBoolBit(int bitNo) => GetBit(bitNo);

  [PerfCritical, Inline]
  protected static void SetBoolBit(int bitNo, bool bSet) {
    if (bSet)
      SetBit(bitNo);
    else
      ClearBit(bitNo);
  }

  <remarks>
  Not written for speed, but for clarity (to match Clear/Set).<br/>
  For speed we would shift down, mask with 1, and use that bit as a bool.<br/>
  Maybe profiling will show a lookup table is even better?<br/>
  The seemingly repetetive typeof checks is following the pattern from Numerics,<br/>
  to allow the to-native compiler (be it JIT or AOT) to generate code only for<br/>
  the specific type.
  </remarks>
  [PerfCritical, Inline]
  private bool GetBit(int bitNo) => (val & (1<<bitNo)) != 0;

  <remarks>
  Might need manual inlining in case to-native compiler screws up.
  Might need Interlocked* atomicity, but that performance price is
  way too high to pay for single-thread access. Maybe need an AtomicData
  and AtomicPackedBool<T,U>?
  </remarks>
  [PerfCritical, Inline]
  private void ClearBit(int bitNo) => val &= ~(1<<bitNo);

  [PerfCritical, Inline]
  private void SetBit(int bitNo) => val |= (1<<bitNo);
}

<summary>Please Wait...</summary>
<remarks>
While we would like to add a constraint like 'where U : UnsignedIntegralNumber',<br/>
the language has no concept of even a Number constraint.<br/>
It is however required that U is one of the predefined unsigned integral types.<br/>
As we have no compile-time asserts, the best we could do is manually check<br/>
that U is one of the allowed types, else throw some kind of Domain exception.<br/>
It sucks to not be able to get a compiler error, though, for known invalid use.
</remarks>
class PackedBool<T, U> where T: System.PackedBool.ByteOrder.Endianness {
  Data<U> _value;

  U value { get => _value.val; set => _value.val = value;  }

  public static T ByteOrder;

  [PerfCritical, Inline]
  public bool GetBit(int bitNo) => _value.GetBoolBit(bitNo);

  [PerfCritical, Inline]
  public void SetBit(int bitNo, bool bSet) => _value.SetBoolBit(bitNo, bSet);
}

Now let's explore how a concrete case could look:

// ByteOrder absence implies 'Native'
// packedbool (placeholder keyword named) is a new type-spec,
// sibling to 'struct' and 'class'.
// ByteOrder is a new context-limited keyword, like 'where' in generic's.
packedbool Backing_t : ulong, ByteOrder:Native /*LittleEndian|BigEndian|Native*/
{
  // User-specified identifiers.
  // Compiler auto-generates properties with these names, and get/set
  // with appropriate IL.
  // The generated properties have 'public' visibility. Not only
  // to to reduce declaration clutter, but because nothing else
  // makes sense if not allowing user-defined methods in this type.
  bool b0, bit1, blurbiBit2; // just to display it follows ordinary syntax
  bool bit3; /* ...b63 */
}

void AccessBits(Backing_t bits)
{
  bool b0 = bits.b0;
  bits.blurbiBit2 = true;
  var val = bits.value; // typeof(val) = ulong
  // The runtime ByteOrder branch taken decides what native code is
  // generated (by either JIT or AOT).
  // TODO: Has the JIT (and AOT) been updated to handle switch-expressions
  // for this idom, removing unused code for non-matching types?
  if (typeof(bits.ByteOrder) == typeof(ByteOrder.Native)) {
    ...
  }
  else if (typeof(bits.ByteOrder) == typeof(ByteOrder.LittleEndian{
    ...
  }
  ...
}
void VerifySize() { // leftovers from before refactoring. Saved for historians.
  Assert(compiletime_sizeof(backing_t.base) = compiletime_sizeof(ulong));
  Assert(compiletime_sizeof(backing_t) == compiletime_sizeof(backing_t.base));
}

The source construct
packedbool Backing_t : ulong, ByteOrder:Native
could be parsed into something like
KEYWORD_PACKEDBOOL IDENTIFIER ':' TYPE
followed by optional
BYTEORDER = ,('ByteOrder:Native'|'ByteOrder:LittleEndian'|'ByteOrder:BigEndian')

generating the declaration

$"public sealed struct {IDENTIFIER} : System.PackedBool.PackedBool<{RemapByteorder(BYTEORDER)}, {TYPE}>"

that in the concrete example would result in

public sealed struct Backing_t : System.PackedBool.PackedBool<ByteOrder.Native, ulong>
{
  public bool b0 { get => GetBit(0); set => SetBit(0); }
  public bool bit1 { get => GetBit(1); set => SetBit(1); }
  ...
}

phew

Just my 0.02. 😄

14 replies

vladd Oct 24, 2023

@Rekkonnect As of now for the bitfields, there are no two or whole pack at the moment, right? Only one.

It's fine to rely on the open source magic when is works, but is there a guarantee that it will work long enough? As a counter-example, my work was relying on Unity container until the magic disappeared.

CyrusNajmabadi Oct 24, 2023
Collaborator

It's fine to rely on the open source magic when is works, but is there a guarantee that it will work long enough?

There are no guarantees, period.

But if it's open source, you can always fork and do what you want with it.

That remains true if we do the work or if someone else does.

vladd Oct 24, 2023

There are no guarantees, period.

That's exactly what I'm trying to say and that's what makes me skeptical about open source. Forking myself means that there must be a skilled and motivated developer within my organization, so the organization will have to care to maintain the feature. In this case, it's usually easier to write custom code which just solves the problem at hand for the organization, and doesn't implement the feature in all its generality. Which usually means "no bitfields but just manual shifts/ands/ors and some unit testing".

HaloFour Oct 24, 2023

Given the alternative of not having a solution, an open source approach sounds good. That's why Roslyn exists and offers all of these extension points, because Microsoft and the C# team are, by far, the biggest bottleneck. Better to offer some way to achieve a solution than none at all.

Rekkonnect Oct 24, 2023

@vladd this goes for every single bit of third-party open-source projects you actively rely on, if at all. In that sense, whatever functionality you want should come from a first party with official support, as you would not be able to trust some third party doing a good job at implementing it.

And let's not forget again, it's just a handy shorthand for implementing bitfield structs to improve your performancce, that generally follows a very basic pattern that many lower-level developers know and apply. So it should be trivial and easy to implement the source generator, and spot bugs or mishaps in the provided solution, as you can also take a peek at the generated source for your use case. I would really not make any fuss about this specific feature.

As of now for the bitfields, there are no two or whole pack at the moment, right? Only one.

I don't see any restriction on this growing further. If you spot a bug, or it does not fit your use case, you can dive into that and make a fix yourself. That specific repo you mention does not use modern source generators, but T4 templates, which are even friendlier for newbies. If anybody, including you, feels that it needs to use the newer generators, they are free to fork the project and make the necessary adjustments, or create a new project for this purpose.

The only problem with this is the discoverability, which always happens with open-source projects. The only solution is for the devs to properly name and tag their projects to help discoverability, and for users to search for NuGet packages and GitHub repositories with keywords related to their goal.

nathan130200 · 2023-06-27T10:11:44Z

nathan130200
Jun 27, 2023

We need this ASAP i'm tired making huge work-arounds to manage struct with bit fields

11 replies

Rekkonnect Oct 9, 2023

Like ive said before soucce generators have a terrible docs

Yes but you can still see how other generators have been developed and make your own. I've written plenty, and getting past the point of setting it up, releasing it as a NuGet package and integrating it into your projects, it's really nothing complicated, if you got the hang of the language. And if you want to participate and express an opinion, it is more sensible to have a good sense of the language itself before diving head-first with your opinion.

As for the example you provided, I find it overly unusable. If you need to work with bits, you need to have performance in mind. Bit fiddling requires math and if you absolutely have to, it means you have performance requirements. This means that ToString and Select are totally out of question, you use a List / List instead of IEnumearble for arbitrarily sized bit sets, or use an existing bit table that somebody else might have developed. The example above is irrelevant with the proposal and provides a far worse and more nuanced solution to a bit field struct.

nathan130200 Oct 10, 2023

The question is, developing an source generator is not just copy/paste other people code. It doesn't make sense for me to simply look at other source generators, and then not understand what each thing does. So there should be more decent documentation that addresses this.

Regarding bits, I still prefer to work with bool arrays in certain things, it is much more practical in certain scenarios.

Rekkonnect Oct 10, 2023

The question is, developing an source generator is not just copy/paste other people code. It doesn't make sense for me to simply look at other source generators, and then not understand what each thing does.

I never said copy-paste code. If you wanted to learn how they work, you could very well clone a repo having the source generator you discovered, debug some tests maybe, explore the available (not the unavailable) documentation and figure out what to do. A generator for bit field structs thing wouldn't be doing anything too complex, so the pain point will be setting it up to work more than it will be for getting your generated code to be valid.

koszeggy Oct 11, 2023

@nathan130200 How is your BitSet better than BitVector32? Or, if you have more than 32 bits you still can use a BitArray and use its CopyTo to get the bits as a byte/int array.

nathan130200 Oct 11, 2023

I didnt even noticed this before. I usually work with avg between 4 and 8 bits

Rekkonnect · 2023-06-28T06:42:20Z

Rekkonnect
Jun 28, 2023

Seeing the recent surge in activity, alongside having given my opinion in the past, I've started to lean towards the generator side of the topic. Adding such a feature would be purely cosmetic for nothing but a very apparent case in a niche area, which involves low-level bit management. This feature would not enable previously unavailable constructs being expressed in legal C#, and thus a generator sounds more reasonable.

That being said, generators are barely being developed, and from what I've heard elsewhere they are still very immature in nature, with the major annoyances that I have also encountered myself being compilation performance, IDE compatibility and getting them to work in both the producer and the consumer side.

Once generators mature and the tooling becomes friendlier, this will be something completely trivial to build and package. For now my only guess would be that nobody bothered to write a generator for this, and many other proposals that deserve their generator for two reasons:

Microsoft didn't build a generator for that, like they did for INotifyPropertyChanged, Regex, etc.
users didn't build a generator for that because they were busy not dealing with successfully releasing a publicly consumable generator

2 replies

CyrusNajmabadi Jun 28, 2023
Collaborator

That being said, generators are barely being developed,

Generators are being actively developed. And there are many first class MS generators shipped in .net, and many more being added each release.

and from what I've heard elsewhere they are still very immature in nature, with the major annoyances that I have also encountered myself being compilation performance,

Compilation performance is fine if you properly use incremental generators. Which should be possible here as I would expect bitfields to use ForAttributeWithMetadataName to find the appropriate information to generate from.

IDE compatibility and getting them to work in both the producer and the consumer side.

The only part of generators that is a bit weak right now is that some authors prefer to develop them "live" in VS (versus writing unit tests for them). Namely, they want to iterate on the generator, while using that same generator in the same session of VS (i.e they don't want to restart VS). This is not supported due to VS running on .net framework. So our recommendation instead is to do this like analyzers and have a fast inner loop where you are iterating on the generator while validating it with unit tests.

am11 Jan 1, 2024

Generators are being actively developed. And there are many first class MS generators shipped in .net, and many more being added each release.

Also, the community supported generators are increasing, here is a good list: https://github.com/amis92/csharp-source-generators (dependency injection ones are especially very useful to speedup the webapps startup).

pedoc · 2023-06-28T06:50:03Z

pedoc
Jun 28, 2023

Out of interest,has anyone measured the difference between the hassle and complexity of implementing a primary constructor versus implementing a Bitfield struct?
In contrast, I think this feature is more attractive than the primary constructor.

1 reply

CyrusNajmabadi Jun 28, 2023
Collaborator

I would expect a bitfield struct to be less hassle to implement. Given that it needs no special language support. So it can be done without changing the c# language at all, and without having to update any part of the c# Roslyn compiler.

msedi · 2023-07-05T20:05:20Z

msedi
Jul 5, 2023

One goal is to create a source generator, but the runtime produces the same code as if it would be written manually. To improve performance, as said above, there are special CPU instructions for bit-fiddling. Would it make sense to enhance the BitOperations and add some methods for bit fiddling that the runtime can handle?

8 replies

CyrusNajmabadi Jul 5, 2023
Collaborator

So if the runtime can be assisted by using CPU bitfiddle operations this would even enhance the performance of the bitfields - and this is what I think people would like also to see.

These are all good questions for the runtime team :)

it's not really clear to me what operations these would be at the CPU level, and how they would be faster than just pure int-operations, but i'd be happy to learn more about your thoughts here.

msedi Jul 5, 2023

@CyrusNajmabadi: I'm not having deep insight and never benchmarked them. I think the runtime teams know much more about it. For reference (you surely know): https://de.wikipedia.org/wiki/Bit_Manipulation_Instruction_Sets. Some of them are already available (ABM, BMI1, BMI2) in System.Runtime.Intrinsics but I think TBM is missing.

CyrusNajmabadi Jul 5, 2023
Collaborator

If these are exposed as intrinsics... then any source generator would be able to use them.

tannergooding Oct 9, 2023
Collaborator

@msedi, TBM was an AMD only instruction set that was deprecated with Jaguar and Zen based microprocessors

It won't be added because no future processors will support it.

msedi Oct 10, 2023

@tannergooding. That makes sense. Thanks for the info.

wanggangzero · 2023-12-27T01:53:33Z

wanggangzero
Dec 27, 2023

I wrote a library for bitfields. In the absence of syntactic sugar, it's fairly simple.
https://www.nuget.org/packages/wanggangzero.CSharpUtil.Bits.BitField

0 replies

wessupermare · 2024-02-07T19:46:02Z

wessupermare
Feb 7, 2024

For anyone who stumbles across this thread like I have, it seems the best solution for this nowadays would be to leverage a BitVector32 or a BitArray. It might not give the nice struct-y definition syntax, but it should at least address the problem with fewer bit-shifts and &s than the obvious solution of doing it oneself.

1 reply

Korporal Feb 27, 2024

Also given that C# has properties we can use properties to represent bit fields, the underlying implementation could be a BitVector or even a raw UInt32 or UInt64 or a pointer etc. Typically devices like microcontrollers expose a large number of (memory mapped) registers which are comprised of numerous fields of 1, 2, 3 and so on, bits.

Other than the overhead costs of access via properties, this is an effective way to work with such registers, so I'm not sure that a lower level support for true bit fields offers much.

colejohnson66 · 2024-02-27T15:41:44Z

colejohnson66
Feb 27, 2024

Bitfields are a constant pain when doing P/Invoke. C programmers love their bitfields, and Win32 is no exception. When working with serial ports, one must use the DCB structure. Unfortunately, CsWin32 crushes that big bitfield down to just a single field: DCB._bitfield. This necessitates weird contortions that could be avoided.

4 replies

tannergooding Feb 27, 2024
Collaborator

That's a CsWin32 tooling problem and it should be generating additional helper properties to access the underlying bitfields, such as is done in TerraFX: https://source.terrafx.dev/#TerraFX.Interop.Windows/Windows/um/WinBase/DCB.cs,373b2c27e15c9fe9

I know a tracking issue for it exists for this in CsWin32, as I provided some guidance around common pitfalls/mistakes devs make when dealing with bitfields (for example, BOOL x : 1 is 0 or -1 as the result, not 0 or 1; due to BOOL being int which is signed).

-- TerraFX is generated directly via dotnet/ClangSharp using 1-to-1 blittable bindings. It includes many helper APIs around anonymous structs/unions, bitfields, etc.

-- CsWin32 is generated via microsoft/win32metadata which itself is generated via dotnet/ClangSharp + some post-processing. win32metadata opts to exclude the helper APIs in favor of attributes so that downstream tools can decide how to expose that in a way appropriate for their language (it's also used for Rust and non .NET languages).

Korporal Feb 27, 2024

That's a good example. That could be improved by creating a struct that contains just a bunch of fields corresponding to the DWORD, WORD and other members, and representing all of the 32 bit fields as just another DWORD. Then define properties that can set/clear the underlying bits of the fields.

So long as the struct has fields ordered and aligned as needed, this is how I would handle that.

Korporal Feb 27, 2024

Oh so there are tools for that, I didn't know that. A tool that could consume a C typedef with bitfields and create a C# struct with properties would be very neat too, is there anything like that to your knowledge @tannergooding ?

tannergooding Feb 27, 2024
Collaborator

dotnet/ClangSharp does this. It takes in C headers/source files and spits out 1-to-1 blittable bindings. There's a small learning curve around getting started with it, as Clang itself requires you to pass in a lot of info, but it is fast and reliable, being used by microsoft/win32metadata and other libraries like my own TerraFX.Interop.* libraries (Win32, DirectX 9/10/11/12, GDI, XAudio, Vulkan, Xlib, PulseAudio, etc).
-- Other tools also exist that do similar things, but I personally prefer/recommend ClangSharp. Noting that I am biased since I maintain ClangSharp 😆

TerraFX.Interop.Windows uses this to generate itself and contains bindings for a large portion of the Windows SDK. It is fully blittable, trimmable, and AOT compatible. So while the base binary is around 20MB, it trims down to just the APIs you need (typically a couple hundred kilobytes). It is used by things like Paint.NET and ComputeSharp. source; nuget

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

C# Bitfield struct support #465

Replies: 24 comments · 75 replies

This comment has been minimized.

gafter Jun 29, 2019 Author

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

CyrusNajmabadi Dec 3, 2022 Collaborator

CyrusNajmabadi Dec 3, 2022 Collaborator

This comment was marked as disruptive content.

This comment was marked as disruptive content.

CyrusNajmabadi Feb 24, 2023 Collaborator

CyrusNajmabadi Feb 24, 2023 Collaborator

CyrusNajmabadi Dec 3, 2022 Collaborator

This comment has been minimized.

This comment has been minimized.

CyrusNajmabadi Oct 24, 2023 Collaborator

CyrusNajmabadi Jun 28, 2023 Collaborator

CyrusNajmabadi Jun 28, 2023 Collaborator

CyrusNajmabadi Jul 5, 2023 Collaborator

CyrusNajmabadi Jul 5, 2023 Collaborator

tannergooding Oct 9, 2023 Collaborator

tannergooding Feb 27, 2024 Collaborator

Replies: 24 comments 75 replies

gafter
Jun 29, 2019
Author

CyrusNajmabadi Dec 3, 2022
Collaborator

CyrusNajmabadi Dec 3, 2022
Collaborator

CyrusNajmabadi Feb 24, 2023
Collaborator

CyrusNajmabadi Feb 24, 2023
Collaborator

CyrusNajmabadi Dec 3, 2022
Collaborator

CyrusNajmabadi Oct 24, 2023
Collaborator

CyrusNajmabadi Jun 28, 2023
Collaborator

CyrusNajmabadi Jun 28, 2023
Collaborator

CyrusNajmabadi Jul 5, 2023
Collaborator

CyrusNajmabadi Jul 5, 2023
Collaborator

tannergooding Oct 9, 2023
Collaborator

tannergooding Feb 27, 2024
Collaborator