Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: bit fields and bit matching #29

Closed
wants to merge 3 commits into from

Conversation

Projects
None yet
@farcaller
Copy link

farcaller commented Apr 4, 2014

No description provided.

@farcaller farcaller changed the title Added RFC on bit fields RFC: bit fields and bit matching Apr 4, 2014

0b00 => ...,
0b01 => ...,
0b02 => ...,
0b03 => ...

This comment has been minimized.

@cmr

cmr Apr 4, 2014

Member

2 and 3 are not valid binary digits. This would have to be 0b10 and 0b11.


# Alternatives

Provide a bit extraction macros that would perform the first part of this RFC. Doesn't solve the problem of second part.

This comment has been minimized.

@cmr

cmr Apr 4, 2014

Member

Is this possible to do with macros? If we can do it with macros, we don't need to add it to the language. I believe it could be done, there's nothing that really precludes it I don't think. I definitely think the match, and possibly the updating could be done, though exraction might need to use methods, and it might not look as nice.

This comment has been minimized.

@farcaller

farcaller Apr 4, 2014

Author

I think, it's possible to do things like val[4..5] with macros, maybe even val[0, 4..5] with recursive macros. Not sure how to deal with match though, as noted on ML that would require new variable bit-sized ints.

This comment has been minimized.

@cmr

cmr Apr 4, 2014

Member

I don't think it'd actually require that. It can just extract it to the smallest uint size that fits and compare against constants, with a fall-through match arm.

When I say "macros", though, I mostly just mean "any syntax extension", which includes procedural ones (written in pure Rust)

This comment has been minimized.

@tari

tari Apr 4, 2014

Contributor

I would personally prefer a macro approach as well. I initially proposed arbitrary-width integers as a workaround due to some confusion about what constituted valid operations in a macro, and haven't been able to come up with a reasonable way to unify the required types with the existing ones in a pleasing way.

While the implementation of this feature would be easier with such arbitrary-width integers, I think it comes with effects on the semantics of too many other language items. In the simplest case, we clutter the 'standard' namespace with a lot of new types that few users will need (u4, i13, ...) and a more generic version would likely need its own unusual semantics (UIntN(4), IntN(13), perhaps).

```rust
let mut val: u32 = ...;
let bits1 = val[4..5]; // equivalent to bits = (val >> 4) & 3
let bits2 = val[0,4..5]; // equivalent to bits = ((val >> 4) & 3) | (val & 1)

This comment has been minimized.

@SimonSapin

SimonSapin Apr 4, 2014

Contributor

The previous line is fine, but this might be a bit too magical. It’s not obvious that , means |

This comment has been minimized.

@farcaller

farcaller Apr 28, 2014

Author

Still, it would be useful to extract non-continuous bits. Maybe using | instead of , is a better option?

let bits2 = val[0,4..5]; // equivalent to bits = ((val >> 4) & 3) | (val & 1)
val[2..7] = 10; // equivalent to val = (val & (0xffffffff ^ 0xfc)) | (10 << 2)
val[0] = 3; // doesn't compile, as you can't fit 0b11 into one bit place

This comment has been minimized.

@SimonSapin

SimonSapin Apr 4, 2014

Contributor

What if instead of a literal you have a variable or other expression whose value is not known at compile-time? Do too-big values trigger fail!()?

This comment has been minimized.

@farcaller

farcaller Apr 4, 2014

Author

I don't think it's reasonable to have run-time support for this feature. It could be solved by something like this: val[0] = data[0], where data is of integer type.

@bill-myers

This comment has been minimized.

Copy link

bill-myers commented Apr 4, 2014

I think that it's not necessary to change the language in this way to support this.

For bit access, it seems to me that a macro would work.

For matching, add a special case to the language so that matching an expression that, after trivial constant propagation/move elimination, is (expr & const), (expr % const), (expr << const), (expr >> const) or a combination of those (and possibly others) is considered exhaustive when all the possible output values are specified.

It's also possible to handle matching with integers of any bit-size, but that doesn't quite handle modulus and shifts and so on without adding those too in the typesystem, which seems worse.

@erickt

This comment has been minimized.

Copy link

erickt commented Apr 7, 2014

I'm going to vote against this RFC. While I love the idea of bitstring matching, @LeoTestard's awesome rustlex demonstrates it's possible to build a complex pattern matching macro. I don't really see any disadvantages of it being done externally, so I feel this should be done as an external project.

@nrc

This comment has been minimized.

Copy link
Member

nrc commented Apr 28, 2014

I think something along these lines is necessary for systems programming. I don't really care if it is part of the language or a syntax extension, as long as it is nice. I would rather have this done well in the language than done awkwardly with a syntax extension. Given that this is a fairly core feature for our core audience, I don't see the attraction of having it outside the language if there is any reason not to.


```rust
let mut val: u32 = ...;
let bits1 = val[4..5]; // equivalent to bits = (val >> 4) & 3

This comment has been minimized.

@nrc

nrc Apr 28, 2014

Member

Could you explain this syntax please? Its not clear to me from the examples.

This comment has been minimized.

@pczarn

pczarn Apr 28, 2014

Consider 123u[32..63]. Would such access compile only on some platforms other than 32-bit?

This comment has been minimized.

@farcaller

farcaller Apr 28, 2014

Author

I think it should be limited to strictly sized types, e.g. 123u[32..62] wouldn't work, you must use 123u64[32..62].

Provide a bit extraction macros that would perform the first part of this RFC. Doesn't solve the problem of second part.

Erlang has an even better bit matching:

This comment has been minimized.

@nrc

nrc Apr 28, 2014

Member

Could you show what this would look like in Rust or explain it in words please? I don't understand the Erlang syntax.

This comment has been minimized.

@farcaller

farcaller Apr 28, 2014

Author

Well, this one is quite a complex example, actually, I got it from here. I think, rust version would be something along the lines of

let IP_VERSION = 4;
let IP_MIN_HDR_LEN = 5;

let DgramSize = byte_size(Dgram);
match Drgam {
  (
    ref IPVers @ [0..3],
    ref HLen @ [4..7],
    ref SrvcType @ [8..15],
    ref TotLen @ [16..31], 
    ref ID @ [31..47],
    ref Flgs @ [48..50],
    ref FragOff @ [51..63],
    ref TTL @ [64..71],
    ref Proto @ [72..79],
    ref HdrChkSum @ [80..95],
    ref SrcIP @ [96..127],
    ref DestIP @ [128..159],
    ref RestDgram @ [160..]
  ) if IPVers = IP_VERSION && HLen >= 5 && HLen*4 <= DgramSize {
    // ...
  },
  _ => (),
}

This comment has been minimized.

@edwardw

edwardw May 10, 2014

More formally, a bitstring in Erlang is of the form:

<<Sengment1, ..., Segment_N>>

And a segment is:

Segment = Value | Value:Size | Value/TypeSpecifiers | Value:Size/TypeSpecifiers
TypeSpecifiers = Endianess-Sign-Type-Unit
Endianess = big | little | native
Sign = signed | unsigned
Type = integer | float | binary
Unit = 1 | 2 | ... | 255

Please let us rusteceans have it :)

This comment has been minimized.

@pczarn

pczarn May 10, 2014

@farcaller: I don't understand how to reference bit-aligned data. Also, how would you create and transmute bitstrings?

I'm convinced that bit matching should use structs.

struct Dgram {
  ip_vers: Uint<4>,
  hlen: Uint<4>,
  srvc_type: u8,
  total_len: u16,
  id: u16,
  flgs: Uint<3>,
  frag_off: Uint<13>,
  ttl: u8,
  proto: u8,
  hdr_chksum: u16,
  src_ip: u32,
  dest_ip: u32,
}

static IP_VERSION = 4;
static IP_MIN_HDR_LEN = 5;

// in fn(dgram: Dgram, rest: Vec<u8>)
let size = size_of::<Dgram>();
match dgram {
  Dgram {
    ip_vers: IP_VERSION as Uint<4>,
    hlen: hlen,
    srvc_type: srvc_type, total_len: total_len,
    id: id, flgs: flgs, frag_off: frag_off,
    ttl: ttl, proto: proto, hdr_chksum: hdr_chksum,
    src_ip: src_ip, dest_ip: dest_ip,
  } if hlen >= 5 && hlen*4 <= size => {
    let opts_len = 4 * (hlen - IP_MIN_HDR_LEN);
    let (opts, data) = rest.split_at(opts_len);
    // ...
  },
  _ => (),
}

This comment has been minimized.

@farcaller

farcaller May 10, 2014

Author

I like how this struct looks, but it's getting close to bitfields of C/C++ that are often frowned upon. I guess the main reason is that byte order is not defined in those, so if we can have structs with explicit alignment and byte order, that would work.

This comment has been minimized.

@pczarn

pczarn May 10, 2014

Struct fields with attributes are certainly possible. I propose the following syntax

struct MyData {
  a: u8,
  #[align(4)] b: u8,
  align(16) little { // little endian
    c: int,
    d: uint,
  }
}

This comment has been minimized.

@farcaller

farcaller May 10, 2014

Author

That would still require support for arbitrary-sized ints, right? In cases of Uint<4>.

This comment has been minimized.

@pczarn

pczarn May 10, 2014

Yes, and they require support for static generic parameters in turn. Another problem is, would all fields have bit alignment by default? What would happen when an Uint<4> was followed by u8?

This comment has been minimized.

@farcaller

farcaller May 10, 2014

Author

There's #[packed] for that.

@nrc

This comment has been minimized.

Copy link
Member

nrc commented Apr 28, 2014

I think it is essential to be refer to groups of bits (not just single bits; its not clear to me if that is possible here) and to name them. Both things are possible in C++. I think there is too much potential for errors without.

There is some good discussion on this reddit thread on what is necessary to have safe and portable bitfields - http://www.reddit.com/r/rust/comments/244yz6/bitfields_in_rust/


# Detailed design

The first part of this RFC is defenition of a bit access for integer types. For the sake of simplicity, only unsigned integer types (uint, u8, u16, u32, u64) are supported.

This comment has been minimized.

@pczarn

pczarn Apr 28, 2014

only fixed width unsigned integer types (u8, u16, u32, u64)

@dobkeratops

This comment has been minimized.

Copy link

dobkeratops commented Apr 28, 2014

over the years i'd done a lot of low level work packing vertex and color formats, building DMA tags and so on.. I'd always been perfectly happy with C/C++ on this front without bitfields (always feared them for portability issues,and just used shifts/masks and abstractions of those). Its a long way from the issues that drove me to Rust.. and I can think of many other things i'd rather have added to the language today. Ints in the generic type params (like C++) would extend what you can do with generic code, eg shift/mask values in constants. (eg, have a smartpointer which is a compressed pointer with a shift value for acessing aligned objects within an arena, ). Better generic type inference (equiv of C++ decltype, even auto return type) would help. HKT. If rust could get something that improves on struct inheritance (generalized delegation of fields and component methods? .. maybe on tuples? .. and coercions ?) ... that would be great too. Even array sugar foo[i,j] would be preferable to dedicated bitfields .. if you get improved [] overloading maybe that type of array sugar could be of use for bitfield access? some people want slice syntax, maybe that would work.

@esbullington

This comment has been minimized.

Copy link

esbullington commented Apr 28, 2014

I've been playing around using Rust with several binary network protocols. Pattern matching on bits would be incredibly useful. I've experienced no easier parsing of binary wire protocols than when I've used OCaml's bitstring pattern matching syntax extension (based on Erlang's bitstring matching). Pure bliss.

If this can be accomplished using macros, great (I still haven't explored Rust's macro system so I'm not sure). But given the interest in Rust from embedded and systems programmers, I would think this would be a great addition to the core language.

Bitfields would be great, too, particularly for the embedded programmers.

@bstrie

This comment has been minimized.

Copy link
Contributor

bstrie commented May 6, 2014

Does the recent bitflags! macro alleviate some of the pressure here?

rust-lang/rust#13072

@cmr

This comment has been minimized.

Copy link
Member

cmr commented May 6, 2014

No. Bit fields/bit matching are very unrelated to bitflags.

On Tue, May 6, 2014 at 6:24 PM, Ben Striegel notifications@github.comwrote:

Does the recent bitflags! macro alleviate some of the pressure here?

rust-lang/rust#13072 rust-lang/rust#13072


Reply to this email directly or view it on GitHubhttps://github.com//pull/29#issuecomment-42367685
.

http://octayn.net/

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented May 26, 2014

I haven't read this in detail yet but I think rust-lang/rust#12642 might be related.

@brson

This comment has been minimized.

Copy link
Contributor

brson commented Jun 5, 2014

Thank you for the RFC. Current policy is to not have bitfields in the language itself and use syntax extensions. We currently have a bitflags! extension that can be used for some of these use cases.

Closing.

@brson brson closed this Jun 5, 2014

@pnkfelix

This comment has been minimized.

Copy link
Member

pnkfelix commented Jun 5, 2014

To elaborate on @brson's comment: If bitflags! does not cover a use case you care about, we encourage you to work with us to design a syntax extension that does cover your use case.

@bgamari bgamari referenced this pull request Aug 18, 2014

Closed

RFC: Proposal for bit data #205

withoutboats pushed a commit to withoutboats/rfcs that referenced this pull request Jan 15, 2017

Depend on freshly published openssl on crates.io
No more git dependencies there!

Closes rust-lang#29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.