New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option<char> should be represented as just one 32 bit value #5977

Closed
Kimundi opened this Issue Apr 20, 2013 · 16 comments

Comments

Projects
None yet
@Kimundi
Copy link
Member

Kimundi commented Apr 20, 2013

A char is a u32, but Unicode is standardized to only take up 21bit, so there is more than enough space for that optimization.

Might be useful for specifying code page conversion tables with holes in them.

@msullivan

This comment has been minimized.

Copy link
Contributor

msullivan commented Jun 13, 2013

In the same vein as #6791. Is a reasonable optimization but probably not critical. We do similar optimizations for pointer types, I think.

@thestinger

This comment has been minimized.

Copy link
Contributor

thestinger commented Jul 4, 2013

Okay, this would mean representing Some(char) as the zero tag and None with the non-zero tag. If we fix char to actually always be 21-bit, that would work.

@thestinger

This comment has been minimized.

Copy link
Contributor

thestinger commented Sep 4, 2013

Once #8974 lands, we can go ahead with this.

@Thiez

This comment has been minimized.

Copy link
Contributor

Thiez commented Sep 4, 2013

If the compiler knew about possible values a type could take perhaps it could make these optimizations automatically? E.g. an ASCII character can have values in the range 0-127, so Option<ASCII> could automatically get optimized to 1 byte.
The following could be optimized to a single byte as well.

enum T {
    On,
    Off
}
enum Repeats {
    Zero
    One(T),
    Two(T,T),
    Three(T,T,T)
    Four(T,T,T,T)
    Five(T,T,T,T,T)
}

Because Repeats has 6 variants it can use 3 tag bits, and T uses 1 bit, so the largest variant uses 5 bits for the data, it all fits in a byte.

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Sep 4, 2013

@Thiez what happens for:

match repeats {
    Five(ref x, _, _, _, _) => {}
    _ => {}
}

it'd have to be able to handle sub-word references.

@thestinger

This comment has been minimized.

Copy link
Contributor

thestinger commented Sep 4, 2013

This will only work for Option. The compiler knows the high bits will always be zero for char so it can reuse one of them as the tag, with zero representing a Some(char) and non-zero representing None. It can't actually set any of the bits or references won't work.

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented Sep 4, 2013

I think it would work for any enum with up to 2047 nullary variants and one variant containing a char.

@thestinger

This comment has been minimized.

Copy link
Contributor

thestinger commented Sep 4, 2013

If you set any of the bits in the char, a reference to the char will be corrupt.

@bluss

This comment has been minimized.

Copy link
Contributor

bluss commented Sep 4, 2013

Can this case be extended to bool as well? it has 7 unused bits.

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented Sep 5, 2013

@thestinger: Yes. But if you have:

enum MaybeChar {
    HasChar(char),
    NoChar_0,
    NoChar_1,
    NoChar_2,
    ...,
    NoChar_N
}

You can freely use up to 11 bits to represent which variant it is, with the restriction that zero is reserved for HasChar.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Feb 12, 2014

We now have range asserts on char, so this is more possible than it was before.

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Sep 9, 2014

Similar to #14540.

@cmr cmr self-assigned this Mar 25, 2015

@cmr cmr removed their assignment Jan 5, 2016

@Yoric

This comment has been minimized.

Copy link
Contributor

Yoric commented Jan 22, 2016

If someone can mentor me through this bug, I'd be interested in giving it a try.

@pnkfelix

This comment has been minimized.

Copy link
Member

pnkfelix commented Jan 22, 2016

@Yoric one starting point to look would be src/librustc_trans/trans/adt.rs ; you might look over how PR #19765 changed that file to expand how we search for potential discriminant representations in an enum.

Yoric pushed a commit to Yoric/rust that referenced this issue Jan 26, 2016

Yoric pushed a commit to Yoric/rust that referenced this issue Jan 26, 2016

Yoric pushed a commit to Yoric/rust that referenced this issue Jan 26, 2016

Yoric pushed a commit to Yoric/rust that referenced this issue Jan 27, 2016

Yoric pushed a commit to Yoric/rust that referenced this issue Jan 27, 2016

Yoric pushed a commit to Yoric/rust that referenced this issue Jan 27, 2016

Yoric pushed a commit to Yoric/rust that referenced this issue Jan 27, 2016

Yoric pushed a commit to Yoric/rust that referenced this issue Jan 29, 2016

Yoric pushed a commit to Yoric/rust that referenced this issue Jan 29, 2016

Yoric pushed a commit to Yoric/rust that referenced this issue Jan 29, 2016

Yoric pushed a commit to Yoric/rust that referenced this issue Jan 29, 2016

Yoric pushed a commit to Yoric/rust that referenced this issue Feb 2, 2016

Yoric pushed a commit to Yoric/rust that referenced this issue Feb 2, 2016

Yoric pushed a commit to Yoric/rust that referenced this issue Feb 2, 2016

Yoric pushed a commit to Yoric/rust that referenced this issue Feb 3, 2016

@mrhota

This comment has been minimized.

Copy link
Contributor

mrhota commented Aug 18, 2016

FYI @pnkfelix I rebased @Yoric's work onto @eddyb's recent PR #35764

I'll wait for that to land, then I'll issue a PR for Yoric's rebased work.

mrhota added a commit to mrhota/rust that referenced this issue Sep 3, 2016

mrhota added a commit to mrhota/rust that referenced this issue Sep 3, 2016

mrhota added a commit to mrhota/rust that referenced this issue Sep 3, 2016

mrhota added a commit to mrhota/rust that referenced this issue Sep 3, 2016

@frewsxcv

This comment has been minimized.

Copy link
Member

frewsxcv commented Aug 15, 2017

For anyone wanting to see the status of this, here's code you can run:

fn main() {
    println!("{}", ::std::mem::size_of::<Option<char>>());
}

playpen

it currently prints '8' (bytes)

bors added a commit that referenced this issue Oct 14, 2017

Auto merge of #45225 - eddyb:trans-abi, r=<try>
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Oct 19, 2017

Auto merge of #45225 - eddyb:trans-abi, r=<try>
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Oct 24, 2017

Auto merge of #45225 - eddyb:trans-abi, r=<try>
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Oct 29, 2017

Auto merge of #45225 - eddyb:trans-abi, r=<try>
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Nov 8, 2017

Auto merge of #45225 - eddyb:trans-abi, r=<try>
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Nov 9, 2017

Auto merge of #45225 - eddyb:trans-abi, r=<try>
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Nov 13, 2017

Auto merge of #45225 - eddyb:trans-abi, r=<try>
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Nov 13, 2017

Auto merge of #45225 - eddyb:trans-abi, r=<try>
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Nov 18, 2017

Auto merge of #45225 - eddyb:trans-abi, r=arielb1
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Nov 19, 2017

Auto merge of #45225 - eddyb:trans-abi, r=arielb1
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Nov 19, 2017

Auto merge of #45225 - eddyb:trans-abi, r=arielb1
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Nov 19, 2017

Auto merge of #45225 - eddyb:trans-abi, r=arielb1
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Nov 19, 2017

Auto merge of #45225 - eddyb:trans-abi, r=arielb1
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Nov 19, 2017

Auto merge of #45225 - eddyb:trans-abi, r=arielb1
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Nov 19, 2017

Auto merge of #45225 - eddyb:trans-abi, r=arielb1
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Nov 19, 2017

Auto merge of #45225 - eddyb:trans-abi, r=arielb1
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

bors added a commit that referenced this issue Nov 19, 2017

Auto merge of #45225 - eddyb:trans-abi, r=arielb1
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.

@bors bors closed this in #45225 Nov 20, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment