Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split ty::layout::Layout into in-memory field placements, ABI and discriminant representation. #44426

Closed
eddyb opened this issue Sep 8, 2017 · 0 comments · Fixed by #45225
Closed
Assignees
Labels
C-cleanup Category: PRs that clean code up or issues documenting cleanup. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@eddyb
Copy link
Member

eddyb commented Sep 8, 2017

This should allow us to represent newtypes and implement enum niche optimizations, without an explosion of cases that need to be handle. I'll attempt tackling it on a trans/ABI branch of mine.
cc @gankro @nox @rust-lang/compiler

@eddyb eddyb self-assigned this Sep 8, 2017
@TimNN TimNN added C-cleanup Category: PRs that clean code up or issues documenting cleanup. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 17, 2017
bors added a commit that referenced this issue Oct 14, 2017
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.
bors added a commit that referenced this issue Oct 29, 2017
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.
bors added a commit that referenced this issue Nov 13, 2017
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.
bors added a commit that referenced this issue Nov 19, 2017
Refactor type memory layouts and ABIs, to be more general and easier to optimize.

To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
  * `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
  * `Tagged` has its variants discriminated by an integer tag, including C `enum`s
  * `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
  * `Union` has all its fields at offset `0`
  * `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
  * `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
  * `Uninhabited` corresponds to no values, associated with unreachable control-flow
  * `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
  * `ScalarPair` has two "scalar components", but only applies to the Rust ABI
  * `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
  * `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s

Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
  * `Option<!>` is 0 bytes
  * `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
  * `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
  * `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
  * `enum E { A(bool), B, C, D }` is 1 byte

Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.

Closes #44426, fixes #5977, fixes #14540, fixes #43278.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-cleanup Category: PRs that clean code up or issues documenting cleanup. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants