Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upSpace-optimize `Option<T>` for integral enum `T` #14540
Comments
tommit
referenced this issue
May 30, 2014
Closed
Space-optimize `Option<T>` for integral enum `T` #84
huonw
added
the
I-slow
label
Jun 2, 2014
This comment has been minimized.
This comment has been minimized.
|
I whipped up a dummy example to see how this would optimize |
This comment has been minimized.
This comment has been minimized.
|
I don't want to comment too much on what should be the language defined method of determining the one "invalid" bit pattern that would get chosen among all the possible ones. But I do believe that if at least one "invalid" bit pattern (interpreted as the underlying integral type of the enum) exists such that it is either larger than the largest enum variant or smaller than the smallest enum variant, then one of those should be guaranteed to get chosen. This enables users to create C-like enums that they use as integers bounded to a certain continuous range of values (by providing their own unsafe iterators for such an enum type). For example, given the following enum:
The language should guarantee that the "invalid" bit pattern used for representing the [Edit] |
This comment has been minimized.
This comment has been minimized.
|
Don't you think it could be done transitively for all tagged unions and integral enum types? I have three optimizations on my mind:
I think the bit pattern of a variant should should be as close to 0 as possible in the order of declaration, just like in integral enums. It could stay undefined or become implementation defined |
This comment has been minimized.
This comment has been minimized.
Why wouldn't this be the default? (That is, why would all optimisations be applied by default.) |
This comment has been minimized.
This comment has been minimized.
I don't really understand the question. I'm sure that there are plenty of other possible optimizations, but they can be implemented irrespective of this proposed optimization and I think they should get their own dedicated github-issues. This proposed optimization should happen automatically, just like the non-null based space-optimization of |
This comment has been minimized.
This comment has been minimized.
|
@huonw, because it's a space-time tradeoff. Ideally, all possible values of an ADT could be represented within a minimal number of bits. However, values can be moved out of enums, so they should exist somewhere with proper alignment. Matching complex packed enums could still get expensive without simple discriminants: enum Value {
A = 20,
B = 21,
}
// Option<Result<Value, IoErrorKind>>
match maybe_result {
// matches on {22, _}
None => a,
// matches on {18, 0}
Some(Err(ShortWrite(0))) => b,
// matches on {x, _} if 20 <= x && x < 22
Some(Ok(_)) => c,
} |
huonw
referenced this issue
Jul 3, 2014
Closed
#[deriving(PartialOrd)] is O(N^2) code size for N enum variants #15375
huonw
referenced this issue
Sep 9, 2014
Closed
Option<char> should be represented as just one 32 bit value #5977
This comment has been minimized.
This comment has been minimized.
|
@pczarn But there is no space-time tradeoff with the space-optimization that is being proposed here. |
This comment has been minimized.
This comment has been minimized.
|
Of course, sorry, I was referring to something else. |
This comment has been minimized.
This comment has been minimized.
|
Triage: no changes I'm aware of. |
This comment has been minimized.
This comment has been minimized.
|
I'm going to close this in favor of rust-lang/rfcs#1230 since there's a lot of potential layout optimizations involving enums but this one specific issue isn't key enough that it should be tracked separately. |
tommit commentedMay 30, 2014
_Summary_
I propose a space optimization for variables of type
Option<E>whenEis a nullary, integral enum type._Motivation_
There's no need to waste memory for storing a separate tag in variables of type
Option<E>ifEis an integral enum type and the set of valid values ofEdoes not cover all possible bit patterns. Any bit pattern (of the size ofE) that doesn't represent a valid value of typeEcould be used by the compiler to represent theNonevalue of typeOption<E>._Details_
Given a nullary, integral enum type
E, the compiler should check if some bit pattern exists which does not represent a valid value of typeE(the only valid values are the ones determined by the nullary enum variants ofE). If such "invalid" bit patterns are found, the compiler should use one of them to represent theNonevalue of typeOption<E>and omit storing the tag in variables of typeOption<E>. If more than one such "invalid" bit pattern exists, there should be a language defined method to deterministically determine which one of those bit patterns is used to represent theNonevalue. I think the bit pattern ofNoneshould be language defined rather than implementation defined in order to makeOption<E>values serialized to disk more stable between different compilers / compiler versions.In determining whether a certain value of such space optimized type
Option<E>isNoneor not, the algorithm should simply check whether or not the binary representation of said value is equal to the binary representation of the language defined "invalid" value.