Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upTracking issue for the to_bytes and from_bytes methods of integers #49792
Comments
SimonSapin
added
T-libs
S-waiting-on-team
C-feature-request
labels
Apr 8, 2018
This comment has been minimized.
This comment has been minimized.
|
I definitely like the idea of having these. Because of the multiple possible interpretations I'd lean towards inherent methods rather than From impls. I think to_bytes and from_bytes are reasonable names - to_native_endian is a bit confusing I agree. |
This comment has been minimized.
This comment has been minimized.
|
Two recent internals threads with thoughts around this area: It seems to me like there's a general common theme here of "safe, but a bit weird and rather transmute-y" conversions: this thread's So here's a sketch of an idea using #[marker] unsafe trait InplaceReinterpretAs<T> {}
unsafe impl<T> InplaceReinterpretAs<T> for T {}
unsafe impl InplaceReinterpretAs<[u8; 4]> for u32 {}
unsafe impl InplaceReinterpretAs<i32> for u32 {}
unsafe impl InplaceReinterpretAs<u32> for i32 {}
unsafe impl<T, U> InplaceReinterpretAs<*const U> for *const T {}
unsafe impl<T, U> InplaceReinterpretAs<*mut U> for *mut T {}
unsafe impl InplaceReinterpretAs<u16x8> for u32x4 {}
unsafe impl InplaceReinterpretAs<u32x4> for u16x8 {}
#[marker] unsafe trait ReinterpretAs<T> {
// Because it's a marker trait, these cannot be overridden,
// and thus their behaviour is always predicatable
fn reinterpret(self) -> T {
unsafe {
let r = ptr::read_unaligned(&self as *const Self as *const T);
mem::forget(self);
r
}
}
unsafe fn reinterpret_unchecked(x: T) -> Self {
let r = ptr::read_unaligned(&x as *const T as *const Self);
mem::forget(x);
r
}
}
unsafe impl<T, U> ReinterpretAs<U> for T where T: InplaceReinterpretAs<U> {}
unsafe impl<'a, T, U> ReinterpretAs<&'a U> for &'a T where T: InplaceReinterpretAs<U> {}
unsafe impl<'a, T, U> ReinterpretAs<&'a mut U> for &'a mut T where T: InplaceReinterpretAs<U> {}
unsafe impl ReinterpretAs<u32> for [u8;4] {} // not ok in-place, but fine as memcpyCertainly (Name inspired by C++'s |
SimonSapin
changed the title
Conversions between integers and typed arrays
Conversions between integers and byte arrays
Apr 9, 2018
This comment has been minimized.
This comment has been minimized.
|
@scottmcm There’s probably something interesting there, but this feels like the definition of scope creep and I’m not gonna personally pursue it today. Byte arrays/slices are "special" because they’re what’s used for I/O, let’s keep this particular issue specifically about them. |
This comment has been minimized.
This comment has been minimized.
|
@sfackler Alright,
|
This comment has been minimized.
This comment has been minimized.
|
It definitely makes sense for signed integers. The case for usize/isize is interesting. I think we should probably still have them and maybe just note in the docs that people should be careful about using literal sizes. |
SimonSapin
changed the title
Conversions between integers and byte arrays
Tracking issue for the to_bytes and from_bytes methods of integers
Apr 11, 2018
SimonSapin
referenced this issue
Apr 11, 2018
Merged
Add to_bytes and from_bytes to primitive integers #49871
SimonSapin
added
C-tracking-issue
and removed
C-feature-request
S-waiting-on-team
labels
Apr 11, 2018
kennytm
added a commit
to kennytm/rust
that referenced
this issue
Apr 14, 2018
This comment has been minimized.
This comment has been minimized.
|
This landed almost two months ago. Let’s stabilize? @rfcbot fcp merge |
This comment has been minimized.
This comment has been minimized.
rfcbot
commented
Jun 6, 2018
•
|
Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged teams: No concerns currently listed. Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
rfcbot
added
proposed-final-comment-period
disposition-merge
labels
Jun 6, 2018
rfcbot
added
the
final-comment-period
label
Jun 19, 2018
This comment has been minimized.
This comment has been minimized.
rfcbot
commented
Jun 19, 2018
|
|
rfcbot
removed
the
proposed-final-comment-period
label
Jun 19, 2018
This comment has been minimized.
This comment has been minimized.
|
Two comments:
trait ByteConversions {
const N: usize;
pub fn to_bytes(self) -> [u8; Self::N];
pub fn from_bytes(bytes: [u8; Self::N]) -> Self;
} |
This comment has been minimized.
This comment has been minimized.
|
I think there are two separate relevant operations here: the type cast (which compiles to a no-op), and the byte-order normalization. They are often used together, but not necessarily always. For example, a serialization mechanism for communicating over a (byte-stream) pipe with a child process on the same machine might want to use native-endian to avoid the cost of swapping the byte order only to swap it back on the other end of the pipe. So if we want to provide separate APIs for each endianness I think we should still include native-endian, for a total of 6 new methods. As to a trait, I don’t think it is preferable for this specifically. Integer types already have plenty of "duplicated" inherent methods, why are these ones different? If we want to bring back an |
This comment has been minimized.
This comment has been minimized.
|
You are correct that some users will want native Endianness. IMO it's wrong to make the easiest option platform-dependent, but I don't see a good, concise alternative. You are also right about these not being special with regards to integer operations. Having traits is very useful for generic code, so an Then I have no more concerns regarding this proposal. |
This comment has been minimized.
This comment has been minimized.
|
(The |
SimonSapin commentedApr 8, 2018
•
edited
This tracks the stabilization of two methods on each primitive integer type, added in PR #49871:
Previous issue message:
I’d like to propose adding to the standard library between various integer types
$Intand byte arrays[u8; size_of::<$Int>()](which by the way is literally a valid type today). The implementation would be exactlytransmute, but since the signature is much more restricted and all bit patterns are valid for each of the types involved, these conversions are safe.Transmuting produces arrays with the target platform’s endianness. When something different is desired, the existing
to_be/to_le/from_be/from_lemethods can be combined with these new conversions. Keeping these concerns orthogonal (instead of multiplying ad-hoc conversions) allows to keep the API surface small.Wrapping specific forms of
transmuteinto safe APIs makes good candidates for the standard library IMO since they can save users from needing writing (and reviewing and maintaining)unsafecode themselves. SeeBox::into_rawfor example. Together with the existing{to,from}_{be,le}methods andTryFrom<&[T]> for &[T; $N]impls, these new conversions would cover much of the functionality of the popular byteorder crate with little code and a relatively small API surface.What I’m less certain about (and why this isn’t a PR yet) is what API should we expose these conversions as. Options are:
Fromtrait, orf32::to_bitsandf32::from_bits. The advantage overFromis that we can give specific names to these conversions in order to communicate what they do. The downside is that we need to pick names.to_native_endian_bytesandfrom_native_endian_bytesbut that’s not great because:to_beand friends which are much more abbreviated. (But maybe they shouldn’t be. It is worth addingto_big_endian& co and deprecating the short ones?)n.to_be().to_native_endian(): now "native endian" is inaccurate, but that’s partly the fault ofto_befor changing the meaning of a value without changing its type.to_bytesandfrom_bytes, but that’s uninformative enough that they could just as well beFromimpls.@rust-lang/libs or anyone, any thoughts?