-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: specialize Encodable
/Decodable
for bytes
#4145
Conversation
fc45773
to
970849b
Compare
chunk_size: usize, | ||
} | ||
|
||
/// Read `opts.len` bytes from reader, where `opts.len` could potentially be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reminds me I need to port https://github.com/rust-bitcoin/rust-bitcoin/blob/e2b9555070d9357fb552e56085fb6fb3f0274560/bitcoin/src/consensus/encode.rs#L304 to Fedimint ...
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #4145 +/- ##
=======================================
Coverage 58.10% 58.11%
=======================================
Files 192 192
Lines 42981 43054 +73
=======================================
+ Hits 24976 25020 +44
- Misses 18005 18034 +29 ☔ View full report in Codecov by Sentry. |
970849b
to
0233fda
Compare
Re fedimint#4111 Re rust-bitcoin/rust-bitcoin#2390 Due to Rust limitations, we can't have impls for both `Vec<T>` and `Vec<u8>`. Handling byte case in a generic way is potentially *very* inefficient, and we do use it in some places, thought it's hard to find all of them (and then prevent even more from being introduced). The most appealing approach seems to be specializing that case using `TypeId` check and `unsafe`. Cons: * `TypeId` requires `'static` bound * unsafe (but small and easy to reason about) Pros: * zero cost (at runtime) * works automatically everywhere * no loss in ergonomics
0233fda
to
154b804
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a not very scientific test this made running fedimint-dbtool dump
18% faster.
if TypeId::of::<T>() == TypeId::of::<u8>() { | ||
// unsafe: we've just checked that T is `u8` so the transmute here is a no-op | ||
return consensus_encode_bytes(unsafe { mem::transmute::<&[T], &[u8]>(self) }, writer); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really cool pattern! 🤯
let len: usize = | ||
usize::try_from(len).map_err(|_| DecodeError::from_str("size exceeds memory"))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should only error on <64bit platforms, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This errors if trying to decode byte array larger than total theoretical virtual memory available for the process (which we wouldn't be able to allocate anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but for all reasonable platforms these days u64==usize
I'm happy to hear you were able to confirm this improves something. :D |
That is a pretty big con, right? In a real life program when would one decode static things? (Excuse my ignorance but the only |
AFAIR |
@@ -336,14 +431,29 @@ where | |||
} | |||
} | |||
|
|||
// From <https://github.com/rust-lang/rust/issues/61956> | |||
unsafe fn horribe_array_transmute_workaround<const N: usize, A, B>(mut arr: [A; N]) -> [B; N] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#[inline(always)]
to make array doesn't have to be copied
{ | ||
fn consensus_decode<D: std::io::Read>( | ||
d: &mut D, | ||
modules: &ModuleDecoderRegistry, | ||
) -> Result<Self, DecodeError> { | ||
if TypeId::of::<T>() == TypeId::of::<u8>() { | ||
// unsafe: we've just checked that T is `u8` so the transmute here is a no-op | ||
return Ok(unsafe { mem::transmute::<Vec<u8>, Vec<T>>(consensus_decode_bytes(d)?) }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait you can't transmute Vec
you have to go through {to, from}_raw
methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually it is correct, because T is u8
I think we can remove unsafe from this, using |
yes, I removed a lot of unsafe in 6ee8e87 but for &[T], rust complains about 'static |
.map_err(DecodeError::from_err)?; | ||
Ok(bytes) | ||
} | ||
|
||
impl_encode_decode_tuple!(T1, T2); | ||
impl_encode_decode_tuple!(T1, T2, T3); | ||
impl_encode_decode_tuple!(T1, T2, T3, T4); | ||
|
||
impl<T> Encodable for &[T] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should impl for [T]
instead of &[T]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but using Any still doesn't work
because you can't do .downcast_ref::<[u8]>(), because downcast_ref needs Sized
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could impl for Vec without unsafe, but not for Box<[u8]>
or Arc, Rc
To shed more light on why fn downcast_ref<'a, T, U>(value:: &'a T) -> Option<&'a U> { /* ... */ } There needs to be a bound requiring that fn do_stuff<T>(value: &T, s: &mut &'static str) {
*s = *downcast_ref(value).unwrap();
} But such bound is currently inexpressible. This PR wold be sound even without the |
@Kixunil 100% matches my understanding. BTW. If Rust had something like |
@maan2003 I don't see a reason to worry about wrapping this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
wow github doesn't like this pr, maybe we broke something |
It already passed before the CI. It must be some flakiness on |
Re #4111
Re rust-bitcoin/rust-bitcoin#2390
Due to Rust limitations, we can't have impls for both
Vec<T>
andVec<u8>
. Handling byte case in a generic way is potentially very inefficient, and we do use it in some places, thought it's hard to find all of them (and then prevent even more from being introduced).The most appealing approach seems to be specializing that case using
TypeId
check andunsafe
.Cons:
TypeId
requires'static
boundPros:
Edit:
In idle
just mprocs
I can't tell if there's any performance difference.