New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid vector allocation in RawNetworkMessage
encoding
#1954
Conversation
provide accessor method and new for downstream libs. This is done in order to more easily change the struct without impacting downstream and also in order to add another field while preserving struct invariant in future commit.
Using it in RawNetworkMessage encoding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 23e5d3c
bitcoin/src/consensus/encode.rs
Outdated
pub struct CheckedData(Vec<u8>, [u8; 4]); | ||
|
||
impl CheckedData { | ||
/// Create a new `CheckedData` computing the checksum of given data | ||
pub fn new(vec: Vec<u8>) -> Self { | ||
let checksum = sha2_checksum(&vec); | ||
Self(vec, checksum) | ||
} | ||
/// Return raw data without checksum | ||
pub fn data(self) -> Vec<u8> { | ||
self.0 | ||
} | ||
/// Return the checksum of the data | ||
pub fn checksum(&self) -> [u8; 4] { | ||
self.1 | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should use names?
/// Data and a 4-byte checksum.
#[derive(PartialEq, Eq, Clone, Debug)]
pub struct CheckedData {
data: Vec<u8>,
checksum: [u8; 4],
}
impl CheckedData {
/// Creates a new `CheckedData` computing the checksum of given data.
pub fn new(data: Vec<u8>) -> Self {
let checksum = sha2_checksum(&data);
Self { data, checksum }
}
/// Returns a reference to the raw data without the checksum.
pub fn data(&self) -> &[u8] { &self.data }
/// Returns the raw data without the checksum.
pub fn into_data(self) -> Vec<u8> { self.data }
/// Returns the checksum of the data.
pub fn checksum(&self) -> [u8; 4] { self.checksum }
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to use names, I have to dismiss review anyway because there is something to fix in the examples as the CI shows
bitcoin/src/network/message.rs
Outdated
/// Creates a [RawNetworkMessage] | ||
pub fn new(magic: Magic, payload: NetworkMessage) -> Self { | ||
let mut engine = sha256d::Hash::engine(); | ||
let payload_len = payload.consensus_encode(&mut engine).expect("engine doesn't error") as u32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This as
can silently drop bits if the payload is NetworkMessage::Alert
and it has an inner vector that is super long, right? Or are all messages guaranteed to be short enough that this cast is ok? Or is enforcing invariants on NetworkMessage
out of scope for this PR, just flagging in case it needs further investigation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to file an issue to check on where these invariants are enforced. But there is no way a message on the p2p network can exceed 32 MB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we should use a conversion function and expect("there is no way an p2p message can exceed 32MB")
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although there are other uses of as
in this module and the code is miles behind the rest of the codebase anyways so I'm totally ok to just use as
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to convert this with the expect for clarity.
Too be precise:
- u32 would be up to 4GB
- it looks the u32 is part of the specification of the network message, in other words, you have just 4 bytes to specify the lenght they cannot grow like a varint , UPDATE: I see in encoding could happen by using a vec greather than 4GB, it seems acceptable panicking in that case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Legend, cheers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK, love this!
RawNetworkMessage keep the payload_len and its checksum in the struct, thus is not needed to serialize the inner network message pub in fields of both RawNetworkMessage and CheckedData are removed so that invariant are preserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 5c89330
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 5c89330
This PR removes the need to serialize the inner NetworkMessage in the RawNetworkMessage encoding, thus saving memory and reducing allocations.
To achieve this payload_len and checksum are kept in the RawNetworkMessage and checksum kept in CheckedData, to preserve invariants fields of the struct are made non-public.