Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store: make RawTrieNode borsh serialisable #8744

Merged
merged 6 commits into from
Apr 17, 2023
Merged

Conversation

mina86
Copy link
Contributor

@mina86 mina86 commented Mar 17, 2023

Replace Children::encode_into and decode methods by implementations of
BorshSerializable and BorshDeserializable traits and with that change
RawTrieNode into an enum for which borsh implementations can be
derived. While the first step is quite straightforward, the second
requires RawTrieNode::Branch variant being split into two:
BranchNoValue and BranchWithValue.

The main benefit of this is less code (the commit has delta of -68
lines) and API consistent with other serialised objects. However,
this doesn’t come without downsides.

Firstly, having separate BranchNoValue and BranchWithValue branches
means that it’s now a bit more awkward to handle branch as a single
case with common code paths having to be refactored.

Secondly, one might imagine in the future a new representation for
nodes which embed value. For example {4u8}{key_len:u32}{key}{value}
might be added for a leaf. This would essentially mean that value
comprises of the rest of encoded bytes. The issue for borsh is that
it would like to know length of {value} before it reads it. Either
redundant {value_len:u32} would need to be included or some hacks
employed (namely not using borsh for RawTrieNodeWithSize and then
consuming the rest of the reader when decoding RawTrieNode).

Lastly, I’m not certain about performance implications. Borsh might
actually be slower than hand-implemented decoder which operates on
a slice. If this is actually the case I haven’t measured.

Replace Children::encode_into and decode methods by implementations of
BorshSerializable and BorshDeserializable traits and with that change
RawTrieNode into an enum for which borsh implementations can be
derived.  While the first step is quite straightforward, the second
requires RawTrieNode::Branch variant being split into two:
BranchNoValue and BranchWithValue.

The main benefit of this is less code (the commit has delta of -68
lines) and API consistent with other serialised objects.  However,
this doesn’t come without downsides.

Firstly, having separate BranchNoValue and BranchWithValue branches
means that it’s now a bit more awkward to handle branch as a single
case with common code paths having to be refactored.

Secondly, one might imagine in the future a new representation for
nodes which embed value.  For example <4u8><key_len:u32><key><value>
might be added for a leaf.  This would essentially mean that value
comprises of the rest of encoded bytes.  The issue for borsh is that
it would like to know length of <value> before it reads it.  Either
redundant <value_len:u32> would need to be included or some hacks
employed (namely not using borsh for RawTrieNodeWithSize and then
consuming the rest of the reader when decoding RawTrieNode).

Lastly, I’m not certain about performance implications.  Borsh might
actually be slower than hand-implemented decoder which operates on
a slice.  If this is actually the case I haven’t measured.
@mina86 mina86 requested a review from a team as a code owner March 17, 2023 02:03
@mina86 mina86 requested a review from mzhangmzz March 17, 2023 02:03
Copy link
Contributor

@robin-near robin-near left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much cleaner, thanks!

core/store/src/trie/raw_node.rs Outdated Show resolved Hide resolved
@mina86
Copy link
Contributor Author

mina86 commented Mar 29, 2023

Can I haz S-automerge?

@mina86
Copy link
Contributor Author

mina86 commented Apr 13, 2023

Friendly ping.

@near-bulldozer near-bulldozer bot merged commit 5fc27b6 into near:master Apr 17, 2023
@mina86 mina86 deleted the c branch April 17, 2023 19:56
nikurt pushed a commit that referenced this pull request Apr 18, 2023
Replace Children::encode_into and decode methods by implementations of
BorshSerializable and BorshDeserializable traits and with that change
RawTrieNode into an enum for which borsh implementations can be
derived.  While the first step is quite straightforward, the second
requires RawTrieNode::Branch variant being split into two:
BranchNoValue and BranchWithValue.

The main benefit of this is less code (the commit has delta of -68
lines) and API consistent with other serialised objects.  However,
this doesn’t come without downsides.

Firstly, having separate BranchNoValue and BranchWithValue branches
means that it’s now a bit more awkward to handle branch as a single
case with common code paths having to be refactored.

Secondly, one might imagine in the future a new representation for
nodes which embed value.  For example {4u8}{key_len:u32}{key}{value}
might be added for a leaf.  This would essentially mean that value
comprises of the rest of encoded bytes.  The issue for borsh is that
it would like to know length of {value} before it reads it.  Either
redundant {value_len:u32} would need to be included or some hacks
employed (namely not using borsh for RawTrieNodeWithSize and then
consuming the rest of the reader when decoding RawTrieNode).

Lastly, I’m not certain about performance implications.  Borsh might
actually be slower than hand-implemented decoder which operates on
a slice.  If this is actually the case I haven’t measured.
nikurt pushed a commit that referenced this pull request Apr 25, 2023
Replace Children::encode_into and decode methods by implementations of
BorshSerializable and BorshDeserializable traits and with that change
RawTrieNode into an enum for which borsh implementations can be
derived.  While the first step is quite straightforward, the second
requires RawTrieNode::Branch variant being split into two:
BranchNoValue and BranchWithValue.

The main benefit of this is less code (the commit has delta of -68
lines) and API consistent with other serialised objects.  However,
this doesn’t come without downsides.

Firstly, having separate BranchNoValue and BranchWithValue branches
means that it’s now a bit more awkward to handle branch as a single
case with common code paths having to be refactored.

Secondly, one might imagine in the future a new representation for
nodes which embed value.  For example {4u8}{key_len:u32}{key}{value}
might be added for a leaf.  This would essentially mean that value
comprises of the rest of encoded bytes.  The issue for borsh is that
it would like to know length of {value} before it reads it.  Either
redundant {value_len:u32} would need to be included or some hacks
employed (namely not using borsh for RawTrieNodeWithSize and then
consuming the rest of the reader when decoding RawTrieNode).

Lastly, I’m not certain about performance implications.  Borsh might
actually be slower than hand-implemented decoder which operates on
a slice.  If this is actually the case I haven’t measured.
nikurt pushed a commit that referenced this pull request Apr 25, 2023
Replace Children::encode_into and decode methods by implementations of
BorshSerializable and BorshDeserializable traits and with that change
RawTrieNode into an enum for which borsh implementations can be
derived.  While the first step is quite straightforward, the second
requires RawTrieNode::Branch variant being split into two:
BranchNoValue and BranchWithValue.

The main benefit of this is less code (the commit has delta of -68
lines) and API consistent with other serialised objects.  However,
this doesn’t come without downsides.

Firstly, having separate BranchNoValue and BranchWithValue branches
means that it’s now a bit more awkward to handle branch as a single
case with common code paths having to be refactored.

Secondly, one might imagine in the future a new representation for
nodes which embed value.  For example {4u8}{key_len:u32}{key}{value}
might be added for a leaf.  This would essentially mean that value
comprises of the rest of encoded bytes.  The issue for borsh is that
it would like to know length of {value} before it reads it.  Either
redundant {value_len:u32} would need to be included or some hacks
employed (namely not using borsh for RawTrieNodeWithSize and then
consuming the rest of the reader when decoding RawTrieNode).

Lastly, I’m not certain about performance implications.  Borsh might
actually be slower than hand-implemented decoder which operates on
a slice.  If this is actually the case I haven’t measured.
nikurt pushed a commit that referenced this pull request Apr 25, 2023
Replace Children::encode_into and decode methods by implementations of
BorshSerializable and BorshDeserializable traits and with that change
RawTrieNode into an enum for which borsh implementations can be
derived.  While the first step is quite straightforward, the second
requires RawTrieNode::Branch variant being split into two:
BranchNoValue and BranchWithValue.

The main benefit of this is less code (the commit has delta of -68
lines) and API consistent with other serialised objects.  However,
this doesn’t come without downsides.

Firstly, having separate BranchNoValue and BranchWithValue branches
means that it’s now a bit more awkward to handle branch as a single
case with common code paths having to be refactored.

Secondly, one might imagine in the future a new representation for
nodes which embed value.  For example {4u8}{key_len:u32}{key}{value}
might be added for a leaf.  This would essentially mean that value
comprises of the rest of encoded bytes.  The issue for borsh is that
it would like to know length of {value} before it reads it.  Either
redundant {value_len:u32} would need to be included or some hacks
employed (namely not using borsh for RawTrieNodeWithSize and then
consuming the rest of the reader when decoding RawTrieNode).

Lastly, I’m not certain about performance implications.  Borsh might
actually be slower than hand-implemented decoder which operates on
a slice.  If this is actually the case I haven’t measured.
nikurt pushed a commit that referenced this pull request Apr 25, 2023
Replace Children::encode_into and decode methods by implementations of
BorshSerializable and BorshDeserializable traits and with that change
RawTrieNode into an enum for which borsh implementations can be
derived.  While the first step is quite straightforward, the second
requires RawTrieNode::Branch variant being split into two:
BranchNoValue and BranchWithValue.

The main benefit of this is less code (the commit has delta of -68
lines) and API consistent with other serialised objects.  However,
this doesn’t come without downsides.

Firstly, having separate BranchNoValue and BranchWithValue branches
means that it’s now a bit more awkward to handle branch as a single
case with common code paths having to be refactored.

Secondly, one might imagine in the future a new representation for
nodes which embed value.  For example {4u8}{key_len:u32}{key}{value}
might be added for a leaf.  This would essentially mean that value
comprises of the rest of encoded bytes.  The issue for borsh is that
it would like to know length of {value} before it reads it.  Either
redundant {value_len:u32} would need to be included or some hacks
employed (namely not using borsh for RawTrieNodeWithSize and then
consuming the rest of the reader when decoding RawTrieNode).

Lastly, I’m not certain about performance implications.  Borsh might
actually be slower than hand-implemented decoder which operates on
a slice.  If this is actually the case I haven’t measured.
nikurt pushed a commit that referenced this pull request Apr 28, 2023
Replace Children::encode_into and decode methods by implementations of
BorshSerializable and BorshDeserializable traits and with that change
RawTrieNode into an enum for which borsh implementations can be
derived.  While the first step is quite straightforward, the second
requires RawTrieNode::Branch variant being split into two:
BranchNoValue and BranchWithValue.

The main benefit of this is less code (the commit has delta of -68
lines) and API consistent with other serialised objects.  However,
this doesn’t come without downsides.

Firstly, having separate BranchNoValue and BranchWithValue branches
means that it’s now a bit more awkward to handle branch as a single
case with common code paths having to be refactored.

Secondly, one might imagine in the future a new representation for
nodes which embed value.  For example {4u8}{key_len:u32}{key}{value}
might be added for a leaf.  This would essentially mean that value
comprises of the rest of encoded bytes.  The issue for borsh is that
it would like to know length of {value} before it reads it.  Either
redundant {value_len:u32} would need to be included or some hacks
employed (namely not using borsh for RawTrieNodeWithSize and then
consuming the rest of the reader when decoding RawTrieNode).

Lastly, I’m not certain about performance implications.  Borsh might
actually be slower than hand-implemented decoder which operates on
a slice.  If this is actually the case I haven’t measured.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants