-
Notifications
You must be signed in to change notification settings - Fork 889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(db): codec encoding/decoding #51
Conversation
/// BlockNumber concatenated with BlockHash. Used as a key for multiple tables. Having the first | ||
/// element as BlockNumber, helps out with querying/sorting. | ||
#[derive(Debug)] | ||
#[allow(non_camel_case_types)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we want non camel case here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted some representation of a concatenation of (BlockNumber, BlockHash, TxNumber), open to change it man_shrugging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo just using camel case is sufficient
crates/db/src/kv/tables.rs
Outdated
#[allow(non_camel_case_types)] | ||
type BNum_BHash_TxId = Vec<u8>; | ||
type RlpHeader = Vec<u8>; | ||
type BlockNumber_BlockHash_TxNumber = Vec<u8>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same q
crates/db/src/kv/codecs/scale.rs
Outdated
where | ||
T: ScaleOnly + parity_scale_codec::Decode + Sync + Send + std::fmt::Debug, | ||
{ | ||
fn decode(value: bytes::Bytes) -> Result<T, KVError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this take Into<Bytes>
?
0062f57
to
5b6f783
Compare
5b6f783
to
18b4345
Compare
crates/primitives/codecs/src/lib.rs
Outdated
if !path.path.segments.is_empty() { | ||
let _type = format!("{}", path.path.segments[0].ident); | ||
if compactable_types.contains(&_type.as_str()) { | ||
field.attrs.push(syn::parse_quote! { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if there's a better way to check the type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattsse may know, but seems fine to me, per L22 above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems fine for now
crates/db/src/kv/models/blocks.rs
Outdated
impl Decode for BlockNumHash { | ||
fn decode<B: Into<Bytes>>(value: B) -> Result<Self, KVError> { | ||
let value = value.into(); | ||
Ok(BlockNumHash((u64::decode(value.clone())?, H256::decode(value.slice(8..))?))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can avoid the clone here by doing the u64::decode(&value)
in a separate line from value.slice
? else let's impl decode for refs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the decode object is often provided as mu
, &mut BufMut
for example.
this way you can take what you want from a buffer and advance it once done.
crates/primitives/src/account.rs
Outdated
|
||
/// Account saved in database | ||
#[use_postcard] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be #[main_codec]`?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added it just to showcase that both can be used, but yea
crates/primitives/codecs/src/lib.rs
Outdated
if !path.path.segments.is_empty() { | ||
let _type = format!("{}", path.path.segments[0].ident); | ||
if compactable_types.contains(&_type.as_str()) { | ||
field.attrs.push(syn::parse_quote! { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattsse may know, but seems fine to me, per L22 above
@@ -0,0 +1,58 @@ | |||
//! Block related models and types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also that similar XForStorage
types that Akula has, which we maybe could adopt as a pattern
This all looks good as it sets up the infra for encoding/decoding with configurable codec. Next steps:
Can be done in follow-up PR or any additional small changes you wanna do. |
crates/primitives/codecs/src/lib.rs
Outdated
@@ -0,0 +1,63 @@ | |||
use proc_macro::{self, TokenStream}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move this to derive
folder which is it's own proc-macro crate
crates/primitives/codecs/src/lib.rs
Outdated
if !path.path.segments.is_empty() { | ||
let _type = format!("{}", path.path.segments[0].ident); | ||
if compactable_types.contains(&_type.as_str()) { | ||
field.attrs.push(syn::parse_quote! { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems fine for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understood it, this serves as a way to easily switch between codecs which we'll make use of for benchmarks?
once we settled on a codec this will be become obsolete, right?
can we add this to the readme?
@@ -11,6 +11,12 @@ description = "Staged syncing primitives used in reth." | |||
# reth | |||
reth-primitives = { path = "../primitives" } | |||
|
|||
# codecs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't these be optional?
@@ -0,0 +1,2 @@ | |||
mod postcard; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be feature gated?
reth-rlp = { path = "../common/rlp", features = ["derive"]} | ||
parity-scale-codec = { version = "3.2.1", features = ["derive", "bytes"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we're using scale, so when the macro #[main_codec]
gets expanded, it expects parity-scale-codec to be reachable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
couldn't find a straightforward way to include/exclude code & dependencies depending on a downstream dependency feature (primitives -> codecs!) . the if it's okay, i'll look into it another time |
Summary:
When data is to be stored or read from the database, the type should implement
table::Encoding
andtable::Decoding
so it knows how to transform from a structured type to a list of bytes.However, at the same time we have to care for how we encode this data. Either in a way that helps with sorting (
Table::Key
) or compactness (Table::Value
). That's why we add external codecs such as scale for example.Having them to be configurable through annotations allows us to contain their implementations/leakage to isolated portions of the project, and allows us to switch between them relatively easily (eg. benchmarking).
old
Added scale encoding, which allowed to remove most of the
impl
of some basic types.Added
Headers
example.Although, having to put#[codec(compact)]
in every integer is kinda of annoying. I was kinda hoping it would do it by default. Another limitation of the lib is that inner types are not getting compacted at all (Vec, Option, etc...)Added another one called
postcard
, which seemed to be good on the benchmarks. However, when I tested it withAccount
... it didn't perform as I expected it. I'd expect the maxsize to be 72 (u64 + 32bytes + 32bytes)... but it's encoding it to 144 bytes.. which is strange. Maybe I'm missing something.