-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace tables with maps in merkle masks #14595
Replace tables with maps in merkle masks #14595
Conversation
!ci-build-me |
; accounts = Location_binable.Map.empty | ||
; token_owners = Token_id.Map.empty | ||
; hashes = Addr.Map.empty | ||
; locations = Account_id.Map.empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nholland94 one thing I noticed a while back is that comparing and hashing fields performs quite a bit of work (and allocations). That (among many other things) affects tables and maps that use Account_id.t
keys.
The hashing part is probably easily avoided by moving away from hash tables like in this PR (although any composite type that has field values inside it will still pay the cost when hashed), but if the Field.compare
function is improved, that would still help here (for example, for a lookup that has to go through k=290 masks, the key being queried needs to be converted 290 amounts of times with the current implementation).
What do you think? could implementing compare
as an external
that directly compares the values in Rust help? I see that it is already used here when implementing the custom pointer type's compare
operation handler
mina/src/lib/crypto/kimchi_bindings/stubs/src/arkworks/pasta_fp.rs
Lines 31 to 45 in 44af5c2
unsafe extern "C" fn ocaml_compare(x: ocaml::Raw, y: ocaml::Raw) -> i32 { | |
let x = x.as_pointer::<Self>(); | |
let y = y.as_pointer::<Self>(); | |
match x.as_ref().0.cmp(&y.as_ref().0) { | |
core::cmp::Ordering::Less => -1, | |
core::cmp::Ordering::Equal => 0, | |
core::cmp::Ordering::Greater => 1, | |
} | |
} | |
} | |
ocaml::custom!(CamlFp { | |
finalize: CamlFp::caml_pointer_finalize, | |
compare: CamlFp::ocaml_compare, | |
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot the link to the functions I am talking about:
mina/src/lib/crypto/kimchi_backend/common/field.ml
Lines 180 to 197 in 44af5c2
let to_bignum_bigint n = | |
let rec go i two_to_the_i acc = | |
if Int.equal i size_in_bits then acc | |
else | |
let acc' = | |
if Bigint.test_bit n i then Bignum_bigint.(acc + two_to_the_i) | |
else acc | |
in | |
go (i + 1) Bignum_bigint.(two_to_the_i + two_to_the_i) acc' | |
in | |
go 0 Bignum_bigint.one Bignum_bigint.zero | |
let hash_fold_t s x = | |
Bignum_bigint.hash_fold_t s (to_bignum_bigint (to_bigint x)) | |
let hash = Hash.of_fold hash_fold_t | |
let compare t1 t2 = Bigint.compare (to_bigint t1) (to_bigint t2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. Yes, it should make a big difference to actually do the comparison in Rust instead of what we are currently doing in the kimchi backend. We have moved other data structures that were hashing many field elements to instead compare field elements, so this optimization would be very valuable overall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to be careful to preserve the behaviour: the above should convert via into_repr
in rust before comparing, to ensure that we can still use it in the protocol and o1js.
More generally though, the function is badly implemented: it should use bitmasks, and also let rust return an array of int
s that we can project to bigint instead of going bitwise, which will also dramatically improve bin_prot performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version I linked above is already using into_repr()
internally in the implementation of cmp
(EDIT: but I imagine an explicit into_repr()
will provide a stronger guarantee), but I may be missing some details related to o1js:
https://github.com/arkworks-rs/algebra/blob/v0.3.0/ff/src/fields/macros.rs#L513-L518
Btw, caml_pasta_fp/fq_compare
already exist, so changing compare
is a few lines diff (just created a very rough PR here). I will get back to it tomorrow and look into improving to_bignum_bigint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding to_bignum_bigint
, I see returning int64
values is not currently supported by ocaml-gen
but returning regular ints with at-most 32 bits filled works (assuming OCaml with 63bit ints). What is missing is a way to construct a bigint at once from all the input ints instead of having to create 8 temporary ones (from which then additional temporary bigints are created through shifts and lor
operations). It is still better than going bit by bit, but still not ideal.
Something like this (incomplete and untested):
let combine_ints_to_bigint a b c d e f g h =
let open Bignum_bigint in
let a' = shift_left (of_int a) 224 in
let b' = shift_left (of_int b) 192 in
let c' = shift_left (of_int c) 160 in
let d' = shift_left (of_int d) 128 in
let e' = shift_left (of_int e) 96 in
let f' = shift_left (of_int f) 64 in
let g' = shift_left (of_int g) 32 in
let h' = of_int h in
a' lor b' lor c' lor d' lor e' lor f' lor g' lor h'
#[ocaml_gen::func]
#[ocaml::func]
pub fn caml_pasta_fp_to_ints(x: ocaml::Pointer<CamlFp>) -> Vec<i32> {
let repr = x.as_ref().0.into_repr();
let mut result: [i32; 8] = [0; 8];
for (i, &value) in repr.0.iter().enumerate() {
result[i * 2] = (value & 0xFFFFFFFF) as i32; // Lower 32 bits
result[i * 2 + 1] = (value >> 32) as i32; // Upper 32 bits
}
result.to_vec()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is missing is a way to construct a bigint at once from all the input ints instead of having to create 8 temporary ones (from which then additional temporary bigints are created through shifts and lor operations)
Purely with Zarith without intermediary bignums and shifts (bignum.bigint
doesn't seem to expose of_bits
), but according to the docs, the sign is not included in that conversion
let combine_ints_to_z a b c d e f g h =
let buffer = Bytes.create 32 in
Bytes.set_int32_le buffer 0 (Int32.of_int a);
Bytes.set_int32_le buffer 4 (Int32.of_int b);
Bytes.set_int32_le buffer 8 (Int32.of_int c);
Bytes.set_int32_le buffer 12 (Int32.of_int d);
Bytes.set_int32_le buffer 16 (Int32.of_int e);
Bytes.set_int32_le buffer 20 (Int32.of_int f);
Bytes.set_int32_le buffer 24 (Int32.of_int g);
Bytes.set_int32_le buffer 28 (Int32.of_int h);
Z.of_bits (Bytes.unsafe_to_string buffer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed a version (different to what was discussed here) to #14599
It is a bit hackier in that it goes through bin_prot (because that is the only way exposed by Bignum_bigint
to convert raw bytes into a bigint), but should be more efficient.
!ci-build-me |
1 similar comment
!ci-build-me |
bd19650
to
9955562
Compare
efdf699
to
098f683
Compare
!ci-nightly-me |
5bf9cbe
to
21cec0d
Compare
098f683
to
a36203f
Compare
89b8f66
to
665b5f8
Compare
a36203f
to
7027feb
Compare
688501b
to
99ff560
Compare
7027feb
to
04e1b02
Compare
dfa63d5
to
d6e6137
Compare
04e1b02
to
4403440
Compare
!ci-nightly-me |
!ci-nightly-me |
6458ec8
into
feature/sparse-ledger-wide-merkle-paths
This PR builds on #14594.
In this PR, the hash tables used in merkle masks are replaced with maps in order to avoid hashing data at every single mask traversal when evaluating ledger reads. This showed ~7.5x speedup in applying transactions to a chain of
k=290
masks.