Change in logic of `from_sub_trees_as_trees` constructor #98

storojs72 · 2022-03-28T12:12:28Z

This PR is rebased on top of #100 (that introduced framework for integration testing). It refactors from_sub_trees_as_trees constructor of MerkleTree (top-tree) instance in a more generic way, e.g. values of SubTreeArity and TopTreeArity can be either even or odd (previously only even numbers were supported and, with odd numbers, mentioned constructor panicked). This change has been evaluated using high-level seal-lifecycle tests for production sectors (32gib and 64gib). Logs of successful executing mentioned tests on proofs-buildbot are included.

This commit separates testing of MerkleTree's functions-constructors into separate test suite. This seems reasonable, since different constructors may use different datasets actually which in fact has impact on tree's root computation, expressing one more dimension that is a subject of additional testing. Analysis of unit-tests majority from test_xor128.rs source file one can see that a lot of tests have much in common in their internal logic, though they use different constructors for tree's instantiation and also different tree parameters (arity and storage type). In such circumstances we can use passing function as parameter to particular test. Here I tried to follow this approach and separate test logic into simple (yet generic over tree structure) function which takes instantiated tree (it doesn't matter how it was instantiated actually) and checks if tree's functionality (computing length, number of leaves, root and generating / verifying inclusion proofs) works as expected. Ultimately, I think that such approach will allow us having less number of unordered tests and consequently avoid duplication of test logic.

Constructor 'new' is basically a 'try_from_iter' with the only difference in representation of incoming dataset. It should be something that allows calling 'into_iter' on it. That's why we can extend existing 'try_from_iter' test by just replacing crafted 'new' constructor function.

As far as I understand, a 'from_byte_slice' constructor is an attempt to simplify instantiation of tree from raw bytes without requiring input dataset to be iterable Since Algorithm abstraction enforces particular way of hashing (as specified in RFC6962: https://datatracker.ietf.org/doc/html/rfc6962) to prevent some cryptographic attacks on Merkle Tree, random slice (even of correct length) can't be used as a source for input leaves. So we can use test suite for 'from_data' constructor, but keep in mind that preparation of the dataset is completely different (specified in 'instantiate_from_byte_slice' function).

This commit introduces testing of 'from_par_iter' constructor. It is very similar to its single-threaded friend 'from_iter' in a sense that it doesn't require any changes in expected values.

vmx

Only minor things.

vmx · 2022-03-28T15:28:17Z

src/merkle.rs

+    let sub_tree_arity = SubTreeArity::to_usize();
+    let base_tree_arity = BaseTreeArity::to_usize();
+
+    let base_tree_len = match get_merkle_tree_len(leaves, base_tree_arity) {


I think this can be written as let base_tree_len = get_merkle_tree_len(leaves, base_tree_arity)?;

Definitely! Thanks!

vmx · 2022-03-28T15:29:07Z

src/merkle.rs

@@ -1810,6 +1810,37 @@ impl Element for [u8; 32] {
    }
 }

+pub fn get_merkle_tree_len_generic<


Some documentation would be good as it is quite interesting in which cases which values are returned.

Added missing explanation

vmx · 2022-03-28T15:32:22Z

src/merkle.rs

+    assert!(get_merkle_tree_len_generic::<U4, U0, U0>(16).is_ok());
+    assert!(get_merkle_tree_len_generic::<U1, U0, U0>(3).is_ok());


You could also use assert_eq!() and test for the actual return value.

Sure. Added mentioned tests. They will never be redundant

vmx · 2022-03-28T15:39:37Z

src/test_merkle_constructors.rs

+use crate::hash::{Algorithm, Hashable};
+use crate::merkle::{
+    get_merkle_tree_len_generic, Element, FromIndexedParallelIterator, MerkleTree,
+};
+use crate::store::{Store, VecStore};
+use crate::test_common::{Item, Sha256Hasher, XOR128};
+use rayon::iter::{IntoParallelIterator, ParallelIterator};
+use typenum::{Unsigned, U0, U2, U3, U5};


I know it's not the case in this crate, but in most places we separate the imports (similar to what Python does, in case you're familiar with that). It's three blocks, separated by an empty line. First the std imports, than external dependencies, then crate dependencies.

In this case it would be

Suggested change

use crate::hash::{Algorithm, Hashable};

use crate::merkle::{

get_merkle_tree_len_generic, Element, FromIndexedParallelIterator, MerkleTree,

};

use crate::store::{Store, VecStore};

use crate::test_common::{Item, Sha256Hasher, XOR128};

use rayon::iter::{IntoParallelIterator, ParallelIterator};

use typenum::{Unsigned, U0, U2, U3, U5};

use rayon::iter::{IntoParallelIterator, ParallelIterator};

use typenum::{Unsigned, U0, U2, U3, U5};

use crate::hash::{Algorithm, Hashable};

use crate::merkle::{

get_merkle_tree_len_generic, Element, FromIndexedParallelIterator, MerkleTree,

};

use crate::store::{Store, VecStore};

use crate::test_common::{Item, Sha256Hasher, XOR128};

I mention it, as I think it makes sense to do it this way for new files. Though if you prefer consistency across this crate, feel free to keep it like this.

Got it. Introduced this change. Thanks for pointing. Will try to keep in mind

vmx · 2022-03-28T15:45:34Z

src/test_merkle_constructors.rs

+    let mut dataset = Vec::<E>::new();
+    for index in 0..leaves {
+        // we are ok with usize -> u8 conversion problems, since we need just predictable dataset
+        let vector: Vec<u8> = (0..E::byte_len()).map(|x| (index + x) as u8).collect();
+        let element = E::from_slice(vector.as_slice());
+        dataset.push(element);
+    }
+    dataset


I'd also write the outer loop with an functional-style iterator. This way you don't need mutability and it's easy to spot that you operate on all elements. I would write this function as:

(0..leaves).map(|index| { // we are ok with usize -> u8 conversion problems, since we need just predictable dataset let vector: Vec<u8> = (0..E::byte_len()).map(|x| (index + x) as u8).collect(); E::from_slice(vector.as_slice()) })

Added this refactoring. Thanks. That's a "negative" habit from procedural programming

vmx · 2022-03-28T15:45:59Z

src/test_merkle_constructors.rs

+    dataset
+}
+fn generate_vector_of_usizes(leaves: usize) -> Vec<usize> {
+    let mut dataset = Vec::with_capacity(leaves);


Same here, I'd use map().

ditto. Thanks

This commit integrates test of 'from_tree_slice' constructor of MerkleTree into existing test suite. It wasn't tested previously so, as it seems to me, there was a bug while getting value of root from raw tree's data, when we tried to extract last element from data using wrong indexing. Now this is fixed and tests are added. In subsequent commits I'm going to add testing of 'from_slices' constructor (for instantiating compound trees ) which internally uses 'from_tree_slice'.

This commits integrates 'from_data_store' constructor testing into existing test-suite. Function that instantiates tree from data store works as following: it creates new tree using already defined 'new' instantiator, then serializes its data and recreates new storage over existing one (note that we can't generically clone storage); after that new tree is instantiated using newly created storage. Since we use same dataset as for 'test_from_slice', we can easily combine these two constructors into single group in order to reduce number of lines for expressing new test.

cryptonemo · 2022-03-29T14:45:00Z

src/merkle.rs

@@ -447,7 +447,7 @@ impl<
        );

        let store = S::new_from_slice(tree_len, &data).context("failed to create data store")?;
-        let root = store.read_at(data.len() - 1)?;
+        let root = store.read_at(store.len() - 1)?;


Are you sure these are compatible? I thought one was in element counts and one was in data/byte counts(?)

Thanks, for pointing on this. This is a place, where I'm not sure completely, TBH. But two following thoughts caused me to introduce this change:

I have double-checked that from_tree_slice constructor doesn't have tests neither in this repository, nor in rust_fil_proofs - so it is impossible to check compatibility actually;

data from this constructor is actually a byte slice that, as far as I understand this, represents serialised storage in some custom format. But store is an instance of storage, created after deserialising data via S::new_from_slice, and if we want to get last element from store (which is root of tree), we need to use indexing of store - not data.

I added tests for this constructor, where base binary tree with vector storage is instantiated with 4 leaves (each leaf is a 16-bytes array) and basic functionality of the tree is evaluated. This tree has 7 elements in total (4 leaves + 2 nodes + 1 root), which gives 7 * 16 = 112 bytes slice for serialised tree (our data). With previous version of code I was getting panic:

thread 'test_merkle_constructors::test_from_tree_slice_group' panicked at 'index out of bounds: the len is 7 but the index is 111', src/store/vec.rs:86:12 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace test test_merkle_constructors::test_from_tree_slice_group ... FAILED

specifically at this place, when root was being read. And it is reasonable, since we use data which is 112 bytes slice, instead of store (of VecStore type) which is actually vector of elements (Element type, which in my case is Item - 16-bytes array), so according to the logic in mind, we need to take last Item from our store, which has index: [store.len() - 1].

storojs72 · 2022-03-30T10:39:45Z

@vmx, thanks for initial review. I will introduce your remarks, once finishing with other constructors

Previously compound trees were tested (in context of new test suite) redundantly. E.g. for 'from_trees' constructor we performed tests varying base tree constructors which is not needed actually to be ensure that 'from_trees' works as expected. It is sufficient to use some single base tree constructor and run tests one per each dataset group, which is actually introduced in this commit.

cryptonemo

I like the map usage suggestions, but looks good overall

storojs72 · 2022-04-02T10:25:39Z

@cryptonemo , take a look please at 46fb497. Here I introduced changes into from_sub_tree_as_trees construction of compound-compound tree. Previously it didn't have tests, so I couldn't definitely understand its intention. My suspicion regarding this is that given an ordered set of base trees, this constructor groups base trees according to provided sub-tree arity, then instantiates at first set of compound trees and finally combines them into compound-compound tree. I don't know for sure whether split_off was (and is now) used properly, so now I changed logic a bit, according to my understanding and added proper test for it. Let me know please, if I missed something

storojs72 · 2022-04-05T16:35:58Z

@vmx , @cryptonemo , I've finally come up with all constructors covered. And I added proposed changes. Take please a look once again. In a meanwhile, I will try to experiment with tests in rust-fil-proofs, in order to evaluate split_off-based changes in from_sub_tree_as_trees constructor, which is questionable. Thanks!

vmx

I haven't reviewed all constructors closely. If I should, please let me know.

I saw a few for loops that could be map()s, but in some cases I can also imagine that the type annotations might become crazy. So I'm OK with just keeping the current state.

storojs72 · 2022-04-06T11:12:52Z

I can also imagine that the type annotations might become crazy

That's true, you are right, @vmx ! I also tried to replace for loops with functional constructions where possible

storojs72 · 2022-04-11T13:46:18Z

Last commit is a result of investigation - why some tests of rust-fil-proofs (which is main consumer of merkletree dependency) was failing. The reason was in from_sub_trees_as_trees constructor, which is used here for building 8_8_2 compound-compound trees while running most of the tests with actual sectors. While introducing the change in 46fb497 I didn't consider that order of trees is critical, so after analysing some of rust-fil-proofs test execution (where mentioned constructor is involved), I've come up with workaround. I've also checked that tests successfully pass on CI of rust-fil-proofs if we pick this branch of merkletree dependency.

As all constructors are finally covered, and we investigated the reason of high-level fails - I think this PR is ready for merging, so I remove 'WIP' label in its name

storojs72 · 2022-04-11T16:59:54Z

As we agreed with @cryptonemo , it is worth to perform testing of this branch with 32/64 GiG sectors used in production. So let's no rush with merging it

…constructors_testing

cryptonemo · 2022-04-19T15:37:27Z

src/merkle.rs

-        let mut grouped_trees = Vec::with_capacity(sub_tree_count);
-        for _ in (0..sub_tree_count).step_by(trees.len() / sub_tree_count) {
-            grouped_trees.push(trees.split_off(trees.len() / sub_tree_count));
+        let mut grouped_trees = Vec::with_capacity(top_tree_arity);


Can you add a test (or otherwise point it out if it exists) that shows what cases the previous code here breaks in?

I created reproducible example that demonstrates mentioned caveat of from_sub_trees_as_trees constructor in this branch. If you pull it and run:

cargo test reproducible_example --release -- --nocapture

you should see that for from_sub_trees_as_trees successfully instantiates 2-4-2 compound-compound tree, but fails on 2-5-3. It is also visible on our CI.

This commit drops DiskStore-based trees from test_xor128.rs as they are presented in new test suite (in tests/test_base_constructors.rs) and also moves some missing tests from #98

storojs72 · 2022-04-23T16:42:14Z

@cryptonemo, I narrow down the scope of this PR to discussion of from_sub_trees_as_trees constructor, by introducing #100. It allows to unblock addition of integration tests.

This commit drops introduced constructors' in order to simplify merging #100

storojs72 · 2022-04-23T16:56:30Z

I upload full log of successful executing test_seal_lifecycle_32gib_porep_id_v1_1_top_8_8_0_api_v1_1 test from rust-fil-proofs: 32gib.txt. I'm still working on executing similar 64gib test in order to check compatibility of from_sub_trees_as_trees with sectors used in production

storojs72 · 2022-05-06T13:18:21Z

Uploading full log of successful executing test_seal_lifecycle_64gib_porep_id_v1_1_top_8_8_2_api_v1_1 test:
64gib.txt

…constructors_testing

vmx

The new code is way easier to follow. I think it could even be more clear using map instead of for loops.

vmx · 2022-06-17T14:43:08Z

I forgot to add that I think it would be good to squash it into a single commit with a good commit message explaining that the old code was wrong etc.

storojs72 · 2022-06-21T11:12:17Z

I think it could even be more clear using map instead of for loops.

Yes. Will keep in mind in further PRs.

storojs72 added 4 commits March 28, 2022 14:09

Add testing of 'from_par_iter' constructor

30f2384

This commit introduces testing of 'from_par_iter' constructor. It is very similar to its single-threaded friend 'from_iter' in a sense that it doesn't require any changes in expected values.

storojs72 added the testing Improvements in unit/integration tests label Mar 28, 2022

storojs72 requested review from vmx, dignifiedquire, DrPeterVanNostrand and cryptonemo March 28, 2022 12:12

storojs72 self-assigned this Mar 28, 2022

vmx reviewed Mar 28, 2022

View reviewed changes

storojs72 added 2 commits March 29, 2022 16:38

cryptonemo reviewed Mar 29, 2022

View reviewed changes

storojs72 added 4 commits March 29, 2022 18:24

Add testing of 'new_with_config' constructor

466f5dd

Add testing of 'try_from_iter_with_config' constructor

9cf1ce5

Add testing of 'from_data_with_config' constructor

bd966ad

Add testing of 'from_tree_slice_with_config' constructor

4b296d0

storojs72 force-pushed the constructors_testing branch from d89b2f3 to 4b296d0 Compare March 30, 2022 10:10

storojs72 added 5 commits March 30, 2022 13:47

Add testing of 'from_par_iter_with_config' constructor

c428080

Add testing of 'from_byte_slice_with_config' constructor

fae5eb3

Narrowing testing surface of compound-compound tree constructors

c797f2e

Add testing of 'from_stores' constructor

a61cc46

cryptonemo approved these changes Mar 30, 2022

View reviewed changes

Add testing of 'from_sub_tree_as_trees' constructor

46fb497

storojs72 added 2 commits April 2, 2022 15:07

Add testing of 'from_slices' constructor

91090d5

Add testing of 'from_slices_with_configs' constructor

849f919

storojs72 added 3 commits April 5, 2022 17:13

Add testing of 'from_store_configs_and_replica' constructor

f719aa9

Add testing of 'from_sub_tree_store_configs_and_replica' constructor

f860ec4

Add proposed refactoring

f730916

storojs72 requested review from vmx and cryptonemo April 5, 2022 16:31

vmx approved these changes Apr 6, 2022

View reviewed changes

Replace loops with functional style constructions where possible

63df75c

Modify 'from_sub_trees_as_trees' constructor to preserve compatibility

2be824d

storojs72 force-pushed the constructors_testing branch from 88a316a to 2be824d Compare April 8, 2022 16:56

storojs72 changed the title ~~[WIP] Introduce testing of MerkleTree constructors~~ Introduce testing of MerkleTree constructors Apr 11, 2022

Merge branch 'master' of github.com:filecoin-project/merkletree into …

7354be1

…constructors_testing

cryptonemo reviewed Apr 19, 2022

View reviewed changes

storojs72 mentioned this pull request Apr 23, 2022

Integration tests #100

Merged

storojs72 changed the title ~~Introduce testing of MerkleTree constructors~~ Change in logic of from_sub_trees_as_trees constructor Apr 23, 2022

Drop constructors testing

fafee7e

This commit drops introduced constructors' in order to simplify merging #100

storojs72 added 2 commits May 9, 2022 18:37

Merge branch 'master' of github.com:filecoin-project/merkletree into …

acfd7ff

…constructors_testing

Fix CI routine

237542f

vmx approved these changes Jun 17, 2022

View reviewed changes

storojs72 merged commit 6f3cbd5 into master Jun 21, 2022

storojs72 deleted the constructors_testing branch June 21, 2022 11:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change in logic of `from_sub_trees_as_trees` constructor #98

Change in logic of `from_sub_trees_as_trees` constructor #98

storojs72 commented Mar 28, 2022 •

edited

vmx left a comment

vmx Mar 28, 2022

storojs72 Apr 5, 2022

vmx Mar 28, 2022

storojs72 Apr 5, 2022

vmx Mar 28, 2022

storojs72 Apr 5, 2022

vmx Mar 28, 2022

storojs72 Apr 5, 2022

vmx Mar 28, 2022

storojs72 Apr 5, 2022 •

edited

vmx Mar 28, 2022

storojs72 Apr 5, 2022

cryptonemo Mar 29, 2022

storojs72 Mar 30, 2022

storojs72 commented Mar 30, 2022

cryptonemo left a comment

storojs72 commented Apr 2, 2022

storojs72 commented Apr 5, 2022

vmx left a comment

storojs72 commented Apr 6, 2022 •

edited

storojs72 commented Apr 11, 2022 •

edited

storojs72 commented Apr 11, 2022 •

edited

cryptonemo Apr 19, 2022

storojs72 Apr 19, 2022

storojs72 commented Apr 23, 2022 •

edited

storojs72 commented Apr 23, 2022 •

edited

storojs72 commented May 6, 2022

vmx left a comment

vmx commented Jun 17, 2022

storojs72 commented Jun 21, 2022

		assert!(get_merkle_tree_len_generic::<U4, U0, U0>(16).is_ok());
		assert!(get_merkle_tree_len_generic::<U1, U0, U0>(3).is_ok());

Change in logic of from_sub_trees_as_trees constructor #98

Change in logic of from_sub_trees_as_trees constructor #98

Conversation

storojs72 commented Mar 28, 2022 • edited

vmx left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

storojs72 Apr 5, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

storojs72 commented Mar 30, 2022

cryptonemo left a comment

Choose a reason for hiding this comment

storojs72 commented Apr 2, 2022

storojs72 commented Apr 5, 2022

vmx left a comment

Choose a reason for hiding this comment

storojs72 commented Apr 6, 2022 • edited

storojs72 commented Apr 11, 2022 • edited

storojs72 commented Apr 11, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

storojs72 commented Apr 23, 2022 • edited

storojs72 commented Apr 23, 2022 • edited

storojs72 commented May 6, 2022

vmx left a comment

Choose a reason for hiding this comment

vmx commented Jun 17, 2022

storojs72 commented Jun 21, 2022

Change in logic of `from_sub_trees_as_trees` constructor #98

Change in logic of `from_sub_trees_as_trees` constructor #98

storojs72 commented Mar 28, 2022 •

edited

storojs72 Apr 5, 2022 •

edited

storojs72 commented Apr 6, 2022 •

edited

storojs72 commented Apr 11, 2022 •

edited

storojs72 commented Apr 11, 2022 •

edited

storojs72 commented Apr 23, 2022 •

edited

storojs72 commented Apr 23, 2022 •

edited