Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Namefilter and HAMT Implementation and Tests #31

Merged
merged 6 commits into from
Jul 20, 2022

Conversation

appcypher
Copy link
Member

@appcypher appcypher commented Jun 28, 2022

Summary

Implement the updated HAMT and Namefilter specs. The new spec updates the the saturation and hash algorithm used by the namefilter. Some decisions were also made around what kind of degree WNFS HAMT should have among other things.

This PR implements the following features

  • Updated WNFS HAMT spec
  • Updated Namefilter spec

Test plan (required)

  • Testing the Rust core.

    cargo test -p wnfs --release
  • Testing the wasm bindings.

    cd crates/wasm
    yarn playwright test

Closing issues

Fixes #26
Fixes #24

@appcypher appcypher marked this pull request as ready for review July 8, 2022 11:26
@appcypher appcypher requested a review from matheus23 July 8, 2022 11:26
Copy link
Member

@matheus23 matheus23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HAMT

I have some comments about canonicity. It should be the case that given the same keys and values that your HAMT stores, you should always get back the same root CID, no matter what order the keys were inserted in or which intermediate operations were run. It's some quite tricky logic, but that's what gives us the really strong "if the CID is equal, the tree is equal" logic.

I think we've somewhat diverged from the original codebase now, but we can still take some inspiration. Apparently they just implement a clean() function, maybe that's the way to go for us, too?

https://github.com/filecoin-project/ref-fvm/blob/master/ipld/hamt/src/pointer.rs#L114

Before we put this into production we should definitely get some property tests going for this for sure, but that should be another PR ✌️

Namefilter

Saturation needs a slight change to be correct. It's described in the pseudocode in the spec branch.

Same thing with the property tests applies here, too. These structures have so many interesting mathematical properties, and we should write property tests to check them :) (In another PR)

crates/fs/private/hamt/node.rs Outdated Show resolved Hide resolved
crates/fs/private/hamt/node.rs Outdated Show resolved Hide resolved
crates/fs/private/namefilter/bloomfilter.rs Show resolved Hide resolved
crates/fs/private/namefilter/bloomfilter.rs Outdated Show resolved Hide resolved
crates/fs/private/namefilter/namefilter.rs Outdated Show resolved Hide resolved
crates/fs/private/namefilter/namefilter.rs Outdated Show resolved Hide resolved
crates/fs/private/namefilter/namefilter.rs Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Jul 8, 2022

Codecov Report

Merging #31 (059f37d) into main (0a6d8b6) will increase coverage by 2.75%.
The diff coverage is 83.06%.

@@            Coverage Diff             @@
##             main      #31      +/-   ##
==========================================
+ Coverage   77.11%   79.87%   +2.75%     
==========================================
  Files          17       18       +1     
  Lines         970     1262     +292     
==========================================
+ Hits          748     1008     +260     
- Misses        222      254      +32     
Impacted Files Coverage Δ
crates/fs/common/error.rs 0.00% <ø> (-50.00%) ⬇️
crates/fs/common/metadata.rs 68.18% <0.00%> (ø)
crates/fs/private/hamt/pointer.rs 66.15% <64.00%> (-2.14%) ⬇️
crates/fs/private/hamt/hamt.rs 80.95% <80.95%> (+3.17%) ⬆️
crates/fs/private/hamt/node.rs 83.90% <82.85%> (+14.67%) ⬆️
crates/fs/private/namefilter/bloomfilter.rs 88.88% <88.88%> (ø)
crates/fs/private/namefilter/namefilter.rs 90.90% <90.90%> (ø)
crates/fs/private/hamt/hash.rs 97.22% <95.83%> (-2.78%) ⬇️
crates/fs/common/link.rs 56.81% <100.00%> (+3.40%) ⬆️
... and 2 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

Copy link
Member

@matheus23 matheus23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a typo and a minor suggestion, but otherwise good to go 👍

I see that we're doing a bunch of values.iter().position(|p| p.key == *key) at the moment, an we've talked about "lookup-by-hash-of-key", so yeah that'd need to change to support that eventually, which seems like a time vs. space trade-off. I think in the worst case it's going from 3 256-byte equality checks to 3 256-byte sha3 + 32-byte equality checks. I'm leaning towards space in this case 🤔

👆 That's me musing about what we've talked about today, not really related to the PR.

let mut node = (**self).clone();
node.pointers[value_index] = Pointer::Link(Link::from(child));
if value.is_some() {
// If something has been deleted, we attempt toc canonicalize the pointer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo - toc -> to

if value.is_some() {
// If something has been deleted, we attempt toc canonicalize the pointer.
if let Some(pointer) =
Pointer::Link(Link::from(child)).canonicalize(store).await?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. This seems somewhat awkward. Would it help if canonicalize would work on a &mut self and you'd just call it on node.pointers with the child modified?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to avoid a situation where mutation happens in a call to another function and doesn't happen in the local scope. I probably took the local mutation thing too far 😅. I don't mind making it an &mut self.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will merge this PR but I'm going to add the change to the other PR since I have made some other changes there too.

@appcypher appcypher merged commit ed608aa into main Jul 20, 2022
@matheus23 matheus23 deleted the appcypher/fix-hamt-namefilter branch July 20, 2022 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2048-bit Namefilters Private Partition File Index (HAMT)
3 participants