Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merkle tree abstraction #4626

Merged
merged 13 commits into from
May 24, 2020
Merged

merkle tree abstraction #4626

merged 13 commits into from
May 24, 2020

Conversation

arvidn
Copy link
Owner

@arvidn arvidn commented May 9, 2020

This is a proposed abstraction for the per-file merkle tree. Currently this tree is a std::vector<sha256_hash> that's allocated to fit the full tree.

The goal of this abstraction is to enable a sparse representation of this tree, but this is not implemented yet. The aux::merkle_tree class does not allow direct write access to the tree, but the tree can only be mutated via a few high level operations. I believe these operations are at somewhat reasonable levels.

I've divided up the patch in bite-sized commits that can be reviewed individually.

@ssiloti I'm interested in whether you think this is a reasonable abstraction, or if you anticipate problems I may run into going further in this direction.

@arvidn arvidn force-pushed the merkle-tree branch 2 times, most recently from 4c886d2 to 808b856 Compare May 9, 2020 14:21
src/merkle_tree.cpp Outdated Show resolved Hide resolved
@ssiloti
Copy link
Collaborator

ssiloti commented May 9, 2020

This seems like a reasonable approach to me. The one minor concern I have is that when the tree is sparse peers could craft a hash request which would require a large number of hash computations to generate proof hashes. E.g. request two leaf hashes and proofs all the way back to the file root.

I'm not sure if this would be a problem in practice. I don't think it would be very effective as a DOS attack. It may still be a good idea to rate limit hash requests which require a lot of hash (re)computation.

@arvidn
Copy link
Owner Author

arvidn commented May 9, 2020

thanks for the review. Yeah, I should spend more time on improving the names and comments. I will also add unit tests for the merkle_tree type before making it support sparse modes.

Maybe a (future) approach to the DoS potential would be to keep score of how many hashes have been computed in response to a hash request (i.e. proofs higher up the tree have a higher cost) and rate limit based on that.

@arvidn
Copy link
Owner Author

arvidn commented May 9, 2020

btw. To further justify this whole project; the reason I consider the memory usage a big deal is because I failed to run my performance test that I run every now and then (once a year or so). Where I generate 100'000 torrents and load them up in client_test under a profiler.

Running this experiment in master fails with out-of-memory, on my linux machine with 78.6 GiB of RAM.

src/merkle.cpp Outdated Show resolved Hide resolved
@arvidn arvidn force-pushed the merkle-tree branch 6 times, most recently from ce17c80 to a302906 Compare May 21, 2020 10:47
@arvidn arvidn changed the title WIP: merkle tree abstraction merkle tree abstraction May 21, 2020
}

std::tuple<int, int, int> merkle_find_known_subtree(span<sha256_hash const> const tree
, int const block_index, int const num_valid_leafs)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssiloti I'm impressed you got all this right without extensive unit tests :)

@arvidn arvidn force-pushed the merkle-tree branch 2 times, most recently from 7c0cf33 to 20a6957 Compare May 22, 2020 01:13
@arvidn
Copy link
Owner Author

arvidn commented May 24, 2020

I think this patch is big enough to not build more onto. But the main missing part is still replacing (or at least replacing most use of) merkle_tree::operator[]() in favor of a higher level function to respond to a hash request. Right now there's potentially a lot of redundant memory allocation and hashing to respond to a request. Since each individual node is computed independently.

@arvidn arvidn merged commit 8ca0186 into master May 24, 2020
@arvidn arvidn deleted the merkle-tree branch May 24, 2020 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants