Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deferred initialization of EmptyMerklePath structures till first use … #4131

Conversation

zebambam
Copy link

@zebambam zebambam commented Sep 9, 2019

…to reduce monolith startup penalty for fuzzing. These structures were previously filled statically at load time.

The performance cost of monolith startup even with an alternative main function is significant when you're loading it many times.

Notes on the approach to fuzzing (why do this patch?):

Yes there are other ways to write fuzzers, but their cost is also higher than necessary. This allows us to access any namespace we need within the monolith and know that at least the build requisites are available without having to do a lot of mocking, shimming, etc.. for each fuzzer.

Notes on the structure of the change:

Larry convinced me to use "LOCK" in the end, which meant bringing in another header. I've added optimization around the lock so that it's not needed every time.

I've added calls to initialize into the comparison operator as well as the accessor. I think that's all the places that are relevant.

…to reduce monolith startup penalty for fuzzing.
@zebambam
Copy link
Author

zebambam commented Sep 9, 2019

I tested this fix btw, and Zcash hits the code and seems stable.

Copy link
Collaborator

@LarryRuane LarryRuane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice PR!

src/zcash/IncrementalMerkleTree.hpp Outdated Show resolved Hide resolved
src/zcash/IncrementalMerkleTree.hpp Show resolved Hide resolved
src/zcash/IncrementalMerkleTree.hpp Outdated Show resolved Hide resolved
Copy link
Contributor

@daira daira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NACK due to undefined behaviour (see comments). Concept ACK though.

return empty_roots.at(depth);
}
template <size_t D, typename H>
friend bool operator==(const EmptyMerkleRoots<D, H>& a,
const EmptyMerkleRoots<D, H>& b);
private:
CCriticalSection cs_EmptyMerkleRoots_private_variables_write;
bool initialized;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be better as a std::atomic<bool>. We could then replace the locking with initialized.compare_exchange_strong(...).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

template<size_t Depth, typename Hash>
class EmptyMerkleRoots {
public:
    EmptyMerkleRoots() : initialized(false) { }
    void initialize() const {
        bool unitialized = false;
        if (initialized.compare_exchange_strong(unitialized, true)) {
            empty_roots.at(0) = Hash::uncommitted();
            for (size_t d = 1; d <= Depth; d++) {
                empty_roots.at(d) = Hash::combine(empty_roots.at(d-1), empty_roots.at(d-1), d-1);
            }
        }
    }
    Hash empty_root(size_t depth) const {
        initialize();
        return empty_roots.at(depth);
    }
    template <size_t D, typename H>
    friend bool operator==(const EmptyMerkleRoots<D, H>& a,
                           const EmptyMerkleRoots<D, H>& b);
private:
    mutable std::atomic<bool> initialized;
    mutable std::array<Hash, Depth+1> empty_roots;
};

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a neat way to do it I guess. I'm not sure that it's so much better than what's proposed that we should hold up the PR though. Maybe someone else disagrees.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. The two ways should both be fine.

…only (no more double-locking), and simplified CCriticalSection variable name.
@zebambam zebambam force-pushed the defer_expensive_intialization_of_empty_merkle_trees branch from 33498e3 to cf89467 Compare September 11, 2019 19:51
@Eirik0
Copy link
Contributor

Eirik0 commented Sep 12, 2019

@zkbot try

@zkbot
Copy link
Contributor

zkbot commented Sep 12, 2019

⌛ Trying commit cf89467 with merge 7523619...

zkbot added a commit that referenced this pull request Sep 12, 2019
…_merkle_trees, r=<try>

Deferred initialization of EmptyMerklePath structures till first use …

…to reduce monolith startup penalty for fuzzing. These structures were previously filled statically at load time.

The performance cost of monolith startup even with an alternative main function is significant when you're loading it many times.

Notes on the approach to fuzzing (why do this patch?):

Yes there are other ways to write fuzzers, but their cost is also higher than necessary. This allows us to access any namespace we need within the monolith and know that at least the build requisites are available without having to do a lot of mocking, shimming, etc.. for each fuzzer.

Notes on the structure of the change:

Larry convinced me to use "LOCK" in the end, which meant bringing in another header. I've added optimization around the lock so that it's not needed every time.

I've added calls to initialize into the comparison operator as well as the accessor. I think that's all the places that are relevant.
@Eirik0 Eirik0 added this to Needs Prioritization in Arborist Team via automation Sep 12, 2019
@Eirik0 Eirik0 moved this from Needs Prioritization to Security Issue Backlog in Arborist Team Sep 12, 2019
@Eirik0 Eirik0 moved this from Security Issue Backlog to Current Sprint in Arborist Team Sep 12, 2019
@Eirik0 Eirik0 moved this from Current Sprint to In Review in Arborist Team Sep 12, 2019
@zkbot
Copy link
Contributor

zkbot commented Sep 12, 2019

💔 Test failed - pr-try

Copy link
Contributor

@Eirik0 Eirik0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initialize() could be declared as private because there should be no need to call it outside of the EmptyMerkleRoots class. That being said, there is not necessarily any harm in leaving it public.

src/zcash/IncrementalMerkleTree.hpp Outdated Show resolved Hide resolved
src/zcash/IncrementalMerkleTree.hpp Outdated Show resolved Hide resolved
@LarryRuane
Copy link
Collaborator

LarryRuane commented Sep 13, 2019

What do you think about just precomputing these values? LarryRuane@df20788 No worries about locking or const. I did not measure performance.

But if this change was adopted, the test here

TEST(merkletree, emptyroots) {
should be modified. Currently, it compares the production empty roots with hard-coded blobs, and both could be wrong. It would be better to have the test compute the roots, and compare them with the production roots. In other words, currently, the production code computes and the test has hard-coded values. If instead the production code uses hard-coded values, then the test should compute.

@LarryRuane
Copy link
Collaborator

LarryRuane commented Sep 13, 2019

I updated the tests as explained in the previous comment: 1c54bd47d00c222613eee3ee63806b7408a3a076. To test the test, I modified one bit of the precomputed empty root data structure, and the new tests failed as expected.

@daira
Copy link
Contributor

daira commented Sep 14, 2019

I like Larry's approach. I reviewed master...LarryRuane:empty-roots and it looks correct to me (i.e. utACK for that branch).

Copy link
Contributor

@defuse defuse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK if the answer to my question is "yes"

Edit: GitHub's UI doesn't display my question because the conversation was marked as resolved. Here it is: https://github.com/zcash/zcash/pull/4131/files#r326359814

src/zcash/IncrementalMerkleTree.hpp Show resolved Hide resolved
@daira
Copy link
Contributor

daira commented Sep 21, 2019

I really prefer Larry's PR #4136 (it's much easier to see that that one is correct), and I think we should close this in favour of that one. It achieves exactly the same goal.

@Eirik0
Copy link
Contributor

Eirik0 commented Sep 26, 2019

@zebambam Are you ok if we close this ticket in favor of @LarryRuane's implementation?

Co-Authored-By: Eirik Ogilvie-Wigley <eirik@z.cash>
Co-Authored-By: Eirik Ogilvie-Wigley <eirik@z.cash>
@zebambam
Copy link
Author

initialize() could be declared as private because there should be no need to call it outside of the EmptyMerkleRoots class. That being said, there is not necessarily any harm in leaving it public.

There's a few reasons why some other code might want to force the initialization of this class, given that it was "supposed" to have been initialized already, and will only get initialized once, it seems only natural to give any code living anywhere in the monolith the opportunity to specify when this happens.

@zebambam zebambam closed this Oct 11, 2019
Arborist Team automation moved this from In Review to Released (Merged in Master) Oct 11, 2019
@zebambam
Copy link
Author

I believe that I started this work without a ticket, or at least I couldn't find one just now. Sorry about that. I've made one here: #4155

zkbot added a commit that referenced this pull request Oct 17, 2019
precompute EmptyMerklePath roots

This is an alternative implementation of #4131.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Arborist Team
  
Released (Merged in Master)
Development

Successfully merging this pull request may close these issues.

None yet

6 participants