Optimize archival Mutator Sets, with a focus on IO #127

aszepieniec · 2024-04-03T10:08:12Z

Depends on #126.

Sword-Smith · 2024-04-29T11:02:14Z

@zkclay If you want to have a look at this, maybe have a look at this blogpost. If you understand the mechanics of Merkle Mountain Ranges (MMR) you are 80 % towards understanding mutator sets. A mutator set contains two lists:

an append-only commitment list which represents the outgoing UTXOs, and
an unbounded Bloom filter which represents spent UTXOs.

Both of these lists are represented by MMRs, so the logic to update a mutator set is 80 % MMR logic and 20 % other stuff.

If you don't quite understand the mechanics of MMRs, maybe this blogpost of mine can help.

zkclay · 2024-04-29T17:23:25Z

Alright, I'll take a look at this! Thanks for the resources!

zkclay · 2024-05-02T19:48:15Z

Hey @Sword-Smith

I spent some time reading up on the Merkle Mountain Ranges and the Mutator Set docs that you wrote and taking a look at the code. The blogs posts are dense and I had to reread it a few times, but now I have a much better understanding of the two concepts now :) Thanks for the work you and the team has put into them!

However, I’m still a bit unsure as to the optimization strategy we want to pursue here. In taking a look at the mutator set code, it seems like the MMR operations are well encapsulated in the Archival MMR abstraction, and the bulk of the code for the mutator set is concerned with managing the sliding window bloom filter. Are the optimizations we want to do here? I’m still a little hazy on the specifics of the SWBF logic and all the chunks we are manipulating, but I can look deeper into this, if it is the intended focus.

I also noticed that there seems to be some functionality duplicated across the Mutator Set and Mutator Set Accumulator, is this what we want to look into? Or are you thinking of something else entirely?

Sword-Smith · 2024-05-03T09:00:52Z

My motivation for this issue and #126 was to optimize the time that the node spends when updating its state with a new block. I looked into the archival-MMR and the archival-mutator set code and it didn't look optimal to me. I wrote the original implementation of all of the MMR code and the mutator set code, but multiple (beneficial) rewrites have occurred since then: all access to persistent storage has been made async, which benefits concurrency, and we got rid of two traits: one that tied MmrAccumulator and ArchivalMmr together, and one that tied MutatorSetAccumulator and ArchivalMutatorSet together. We got rid of the two traits because we didn't want the accumulator versions to have to be implemented through async functions. Although these rewrites are net beneficial, I have the sense that some performance was lost in it.

Notice that the accumulator versions store everything in RAM whereas the archival versions handle so much data, that they must live on disk, and they need to be persisted when the program shuts down.

With #126 behind us, I'm happy with the performance of ArchivalMmr.

A stylistic property I don't like about the current ArchivalMutatorSet is how it always goes through the logic from MutatorSetAccumulator. That looks like lazy programming to me. ArchivalMutatorSet has more knowledge than the accumulator-version, so it should be able to handle everything itself. Inspecting the code, it looks like the archival version uses the accumulator version as a library. Maybe something should be factored out so that's not necessary. Something to consider.

On the top of my list of priorities for this issue, though, is that the archival version is fast. So you could start by adding a relevant benchmark and then measure whether your changes make the code run faster. Since the archival version stores everything to disk, I would imagine that optimizing IO will yield the biggest result.

I hope this addresses some of your questions :)

Sword-Smith · 2024-05-03T09:13:51Z

Here's an example of a speedup I just found: prove for ArchivalMutatorSet goes through prove for the accumulator.

    pub async fn prove(
        &self,
        item: Digest,
        sender_randomness: Digest,
        receiver_preimage: Digest,
    ) -> MsMembershipProof {
        MutatorSetAccumulator::new(
            &self.aocl.get_peaks().await,
            self.aocl.count_leaves().await,
            &self.swbf_inactive.get_peaks().await,
            &self.swbf_active.clone(),
        )
        .prove(item, sender_randomness, receiver_preimage)
    }

And in the accumulator, prove is implemented as:

    pub fn prove(
        &self,
        item: Digest,
        sender_randomness: Digest,
        receiver_preimage: Digest,
    ) -> MsMembershipProof {
        // compute commitment
        let item_commitment = Hash::hash_pair(item, sender_randomness);

        // simulate adding to commitment list
        let auth_path_aocl = self.aocl.to_accumulator().append(item_commitment);
        let target_chunks: ChunkDictionary = ChunkDictionary::default();

        // return membership proof
        MsMembershipProof {
            sender_randomness: sender_randomness.to_owned(),
            receiver_preimage: receiver_preimage.to_owned(),
            auth_path_aocl,
            target_chunks,
        }
    }

So the only thing the ArchivalMutatorSet::prove needs from the accumulator is the MMR authentication path for the next leaf (that hasn't been added yet). Like I said above, this is lazy programming (or programming optimized for developer time, not for runtime cost). We only need an accumulator version of the AOCL MMR for this, not the whole mutator set accumulator. If a new function was added to ArchivalMmr we could even get this authentication path without needing an MmrAccumulator.

zkclay · 2024-05-06T15:21:46Z

@Sword-Smith ok, all this context is very helpful, I'll take another look. Thanks!

zkclay · 2024-05-13T02:11:03Z

Hey @Sword-Smith, I took another look at this, but unfortunately there's still a lot for me to wrap my head around here in this mutator set, and I've been making a lot slower progress than I'd like. I will also be pretty busy with some other commitments in the next week, so I want to give this issue up for someone else to work on, rather than hold the team and the roadmap back.

also, as note, I did test cutting out the MutatorSetAccumulator and moving the implementation into the MutatorSet itself, and I didn't see any speedups, so I think that could be left in depending on your preferences for the codebase.

Sword-Smith · 2024-05-13T07:05:03Z

Hey @Sword-Smith, I took another look at this, but unfortunately there's still a lot for me to wrap my head around here in this mutator set, and I've been making a lot slower progress than I'd like. I will also be pretty busy with some other commitments in the next week, so I want to give this issue up for someone else to work on, rather than hold the team and the roadmap back.

also, as note, I did test cutting out the MutatorSetAccumulator and moving the implementation into the MutatorSet itself, and I didn't see any speedups, so I think that could be left in depending on your preferences for the codebase.

No problem. Thanks for the commits you already contributed. Hope to see you back here at some point :)

aszepieniec added the optimization making stuff faster (or otherwise better) label Apr 3, 2024

aszepieniec mentioned this issue Apr 3, 2024

make storage async, attempt 3. #124

Merged

Sword-Smith mentioned this issue Apr 7, 2024

126 speedup ammr #132

Merged

zkclay mentioned this issue Apr 27, 2024

Optimize get_peaks_and_heights for archival MMR. #141

Merged

Sword-Smith changed the title ~~Optimize IO in archival Mutator Sets~~ Optimize archival Mutator Sets, with a focus on IO May 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize archival Mutator Sets, with a focus on IO #127

Optimize archival Mutator Sets, with a focus on IO #127

aszepieniec commented Apr 3, 2024

Sword-Smith commented Apr 29, 2024 •

edited

Loading

zkclay commented Apr 29, 2024

zkclay commented May 2, 2024

Sword-Smith commented May 3, 2024 •

edited

Loading

Sword-Smith commented May 3, 2024 •

edited

Loading

zkclay commented May 6, 2024

zkclay commented May 13, 2024

Sword-Smith commented May 13, 2024

Optimize archival Mutator Sets, with a focus on IO #127

Optimize archival Mutator Sets, with a focus on IO #127

Comments

aszepieniec commented Apr 3, 2024

Sword-Smith commented Apr 29, 2024 • edited Loading

zkclay commented Apr 29, 2024

zkclay commented May 2, 2024

Sword-Smith commented May 3, 2024 • edited Loading

Sword-Smith commented May 3, 2024 • edited Loading

zkclay commented May 6, 2024

zkclay commented May 13, 2024

Sword-Smith commented May 13, 2024

Sword-Smith commented Apr 29, 2024 •

edited

Loading

Sword-Smith commented May 3, 2024 •

edited

Loading

Sword-Smith commented May 3, 2024 •

edited

Loading