Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDS: Adding new leaf to MST is O(n²) #22

Open
rudyfraser opened this issue Oct 12, 2024 · 6 comments
Open

PDS: Adding new leaf to MST is O(n²) #22

rudyfraser opened this issue Oct 12, 2024 · 6 comments

Comments

@rudyfraser
Copy link
Member

See graph for sample size of 254 leafs being added.

Screenshot 2024-10-12 at 9 43 32 AM

Discovered this issue when testing adding records in bulk (f6b1b26). Should be able to add 1000 records in ms. Canonical code for the same test runs in 0.717 s, estimated 2 s

@DavidBuchanan314
Copy link

DavidBuchanan314 commented Oct 12, 2024

So just to clarify, adding n nodes is costing O(n²) overall, meaning each individual add is O(n)?

Versus the ideal expectation of O(n log n) overall, and O(log n) for each add?

@DavidBuchanan314
Copy link

DavidBuchanan314 commented Oct 12, 2024

I think this is your issue right here:

let mut new_root = MST::create(self.storage.clone(), Some(updated), Some(key_zeros))?;

I'm not a rustacean so forgive me if I'm misunderstanding, but if you're copying the whole MST storage then that's an O(n) operation (where n is the number of nodes currently in the tree)

Edit: ah yeah I'm probably mistaken, I see storage is a SqlRepoReader so that line doesn't actually copy the data

@rudyfraser
Copy link
Member Author

So just to clarify, adding n nodes is costing O(n²) overall, meaning each individual add is O(n)?

Versus the ideal expectation of O(n log n) overall, and O(log n) for each add?

Yes, I think that is exactly right. I appreciate you taking a look at this!

@mackuba
Copy link

mackuba commented Oct 13, 2024

cc @steveklabnik

@steveklabnik
Copy link

Sorry for the late response here; I've glanced at this a few times, and nothing really sticks out to me. I don't have time this exact minute to try and dig in even further, but rather than look it over and go "nope, not immediately seeing it" and not leaving a comment like the last two or three times, figured I'd say something at least :)

@DavidBuchanan314
Copy link

I just came across this https://github.com/domodwyer/merkle-search-tree - which is a rust MST implementation with very impressive perf numbers. It doesn't look like it was written with atproto in mind though, so I'm not sure it'd be easy to drop in, but it might be useful for reference

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants