Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Resharding] Estimate cost of the Trie Shallow Copying solution #9101

Closed
Tracked by #8992
robin-near opened this issue May 23, 2023 · 1 comment
Closed
Tracked by #8992

[Resharding] Estimate cost of the Trie Shallow Copying solution #9101

robin-near opened this issue May 23, 2023 · 1 comment
Assignees

Comments

@robin-near
Copy link
Contributor

Goal here is to see how fast "solution 2" would be, i.e. (1) prefix trie nodes with account ID (or prefix), and (2) during resharding, iterate through the trie from the root, copy everything except for subtrees that correspond to a single account, in which case we shallow copy the roots of those subtrees.

Part (1) requires a db migration (to remove shard UID and to add account prefix), but even without a db migration, we can already estimate the speed of (2), because the trie structure itself is not changing. The amount of trie iteration we would need to do before and after the db migration would be the same.

So, the way to do this can be, to make an offline tool that iterates the trie of a specific shard, stopping at any account subtrees, and see how long that takes. The sequential random reads should be the majority of the cost. We can also include writes in there - just as an estimate - there's no way to write correct resharded data because of the existing shard UID prefix.

The point is that if this takes only a few minutes, it would probably be preferrable to solution 1 (restructure trie) because this is much simpler. Solution 1 is neat but also requires maintaining the previous code, and that can be ugly.

@wacban wacban self-assigned this May 25, 2023
near-bulldozer bot pushed a commit that referenced this issue Jun 6, 2023
This is step 1 of the following plan to benchmark how long does it take to perform shallow trie iteration. For more details about the goal please see #9101. 

1. Add a neard subcommand that does full trie iteration. 
2. Extend it to actually measure and report the time it takes.
3. Extend it with shallow iteration. 

Once all three steps are done it will be possible to benchmark the full iteration, the shallow iteration and compare the time it takes for each.
nikurt pushed a commit that referenced this issue Jun 9, 2023
This is step 1 of the following plan to benchmark how long does it take to perform shallow trie iteration. For more details about the goal please see #9101. 

1. Add a neard subcommand that does full trie iteration. 
2. Extend it to actually measure and report the time it takes.
3. Extend it with shallow iteration. 

Once all three steps are done it will be possible to benchmark the full iteration, the shallow iteration and compare the time it takes for each.
@wacban
Copy link
Contributor

wacban commented Jun 15, 2023

This is finished, I shared the results in the PR.

@wacban wacban closed this as completed Jun 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants