Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPIC: Node & key format #548

Closed
3 tasks done
tac0turtle opened this issue Sep 7, 2022 · 8 comments
Closed
3 tasks done

EPIC: Node & key format #548

tac0turtle opened this issue Sep 7, 2022 · 8 comments
Assignees
Labels

Comments

@tac0turtle
Copy link
Member

tac0turtle commented Sep 7, 2022

Currently IAVL does not take advantage of data locality on disk. The way IAVL store nodes on disk is with a hash that gets stored in a random location on disk and makes it so that IAVL needs to do a random search of disk in order to find the data. With the recent work from osmosis on fast node we saw a large performance improvement with a trade off on writes.

Secondly, there are two key formats, live keys and orphan keys referenced by node hash which also indexes the orphan nodes.

A next step would be to modify IAVL to structure data on disk in a logical manner to reduce random searches and reduce random compaction of disk data.

There are two things that can be done in order to take advantage of this.

Work Scope:

Phase 1:

  • modify the key format from a hash of the data to be write_version|node_path_in_tree.

Phase 2:

  • make every node store the version of its children.
  • remove orphan indexing and hashes

(Proposed by @ValarDragon)

@tac0turtle tac0turtle changed the title Node & key format EPIC: Node & key format Sep 7, 2022
@ValarDragon
Copy link
Contributor

ValarDragon commented Sep 7, 2022

Other steps:

  • To modify the key format, you have to update the node struct to add left_child_version and right_child_version in the nodes
  • Probably delete the root key format, and instead just make it a method that returns a key in your main node format space

Can be done later, but needed pre-integration for pruning:

  • Clarify versioning / historic delete API semantics from SDK
  • Make methods to prune (theres a couple approaches, if someones having trouble deriving them, can write up more)

@faddat
Copy link
Contributor

faddat commented Sep 9, 2022

modify the key format from a hash of the data to be write_version|node_path_in_tree.

where write_version is the left side and node_path_in_tree is the right side?

Or is it like, both sides have that?

@faddat faddat mentioned this issue Sep 9, 2022
@robert-zaremba
Copy link
Collaborator

NOTE: store/v2 solves that in a way similar to TurboGeth (erigon)

@robert-zaremba
Copy link
Collaborator

So this proposal will increase a node (and DB) size, right?

Few important questions

  • What is the definition of write_version?
  • Will that impact state proofs / merkle proofs? This change must be compatible with the existing merkle proof, otherwise we have trouble with IBC. My guess is that it won't, but want to double check.

@tac0turtle
Copy link
Member Author

NOTE: store/v2 solves that in a way similar to TurboGeth (erigon)

thank you, but as mentioned in a few places store/v2 work is unknown at this time for countless reasons.

@cool-develope
Copy link
Collaborator

cool-develope commented Sep 27, 2022

@marbar3778
It looks like we couldn't modify the key format as write_version|node_path_in_tree. Because when rotating the tree, the path would be changed.

@tac0turtle
Copy link
Member Author

this is expected, do you see a problem with this?

cc @ValarDragon

@catShaark
Copy link
Contributor

@marbar3778, I'm also working on this, can you assign me as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants
@robert-zaremba @ValarDragon @faddat @tac0turtle @catShaark @cool-develope and others