Skip to content
This repository has been archived by the owner on Mar 8, 2024. It is now read-only.

Implementation of Snap sync #95

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

alejoacosta74
Copy link

@alejoacosta74 alejoacosta74 commented Feb 16, 2024

No description provided.

Copy link

@gameofpointers gameofpointers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I think one thing you are missing is that the Prime being able to tell the zones to start this sync, you need to follow the API (set sync target) and make a new method to do that and then you can send the start snap sync message into the handler in zone using the new feed you have created

trie/snap.go Outdated
}

// CalculateNodeHash calculates the Keccak-256 hash of a trie node's RLP encoding.
func calculateNodeHash(n node) ([]byte, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure there isn't already a method defined somewhere to get node hashes? Hashing a node is fundamental to trie operation, so I know this logic must already exist somewhere.

If it doesn't exist as a method on the node type, I recommend you move it to a method and call that method here, so that we can be sure we are doing the exact same hashing as the PMT logic.

This is important, because we are going to change the PMT hash later, and it would be better to make that change in one place instead of N places

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I guess I can leverage on the functionality from trie/hasher.go .

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to calculate the hash if the child node is type hashNode. See next comment

trie/snap.go Outdated
hashes := make([]common.Hash, 0, 17)
for _, child := range n.Children {
if child != nil {
hashBytes, err := calculateNodeHash(child)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why you're hashing again here. IIUC the full node contains hashes of the child nodes, so it should be as simple as parsing those hashes from the node data

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good observation.
The child is actually an interface:

type node interface {
	fstring(string) string
	cache() (hashNode, bool)
}

And can be any of:

type (
	fullNode struct {
		Children [17]node // Actual trie node data to encode/decode (needs custom encoder)
		flags    nodeFlag
	}
	shortNode struct {
		Key   []byte
		Val   node
		flags nodeFlag
	}
	hashNode  []byte
	valueNode []byte
)

I think a more appropiated implementation would be something like:

        for _, child := range n.Children {
            // Directly append the hash if the child is a hashNode
            if hn, ok := child.(hashNode); ok {
                hashes = append(hashes, common.BytesToHash(hn))
            } else {
                // Logically, this case should not occur if we ensure that children transmitted over the network
                // are `hashNode` s containing only the hash
                return nil, fmt.Errorf("unexpected non-hash child node encountered")
            }

Then, we need to ensure that:

fullNode struct {
    Children [17]node // When transmitted over the network and unmarshalled, `node` is expected to be either `hashNode` or `nil`.
    flags    nodeFlag
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented as suggested above

}

// IsFullNode returns true if the trie node is a fullNode
func (t *TrieNode) IsFullNode() bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this useful? It looks like you just use the type assertion directly, instead of calling this method.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type assertion cannot be used outside the trie package given that the node types are private.
The IsFullNode method is being used by the downloader to decide wether it should iterate over all children.

c.sl.subClients[header.Location().SubIndex(nodeLocation)].SetSyncTarget(context.Background(), header)
}
if subClient := c.sl.subClients[header.Location().SubIndex(nodeLocation)]; subClient != nil {
subClient.SetSyncTarget(context.Background(), header)
Copy link
Member

@wizeguyy wizeguyy Mar 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this used? We need to tell the node which block to snap to, but this seems to tell the node which node to sync to.

e.g. if we have the following chain:

[block 200]<--[block 201]<--...[block 10000]<--[block 10001]<--[block 10002]
  ^
zone node is here

Region should use this method should tell the zone to sync to 10002, but when snap syncing we want to tell the node to start with a snapshot at 800 or something (some number of prime blocks before the tip).

Am I misunderstanding this method?

Copy link
Member

@wizeguyy wizeguyy Mar 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I am trying to ensure is that all nodes in a slice snapshot to the same prime block. They cannot decide on independent snap targets

core/core.go Outdated
// TriggerSnapSync triggers snap sync at the zone level
func (c *Core) TriggerSnapSync(header *types.Header) {
if header == nil {
log.Global.Warn("TriggerSnapSync called with nil header")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a fatal error. This can only happen if we make a programming error and its non-nonsensical for the node to continue operation in that case

In this case, I think its fine not to check, and let the code panic later if this ever is actually nil

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Fixed

}

// VerifyNodeHash verifies a expected hash against the RLP encoding of the received trie node
func verifyNodeHash(rlpBytes []byte, expectedHash []byte) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed yesterday, I think this can be handled entirely by libp2p, since it hashes every packet it receives. Please leverage that

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to have @gameofpointers view on this as well. Since the above was agreed with him

func (h *handler) startSnapSync(blockNumber uint64) {
err := h.d.StartSnapSync(h.nodeLocation, blockNumber)
if err != nil {
panic("TODO: handle error" + err.Error())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Panics are fine here for now, but this sort of error should be handled before we merge the PR

Copy link
Member

@wizeguyy wizeguyy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few questions, but looks good so far

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants