New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trie: add difference iterator #3637
Conversation
Thank you for your contribution! Your commits seem to not adhere to the repository coding standards
Please check the contribution guidelines for more details. This message was auto-generated by https://gitcop.com |
|
||
accountHash common.Hash // Hash of the node containing the account | ||
codeHash common.Hash // Hash of the contract source code | ||
code []byte // Source code associated with a contract | ||
|
||
Hash common.Hash // Hash of the current entry being iterated (nil if not standalone) | ||
Entry interface{} // Current state entry being iterated (internal representation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you delete this because it wasn't used anywhere? Any particular reason? Doesn't this make the iterator a lot less useful as there's no way to actually access the data being iterated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I deleted it because it isn't being used anywhere. Because the entry types are all private anyway, this field wasn't usable outside the package.
trie/iterator.go
Outdated
nodeIt: NewNodeIterator(trie), | ||
keyBuf: make([]byte, 0, 64), | ||
Key: nil, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed, nil
is the default value anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha, just saw this was in the original code too. Anyway, probably should be deleted.
trie/iterator.go
Outdated
// FromNodeIterator creates a new key-value iterator from a node iterator | ||
func FromNodeIterator(it NodeIterator) *Iterator { | ||
return &Iterator{ | ||
nodeIt: it, | ||
Key: nil, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Key: nil
shouldn't be needed.
trie/iterator.go
Outdated
} | ||
|
||
// FromNodeIterator creates a new key-value iterator from a node iterator | ||
func FromNodeIterator(it NodeIterator) *Iterator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the correct name is NewIteratorFromNodeIterator
. A bit unwieldy but NewSomething
should be part of the method signature to make it visible what it does. I'm open to suggestions though :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NewIterator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except that already exists
} | ||
return decodeCompact(key) | ||
// NodeIterator is an iterator to traverse the trie pre-order. | ||
type NodeIterator interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this interface is a public one, please document all the methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few special mentions probably warranted for:
- The caller of the interface methods must not modify the returned values (at least for LeadBlob and Path), since those are not copied by the iterator, just returned as is.
- The caller should not retain the result of Path or should make a copy, since the next invocation to "Next" will potentially change its content.
trie/iterator.go
Outdated
|
||
// Hash returns the hash of the current node | ||
func (it *nodeIterator) Hash() common.Hash { | ||
if it.trie == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is up for debate, but I think it would be cleaner to have if len(it.stack) == 0
here, rather than looking at it.trie == nil
because below you are indexing into the stack and you need an extra "mental jump" to realize what the connection between it.trie
and it.stack
is.
Same for the methods below.
trie/iterator.go
Outdated
return nil | ||
} | ||
|
||
if node, ok := it.stack[len(it.stack)-1].node.(valueNode); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about it but I think you could convert to []byte
directly here instead of valueNode
first and then []byte
later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't, unfortunately.
trie/iterator.go
Outdated
func (it *NodeIterator) Next() bool { | ||
// sets the Error field to the encountered failure. If `children` is false, | ||
// skips iterating over any subnodes of the current node. | ||
func (it *nodeIterator) Next(children bool) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick. Maybe descend
would be a more appropriate name than children
? It more verb-y, at least fits better with a bool
.
hashes[it.Hash] = struct{}{} | ||
for it := NewNodeIterator(trie); it.Next(true); { | ||
if it.Hash() != (common.Hash{}) { | ||
hashes[it.Hash()] = struct{}{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine here since it's only a test, but live code should cache the result of .Hash()
since it allocates a new common.Hash
memory area. Just saying to make sure any code after this will be ok :)
func (it *NodeIterator) retrieve() bool { | ||
// Clear out any previously set values | ||
it.Hash, it.Node, it.Parent, it.Leaf, it.LeafBlob = common.Hash{}, nil, common.Hash{}, false, nil | ||
type differenceIterator struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some dox. I know it's only an internal type, but often it helps when skimming the code if internals have at least slight docs.
trie/iterator.go
Outdated
a: a, | ||
b: b, | ||
eof: false, | ||
count: 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to set eof
to false
and count
to 0
, those are the default values.
trie/iterator.go
Outdated
// If the iteration's done, return no available data | ||
if it.trie == nil { | ||
// NewDifferenceIterator constructs a NodeIterator that iterates over elements in b that | ||
// are not in a. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also mention what the int pointer return is, and why it's a pointer.
trie/iterator.go
Outdated
it.Hash, it.Node, it.Parent, it.Leaf, it.LeafBlob = common.Hash{}, nil, common.Hash{}, false, nil | ||
type differenceIterator struct { | ||
a, b NodeIterator | ||
eof bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document that eof
marks the end of a
.
trie/iterator.go
Outdated
|
||
for { | ||
apath, bpath := it.a.Path(), it.b.Path() | ||
cmp := bytes.Compare(apath, bpath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would probably be nicer with:
switch bytes.Compare(it.a.Path(), it.b.Path()) {
case -1:
// ...
case 1:
// ...
default:
// ...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not, see and chose whichever :)
trie/iterator.go
Outdated
// a and b are identical; skip this whole subtree if the nodes have hashes | ||
if !it.b.Next(it.a.Hash() == common.Hash{}) { | ||
return false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you ought to increment count by one here, and by one below. Otherwise you might miss a count if a
goes EOF.
trie/iterator.go
Outdated
} | ||
|
||
// a and b are identical; skip this whole subtree if the nodes have hashes | ||
if !it.b.Next(it.a.Hash() == common.Hash{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b.Hash()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be the same as the current one, but at least it doesn't look like an accident :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've refactored a bit to make the logic clearer.
Thank you for your contribution! Your commits seem to not adhere to the repository coding standards
Please check the contribution guidelines for more details. This message was auto-generated by https://gitcop.com |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
This PR implements a differenceIterator, which allows iterating over trie nodes that exist in one trie but not in another. This is a prerequisite for most GC strategies, in order to find obsolete nodes.