Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Shard Chain Type With Useful Methods for Notary/Proposer Clients #100

Merged
merged 37 commits into from May 16, 2018
Merged

WIP: Shard Chain Type With Useful Methods for Notary/Proposer Clients #100

merged 37 commits into from May 16, 2018

Conversation

rauljordan
Copy link
Contributor

@rauljordan rauljordan commented May 2, 2018

This is a PR referencing #99 to create a shard chain interface/struct that can be used across our various packages.

Context

This was worked on in Py-EVM ethereum/py-evm#570 and would also serve as a useful wrapper class to thin out the code we would otherwise have to write for notaries/proposers.

Requirements for Merge

  • Create a Shard struct/interface
  • Define common methods such as fetching collation by hash, checking availability, fetching x number of collations from within the shard
  • Define an in-memory db for now as the shardDB backend
  • Finalize all the tests for the methods below. Currently have 3 tests passing.

The design for an actual db backend used in our sharding client is beyond the scope of this PR. This PR just uses a simple k-v store.

Design Rationale

In this PR, we include both a transactions []*types.Transaction field along with a body []byte field in the Collation struct.

// Collation base struct.
type Collation struct {
	header *CollationHeader
	// body represents the serialized blob of a collation's transactions.
	body []byte
	transactions []*types.Transaction
}

The reason for this is that transactions serves as a useful slice to store deserialized chunks from the
collation's body we can manipulate programmatically. Every time this transactions slice is updated, the serialize would need to be recalculated. This will be a useful property for proposers and executors in our sharding implementation down the line.

Functions Being Implemented

func (s *Shard) HeaderByHash(hash *common.Hash) (*CollationHeader, error) {}
func (s *Shard) CollationByHash(headerHash *common.Hash) (*Collation, error) {}
func (s *Shard) CanonicalCollationHash(shardID *big.Int, period *big.Int) (*common.Hash, error) {}
func (s *Shard) CanonicalCollation(shardID *big.Int, period *big.Int) (*Collation, error) {}
func (s *Shard) BodyByChunkRoot(chunkRoot *common.Hash) ([]byte, error) {}
func (s *Shard) CheckAvailability(header *CollationHeader) (bool, error) {}
func (s *Shard) SetAvailability(chunkRoot *common.Hash, availability bool) error {}
func (s *Shard) SaveHeader(header *CollationHeader) error {}
func (s *Shard) SaveBody(body []byte) error {}
func (s *Shard) SaveCollation(collation *Collation) error {}
func (s *Shard) SetCanonical(header *CollationHeader) error {}

@rauljordan rauljordan added this to the Ruby milestone May 2, 2018
@rauljordan rauljordan self-assigned this May 2, 2018
@rauljordan rauljordan added this to To do in Validator Client via automation May 2, 2018

// Shard base struct.
type Shard struct {
shardDB *shardKV
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this to a common interface? Right now, it only accepts a database of struct shardKV, so it’s not very swappable.

// MakeShard creates an instance of a Shard struct given a shardID.
func MakeShard(shardID *big.Int) *Shard {
// Swappable - can be makeShardLevelDB, makeShardSparseTrie, etc.
shardDB := makeShardKV()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn’t this be a function argument? Why does a shard know which database to initialize? It feels a bit out of the responsibilities of a initializing a shard.

I can imagine that a nontrivial database has many arguments so this method will quickly get out of control.


// ShardID is the identifier for a shard.
func (c *Collation) ShardID() *big.Int { return c.header.shardID }
// Hash returns the hash of a collation's entire contents. Useful for tests.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only useful for tests? Maybe move to the testing package

shard := MakeShard(big.NewInt(3))

if err := shard.ValidateShardID(header); err == nil {
t.Fatalf("Shard ID validation incorrect. Function should throw error when shardID's do not match. want=%d. got=%d", header.ShardID().Int64(), shard.ShardID().Int64())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be Errorf. You should only fatal when the test cannot possibly proceed forward.

Please change this everywhere in this test

sharding/db.go Outdated
fmt.Printf("Map: %v\n", sb.kv)
fmt.Printf("Key: %v\n", k)
fmt.Printf("Val: %v\n", sb.kv[k])
fmt.Printf("Ok: %v\n", ok)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove all of this printing. If you feel that you really need it, use ethereum log with the appropriate level (debug).

// Transactions returns an array of tx's in the collation.
func (c *Collation) Transactions() []*types.Transaction { return c.transactions }
// Body returns the collation's byte body.
func (c *Collation) Body() []byte { return c.body }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn’t this return a pointer to the byte array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would it? In case it's nil?

sharding/db.go Outdated
fmt.Printf("Val: %v\n", sb.kv[k])
fmt.Printf("Ok: %v\n", ok)
if !ok {
return nil, fmt.Errorf("Key Not Found")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to (“key not found %v”, k)

}

// GetHeaderByHash of collation.
func (s *Shard) GetHeaderByHash(hash *common.Hash) (*CollationHeader, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove “Get” here and everywhere https://golang.org/doc/effective_go.html#Getters

// if collation.Hash() != dbCollation.Hash() {
// t.Fatalf("Collations do not match. want=%v. got=%v", collation, dbCollation)
// }
// }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uncomment or delete this test. No point in checking in dead code

return common.BytesToHash([]byte(key))
}

// dataAvailabilityLookupKey formats a string that will become a lookup key
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong method name

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still wrong!

)

// Collation base struct.
type Collation struct {
header *CollationHeader
body []byte
transactions []*types.Transaction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. How are we handling a body and transactions?

Are we supposed to update the body every time we modify this array?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldnt transactions be called chunks? There's no TransactionRoot in collation header, only ChunkRoot

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transactions are actual deserialized tx's that will be helpful for our implementation

// Period the collation corresponds to.
func (c *Collation) Period() *big.Int { return c.header.period }
// Transactions returns an array of tx's in the collation.
func (c *Collation) Transactions() []*types.Transaction { return c.transactions }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transactions here too. I thought that's replaced by chunks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need them because we'll be deserializing the chunks into transactions. We'll need a place to store this deserialized data and keeping a tx's slice is good here.

sharding/db.go Outdated

"github.com/ethereum/go-ethereum/common"
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, a highlevel comment on how this works would be nice

shard := MakeShard(big.NewInt(1))

if err := shard.SaveHeader(header); err != nil {
t.Fatal(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the Fatal be more descriptive here? Like t.Fataf("Save header failed: %v", err)

// It's being saved, but the .Get func doesn't fetch the value...?
dbHeader, err := shard.GetHeaderByHash(&hash)
if err != nil {
t.Fatal(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More description for t.Fatal(err)?

Copy link
Member

@prestonvanloon prestonvanloon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of questions about design choices.
Could you add some details to the pull request description where necessary?

)

// Collation base struct.
type Collation struct {
header *CollationHeader
body []byte
transactions []*types.Transaction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. How are we handling a body and transactions?

Are we supposed to update the body every time we modify this array?


// DecodeRLP uses an RLP Stream to populate the data field of a collation header.
func (h *CollationHeader) DecodeRLP(s *rlp.Stream) error {
err := s.Decode(&h.data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return s.Decode(&h.data)


// CalculateChunkRoot updates the collation header's chunk root based on the body.
func (c *Collation) CalculateChunkRoot() {
// TODO: this needs to be based on blob serialization.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you create a github issue for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 There's a lot more to be done here. For proof of custody we need to split chunks (body) into chunk + salt and take the merkle root of that

sharding/db.go Outdated

"github.com/ethereum/go-ethereum/common"
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, a highlevel comment on how this works would be nice

sharding/db.go Outdated
@@ -0,0 +1,37 @@
package sharding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can/should we move this to its own package?

hash := common.BytesToHash(key.Bytes())
collationHashBytes, err := s.shardDB.Get(hash)
if err != nil {
return nil, fmt.Errorf("no canonical collation set for period, shardID pair: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error seems to assume that the error was caused by non-existent data. See my other comment about this.

if err != nil {
return nil, fmt.Errorf("no canonical collation set for period, shardID pair: %v", err)
}
collationHash := common.BytesToHash(collationHashBytes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when the data isn't there but there was no error? i.e. if collationHashBytes == nil

if err != nil {
return nil, err
}
return &Collation{header: header, body: body}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return &Collation{header, body}, nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting "too few values in struct initializer"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't sound right...

}

// Uses the hash of the header as the key.
hash := header.Hash()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to declare this variable? It would be simpler as

s.shardDB.Put(header.Hash(), encoded)

if err != nil {
return nil, err
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle case for header == nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is covered by the headerbyhash method. error means that the header was nil.

// ProposerAddress is the coinbase addr of the creator for the collation.
func (c *Collation) ProposerAddress() *common.Address {
return c.header.data.ProposerAddress
}

// AddTransaction adds to the collation's body of tx blobs.
func (c *Collation) AddTransaction(tx *types.Transaction) {
// TODO: Include blob serialization instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

github issue here too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly should the issue be @terenc3t ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use blob serialization instead of appending tx to c.transactions c.transactions = append(c.transactions, tx)

rlp.Encode(hw, h)
hw.Sum(hash[:0])
return hash
}

// ShardID is the identifier for a shard.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is not exactly accurate. It should be something like ShardID the collation belongs to

period *big.Int //the period number in which collation to be included.
proposerAddress *common.Address //address of the collation proposer.
proposerSignature []byte //the proposer's signature for calculating collation hash.
data collationHeaderData
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why put a nested struct collationHeaderData?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so that it acts as a read only property. You can only access the data via getters and cannot modify it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! got it. Thanks for the explanations


// CalculateChunkRoot updates the collation header's chunk root based on the body.
func (c *Collation) CalculateChunkRoot() {
// TODO: this needs to be based on blob serialization.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 There's a lot more to be done here. For proof of custody we need to split chunks (body) into chunk + salt and take the merkle root of that

sharding/db.go Outdated
"github.com/ethereum/go-ethereum/common"
)

type shardKV struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 Some basic tests for Get, Has, Put and Delete would be nice

return nil
}

// HeaderByHash of collation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make the function description comment more descriptive?

Copy link
Member

@prestonvanloon prestonvanloon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really solid. Just a few comments for code optimization and to remove commented code.

// body represents the serialized blob of a collation's transactions.
body []byte
// transactions serves as a useful slice to store deserialized chunks from the
// collation's body. Every time this transactions slice is updated, the serialized
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update comment as described above.

// TODO: this test needs to change as we will be serializing tx's into blobs
// within the collation body instead.

// func TestCollation_AddTransactions(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete all of this?

if err != nil {
return nil, fmt.Errorf("cannot fetch body by chunk root: %v", err)
}
// TODO: deserializes the body into a txs slice instead of using
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create github issue for this. It would be a nice place for someone to contribute after serialization is added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll create as soon as we merge

return nil, err
}
if body == nil {
return nil, fmt.Errorf("no corresponding body with chunk root found: %v", chunkRoot.String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this to %s and I think you don't have to call .String() since go will do that automatically.

if val == nil {
return false, fmt.Errorf("availability not set for header")
}
availability := *val
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this makes a copy of the data. Can't you access val[0]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cause val is type *[]byte

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! I am mistaken about copying the data. With a slice, like you have here, it is copying the internal slice pointers so this is not an expensive operation.

In fact, it is OK to pass in a slice as an argument since it's really just a set of 1 pointer and 2 values. (beginning, cap, len).

https://blog.golang.org/go-slices-usage-and-internals

}
// sets the key to be the canonical collation lookup key and val as RLP encoded
// collation header.
if err := s.shardDB.Put(key, encoded); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return this


// dataAvailabilityLookupKey formats a string that will become a lookup
// key in the shardDB.
func dataAvailabilityLookupKey(chunkRoot *common.Hash) common.Hash {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind.

// dataAvailabilityLookupKey formats a string that will become a lookup
// key in the shardDB.
func dataAvailabilityLookupKey(chunkRoot *common.Hash) common.Hash {
key := fmt.Sprintf("availability-lookup:%s", chunkRoot.String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to call .String() it should be automatic for the formatter with %s.

// of the shard for ease of use.
func canonicalCollationLookupKey(shardID *big.Int, period *big.Int) common.Hash {
str := "canonical-collation-lookup:shardID=%s,period=%s"
key := fmt.Sprintf(str, shardID.String(), period.String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to call .String(), it should be automatic.

}

if canonicalHeaderHash.String() != headerHash.String() {
t.Errorf("header hashes do not match. want=%v. got=%v", headerHash.String(), canonicalHeaderHash.String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove .String() and change %v to %s

Copy link
Member

@terencechain terencechain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job on the well-written comments! Makes it really to understand the code

type collationHeaderData struct {
ShardID *big.Int // the shard ID of the shard.
ChunkRoot *common.Hash // the root of the chunk tree which identifies collation body.
Period *big.Int // the period number in which collation to be Pncluded.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Pncluded/included

// Hash takes the keccak256 of the collation header's contents.
func (h *CollationHeader) Hash() (hash common.Hash) {
hw := sha3.NewKeccak256()
rlp.Encode(hw, h)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes more sense to encode h.data instead of h?

Copy link
Member

@prestonvanloon prestonvanloon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best pull request I have seen all day. Great job

@rauljordan rauljordan merged commit 0228955 into prysmaticlabs:master May 16, 2018
Validator Client automation moved this from To do to Done May 16, 2018
prestonvanloon pushed a commit that referenced this pull request Jul 22, 2018
prestonvanloon pushed a commit that referenced this pull request Jul 22, 2018
WIP: Shard Chain Type With Useful Methods for Notary/Proposer Clients
prestonvanloon pushed a commit that referenced this pull request Jul 22, 2018
prestonvanloon pushed a commit that referenced this pull request Jul 22, 2018
prestonvanloon pushed a commit that referenced this pull request Jul 22, 2018
prestonvanloon pushed a commit that referenced this pull request Jul 22, 2018
WIP: Shard Chain Type With Useful Methods for Notary/Proposer Clients
Former-commit-id: d7f0bdf
prestonvanloon pushed a commit that referenced this pull request Jul 22, 2018
Former-commit-id: 5febe72a5a1936ce643488067e0990da810f1f5e [formerly 74c85fc]
Former-commit-id: 0cc6d45
prestonvanloon pushed a commit that referenced this pull request Jul 22, 2018
Former-commit-id: 9ab2595c130922bc49fc3c9a69f11dd93b6774fb [formerly 32c501e]
Former-commit-id: cfe18a1
prestonvanloon pushed a commit that referenced this pull request Jul 22, 2018
Former-commit-id: 1fc773af9e07a79f63bf63a097526a7c3303cd14 [formerly 59b4763]
Former-commit-id: 6b8162c
prestonvanloon pushed a commit that referenced this pull request Jul 22, 2018
WIP: Shard Chain Type With Useful Methods for Notary/Proposer Clients
Former-commit-id: 6e12288539871b6e5077f21564a3161dfecd4c0b [formerly d7f0bdf]
Former-commit-id: bca3868
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants