-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: create square builder class #1659
Conversation
Co-authored-by: Rootul P <rootulp@gmail.com>
Co-authored-by: Rootul P <rootulp@gmail.com>
app/estimate_square_size_test.go
Outdated
func txsToBytes(txs coretypes.Txs) [][]byte { | ||
e := make([][]byte, len(txs)) | ||
for i, tx := range txs { | ||
e[i] = []byte(tx) | ||
} | ||
return e | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: this was moved out from the shares package because no one was using it. IIRC there's a similar function in core we could just use
pkg/square/builder.go
Outdated
// | ||
// Note that the padding would actually belong to the namespace of the transaction before it, but | ||
// this makes no difference to the total share size. | ||
maxPadding: shares.SubTreeWidth(shareSize) - 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the reviewer: Please make sure I have this assumption correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[question][blocking] what is the relationship between subtree width and
Blobs start at an index that is equal to a multiple of the blob length divided by MaxSquareSize rounded up.
Related: should we extract a function that calculates the max padding for a blob and unit test it in isolation? Then the unit tests can use the table here to verify max padding matches what we expect: https://github.com/celestiaorg/celestia-app/blob/main/docs/architecture/assets/worst-case-padding-in-blob-size-range.png
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blobs start at an index that is equal to a multiple of the blob length divided by MaxSquareSize rounded up.
Where do you see this in ADR013? I can't find it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we use SubtreeRootThreshold
to work out the height in the tree of the highest merkle mountain and thus the width of the leaves
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do you see this in ADR013? I can't find it
Sorry I should've linked directly to the line:
celestia-app/docs/architecture/adr-013-non-interactive-default-rules-for-zero-padding.md
Line 27 in f31ee58
Blobs start at an index that is equal to a multiple of the blob length divided by `MaxSquareSize` rounded up. |
func isPowerOf2(v uint64) bool { | ||
return v&(v-1) == 0 && v != 0 | ||
} | ||
|
||
func BlobsFromProto(blobs []core.Blob) ([]coretypes.Blob, error) { | ||
result := make([]coretypes.Blob, len(blobs)) | ||
for i, blob := range blobs { | ||
if blob.ShareVersion > math.MaxUint8 { | ||
return nil, fmt.Errorf("share version %d is too large to be a uint8", blob.ShareVersion) | ||
} | ||
result[i] = coretypes.Blob{ | ||
NamespaceID: blob.NamespaceId, | ||
Data: blob.Data, | ||
ShareVersion: uint8(blob.ShareVersion), | ||
} | ||
} | ||
return result, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: these functions are either not used (and probably shoudn't be) or redundant
// AvailableBytesFromSparseShares returns the maximum amount of bytes that could fit in `n` sparse shares | ||
func AvailableBytesFromSparseShares(n int) int { | ||
if n <= 0 { | ||
return 0 | ||
} | ||
if n == 1 { | ||
return appconsts.FirstSparseShareContentSize | ||
} | ||
return (n-1)*appconsts.ContinuationSparseShareContentSize + appconsts.FirstSparseShareContentSize | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something that the node team might want to use
// here we keep track of the actual data to go in a square | ||
txs [][]byte | ||
pfbs []*coretypes.IndexWrapper | ||
blobs []*element |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Imagine if, according to the share commitment rules, a transcation took up 11 shares | ||
// and had the merkle mountain tree commitment of 4,4,2,1. The first part of the share commitment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having a tough time relating this example to the ascii diagram above b/c it discusses a blob that occupies 11 shares but there are only 8 possible share indexes in the ASCII diagram
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 8 possible shares indexes are a snapshot in the tree. It's used to demonstrate that if the index of the last transaction ended at 1 (inclusive), and the SubtreeWidth() was 4, then the next one would start at index 4 and we would have 2 shares of padding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think this example is out dated with the implementation of ADR013
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this tree across from somewhere else just so I could explain why we have padding and why that formula gives us the maximum possible padding
pkg/square/builder.go
Outdated
// | ||
// Note that the padding would actually belong to the namespace of the transaction before it, but | ||
// this makes no difference to the total share size. | ||
maxPadding: shares.SubTreeWidth(shareSize) - 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[question][blocking] what is the relationship between subtree width and
Blobs start at an index that is equal to a multiple of the blob length divided by MaxSquareSize rounded up.
Related: should we extract a function that calculates the max padding for a blob and unit test it in isolation? Then the unit tests can use the table here to verify max padding matches what we expect: https://github.com/celestiaorg/celestia-app/blob/main/docs/architecture/assets/worst-case-padding-in-blob-size-range.png
Codecov Report
@@ Coverage Diff @@
## main #1659 +/- ##
==========================================
+ Coverage 51.56% 52.67% +1.11%
==========================================
Files 95 97 +2
Lines 5954 6185 +231
==========================================
+ Hits 3070 3258 +188
- Misses 2570 2598 +28
- Partials 314 329 +15
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work!! overall this is a really big improvement
mostly non-blocking naming things
// Reconstruct takes a list of ordered transactions and reconstructs a square, validating that | ||
// all PFBs are ordered after regular transactions and that the transactions don't collectively | ||
// exceed the maxSquareSize. Note that this function does not check the underlying validity of | ||
// the transactions. | ||
func Reconstruct(txs [][]byte, maxSquareSize int) (Square, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[not blocking]
I can see this naming scheme becoming confusing, since "reconstructing the square" is already a term we use to describe the process of collecting enough erasure encoded shares to rebuild the EDS. (for example, search the word reconstruct in celestia-node)
I understand using the word reconstruct here, but perhaps we could use a synonym to contruct? build and rebuild?
// Imagine if, according to the share commitment rules, a transcation took up 11 shares | ||
// and had the merkle mountain tree commitment of 4,4,2,1. The first part of the share commitment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think this example is out dated with the implementation of ADR013
pkg/square/builder.go
Outdated
// | ||
// Note that the padding would actually belong to the namespace of the transaction before it, but | ||
// this makes no difference to the total share size. | ||
maxPadding: shares.SubTreeWidth(shareSize) - 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pkg/square/builder.go
Outdated
blob core.Blob | ||
pfbIndex int | ||
blobIndex int | ||
shareSize int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[not blocking naming thing]
is it possible to use a different word besides shareSize that describes what this is? This is the number of shares used correct? the shares size makes me think of
celestia-app/pkg/appconsts/appconsts.go
Line 21 in f31ee58
ShareSize = 512 |
size := iw.Size() | ||
pfbShareDiff := c.pfbCounter.Add(size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, this is a significant improvement to
celestia-app/app/estimate_square_size.go
Lines 74 to 105 in f31ee58
// maxWrappedTxOverhead calculates the maximum amount of overhead introduced by | |
// wrapping a transaction with a shares index | |
// | |
// TODO: make more efficient by only generating these numbers once or something | |
// similar. This function alone can take up to 5ms. | |
func maxIndexWrapperOverhead(squareSize uint64) int { | |
maxTxLen := squareSize * squareSize * appconsts.ContinuationCompactShareContentSize | |
wtx, err := coretypes.MarshalIndexWrapper( | |
make([]byte, maxTxLen), | |
uint32(squareSize*squareSize), | |
) | |
if err != nil { | |
panic(err) | |
} | |
return len(wtx) - int(maxTxLen) | |
} | |
// maxIndexOverhead calculates the maximum amount of overhead in bytes that | |
// could occur by adding an index to an IndexWrapper. | |
func maxIndexOverhead(squareSize uint64) int { | |
maxShareIndex := squareSize * squareSize | |
maxIndexLen := binary.PutUvarint(make([]byte, binary.MaxVarintLen32), maxShareIndex) | |
wtx, err := coretypes.MarshalIndexWrapper(make([]byte, 1), uint32(maxShareIndex)) | |
if err != nil { | |
panic(err) | |
} | |
wtx2, err := coretypes.MarshalIndexWrapper(make([]byte, 1), uint32(maxShareIndex), uint32(maxShareIndex-1)) | |
if err != nil { | |
panic(err) | |
} | |
return len(wtx2) - len(wtx) + maxIndexLen | |
} |
I'm working on some changes locally for refactoring limiting the blocksize, and I this change pairs very well |
Co-authored-by: Rootul P <rootulp@gmail.com>
Thought I'd post some relatively crude results comparing the square size estimation algorithm before and after (each PFB is 512 bytes)
|
Ref: #1214
This creates a builder struct for assembling squares from a (prioritised) list of transactions.
It also includes two functions
Construct
andReconstruct
as listed in ADR20