Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVM: Implement lsig size pooling #6057

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

giuliop
Copy link

@giuliop giuliop commented Jul 8, 2024

Implement pooling of logic signatures' size across a transaction group.

This PR addresses this issue by implementing pooling of logic signatures' size across a transaction group.
As discussed in the issue, given that currently it is possible to submit a group with 16 1KB lsigs, it seems logical to allow pooling of the total size across a group.

Implementation

The pooled max lsig length for the group is set at len(group) * LogicSigMaxSize, as done for lsig opcode budget pooling, for both node validation and goal clerk dryrun.

A group is valid if the sum of all lsigs' program and args size in the group does not exceed the pooled max lsig length.

The consensus change(EnableLogicSigSizePooling bool) is enabled in vFuture in this PR.

To make pass the TestMaxSizesCorrect, I had to change the value of TxnTagMaxSize.
Two observations on this:

  1. The comments of TestMaxSizesCorrect say:
// If this test fails, DO NOT JUST UPDATE THE CONSTANTS OR MODIFY THE TEST!
// Instead you need to introduce a new version of the protocol and mechanisms
// to ensure that nodes on different proto versions don't reject each others messages due to exceeding
// max size network protocol version

Beside updating the constant I did introduce a new version of the protocol (in vFuture) but I don't fully understand the comment so not sure if I was supposed to do something else as well.

  1. To calculate the value of TxnTagMaxSize, TestMaxSizesCorrect calls the function SignedTxnMaxSize which calculates the LogicSigMaxSize calling this function:
// MaxSize returns a maximum valid message size for this message type
func LogicSigMaxSize() (s int) {
	s = 1 + 2 + msgp.BytesPrefixSize + config.MaxLogicSigMaxSize + 4 + crypto.SignatureMaxSize() + 5 + crypto.MultisigSigMaxSize() + 4
	// Calculating size of slice: z.Args
	s += msgp.ArrayHeaderSize + ((EvalMaxArgs) * (msgp.BytesPrefixSize + config.MaxLogicSigMaxSize))
	return
}

config.MaxLogicSigMaxSize used to be 1KB and this PR changes it to 16KB.
What I don't undestand is why we multiply config.MaxLogicSigMaxSize by EvalMaxArgs which is 255, ending up with what looks like an exagerrated value for LogicSigMaxSize.
I did not want to touch this in case I am missing something.

Testing plan

The PR is marked as draft since it does not includes tests yet.
I've run separately integration tests testing success and failure for single transactions and groups, both for goal clerk dryrun and actually sending the transactions.

Before integrating the tests in the codebase I'd like to get some feedback:

  1. Is the overall approach ok?
  2. How would you like to see the tests? I suppose integration tests only? Any guidance on where to put the tests and which existing tests to use as reference would be very helpful

Copy link

codecov bot commented Jul 10, 2024

Codecov Report

Attention: Patch coverage is 29.62963% with 19 lines in your changes missing coverage. Please review.

Project coverage is 56.18%. Comparing base (37a9a18) to head (7953862).
Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
data/transactions/verify/txn.go 16.66% 8 Missing and 2 partials ⚠️
cmd/goal/clerk.go 0.00% 7 Missing ⚠️
data/transactions/logic/eval.go 66.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6057      +/-   ##
==========================================
- Coverage   56.20%   56.18%   -0.03%     
==========================================
  Files         494      494              
  Lines       69931    69945      +14     
==========================================
- Hits        39304    39297       -7     
- Misses      27953    27968      +15     
- Partials     2674     2680       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gmalouf gmalouf requested a review from jannotti July 10, 2024 19:10
cmd/goal/clerk.go Outdated Show resolved Hide resolved
data/transactions/verify/txn.go Outdated Show resolved Hide resolved
data/transactions/verify/txn.go Show resolved Hide resolved
@jannotti
Copy link
Contributor

I tentatively agree with you about you thinking about LogicSigMaxSize(). There is no explicit constant for a maximum argument size, so an argument can indeed by the whole 16kb in size. But, of course, you can't have 255 of them. So I think we want to capture the fact that the the msgp size can by 16kb plus the overhead of (up to) 255 arguments. Each additional argument causes the msgp serialization to be a bit bigger in order to encode the length of the argument. I think that means we just remove config.MaxLogicSIgMaxSize from the sum being multiplied.

protocol/tags.go Outdated Show resolved Hide resolved
@giuliop
Copy link
Author

giuliop commented Jul 20, 2024

I tentatively agree with you about you thinking about LogicSigMaxSize(). There is no explicit constant for a maximum argument size, so an argument can indeed by the whole 16kb in size. But, of course, you can't have 255 of them. So I think we want to capture the fact that the the msgp size can by 16kb plus the overhead of (up to) 255 arguments. Each additional argument causes the msgp serialization to be a bit bigger in order to encode the length of the argument. I think that means we just remove config.MaxLogicSIgMaxSize from the sum being multiplied.

I agree with your logic and my first attempt was to do:

type LogicSig struct {
	_struct struct{} `codec:",omitempty,omitemptyarray"`

	// Logic signed by Sig or Msig, OR hashed to be the Address of an account.
	Logic []byte `codec:"l,allocbound=config.MaxLogicSigMaxSize"`

	Sig  crypto.Signature   `codec:"sig"`
	Msig crypto.MultisigSig `codec:"msig"`

	// Args are not signed, but checked by Logic
-	Args [][]byte `codec:"arg,allocbound=EvalMaxArgs,allocbound=config.MaxLogicSigMaxSize"
+	Args [][]byte `codec:"arg,allocbound=EvalMaxArgs,allocbound=0"`
}

This makes msgp generate:

func LogicSigMaxSize() (s int) {
	s = 1 + 2 + msgp.BytesPrefixSize + config.MaxLogicSigMaxSize + 4 + crypto.SignatureMaxSize() + 5 + crypto.MultisigSigMaxSize() + 4
	// Calculating size of slice: z.Args
-	s += msgp.ArrayHeaderSize + ((EvalMaxArgs) * (msgp.BytesPrefixSize + config.MaxLogicSigMaxSize))
+	s += msgp.ArrayHeaderSize + ((EvalMaxArgs) * (msgp.BytesPrefixSize + 0))
	return
}

which implements the idea that the size of the actual arguments is already captured in config.MaxLogicSigMaxSize and we only want to add the overhead of the arg len encoding times the max number of args

The issue is that we are saying that the max length on an argument is 0 and this is used also in the func (z *LogicSig) UnmarshalMsgWithState(...) which gets broken as shown by the resulting diff:

diff --git a/data/transactions/msgp_gen.go b/data/transactions/msgp_gen.go
index 7cc22db08..955c1812f 100644
--- a/data/transactions/msgp_gen.go
+++ b/data/transactions/msgp_gen.go
@@ -3306,8 +3306,8 @@ func (z *LogicSig) UnmarshalMsgWithState(bts []byte, st msgp.UnmarshalState) (o
...
-                               if zb0007 > config.MaxLogicSigMaxSize {
-                                       err = msgp.ErrOverflow(uint64(zb0007), uint64(config.MaxLogicSigMaxSize))
+                               if zb0007 > 0 {
+                                       err = msgp.ErrOverflow(uint64(zb0007), uint64(0))
                                        return
                                }
@@ -3395,8 +3395,8 @@ func (z *LogicSig) UnmarshalMsgWithState(bts []byte, st msgp.UnmarshalState) (o
...
-                                       if zb0011 > config.MaxLogicSigMaxSize {
-                                               err = msgp.ErrOverflow(uint64(zb0011), uint64(config.MaxLogicSigMaxSize))
+                                       if zb0011 > 0 {
+                                               err = msgp.ErrOverflow(uint64(zb0011), uint64(0))
                                                return
                                        }

My second attempt was to say instead that the max len of the arg array is one max size arg by doing:

Args [][]byte `codec:"arg,allocbound=1,allocbound=config.MaxLogicSigMaxSize"`

which generates
s += msgp.ArrayHeaderSize + ((1) * (msgp.BytesPrefixSize + config.MaxLogicSigMaxSize))

which also breaks func (z *LogicSig) UnmarshalMsgWithState(...) since we are saying the max number of args is 1:

+++ b/data/transactions/msgp_gen.go
@@ -3287,8 +3287,8 @@ func (z *LogicSig) UnmarshalMsgWithState(bts []byte, st msgp.UnmarshalState) (o
...
-                       if zb0005 > EvalMaxArgs {
-                               err = msgp.ErrOverflow(uint64(zb0005), uint64(EvalMaxArgs))
+                       if zb0005 > 1 {
+                               err = msgp.ErrOverflow(uint64(zb0005), uint64(1))
                                err = msgp.WrapError(err, "struct-from-array", "Args")
                                return
                        }
@@ -3376,8 +3376,8 @@ func (z *LogicSig) UnmarshalMsgWithState(bts []byte, st msgp.UnmarshalState) (o
...
-                               if zb0009 > EvalMaxArgs {
-                                       err = msgp.ErrOverflow(uint64(zb0009), uint64(EvalMaxArgs))
+                               if zb0009 > 1 {
+                                       err = msgp.ErrOverflow(uint64(zb0009), uint64(1))

I am out of quick fixes, would you have some other idea? Or shall we just leave it very generous as it is now?
Alternatively I would need to understand in depth how msgp works and what is go-algorand codec architecture to suggest some other alternatives but it's a non trivial time investment

@giuliop
Copy link
Author

giuliop commented Jul 29, 2024

OK, wasn't too bad, studying algorand/msgp I found an handy maxtotalbytes directive which I used to fix the max size of the argument array:

@@ -39,7 +39,7 @@ type LogicSig struct {
	// Args are not signed, but checked by Logic
-       Args [][]byte `codec:"arg,allocbound=EvalMaxArgs,allocbound=config.MaxLogicSigMaxSize"`
+       Args [][]byte `codec:"arg,allocbound=EvalMaxArgs,allocbound=config.MaxLogicSigMaxSize,maxtotalbytes=config.MaxLogicSigMaxSize"`
 }

I used config.MaxLogicSigMaxSize for the max size, which generates:

func LogicSigMaxSize() (s int) {
       s = 1 + 2 + msgp.BytesPrefixSize + config.MaxLogicSigMaxSize + 4 + crypto.SignatureMaxSize() + 5 + crypto.MultisigSigMaxSize() + 4
       // Calculating size of slice: z.Args
-       s += msgp.ArrayHeaderSize + ((EvalMaxArgs) * (msgp.BytesPrefixSize + config.MaxLogicSigMaxSize))
+       s += msgp.ArrayHeaderSize + config.MaxLogicSigMaxSize
       return
}

and leaves everything else intact.

Let me know if you like this or would consider something else instead of config.MaxLogicSigMaxSize.

I like it because that's very close to the max size for the args array (a bit more because the Logic cannot be zero, a bit less because we are not including the overhead from maxArgs BytesPrefixSize).

In any case in the overall LogicSigMaxSize() we are double counting MaxLogicSigMaxSize for both the Logic and the Args so we have a generous, but reasonable, margin.

@jannotti
Copy link
Contributor

I was thinking about another approach. How about we just keep the allocbound on Args at 1000? There's no particular reason to need very large args. And, in fact, it's totally pointless for it to be bigger than 4k, since teal values can't be bigger that 4k. If you tried to access an arg > 4k, you will panic anyway.

@giuliop
Copy link
Author

giuliop commented Jul 29, 2024

You made me realize that indeed we need to decide which is the max limit for a logicsig arg.
Before we had a combined cap for Logic + Args of 1000 which was lower than the AVM limit on teal args of 4096, so it implicitly set the limit at 1000.
Now that the combined cap is 16000 > 4096 we need to decide what is the limit for args.

My first reaction would be to say 4096 to match the AVM, why artificially lower it after all.

The argument I see in favor of using 1000 is not to change the old value of TxnTagMaxSize.
If there is a concern that we might inadvertently break something then it's more prudent to use 1000 and of course if someone needs to pass a larger arg they can always split it up over multiple args.

In sum, I see an allocbound on arg size of 4096 as the most logical choice, but if maintaining the old TxnTagMaxSize is valuable then 1000 is a fine choice too.

@jannotti jannotti self-assigned this Aug 26, 2024
@jannotti jannotti changed the title Implement lsig size pooling AVM: Implement lsig size pooling Aug 26, 2024
@jannotti
Copy link
Contributor

You made me realize that indeed we need to decide which is the max limit for a logicsig arg. Before we had a combined cap for Logic + Args of 1000 which was lower than the AVM limit on teal args of 4096, so it implicitly set the limit at 1000. Now that the combined cap is 16000 > 4096 we need to decide what is the limit for args.

My first reaction would be to say 4096 to match the AVM, why artificially lower it after all.

Yes, we ought to allow 4096 for individual args. I suppose it can be checked in transactions/data/logic/eval.go near here:

	if cx.txn.Lsig.Args != nil && len(cx.txn.Lsig.Args) > transactions.EvalMaxArgs {
		return false, errTooManyArgs
	}

though it's unclear why we don't check this in txn.WellFormed. It can be an unconditional check (no need to have versioning) because old transactions are implicitly limited to < 1000 bytes.

I dislike that this causes a change in TxnTagMaxSize but I don't see a way around it. It was nice when state proof txn was so much bigger than anything else, we didn't have to adjust TxnTagMaxSize for a simple txn addition. In the future, I suppose we will have to do so, so the limit is hit by a "normal" txn, so any increase increases the limit. The comment explaining TxnTagMaxSize will need to be reworked.

There is a unit test that caught this right?

cmd/goal/clerk.go Outdated Show resolved Hide resolved
cmd/goal/clerk.go Show resolved Hide resolved
@jannotti
Copy link
Contributor

I'm really confused by that comment I mentioned. It says:

// TxnTagMaxSize is the maximum size of a TxnTag message. This is equal to SignedTxnMaxSize()
// which is size of just a single message containing maximum Stateproof. Since Stateproof
// transactions can't be batched we don't need to multiply by MaxTxnBatchSize.

But that doesn't seem right. TransactionMaxSize() seems to bound the size of a single transaction, not stateproof in particular, and it includes the use of all fields, though many are mutually exclusive.

But, it also hints that the TxnTag should take into account 16 "normal" transactions, but doesn't bother to do so, since it's the stateproof txn that really causes the number to blow up. (I think the TxnTag is used for groups, not single transactions)

@jannotti
Copy link
Contributor

After further discussion, we're confident that we can make changes to the max size tests so it doesn't fail so easily, without issue. I'll make those changes in a separate PR. If you address the rest in the meantime (limit arg length, maybe remove the assembly checks) I'll get the max length changes in.

@giuliop
Copy link
Author

giuliop commented Aug 28, 2024

Yes, we ought to allow 4096 for individual args. I suppose it can be checked in transactions/data/logic/eval.go near here:

	if cx.txn.Lsig.Args != nil && len(cx.txn.Lsig.Args) > transactions.EvalMaxArgs {
		return false, errTooManyArgs
	}

though it's unclear why we don't check this in txn.WellFormed.

Done, and I also think that the checks in func eval on the number and size of lsig args ought to be in txn.WellFormed

data/transactions/logic/eval.go
...
+var errLogicSigArgTooLarge = errors.New("LogicSig argument too large")
...
@@ -1305,8 +1306,15 @@ func eval(program []byte, cx *EvalContext) (pass bool, err error) {
        if (cx.EvalParams.Proto == nil) || (cx.EvalParams.Proto.LogicSigVersion == 0) {
                return false, errLogicSigNotSupported
        }
-       if cx.txn.Lsig.Args != nil && len(cx.txn.Lsig.Args) > transactions.EvalMaxArgs {
-               return false, errTooManyArgs
+       if cx.txn.Lsig.Args != nil {
+               if len(cx.txn.Lsig.Args) > transactions.EvalMaxArgs {
+                       return false, errTooManyArgs
+               }
+               for _, arg := range cx.txn.Lsig.Args {
+                       if len(arg) > transactions.MaxLogicSigArgSize {
+                               return false, errLogicSigArgTooLarge
+                       }
+               }

data/transactions/logicsig.go
 // EvalMaxArgs is the maximum number of arguments to an LSig
 const EvalMaxArgs = 255

+// MaxLsigArgSize is the maximum size of an argument to an LSig
+// We use 4096 to match the maximum size of a TEAL value
+// (as defined in `const maxStringSize` in package logic)
+const MaxLogicSigArgSize = 4096
...
@@ -39,7 +44,7 @@ type LogicSig struct {
        Msig crypto.MultisigSig `codec:"msig"`

        // Args are not signed, but checked by Logic
-       Args [][]byte `codec:"arg,allocbound=EvalMaxArgs,allocbound=config.MaxLogicSigMaxSize,maxtotalbytes=config.MaxLogicSigMaxSize"`
+       Args [][]byte `codec:"arg,allocbound=EvalMaxArgs,allocbound=MaxLogicSigArgSize,maxtotalbytes=config.MaxLogicSigMaxSize"`
 }

I dislike that this causes a change in TxnTagMaxSize but I don't see a way around it. It was nice when state proof txn was so much bigger than anything else, we didn't have to adjust TxnTagMaxSize for a simple txn addition. In the future, I suppose we will have to do so, so the limit is hit by a "normal" txn, so any increase increases the limit. The comment explaining TxnTagMaxSize will need to be reworked.

There is a unit test that caught this right?

Yes, this one

@giuliop
Copy link
Author

giuliop commented Aug 28, 2024

After further discussion, we're confident that we can make changes to the max size tests so it doesn't fail so easily, without issue. I'll make those changes in a separate PR. If you address the rest in the meantime (limit arg length, maybe remove the assembly checks) I'll get the max length changes in.

I did implement them before reading your comment, I spent some time thinking about it and my best idea in the end was to use the max block length as the max for the transaction / transaction group message tag (as per below).
I can remove that if you want.

protocol/tags.go
@@ -84,10 +84,10 @@ const StateProofSigTagMaxSize = 6378
 // Matches  current network.MaxMessageLength
 const TopicMsgRespTagMaxSize = 6 * 1024 * 1024

-// TxnTagMaxSize is the maximum size of a TxnTag message. This is equal to SignedTxnMaxSize()
-// which is size of just a single message containing maximum Stateproof. Since Stateproof
-// transactions can't be batched we don't need to multiply by MaxTxnBatchSize.
-const TxnTagMaxSize = 4394756
+// TxnTagMaxSize is the maximum size of a TxnTag message.
+// Matches current config.MaxTxnBytesPerBlock, the maximum lenght of a block,
+// since a transaction or transaction group cannot be larger than a block.
+const TxnTagMaxSize = 5 * 1024 * 1024

node/node_test.go
@@ -805,7 +804,7 @@ func TestMaxSizesCorrect(t *testing.T) {
        require.Equal(t, ppSize, protocol.ProposalPayloadTag.MaxMessageSize())
        spSize := uint64(stateproof.SigFromAddrMaxSize())
        require.Equal(t, spSize, protocol.StateProofSigTag.MaxMessageSize())
-       txSize := uint64(transactions.SignedTxnMaxSize())
+       txSize := uint64(config.MaxTxnBytesPerBlock)
        require.Equal(t, txSize, protocol.TxnTag.MaxMessageSize())

@jannotti
Copy link
Contributor

jannotti commented Sep 9, 2024

Ok, we merged a looser check that should allow this to pass tests. We came to pretty much the same result (5M limit) by another path. If you merge and get green checks, I'm ready to approve and I'll talk to others to get another final approval.

This reverts commit 91d6e09
which is not needed anymore after the merge of:
AVM: Derive looser, but more principled, checks of txn max size algorand#6114
We now let goal clerk compile app program and lsig of any sizes,
even it they will be rejected by the nodes.
@giuliop
Copy link
Author

giuliop commented Sep 10, 2024

I like it, i had to make two changes to pass all tests and make all checks green.

The first change was in TestMaxSizesCorrect:

node/node_test.go
@@ -824,10 +824,10 @@ func TestMaxSizesCorrect(t *testing.T) {
        maxCombinedTxnSize := uint64(transactions.SignedTxnMaxSize())
        // subtract out the two smaller signature sizes (logicsig is biggest, it can *contain* the others)
        maxCombinedTxnSize -= uint64(crypto.SignatureMaxSize() + crypto.MultisigSigMaxSize())
-       // the logicsig size is *also* an overestimate, because it thinks each
-       // logicsig arg can be big, but really the sum of the args and the program
-       // has a max size.
-       maxCombinedTxnSize -= uint64(transactions.EvalMaxArgs * config.MaxLogicSigMaxSize)
+       // the logicsig size is *also* an overestimate, because it thinks that the logicsig and
+       // the logicsig args can both be up to to MaxLogicSigMaxSize, but that's the max for
+       // them combined, so it double counts and we have to subtract one.
+       maxCombinedTxnSize -= uint64(config.MaxLogicSigMaxSize)

With lsig pooling the new max size for the lsig arg array is:
EvalMaxArgs (255) * MaxLogicSigArgSize (4096), capped at config.MaxLogicSigMaxSize (16 * 1000), that is 16000 which is added to 16000 for the lsig program size so it is double counted and we remove the double counting in TestMaxSizesCorrect

The second change is in the integration test test/e2e-go/cli/goal/expect/tealConsensusTest.exp where I had to comment out the tests for failing to compile an overly large lsig and an overly large app program since we took out those limitations as per this comment

@giuliop giuliop marked this pull request as ready for review September 10, 2024 11:55
Copy link
Contributor

@jannotti jannotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test you deleted is fine, as I mentioned. But I'm very surprised we don't have an e2e test that confirms the that a too big logicsig is rejected as malformed.

Maybe we do have such a test, but it is being run on a single txn group, so it retains the same behaviour as before. Could you look around for it, and if found, construct one that exercises a two txn group and the different limits?

data/transactions/logic/eval.go Show resolved Hide resolved
// MaxLsigArgSize is the maximum size of an argument to an LSig
// We use 4096 to match the maximum size of a TEAL value
// (as defined in `const maxStringSize` in package logic)
const MaxLogicSigArgSize = 4096
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we export that constant and use it directly?

Copy link
Author

@giuliop giuliop Sep 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice, but package logic already imports package transactions and we cannot have circular imports. We would need to define the constant in package transactions and then import it in package logic.

Or alternatively think if we want to make it a consensus parameter; in this case I guess we would also need to think about const EvalMaxArgs = 255

data/transactions/verify/txn.go Outdated Show resolved Hide resolved
test/e2e-go/cli/goal/expect/tealConsensusTest.exp Outdated Show resolved Hide resolved
@giuliop
Copy link
Author

giuliop commented Sep 14, 2024

Maybe we do have such a test, but it is being run on a single txn group, so it retains the same behaviour as before. Could you look around for it, and if found, construct one that exercises a two txn group and the different limits?

I did not find that test so I created it here
I also had to create this new network template to exercise consensus v18 when lsig max size of 1000 was introduced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants