Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run multisig tests on CI, fix bug in account size computation #62

Merged
merged 12 commits into from
May 21, 2021

Conversation

ruuda
Copy link
Contributor

@ruuda ruuda commented May 11, 2021

  • Add a build step on CI to build the multisig BPF program and CLI, and run its tests.
  • For some reason, if a transaction fails, the error message is no longer in the output, only the numeric error code. I’ve edited the tests for now.
  • The computation of the size of the transaction account that holds a proposed transaction was wrong; the reserved size was insufficient to hold a change-multisig transaction. I don’t know what the correct size should be unfortunately, but adding 10 bytes of headroom makes the tests pass.

# there.
exclude = [
"multisig"
]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I resolved this by deleting the top Cargo.toml from the multisig repository.

@@ -668,6 +668,10 @@ fn propose_instruction(
.expect("Failed to serialize dummy transaction.");
account_bytes.extend(&multisig::Transaction::discriminator()[..]);

// Taking the just the length of the account is insufficient, apparently
// there is more going on than just this. Add a margin of 10 bytes.
let tx_account_size = account_bytes.len() + 10;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 10 a random number or is there a reason to it? This way it seems like a magic number

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally it should be the size of the data that will be stored in the account, but my computation of the data (the serialized struct + discriminator tag) is apparently not everything that's stored there, because if I take that, the test fails.

But you are right, this is a hack, I should try to understand what the missing piece is and compute the actual account size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I dumped the account_bytes here, and the result of solana account, and we can see that the account has an 8-byte prefix that is not in my account_bytes here:

$ solana account GYHeshSFA2SzSUT1rko2wzGM51LBrFr9Kh2BSeisJjwM

Public Key: GYHeshSFA2SzSUT1rko2wzGM51LBrFr9Kh2BSeisJjwM
Balance: 0.00242904 SOL
Owner: 6qz6Wg6cxN3hWhyRiNoSWcFCaLi4A6kw4VEYgnJHNBBj
Executable: false
Rent Epoch: 0
Length: 221 (0xdd) bytes
0000:   0b 18 ae 81  cb 75 f2 17  be a7 cb c4  49 fd 5a 54   .....u......I.ZT
0010:   c9 21 d6 31  98 fb ec b0  f7 72 69 70  8a 07 a7 72   .!.1.....rip...r
0020:   0c 39 55 4c  6e 46 d9 4d  56 d6 de fb  40 7e 2b a7   .9ULnF.MV...@~+.
0030:   0b a8 ab 98  da d5 68 5c  41 4f 15 db  d8 b8 4b ab   ......h\AO....K.
0040:   9e f4 bf a4  71 78 37 7e  02 00 00 00  be a7 cb c4   ....qx7~........
0050:   49 fd 5a 54  c9 21 d6 31  98 fb ec b0  f7 72 69 70   I.ZT.!.1.....rip
0060:   8a 07 a7 72  0c 39 55 4c  6e 46 d9 4d  00 01 d3 4a   ...r.9ULnF.M...J
0070:   e3 92 51 7b  44 0a 9f cc  e6 0a 4d ac  37 d6 e4 7e   ..Q{D.....M.7..~
0080:   f0 90 1d de  6e f1 2c 71  a6 40 2f d3  81 ba 01 00   ....n.,q.@/.....
0090:   34 00 00 00  7a 31 a8 b1  e7 1c a7 cc  01 00 00 00   4...z1..........
00a0:   a1 e9 d8 35  ba a5 e5 55  4b 43 5e ca  e2 31 cd 34   ...5...UKC^..1.4
00b0:   d9 fe 88 0f  60 a4 d0 7f  50 2b bd 7e  9a ae 48 2c   ....`...P+.~..H,
00c0:   01 00 00 00  00 00 00 00  01 00 00 00  01 00 00 00   ................
00d0:   00 00 00 00  00 00 00 00  00 00 00 00  00            .............

vs

$ xxd txdata
00000000: bea7 cbc4 49fd 5a54 c921 d631 98fb ecb0  ....I.ZT.!.1....
00000010: f772 6970 8a07 a772 0c39 554c 6e46 d94d  .rip...r.9ULnF.M
00000020: 56d6 defb 407e 2ba7 0ba8 ab98 dad5 685c  V...@~+.......h\
00000030: 414f 15db d8b8 4bab 9ef4 bfa4 7178 377e  AO....K.....qx7~
00000040: 0200 0000 bea7 cbc4 49fd 5a54 c921 d631  ........I.ZT.!.1
00000050: 98fb ecb0 f772 6970 8a07 a772 0c39 554c  .....rip...r.9UL
00000060: 6e46 d94d 0001 d34a e392 517b 440a 9fcc  nF.M...J..Q{D...
00000070: e60a 4dac 37d6 e47e f090 1dde 6ef1 2c71  ..M.7..~....n.,q
00000080: a640 2fd3 81ba 0100 3400 0000 7a31 a8b1  .@/.....4...z1..
00000090: e71c a7cc 0100 0000 a1e9 d835 baa5 e555  ...........5...U
000000a0: 4b43 5eca e231 cd34 d9fe 880f 60a4 d07f  KC^..1.4....`...
000000b0: 502b bd7e 9aae 482c 0100 0000 0000 0000  P+.~..H,........
000000c0: 0200 0000 0000 0000 0000 000b 18ae 81cb  ................
000000d0: 75f2 17                                  u..

What's also puzzling is that the suffix is different. The solana account data is 10 bytes longer, but that's because of the magic number you commented about. But the suffix being different is weird. This could be because I’m re-creating the transaction by hand, it is not using the same logic that the on-chain program uses. I’ll dive deeper ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so two differences:

  • I appended the discriminator, but it's prepended in the actual account.
  • The last part stores the signer info and owner set sequence number, which I set to false in my dummy transaction, but when proposing a transaction, it is signed automatically by the proposer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m still puzzled ... without the +10, some transactions can be proposed, but in the test, the third transaction it proposes the fails to be proposed with

Transaction simulation failed: Error processing Instruction 0: custom program error: 0x69", data: SendTransactionPreflightFailure(RpcSimulateTransactionResult { err: Some(InstructionError(0, Custom(105))), logs: Some(["Program 6qz6Wg6cxN3hWhyRiNoSWcFCaLi4A6kw4VEYgnJHNBBj invoke [1]", "Program log: Custom program error: 0x69", "Program 6qz6Wg6cxN3hWhyRiNoSWcFCaLi4A6kw4VEYgnJHNBBj consumed 11118 of 200000 compute units", "Program 6qz6Wg6cxN3hWhyRiNoSWcFCaLi4A6kw4VEYgnJHNBBj failed: custom program error: 0x69"

So it is the multisig program that refuses to write to the transaction account, I suppose because it is too small. When I add the +10 and look at the difference between the dummy transaction and the actual data in the account:

--- dummy_tx_serialized
+++ actual_tx_account_data
@@ -8,9 +8,9 @@
 00000070: 2aaa 0d76 20a4 aadc a4a4 2378 1e78 e0d5  *..v .....#x.x..
 00000080: a331 a62f 6351 9ee4 18e3 3e85 71fc 0100  .1./cQ....>.q...
 00000090: 5400 0000 7a31 a8b1 e71c a7cc 0200 0000  T...z1..........
 000000a0: d974 0d16 ba54 26a2 e13d e9eb c703 a79a  .t...T&..=......
 000000b0: 96bc e163 3b5e 1fab 576d 3465 6ec4 f4b7  ...c;^..Wm4en...
 000000c0: 0eba a3b3 a6e8 2461 576c 6ffa 6fe8 d181  ......$aWlo.o...
 000000d0: 1ce4 4e2d 89a0 f7e4 81d1 840c 8605 7acf  ..N-..........z.
-000000e0: 0200 0000 0000 0000 0200 0000 0101 0000  ................
-000000f0: 0000 00                                  ...
+000000e0: 0200 0000 0000 0000 0300 0000 0100 0000  ................
+000000f0: 0000 0000 0000 0000 0000 0000 00         .............

it’s a few differences, and that that really just additional zeros that make it work 😕

Could it be that this Borsh serialization is not fixed-size? I’ll try to make the dummy transaction match the real one exactly. I’ll also ask on the Anchor Discord if there’s a better way to know the size of an account.

Copy link
Contributor Author

@ruuda ruuda May 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discovered the problem! In the dummy transaction I was building to measure the size, I used the wrong signers. The "signers" of the multisig transaction are the owners of the multisig, not the accounts referenced by the inner instruction. If the inner instruction referenced fewer accounts than multisig owners, the account would be too small. Fixed in fe54a98. The magic number is now gone.

ruuda added 12 commits May 21, 2021 10:03
Also, ensure that CI fetches the newly added submodule.
The test is failing in the 'solana transfer' command on CI, but I can't
tell why. Presumably the command output contained more information,
let's print that.
After moving where the multisig program is, we need to change this.
Previously this did not work, because the submodule already contains a
workspace. But I added a commit to that repository to remove it, so now
we are able to embed this in the outer repository. This ensures that
they use the same dependency versions.
I am not sure why the output changed, after moving the program from its
own repository into the Solido one, but it did, so we need to change the
tests to match.
I changed that to size the account to the needs of the transaction
previously, but never tested it after that. Apparently the computed size
is not enough, the account needs to be larger. I don't know by how much
exactly, but adding 10 bytes is enough to make the tests pass again.
We want the binary, not the library / BPF binary.
By merging the "solido" and "multisig" commands, some flags moved or
were renamed; this fixes that.
If there is no error, I want the output to be clean.
Turns out the +10 on the size was not needed, the reason for the failure
was that I was passing in a wrongly sized list of signers. The "signers"
of the multisig transaction are the multisig owners, not the signers of
the inner inrstuction! If the inner instruction referenced more accounts
than owners, the size would work out, but the other way around, it
failed.
@ruuda ruuda merged commit c160ef7 into main May 21, 2021
@ruuda ruuda deleted the add-multisig-tests branch May 21, 2021 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants