Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache compiledMessage when a transaction has signatures #2377

Closed
wants to merge 5 commits into from

Conversation

mcintyre94
Copy link
Collaborator

When we sign a transaction, we actually sign the serialized bytes of its compiled message. This compiled message format is not very strictly defined by the spec, for example the order of accounts with the same role is not defined. This means that the message can be compiled in multiple valid ways. However, the signature bytes will only ever be valid with the exact same message bytes.

This PR adds compiledMessage to ITransactionWithSignatures, so that whenever we have a signed transaction we cache its compiledMessage. This is used instead instead of recompiling the message for future signatures, and when the transaction is compiled. If we do something that invalidates the signatures, then in addition to removing the signatures we also remove this cached compiledMessage.

More importantly, it is set when we decode a serialized transaction if it has existing signatures. This cached compiledMessage will be valid for the existing signature, and will be used for any future signatures/when we compile/serialize the transaction. This means that the existing signature will remain valid.

This allows us to deserialize, sign and reserialize a transaction created by code that compiles messages differently, eg. the legacy web3js library, without invalidating its signatures.

Fixes #2362

Copy link

changeset-bot bot commented Mar 25, 2024

⚠️ No Changeset found

Latest commit: 2d42f48

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Collaborator

steveluscher commented Mar 25, 2024

Even when I was building this the first time, I considered this just as a tradeoff of memory for performance. There's something that made me uncomfortable about having multiple copies of ‘the state’ lying around, though, begging to go out of sync.

This PR description describes what this fixes. Can you list some of the things that could go wrong?

@mcintyre94
Copy link
Collaborator Author

I guess there are probably two main categories: the cache getting out of sync, or the cached data not being used when it should

The cache could get out of sync if we modify the transaction but don't remove the compiledMessage. Then we'd re-use that incorrect compiledMessage next time we sign. I think the risk of this is quite low because we have a single shared getUnsignedTransaction which now removes compiledMessage as well as the signatures. We already use this for a bunch of functions that modify the transaction, and I don't think we'd be likely to write a function that doesn't. In short, anything that invalidates compiledMessage also invalidates the signatures, and we have a single place where we invalidate both which we've already used consistently.

There's also cases where we manually create a Transaction object and add signatures without using our helper functions , like the code in compat for converting a legacy transaction, or when we deserialize a transaction, or a bunch of tests. I think this is probably the most risky because it's easy to add signatures but not add compiledMessage and the type-checker doesn't help because generally these routes can return with or without signatures. You also won't know about the bug until you end up with an invalid signature. We could create a helper function that sets signatures + compiled message and use that for these cases to make it easier/less error prone.

I think the second case is less well handled by this PR, because we have two cases where we check if('compiledMessage' in transaction) and only compile the message if we need to. If we didn't add that check somewhere and compile the message instead, we could end up ignoring the previous message and compiling a new message that isn't valid for the existing signatures. I could instead modify compileMessage to check for an existing one and return it, and then it'd be transparent to callers and we'd be less likely to make that mistake.

Generally I think this is less risky than most cache/state issues, because we know it invalidates simultaneously with signatures and we already have well tested and isolated code for invalidating signatures.

@lorisleiva
Copy link
Collaborator

I'm literally 50/50 on that one.

The performance boost of this PR is not negligible. Even just by looking at the Signer API, this means having 500 KeyPairSigners applied to the same transaction creates just a single compilation of that transaction. We could fix that locally using WeakMaps and whatnot but this is an elegant lower-level solution that fixes this sort of issues everywhere.

On the other side, I am also nervous about introducing a duplicated state of the same data that as the potential to un-sync. No matter how much due diligence we apply in our exported functions, some popular third-party library could accept a transaction object and return a new one with an un-synchronised state and there's not much we can do about it.

My biggest concern is: are we introducing an attack vector? Say, an application or wallet uses the transaction data to display information to the UI but then uses the compiled message to send said transaction without checking that they are the same. Suddenly we (may?) allow a malicious actor to compose a transaction object that does something different than displayed by the UI. That being said, this is not an issue with the wallet standard as transactions are being passed as bytes and therefore must be reconstructed on the wallet-side.

We may need to think a bit deeper about that one before merging to quickly.

@steveluscher
Copy link
Collaborator

I've been noodling on a bunch of concrete ‘ways this could go wrong’ but have yet to get a chance to write them up. Give me a bit to write them out, and propose something slightly different.

@mcintyre94
Copy link
Collaborator Author

mcintyre94 commented Mar 26, 2024

@lorisleiva Definitely a good point on the attack vector. I think this is a tricky limitation of using simple objects for data to overcome, because there's really no way we can stop a third party library from de-syncing these. They're just fields on an object. I think currently a malicious third party could return a Transaction object with compiledMessage set and no signatures. It would need to return the object and not serialised bytes, but that's a lot easier to do with the new web3js. We would then use that compiledMessage to sign for the user and send the transaction.

There are a few things we could do to mitigate this in our code, with the usual caveat that it'll be easy for someone to write their own version that behaves differently:

  • We could improve the check of whether to use compiledMessage to only do so if there are existing signatures that we expect to keep valid. It's part of the ITransactionWithSignatures type, but of course you can just add compiledMessage to any Transaction object. It's only dangerous to have a mis-match if there are no signatures that become invalid.
  • We could write a helper function that validates the cached value semantically matches our calculated one. This would remove the performance improvement you highlighted if we did it on every sign call.

@steveluscher
Copy link
Collaborator

Thanks for this PR @mcintyre94! Of all the other reasons to do something like this (eg. performance) let me rewind way back to the original problem we need to solve: the sort order of accounts (ie. correctness).

I'll start by doing a bunch of talking out loud.

Option A (aspirational) – Specify the order

The sort order of accounts should be specified and the runtime should enforce it.

The fact that you can tweak the order in which you declare accounts to produce two different messages that represent the identical set of instructions thwarts the ‘one transaction per recent blockhash’ that the runtime otherwise tries to enforce through deduplication; you can land as many identical transactions per blockhash as you can generate permutations in the account order. In some ways you can already do this by reordering instructions, or by adding a memo instruction with a unique message, so maybe this doesn't matter.

In any case, enforcing a deterministic sort in the runtime would have ended this discussion before it began, allowing multiple systems and signers to share transactions more easily.

Option B – Make ITransactionWithSignatures and IFullySignedTransaction Just Bytes™

Right now, these types represent a transaction data structure that has a signatures map, and optionally one on which it's been asserted that the signatures map has an entry for all of a transaction's signers. We could instead make these types more opaque.

type TransactionMessageBytes = Uint8Array & { readonly __brand: unique symbol };

interface ITransactionWithSignatures<OrderedSignerAddresses extends Address[]> {
  messageBytes: TransactionMessageBytes,  // All the ‘source data’ from the transaction serialized and thrown out
  signatures: OrderedMap<OrderedSignerAddresses, SignatureBytes | undefined>,
}

This type would be a permitted input and de-facto output of signTransaction() and partiallySignTransaction(), would be compatible with sendTransaction(), and its existence would imply CompilableTransaction (ie. the message contains a fee payer and a lifetime constraint).

Any consumer that needs to reason about the contents of the message has to decode it. This feels closer to most people's use cases for signing an existing transaction; ‘is this what I expect it to be? OK, I'll sign the bytes.’

Notably this means that you will no longer have a signatures map attached to any data structure that looks like a transaction. BaseTransaction and ITransactionWithFeePayer et all have no idea what a signature is, and gone would be the whole concept of ‘unsetting’ the signatures when you make a change (eg. adding an instruction, changing the lifetime).

Option C – Represent the account order as data

Sort of like this PR, except instead of attaching the entire compiled message to the transaction, attach only the address order.

type BaseTransaction<
    TVersion extends TransactionVersion = TransactionVersion,
    TInstruction extends IInstruction = IInstruction,
    TExplicitAccountOrdering extends Address[] = Address[]
> = Readonly<{
    explicitAccountOrdering?: readonly TExplicitAccountOrdering,
    instructions: readonly TInstruction[];
    version: TVersion;
}>;

If there's no order, we do whatever we want. If there is then we indexOf each address to break ties instead of using the addressComparator(). If one of those orders violates some strict ordering rule (like the order of the fee payer, signers, and writables), or if one is missing, then maybe we throw. Otherwise we just do what it says.

As in this PR, we would erase the explicit ordering during any operation that would normally invalidate signatures.

This data structure would still be serializable (eg. to JSON) and would still demand recompilation of the message bytes from the current state of the object every time a signature needs to be made or verified. You could still infer facts about the transaction from the transaction data structure itself without worry that the APIs will vend you message bytes that represent something else.

Option D – Cache the IR between the *Transaction types and MessageBytes

This PR.


I'm wondering if Option C is the radical change we need. This might imply that ‘transactions’ as we know them would disappear from the API. The API would frankly be more accurately written:

const transactionMessage = pipe(
  createTransactionMessage({ version: 0 }),
  m => setTransactionMessageFeePayer(payerAddress, m),
  m => setTransactionMessageLifetimeUsingBlockhash(latestBlockhash, m),
  m => appendTransactionMessageInstruction(/* ... */, m),
);
const signedTransaction = signTransactionMessage([keyPair], transactionMessage); // { messageBytes, signatures }

That's literally closer to what's actually happening. We don't call it getFeeForTransaction, we call it getFeeForMessage.

@mcintyre94
Copy link
Collaborator Author

mcintyre94 commented Mar 27, 2024

I considered option C, but I don't think it's sufficient - it looks like the order of AddressTableLookup[] can also be varied and would also need to be stored (or added into the account ordering without actually being an account). At that point it felt less hacky to just compile the message.

I like option B a lot. I actually think it makes the API a lot clearer to have a much stronger distinction between a signed and unsigned transaction, in terms of what you can and can't do. The fact we have a bunch of places we strip signatures (albeit with a shared helper) if they're present points to that too. @lorisleiva I think this would work nicely for signers too? The transaction modifying signer would return {messageBytes, [signature]} and then the non-modifying signers would all just sign those messageBytes and add signatures. I'm not super familiar with that API though so would definitely be good to hear your thoughts especially from that perspective!

One point is that we couldn't as easily check if a transaction is fully signed. We'd need to decompile the message header to know how many signers are expected, and we'd need to decompile further if we wanted to know what those signer addresses should be to verify that too. We could write a nicely optimised decoder for that though, the message bytes are very conveniently structured!

Edit: Actually I see that you're using the OrderedMap to store addresses with missing signatures too, and we'd know all the expected signers when we compile the message. Clever!

For now I'm going to start working on option B and at least get a better understanding of the impact it would have.

@lorisleiva
Copy link
Collaborator

Whilst this is gonna change EVERYTHING, I am also a big fan of Option B! 🤩


@mcintyre94 Yes this is gonna change the Signer API a little but actually it's gonna make it closer to the Message Signers so we'll have an even more consistent API.


@steveluscher When you say:

I'm wondering if Option C is the radical change we need.

I think you mean option B, right? Because I do agree the proposed code snippet at the end of your last message makes a lot of sense for option B as it separates the concept of an "uncompiled transaction" i.e. a "transaction message" from the transaction itself.

We could even consider publishing @solana/transaction-messages for all the pre-signing part and @solana/transactions for all the types and helpers that require the new compiled transaction type with ordered signatures.

@mcintyre94
Copy link
Collaborator Author

Okay cool I'm going to close this and start a stack to get option B going. Will try to cause as little damage as I can 😅

Copy link
Contributor

Because there has been no activity on this PR for 14 days since it was merged, it has been automatically locked. Please open a new issue if it requires a follow up.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[experimental] Decoding and then re-compiling a transaction invalidates existing signatures
3 participants