-
Notifications
You must be signed in to change notification settings - Fork 797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache compiledMessage
when a transaction has signatures
#2377
Conversation
|
26ba326
to
9edf64e
Compare
Even when I was building this the first time, I considered this just as a tradeoff of memory for performance. There's something that made me uncomfortable about having multiple copies of ‘the state’ lying around, though, begging to go out of sync. This PR description describes what this fixes. Can you list some of the things that could go wrong? |
9edf64e
to
ca91660
Compare
I guess there are probably two main categories: the cache getting out of sync, or the cached data not being used when it should The cache could get out of sync if we modify the transaction but don't remove the There's also cases where we manually create a I think the second case is less well handled by this PR, because we have two cases where we check Generally I think this is less risky than most cache/state issues, because we know it invalidates simultaneously with signatures and we already have well tested and isolated code for invalidating signatures. |
ca91660
to
2d42f48
Compare
I'm literally 50/50 on that one. The performance boost of this PR is not negligible. Even just by looking at the Signer API, this means having 500 On the other side, I am also nervous about introducing a duplicated state of the same data that as the potential to un-sync. No matter how much due diligence we apply in our exported functions, some popular third-party library could accept a transaction object and return a new one with an un-synchronised state and there's not much we can do about it. My biggest concern is: are we introducing an attack vector? Say, an application or wallet uses the transaction data to display information to the UI but then uses the compiled message to send said transaction without checking that they are the same. Suddenly we (may?) allow a malicious actor to compose a transaction object that does something different than displayed by the UI. That being said, this is not an issue with the wallet standard as transactions are being passed as bytes and therefore must be reconstructed on the wallet-side. We may need to think a bit deeper about that one before merging to quickly. |
I've been noodling on a bunch of concrete ‘ways this could go wrong’ but have yet to get a chance to write them up. Give me a bit to write them out, and propose something slightly different. |
@lorisleiva Definitely a good point on the attack vector. I think this is a tricky limitation of using simple objects for data to overcome, because there's really no way we can stop a third party library from de-syncing these. They're just fields on an object. I think currently a malicious third party could return a There are a few things we could do to mitigate this in our code, with the usual caveat that it'll be easy for someone to write their own version that behaves differently:
|
Thanks for this PR @mcintyre94! Of all the other reasons to do something like this (eg. performance) let me rewind way back to the original problem we need to solve: the sort order of accounts (ie. correctness). I'll start by doing a bunch of talking out loud. Option A (aspirational) – Specify the orderThe sort order of accounts should be specified and the runtime should enforce it. The fact that you can tweak the order in which you declare accounts to produce two different messages that represent the identical set of instructions thwarts the ‘one transaction per recent blockhash’ that the runtime otherwise tries to enforce through deduplication; you can land as many identical transactions per blockhash as you can generate permutations in the account order. In some ways you can already do this by reordering instructions, or by adding a memo instruction with a unique message, so maybe this doesn't matter. In any case, enforcing a deterministic sort in the runtime would have ended this discussion before it began, allowing multiple systems and signers to share transactions more easily. Option B – Make
|
I considered option C, but I don't think it's sufficient - it looks like the order of I like option B a lot. I actually think it makes the API a lot clearer to have a much stronger distinction between a signed and unsigned transaction, in terms of what you can and can't do. The fact we have a bunch of places we strip signatures (albeit with a shared helper) if they're present points to that too. @lorisleiva I think this would work nicely for signers too? The transaction modifying signer would return One point is that we couldn't as easily check if a transaction is fully signed. We'd need to decompile the message header to know how many signers are expected, and we'd need to decompile further if we wanted to know what those signer addresses should be to verify that too. We could write a nicely optimised decoder for that though, the message bytes are very conveniently structured! Edit: Actually I see that you're using the For now I'm going to start working on option B and at least get a better understanding of the impact it would have. |
Whilst this is gonna change EVERYTHING, I am also a big fan of Option B! 🤩 @mcintyre94 Yes this is gonna change the Signer API a little but actually it's gonna make it closer to the Message Signers so we'll have an even more consistent API. @steveluscher When you say:
I think you mean option B, right? Because I do agree the proposed code snippet at the end of your last message makes a lot of sense for option B as it separates the concept of an "uncompiled transaction" i.e. a "transaction message" from the transaction itself. We could even consider publishing |
Okay cool I'm going to close this and start a stack to get option B going. Will try to cause as little damage as I can 😅 |
Because there has been no activity on this PR for 14 days since it was merged, it has been automatically locked. Please open a new issue if it requires a follow up. |
When we sign a transaction, we actually sign the serialized bytes of its compiled message. This compiled message format is not very strictly defined by the spec, for example the order of accounts with the same role is not defined. This means that the message can be compiled in multiple valid ways. However, the signature bytes will only ever be valid with the exact same message bytes.
This PR adds
compiledMessage
toITransactionWithSignatures
, so that whenever we have a signed transaction we cache itscompiledMessage
. This is used instead instead of recompiling the message for future signatures, and when the transaction is compiled. If we do something that invalidates the signatures, then in addition to removing the signatures we also remove this cachedcompiledMessage
.More importantly, it is set when we decode a serialized transaction if it has existing signatures. This cached
compiledMessage
will be valid for the existing signature, and will be used for any future signatures/when we compile/serialize the transaction. This means that the existing signature will remain valid.This allows us to deserialize, sign and reserialize a transaction created by code that compiles messages differently, eg. the legacy web3js library, without invalidating its signatures.
Fixes #2362