New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PLT-7583 Validator optimizations #12
Conversation
Script hashes all match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. The diffs where very useful too.
@paluh and I reviewed them, and we have some comments. One he will post himself.
We are not sure whether it is possible to keep makeIsDataIndexed
with the asData
, but if it is possible it would be desirable. So that is why I am asking for changes.
In addition to that, I've noticed that ScriptTypes.hs
corresponds to Script/Types.hs
, not sure if it would be better to have the same naming in both to make it easier to compare.
In the same line, we could order things in marlowe-cardano
in the same way than in marlowe-plutus
. But that would be a different PR, ofc.
| Assert Observation Contract | ||
deriving stock (Generic, Data) | ||
deriving newtype (ToData, FromData, UnsafeFromData, Haskell.Eq, Haskell.Ord, Haskell.Show) | ||
|] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot find a makeIsDataIndexed
for Contract
type. Should there be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for Action
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using asData
makes makeIsDataIndexed
redundant because the patterns include that information. I'll double-check the TH source code for makeIsDataIndexed
, just to be 100% sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the same line, we could order things in marlowe-cardano in the same way than in marlowe-plutus.
If you're referring to the ordering in the semantics files, we'd have to change those in marlowe-cardano
because the order in marlowe-plutus
is (unfortunately) constrained by the way GHC handles TH: i.e., identifiers don't resolve during compilation of the module unless the ordering declares them in a particular sequence.
I think we should reorder and clean up the marlowe
module in marlowe-cardano
, maybe when we upgrade to Conway near the start of PI5. Typeclass instances etc. are scattered about the file in a not very logical order. @jhbertra has some ideas for improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, if it is not possible to be explicit in marlowe-plutus
then it makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bwbush isn't makeIsDataIndexed
replaced in the TH block by deriving stock (Data,...)
which is "constructor order dependent" derivation of the encoding instance:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, makeIsDataIndexed
is replaced by asData
, whose splice expands to explicitly numbering the constructors in their order of appearance. The deriving stock Data
refers to the type Data.Data (Data)
in the base library and doesn't particularly relate to Plutus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, if it is not possible to be explicit in marlowe-plutus then it makes sense
I've added a comment documenting this in 392246d.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bwbush, we had an interesting discussion with @palas that seems pretty important and is related to the semantic changes after migration to asData
. I'm not sure if we understand asData
so please correct me if I'm wrong below.
Our understanding of asData
semantics:
-
asData
makes the decoding of theContract
lazy (without memoization) - we decodeBuiltinData
on demand when we pattern match on a value. -
If we consider a decoding failure, it will happen lazily as well.
-
If the above is true then we can probably say that every value of type
A
which was previously represented as strict data type (so the set of values did not contain⊥
, right?) in the context ofasData
is turned intoA ∪ { ⊥ }
. -
If the above is true then this is probably a problem from formalization point of view because we have to change the spec (semantics would work over
Contract ∪ { ⊥ }
instead ofContract
). -
What is also important is that we can show using a specific example that lazy validation which triggers exceptions can change semantics of the Marlowe program (example follows).
Change of the Marlowe Semantics - example:
-
Let's analyze specific example which causes violation of existing termination guarantees.
-
Let's compare evaluation of our specific example to the evaluation of the same but merkleized contract with strict data representation. Merkleization encodes laziness in a bit different manner but can probably be approximated as
Case (Contract ∪ { ⊥ })
meaning if the hash is invalid (or lost) then decoding of a specific branch can "stale forever". -
The difference with merkleization is that the timeout branch of
When
is always revealed in a merkleized contract. The timeout branch itself can contain a contract which may include aWhen
contract with a timeout which has to be revealed as well, and so on. Therefore, the timeout branch must contain a fully decoded "timeout based" path to aClose
. -
In the case of
asData
used overContract
, there's a possibility that a malicious actor forges a datum on the chain and puts an invalid piece of data in the timeout branch which actually has to containContract
value. Decoding of this timeout continuation will trigger runtime exception. -
The difference really boils down to
Contract ∪ { ⊥ }
vsCase (Contract ∪ { ⊥ })
in this case. -
Let's consider a hypothetical contract decoded lazily using
asData
:When [ Case (Deposit ...) (When [] timeout INVALID_DATA)] timeout Close
- If the user performs the
Deposit
the semantics won't enforce decoding of theINVALID_DATA
in the same step (we don't have to pattern match on this case when timeout is not reached). - If that is true then we end up in a continuation which locks the funds.
-
If we consider the same but strictly decoded merkleized contract:
When [ Case (Deposit ...) hashOfTheContinuatation] timeout Close
- When we are applying input we have to provide the next part of the contract corresponding to the
hashOfTheContinuation
fully decoded. - So in this case we actually have to decode contract
When [] timeout INVALID_DATA
together withINVALID_DATA
. - So if the
INVALID_DATA
doesn't encodeContract
and is part of the initial hash then the input application will not happen because we cannot provide a contract which is correctly decoded and is corresponding to the hash value. - So merkleization doesn't cause locking in this particular case because we it is impossible to perform the
Deposit
.
Discussion:
-
Is the above understanding of
asData
correct? -
If that is true shouldn't migration to
asData
be considered as a breaking change and enforce changes in formalization of the semantics? -
We can say that the above attack vector is unrealistic because datum decoding and inspection of the contract is a natural step which every user or DApp should perform before applying any input. So is this a really harmful change in the semantics? Should we allow this new semantics on the chain?
-
Should we perhaps apply
asData
only to theCase
data type so it preserves similar timeout branch semantics to the merkleized contract? -
If we do this (apply to the
Case
data type only), maybe we can just usedata Contract = ... | When [Case BuiltinData] Timeout Contract | ...
(which somewhat mimics merkleized case):- It requires full decoding of the timeout branch (all the way down to the
Close
) - It explicitly says which parts of the contract decoding can crash in the evaluation loop of the semantics at runtime (with
asData
we have the same problem but encoded implicitly I think).
- It requires full decoding of the timeout branch (all the way down to the
CC: @jhbertra
The above might not be correct because the Marlowe-Cardano specification requires that that output contract from |
…as used. In response to review comment #12 (comment).
I started wondering - maybe |
Here is evidence that corrupt data is only detected later in the contracts's lifecycle. |
1. Removed `PlutusTx.asData` for `Contract`. 2. Added `PlutusTx.asData` for `Cases`, with cabal flag default `False`. 3. Added cabal flag for `PlutusTx.asData` for `Action`, with default `True`.
Fixed in c0b42fc. |
@palas, @paluh, @ramsay-t in 69decf6 I removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me - we are retaining the option to turn on the faster asData stuff, and the risk seems no worse than with bad Merkelisation. We can revisit this in the New Year if we look at some sort of pre-checking for Marlowe contracts.
Cool. Thanks! |
See https://nbviewer.org/gist/bwbush/eb372f077a444021e9ede95699e442b5/PLT-9055.ipynb for test results. |
PlutusTx.asData
forState
.PlutusTx.asData
forAction
.Value.geq
, but with fallback.marlowe-cardano
repository, but tests still rely on that (for consistency).Note to reviewers
Here is an efficient way to compare the validator files where were renamed or relocated: