Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CIP-0046? | Merkelised Plutus Scripts #385

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

L-as
Copy link
Contributor

@L-as L-as commented Nov 29, 2022

Currently, the hash of a script is simply the hash of its serialisation.
This CIP proposes changing this such that the hash of a script (term)
is a function of its immediate children's hashes, forming a Merkle Tree from the AST.
This allows one to shallowly verify a script's hash, and is useful on Cardano,
because it allows scripts to check that a script hash is an instantiation of a parameterised script.

In addition, a blake2b_224 built-in function must be added.

This is inspired by BIP-144,
but the motivations are very different.

Rendered

@L-as L-as changed the title Merkelised Plutus Scripts CIP-???? | Merkelised Plutus Scripts Nov 29, 2022
@zygomeb
Copy link
Contributor

zygomeb commented Nov 30, 2022

This proposal is a low-cost way to allow us to design protocols autonomously making use of staking validators. For that alone it is worth pushing forward on as currently they're not very useful, apart from one fancy technique.

I'm also thinking it may find different uses but I'll sit on it for a while.

@micahkendall
Copy link

Withdrawn my cip in favour of this.

### Relation with BIP-144

BIP144 uses this trick to avoid submitting the parts of the script that aren't used.
Given that reference scripts are common in Haskell, this isn't a big win for efficiency,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that reference scripts are common in Haskell

Not sure what this means, do you just mean "Given that Cardano supports reference scripts"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brainfart

BIP144 uses this trick to avoid submitting the parts of the script that aren't used.
Given that reference scripts are common in Haskell, this isn't a big win for efficiency,
but it might be worth implementing for the sake of scripts used only once.
This CIP however doesn't require that that be implemented.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We looked into MAST during the development of Plutus Core, but we concluded that it wasn't worth it because the size of the hashes corresponding to omitted subtrees cancelled out the saving from omitting the subtree. You can read some notes on it here: https://github.com/input-output-hk/plutus/tree/master/doc/notes/plutus-core/Merklisation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we tried a very similar Merklisation scheme, but for different reasons. We were looking at ways to reduce script sizes and the idea of using Merklisation to let us omit unused parts of the AST in fully applied validators seemed promising. It turned out that that involved replacing subtrees of the AST with hashes which were large (32 bytes) and incompressible, and that meant that we couldn't get any worthwhile size reductions, so we abandoned that idea. However that was for an entirely different purpose, so I don't think it's too relevant here.

However, it is arguably not the optimal solution due to the reference
script problem described above. Even if the reference script problem
is solved as described above, it seems logical to allow supplying a datum
to a staking validator, or somehow combining the payment address and staking address for scripts,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with supplying a datum to anything is where does the datum live? For a validator script it lives on the output. Where could it live for a staking validator? If we can come up with a sane answer to that, then in principle we could just give staking validators datums.

### Staking

This makes staking validators much more powerful, since a single protocol can
now manage many rewards accounts (by instantiating the script with a numeric identifier).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can you write out this use case in more detail. You've alluded to it a few times but I'd really like to see more detail because I'm not familiar with it and I'm trying to back-infer the actual details, probably wrongly. And it seems to be the load-bearing example here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

This CIP however doesn't require that that be implemented.

The argument for privacy doesn't apply, private smart contracts can be achieved through
the use of non-interactive zero-knowledge probabilistic proofs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not today they can't. So I think it is still quite relevant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wdym? They definitely can once we have at least bitwise primitives.

### Reference scripts

Currently, different instances of the same script will need their own reference inputs
since their hashes are different. It seems feasible to allow sharing of a single reference script,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... or they can put them in the datum?

- A script address + datum can't fit in an address,
if you want that you also need this (or need to change what an address is).

## Specification
Copy link
Contributor

@michaelpj michaelpj Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would also need changes to the ledger spec. At the moment, the ledger doesn't deserialise Plutus scripts at all, it passes them to the evaluator still serialised, and despite this it can still hash them etc, straightforwardly. This CIP would probably require changing that in the spec and the implementation, so that the ledger has deserialized scripts around (one reason for this is that deserialization can fail, whereas hashing cannot). It might be good to have at least a sketch of those changes here.

I also don't know whether it violates any principles of the ledger to not have the hash of an item be the hash of its serialised bytes. I think that's true for everything else, it's possible that there's a reason for that (e.g. making it possible to check hashes without having to know the serialization).

currently has 8 constructors. On-chain, annotations are always the unit type,
and are hence ignored for this algorithm. Each case/constructor is generally handled by
hashing the concatenation of a prefix (single byte corresponding to the
constructor index) along with the hashes of the arguments passed to the constructor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly different to what @kwxm wrote here (https://github.com/input-output-hk/plutus/blob/master/doc/notes/plutus-core/Merklisation/Merklisation-notes.md#modified-merklisation-technique), which I think also included the serialized versions of the nodes in the value that gets hashed. Not sure if that's important, Kenneth do you remember?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which I think also included the serialized versions of the nodes in the value that gets hashed. Not sure if that's important

I'm not quite sure what you mean. It talks about "[serialising] all of the contents of the node into bytestrings", but I think by "contents" I meant all of the fields (things like variable names) except subnodes: you wouldn't serialise those and calculate hashes, but instead recursively apply the Merkle hash process. I think the overall process is basically similar to what's going on here.


In pseudocode: `hash $ prefix <> blake2b_256 (serialiseData d)`

## Rationale
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need some discussion of the cost of this kind of hashing. Our experiments suggested it was ~10x more expensive (https://github.com/input-output-hk/plutus/blob/master/doc/notes/plutus-core/Merklisation/Merklisation-notes.md#the-cost-of-calculating-merkle-hashes), unclear if this will have a meaningful impact but it might.

Copy link
Contributor

@kwxm kwxm Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the potential cost of this is my main concern. Calculating the hash involves traversing the entire AST (although as the document points out it can be fused with the deserialisation process), but also calling the underlying hash function(s) at every node, which could become expensive compared with just feeding the serialised script directly to a hashing function in one go. I'd really like to see some figures for this: it's conceivable that computing the Merkle hash might be more expensive than executing the actual scripts, and that might make this proposal impractical. The estimates from our earlier experiments (which were maybe three years ago) were entirely theoretical though, and things have changed a lot since we did that: for one thing, we're using flat instead of CBOR now, which makes the serialised scripts a lot smaller. I think some experiments would really be needed to decide whether the extra cost is a real issue.

it is not clear how much/less to Merkelise the hashing.
E.g., the hashing of data itself could be Merkelised. This is not done in this CIP.
The hashing of a `Data` constant could also prepend the prefix directly to the serialisation,
rather than to the hash of the `Data`. It is not clear what is best.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think stopping the merkelization at the constants is the right place.

Hence, they have been included.
They use Merkle-tree hashing since that's the simplest and most useful in this case.

## Path to Active
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should have some Acceptance Criteria a la the new CIP-001. Perhaps:

  • The ledger specification is updated as necessary
  • The Plutus Core specification is updated with the new hashing scheme
  • Performance assessment has been performed
  • Necessary hashing builtins have been added to PLC and costed
  • Example usecases have been validated to run in an acceptable amount of budget considering the increased use of hashing builtins

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable to me, but calculating a few hashes (see example pseudocode) is well within the budget last time I checked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calculating a few hashes (see example pseudocode) is well within the budget last time I checked.

Is that really true? If you're referring to the pseudocode here (under Rationale), then you need the hashes original and script, and I think those have to be calculated on the chain (or at least one of them does, no?), so there's a potentially large cost that has to be paid before you even get to that pseudocode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

original and script are constants here. original is a fixed script, and can thus be computed beforehand and inlined into the script. script comes from ScriptContext.

@thaddeusdiamond
Copy link
Contributor

This may be a n00b question and I may have forgotten my PL coursework, but doesn't this make the hash of the script reliant on the parser/lexer now? This is not worse than the way it works for compilation today, I am just wondering if the exact same AST will be generated between different platform bindings (e.g., Plutus/Haskell, Helios, Aiken) or if the writer/deployers will have to be cognizant of the platform choice.

@matiwinnetou
Copy link
Contributor

@michaelpj
Use Case:

  • I was contacted during Cardano Summit 2022 in Lausanne by a member of MartifyLabs team that our current model of off-chain-dapp registry (https://github.com/Cardano-Fans/crfa-offchain-data-registry) doesn't work for parameterised scripts. Basically what Martify Labs folks are doing they are generating NFT market places... on the fly. With such a model each time any param changes, they generate new script hashes. It is not possible currently to identity if script hash in fact belongs to the same root. This in itself does not seem like an issue but it could be a big problem for things like dApp audits / certification where a certain certification firm / company will audit certain script hashes and will provide proof on chain for doing so: see CPS: CPS-???? | On chain dApp and script audits #393

@michaelpj
Copy link
Contributor

michaelpj commented Dec 1, 2022

@matiwinnetou why can't they use a datum?

@L-as
Copy link
Contributor Author

L-as commented Dec 2, 2022

Thanks for the review Michael.

Copy link
Contributor

@kwxm kwxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic idea here looks sensible, but I'd like to see some figures comparing the time taken to compute the Merkle hash of a script with the time taken by the current method (computing the hash of a serialised script), and perhaps with the time taken to actually evaluate the script. There's a cost that has to be paid somewhere and I'm worried that it might be prohibitively expensive. I could be totally wrong about that though! We really need to see some numbers. We have a bunch of validation scripts here which we use for benchmarking and which we believe to be reasonably realistic: they'd be good candidates for benchmarking Merklisation costs.

Apart from that, the main things that I find myself wondering about are (a) how much this will affect the work that the node needs to do preparing a transaction for validation, and (b) how compelling the use case given here is in comparison with existing techniques (and with forthcoming extensions to the ledger model). I'll leave those issues to better-qualified people though.

The universe of types used on-chain is always `DefaultUni`.
Each possible data type is handled differently, with each having
a different prefix. The total number of prefixes does not exceed
255. If it did, the prefix would have to be increased to two bytes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it 255 or 256? I think any unsigned byte is a valid prefix, but I could be wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I'm dumb.

The serialisation according to [CIP-58](https://github.com/cardano-foundation/CIPs/blob/a1b9ff0190ad9f3e51ae23e85c7a8f29583278f0/CIP-%3F/README.md#representation-of-builtininteger-as-builtinbytestring-and-conversions),
prefixed with the two-byte prefix, is hashed.

In pseudocode: `hash $ prefix <> prefix' <> serialiseCIP58 n`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's going on here? Is it that prefix tells you that you've got an integer and prefix' tells you the sign?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, I guess that's what the sentence on line 121 means.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think this is a mistake. This is a previous scheme I had, but there's no reason not to collapse it into one byte.

but it has to be proven to be random, hence hashing the prefix byte
is the best option.

In pseudocode: `hash prefix`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I found it a little confusing that everything used prefix (further complicated by the fact that earlier on it mentions that there's a version prefix attached to serialised Plutus Core scripts). It might be clearer if it said error_prefix, lamabs_prefix and so on, like it does later. You could even propose concrete values for the prefixes and use those. We might introduce more Term constructors in the future, but I don't think that's a problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.


The hash of a `Builtin` is the hash of the prefix prepended to the base-256 encoded
(i.e. serialised to bytestring) index of the built-in function.
Because there are less than 256 built-ins, this is currently the same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Less than 256 or less than 257? I think that if we had 256 you could still get away with one byte here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

257

BIP144 uses this trick to avoid submitting the parts of the script that aren't used.
Given that reference scripts are common in Haskell, this isn't a big win for efficiency,
but it might be worth implementing for the sake of scripts used only once.
This CIP however doesn't require that that be implemented.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we tried a very similar Merklisation scheme, but for different reasons. We were looking at ways to reduce script sizes and the idea of using Merklisation to let us omit unused parts of the AST in fully applied validators seemed promising. It turned out that that involved replacing subtrees of the AST with hashes which were large (32 bytes) and incompressible, and that meant that we couldn't get any worthwhile size reductions, so we abandoned that idea. However that was for an entirely different purpose, so I don't think it's too relevant here.


In pseudocode: `hash $ prefix <> blake2b_256 (serialiseData d)`

## Rationale
Copy link
Contributor

@kwxm kwxm Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the potential cost of this is my main concern. Calculating the hash involves traversing the entire AST (although as the document points out it can be fused with the deserialisation process), but also calling the underlying hash function(s) at every node, which could become expensive compared with just feeding the serialised script directly to a hashing function in one go. I'd really like to see some figures for this: it's conceivable that computing the Merkle hash might be more expensive than executing the actual scripts, and that might make this proposal impractical. The estimates from our earlier experiments (which were maybe three years ago) were entirely theoretical though, and things have changed a lot since we did that: for one thing, we're using flat instead of CBOR now, which makes the serialised scripts a lot smaller. I think some experiments would really be needed to decide whether the extra cost is a real issue.

Hence, they have been included.
They use Merkle-tree hashing since that's the simplest and most useful in this case.

## Path to Active
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calculating a few hashes (see example pseudocode) is well within the budget last time I checked.

Is that really true? If you're referring to the pseudocode here (under Rationale), then you need the hashes original and script, and I think those have to be calculated on the chain (or at least one of them does, no?), so there's a potentially large cost that has to be paid before you even get to that pseudocode.

@@ -0,0 +1,270 @@
---
CIP: ?
Title: Merkelised Plutus Scripts
Copy link
Contributor

@kwxm kwxm Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be annoyingly pedantic, I'll point out that the process is named after Ralph Merkle, so it's Merklisation (or Merklization) rather than Merkelisation (which sounds like something from German politics).

Copy link
Contributor Author

@L-as L-as Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realise this. I had thought of this, but Merklisation looks odd.

currently has 8 constructors. On-chain, annotations are always the unit type,
and are hence ignored for this algorithm. Each case/constructor is generally handled by
hashing the concatenation of a prefix (single byte corresponding to the
constructor index) along with the hashes of the arguments passed to the constructor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which I think also included the serialized versions of the nodes in the value that gets hashed. Not sure if that's important

I'm not quite sure what you mean. It talks about "[serialising] all of the contents of the node into bytestrings", but I think by "contents" I meant all of the fields (things like variable names) except subnodes: you wouldn't serialise those and calculate hashes, but instead recursively apply the Merkle hash process. I think the overall process is basically similar to what's going on here.

@micahkendall
Copy link

micahkendall commented Dec 8, 2022

For the efficiency argument, could you instead of merklising every node of the AST, make it optionally merkled or the hash of the serialisation of each node.

Since usually specific parts of the tree need to be referenced (such as in the case of parameterisation), fully merkling is wasteful, but with partial merklisation you could introduce a 'merkle' keyword to whatever high level language, erase it from the UPLC and use it to generate a tree.

This also means the tree used for this can be provided on-chain for referencing, for example so you can identify a parameterised contract without actually knowing the parameters.

@L-as
Copy link
Contributor Author

L-as commented Dec 10, 2022

For the efficiency argument, could you instead of merklising every node of the AST, make it optionally merkled or the hash of the serialisation of each node.

There are two ways of doing this:

  • Allow the user to choose how to hash the script, this means many hashes for one script (seems problematic),
  • or, make the choice part of the AST itself. This preserves one hash per script, but makes the scripts bigger.

The structure of the Flat encoding is thankfully quite simple.
Given an ADT, the encoding is essentially a constructor index, then the fields placed sequentially.
Primitive types are handled like ADTs.
Because everything is variable length, you can not know where anything starts and ends without decoding the entire script.
This also makes the trick @micahkendell suggested not useful AFAICT.

One thing I'm wondering about is, will you ever hash a script without also running it? If you will run it, you have to decode it anyway.

Assuming that there are instances where you don't currently have to run them (and thus decode them),
then I see two routes for improving the hashing performance besides raw engineering:

  • Making the encoding format more amenable to Merkle hashing.
  • Changing the scheme in the CIP to more closely match the Flat encoding.

For the latter, e.g. rather than hashing an integer the way described in the CIP,
it could simply be the hash of the flat encoding. One issue with this is that it complicates checking the hash
on-chain, because you now have to emulate Flat encoding (which is admittedly simple, so should still be possible,
and also much more uncommon than checking the hash off-chain).

Hashing a Term would involve hashing the index the Flat encoding uses, concatenated to the hash of the arguments to the constructor.
To do this you don't to "decode" the term, but rather determine how long the index is (to hash the rest) and how many arguments there are.
This however still doesn't seem optimal.

Going back to @micahkendall's idea, perhaps we can make it doable:
In addition to Term, have another MerkleTerm that has the same structure,
but at each leaf, has Either Term MerkleTerm. (We could use HKDs for this.)
What we then serialise would be Either Term MerkleTerm.
MerkleTerm wouldn't use the Flat encoding, as that's part of the problem here.
The encoding would be the constant-length (1 byte) constructor index, then the encodings of the fields placed serially.
When the child is a Term, it's prefixed with its length.
Then, when hashing a script, we use Merkle hashing until we reach a Term. Because we know the length of its encoding, we can pass the slice of the input bytestring to the hashing algorithm trivially.
We don't need to know the length of the encodings of the MerkleTerms, because we need to decode those parts anyway.

This scheme is a bit complicated, but AFAICT would keep hashing just as fast for all existing scripts.
For scripts that use MerkleTerm, the hashing would inherently be slower, but you'd only use the MerkleTerm
constructors when you want to be able to prove what arguments you are passing in.

As for hashing the constants, that would probably be kept as described in the CIP for constants inside MerkleTerm,
but the same otherwise.

This scheme however seems more complicated than what's described in this CIP, and in the end I have to run some benchmarks to see in practice how efficient Merkle hashing can get.

I will work on this when I find time, I suppose the CIP is blocked on that until then.

@kwxm
Copy link
Contributor

kwxm commented Dec 17, 2022

The structure of the Flat encoding is thankfully quite simple.

You may be aware of this already, but FYI there's a very detailed specification of the flat format as it's used for Plutus Core in Appendix E of the draft of the updated Plutus Core specification here.

@L-as
Copy link
Contributor Author

L-as commented Dec 27, 2022

I've thought about this for a bit, and come to the conclusion that the scheme described in the previous comment is in fact flawed.
The issue is that it needs to account for the worst-case, a user who actively uses as much "Merklisation" as possible.

However, I think the original scheme in the CIP is in fact not problematic.
There are two cases to consider:

When we attach a reference script to a UTXO, the hash verification can never fail, because the hash rather than being verified, can be computed from the attached script. The node can then cache the computation of the hash of this script and store it as part of the UTXO (if this is not already done).

The other case to consider is when we need to pass in a witness/concrete script that matches a script hash.
If the script comes from a reference script, all you need to do is a 32-byte equality check. This is trivial.
The other case to consider is when we pass in the script as part of the transaction itself.
This can fail. What's important to note, is that this is only needed in cases where 1) we need to execute the script and 2) the user chooses to use a reference script.

There are two broad fixes to this:

  1. Disallow not using reference scripts. This would mean passing in the wrong script would cause 1 succeding transaction and 1 phase 1 failing transaction. You still pay fees for the first transaction.
  2. Delay verification of the hash of the scripts included in the transaction to phase 2. Passing in an incorrect script would then cause you to pay the collateral. You can optionally fuse decoding/hashing/execution to further optimise it, but it's not necessary.

With this, AFAICT, performance is no longer a problem. I am fine with either of the above two solutions. 2) might be less disruptive, but 1) seems cleaner.

Thoughts? @kwxm @michaelpj and others

@michaelpj
Copy link
Contributor

I'm not a huge fan of either of those:

  1. Seems extremely disruptive and basically gives up on the original EUTXO model.
  2. Makes script hashing uniquely special and expensive in a way that's likely to be a big pain.

Plus both of these would be quite a bit of work for the ledger. So it doesn't seem worth it to me.

I wonder if there's a more focussed change that would get you what you want. This changes the entire means of hashing scripts in order to find out whether a script is applied to something. We're not really using all that power! An alternative would be to have something like this:

data PlutusScript = JustScript ActualPlutusScript | AppliedScript ActualPlutusScript ActualPlutusScript

I don't particularly love this either, but I think it's at least worth thinking about alternatives that aren't so invasive.

@fallen-icarus
Copy link

I don't really know much about hashing so this question is more out of curiosity. Why is merkilisation better than allowing parameterized reference scripts (#354)? It seems that there isn't a consensus on whether the merkilisation can be done efficiently enough. Does anyone have an idea of what the resource cost would be to allowing a custom parameter in reference scripts?

@@ -0,0 +1,270 @@
---
CIP: ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CIP: ?
CIP: 46

@KtorZ KtorZ changed the title CIP-???? | Merkelised Plutus Scripts CIP-0046? | Merkelised Plutus Scripts Jan 17, 2023
@L-as
Copy link
Contributor Author

L-as commented Feb 6, 2023

@michaelpj That seems feasible. I don't see any issues with that design, and though it's effectively "1 level" of Merkle hashing, it seems to be powerful enough for almost all use cases? The use case this wouldn't cover, is passing in an arbitrary script on-chain, then applying a parameter to it, because that arbitrary script may already be using 1 level of Merkle hashing.

IMO though, checking the script hash should be phase 2 anyway, and so should anything that doesn't have to be in phase 1 (though this may be due to ignorance on my part).

I am however fine with implementing this proposed solution. It seems simple to implement.

One question that remains is, should the JustScript case pass the hash through? This would maintain backward compatibility.
The AppliedScript case can AFAICT just be hash (hash x <> hash y), since there is only one constructor with 2 fields.

In that case, do we need to increment the Plutus language version? Seems like a hard fork would be enough.

@michaelpj
Copy link
Contributor

IMO though, checking the script hash should be phase 2 anyway, and so should anything that doesn't have to be in phase 1 (though this may be due to ignorance on my part).

I don't see why. All other hash checking is phase 1. Phase 2 only exists for script evaluation, because it's so expensive.

I am however fine with implementing this proposed solution. It seems simple to implement.

To be clear, that was a straw proposal. I don't really like it either. I just wanted to encourage the search for more ideas.

@L-as
Copy link
Contributor Author

L-as commented Feb 20, 2023

I had a long talk with @micahkendall , and I believe that what you proposed is more than sufficient, in addition to being the optimal solution.
Informal proof: Given any use of n-deep Merkle hashing for scripts, you can roughly turn it into 1-deep Merkle hashing by putting all added arguments into a list on the RHS in AppliedScript. You do lose some power, but not much. If your goal is to add extra logic, the elements in the applied list can be hashes the refer to other scripts. Of course, if you want to add to the list, you need the entire list in the transaction, but that is still much smaller than having the entire script.

One minor change, however, is that it probably makes sense to have it be

data PlutusScript = JustScript ActualPlutusScript | AppliedScript ActualPlutusScript Data

Morally, this applied argument is similar to redeemers and datums, and hence should be in the same format.
If redeemers and datums are changed to be of some other format, this should be changed to be of the new format too.

Everything in the motivation can still be done with this scheme. What do you think? @michaelpj

@KtorZ KtorZ added the Category: Plutus Proposals belonging to the 'Plutus' category. label Mar 18, 2023
@KtorZ KtorZ added the Waiting for Author Proposal showing lack of activity or participation from their authors. label May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Plutus Proposals belonging to the 'Plutus' category. Waiting for Author Proposal showing lack of activity or participation from their authors.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants