Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CIP-0039? | Language annotated address #310

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

SebastienGllmt
Copy link
Collaborator

Currently, scripts inside addresses are represented simply by their script hash with no indicated of which language was used to generate the script. This is problematic as it means users cannot know whether or not they are sending to a native script (very common use-case that is perfectly fine) or if they are sending to a Plutus script (in which case they may be accidentally locking their funds forever)

This proposal aims to solve tackle this issue by introducing a new address format that asserts the type of script held in its hash

@SebastienGllmt SebastienGllmt changed the title Language annotated address CIP-???? | Language annotated address Jul 31, 2022
Header type (`t t t t . . . .`) | script namespace* | Payment Part | script namespace* | Delegation Part
--- | --- | --- | --- |---
(0) `0000....` | ø | `PaymentKeyHash` | ø | `StakeKeyHash`
(1) `0001....` | 0-255 | `ScriptHash` | ø | `StakeKeyHash`
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the size of the namespace is not defined in the Cardano binary spec, so I just made it one byte. At the current rate (3 namespaces in 2 years since the Shelley release), we have enough for 84 years which seems decent.

If we wanted to optimize for address length, we have 3 optimization targets:

  1. Lowest amount of size on-chain (every bit counts). Probably we should avoid weird length addresses though
  2. Smallest hex representation (0xF vs 0xFF i.e. 0-16 vs 0-255). 16 seems way too small though
  3. Smallest bech32 representation (0-32 vs 0-1024 since bech32 takes 5 bit chunks). Using this also leads to weird length addresses though


Note that nothing stops somebody from creating a mal-constructed address where the language asserted inside the address does not match the actual language used for the script hash.

This is an unlikely error since addresses are provided to the users by dApps. Additionally, even if due to an error the wrong address is used, funds are not permanently lost because the spending condition is still controlled by the script hash -- meaning they can still be spent and additionally can be considered
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you proposing that we do not validate the script namespace at the time of spending, when the scrip type becomes known? I worry about adding in a field that is not enforced. If folks cannot depend in the field being correct, it is dangerous to write code with that assumption (even if it normally hold true).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the alternative is that if they get the namespace wrong then the funds are locked forever. Neither are great options

That being said, if we prefer to take neither option and avoid implementing this at the protocol level, having this supported as a user-level annotated address is still useful

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer this to be checked.

Copy link
Collaborator Author

@SebastienGllmt SebastienGllmt Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some thoughts for how to make these checked: make that any tx that sends to an annotated script address needs to provide the script in any of:

  1. The witness
  2. The auxiliary data
  3. An input (including reference inputs) that contains the inline script in its utxo entry

It makes using this feature a bit more expensive on-chain and a bit more tedious for wallets to implement, but allows having the correctness of the transaction be verifiable using the existing transaction context

Any other alternative to make this checked that includes something like "the full scrip for this hash has to appear somewhere on chain before" requires expanding the transaction validation context which may not be desirable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, I think we could have any or all of the following kinds of checking:

  1. Checking upon spending such an output. This should be easy, since the witness is already there.
  2. Checking, locally, before creating such an output i.e. in cardano-api. This would require providing some extra information at transaction-creation time, but wouldn't bloat the transaction.
  3. Checking, on-chain, when creating such an output. This would require including the witness when creating the transaction, which is significantly more expensive and is a much bigger privacy leak since the preimage really would be publicly available on-chain before the output was spent.

I'm in favour of 1 (since it's cheap), optionally 2 (if they have the information, they might as well tell us for cross-checking), and probably not 3 (although I guess we could support it optionally?).

@KtorZ KtorZ changed the title CIP-???? | Language annotated address CIP-0039 | Language annotated address Aug 2, 2022
@KtorZ KtorZ changed the title CIP-0039 | Language annotated address CIP-0039? | Language annotated address Aug 2, 2022
Copy link
Contributor

@michaelpj michaelpj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favour of doing something like this. I keep forgetting that things don't work this way and being surprised.

I think it's worth calling out that there is potentially a small privacy leak, in that this now makes more information about outputs available before they are spent. It doesn't seem significant to me, though.


Additionally, this is not a problem that can easily be solved with better indexing tools for Cardano. This is because in the proxy native script contract use-case, it's possible that the proxy contract is unique for each individual user (some parameterized Plutus contract is calculated and then added as the requirement of the native script). Therefore, this feature benefits from being supported at the address level.

Lastly, the fact that addresses contain just their script hash without information about which language they use also leads to complexity in ledger rules and code of SDKs for Cardano as they cannot process any script language specific behavior until after the scripts are later provided as part of the witness. Knowing the language ahead of time allows the ledger to protect users from mistakes such as creating a Plutus V1 utxo entry with inline datum (which effectively makes it unspendable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree with this argument honestly. Disregarding the ledger complexity issue (which might be true), you will always make mistakes if you don't know the exact script in question. In what reasonable situation would you make a UTXO for a script address you don't know and then make the datum inline? I can not think of any.

What parts of the ledger could be simplified by adding this information? I'm not sure it's worth the decrease in efficiency due to increased transaction size.

Note there are two ways we can support our modification to the address type:

1) Reserve a new address type as part of the header nibble. This is not great because it means we have to reserve many new address types (a new alternative for every case where the old address format contains a script hash)
2) Add a new column to existing address types that contain scripts. This doesn't work for pointer addresses because they are variable-width so it isn't possible to differentiate between the original address format and this annotated format (note that pointer addresses are almost never used on chain, so this is an acceptable limitation). This does, however, limit the kinds of modifications we could make to these address types in the future as we would need to make sure the length continues to uniquely identify the content.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet more reasons to remove pointer addresses...


Note that nothing stops somebody from creating a mal-constructed address where the language asserted inside the address does not match the actual language used for the script hash.

This is an unlikely error since addresses are provided to the users by dApps. Additionally, even if due to an error the wrong address is used, funds are not permanently lost because the spending condition is still controlled by the script hash -- meaning they can still be spent and additionally can be considered
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer this to be checked.


# Language vs script namespace

Care has to be taken when using the word `language` in Cardano. This is because strictly speaking, `language` is defined in the binary spec as any script type that has a cost model associated with it (i.e. PlutusV1 & PlutusV2. Does not include native scripts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our nomenclature is such a mess here :/

CIP-XXXX/README.md Outdated Show resolved Hide resolved
@bezirg
Copy link

bezirg commented Aug 17, 2022

Taken from Alonzo spec:

TxOut is the type of transaction outputs. These are extended to include an optional hash
of a datum. Note that any output can optionally include such a hash, even though only
phase-2 scripts can actually use the hash value. Since it is in general impossible for the
ledger implementation to know at the time of transaction creation whether or not an
output belongs to a phase-2 script, we simply allow any output to contain a hash of a
datum.

Would this CIP turn into an error the attachment (inline or not) of datums to non-plutus addresses?

Co-authored-by: Nikolaos Bezirgiannis <329939+bezirg@users.noreply.github.com>
@SebastienGllmt
Copy link
Collaborator Author

SebastienGllmt commented Aug 17, 2022

Would this CIP turn into an error the attachment (inline or not) of datums to non-plutus addresses?

no, this CIP has no relation to deciding what to do about non-plutus addresses

@dcoutts
Copy link
Contributor

dcoutts commented Sep 14, 2022

This proposal seems to be based on the assumption that it's "safe" to send funds to a script address that uses the simple script language (as opposed to Plutus).

Currently, scripts inside addresses are represented simply by their script hash with no indicated of which language was used to generate the script. This is problematic as it means users cannot know whether or not they are sending to a native script (very common use-case that is perfectly fine) or if they are sending to a Plutus script (in which case they may be accidentally locking their funds forever)

(bold emphasis added)

What is the basis for this assumption? As far as I can see it's plainly false. When you send funds to a script address where you do not know what the script is, you have literally no idea about the conditions under which the funds can be spent. So in what sense is it safe? This is true for simple scripts or Plutus scripts. The funds might be locked forever, or belong to someone you don't expect. You have literally no idea in that situation.

@SebastienGllmt
Copy link
Collaborator Author

SebastienGllmt commented Sep 14, 2022

What is the basis for this assumption? As far as I can see it's plainly false. When you send funds to a script address where you do not know what the script is, you have literally no idea about the conditions under which the funds can be spent. So in what sense is it safe? This is true for simple scripts or Plutus scripts. The funds might be locked forever, or belong to someone you don't expect. You have literally no idea in that situation.

This is like saying wallets should warn all users whenever they send to a public key address because the user may have forgotten their private key. Sure, I guess that's technically true, but clearly there is difference between sending to an address where as long as the owner of the address didn't screw up then everything is okay vs sending to a plutus script where there is a 100% chance your funds are lost forever if there is no datum

I guess you could make some argument that the native script may contain a timelock which makes it no longer spendable, but I think this is more of an edge case as far as multisig usage is concerned (historically timelocks are only use for nft minting scripts). However, even then, showing a warning for native addresses and and error for plutus addresses is an improvement over the existing behavior

@dcoutts
Copy link
Contributor

dcoutts commented Sep 14, 2022

So your definition of "safe" is that the funds at least go to someone, don't know who, but at least are not lost/locked?

Even if one ignores timelocks in the simple script language and just think about multi-sig, it's still the case that you have no idea who will be able to take the funds you have sent to the address. You have no idea because you do not know the set of pubkey hashes nor the rules on which combinations can spend.

As for "normal" non-script addresses: those at least reveal the one pubkey hash of the recipient, but it's still up to the sender to check that this is the right recipient.

But given that it's up to the sender to check that they're sending to the right address, that applies to script addresses too.

Now all that said, I agree it would be a useful feature to be able to simply send funds to a Plutus script address and have that do something useful, since that would make it easy to send funds with a generic wallet. But I don't see that it makes sense to distinguish the script language in order to be able to provide (or avoid) a meaningless warning.

@JaredCorduan
Copy link
Contributor

Note that @michaelpj 's suggestion in CIP-38 also provides a solution to the problem addressed by this CIP: #309 (comment)

The extended address format could provide a reference to obtain the script, which could then be verified against the hash in the address.

@SebastienGllmt
Copy link
Collaborator Author

@michaelpj in your proposal, how would you differentiate no datum (native script) vs empty datum (plutus)? If the differentiation is based off the script type, the recommendation becomes a superset of what this CIP is proposing (not a bad thing)

@michaelpj
Copy link
Contributor

My proposal is just a sketch, but since the idea is to be inspired by URLs, I think you would just do the equivalent of:

  • <addr>?datum=<bytes-of-unit-datum> for "empty datum"
  • <addr> for "no datum"

I'm not sure it directly addresses the problem that this CIP aims at, although Jared's idea is interesting!

@JaredCorduan
Copy link
Contributor

I'm not sure it directly addresses the problem that this CIP aims at

Maybe it doesn't, but here's what I was thinking:

The problem (in this CIP, not CIP-38) as I understand it: One cannot tell if a script hash corresponds to a native script or a plutus script.

The extended address format would solve this problem by providing a reference to the full script. This is actually safe, since the user/client can check it themselves. Note that a language identifier is used in the script hash.

@michaelpj
Copy link
Contributor

I see, so you're suggesting that in addition to information needed to construct the output itself, we could pack other metadata into the extended address format also. I guess that does make some sense - although they could get quite big!

@JaredCorduan
Copy link
Contributor

they could get quite big!

indeed. that's why I was thinking that it would be a reference to the script, so that the extended address itself didn't have to get too huge. that's the only way I can think of to make this both safe and not require a drastic change to the address format.

@KtorZ KtorZ added the Category: Ledger Proposals belonging to the 'Ledger' category. label Mar 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Ledger Proposals belonging to the 'Ledger' category.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants