Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPS-0013? | Better builtin data structures in Plutus #638

Merged
merged 5 commits into from
Feb 20, 2024

Conversation

michaelpj
Copy link
Contributor

@michaelpj michaelpj commented Dec 14, 2023

Based on some discussions with Pi and others.

Some tasks today are difficult because we don't have the data structures to do them efficiently. This CPS outlines a couple of examples and gestures towards some possible solutions (which will require a proper CIP).


Rendered Version

@michaelpj michaelpj changed the title CPS for better builtin datastructures in Plutus CPS-??? |Better builtin datastructures in Plutus Dec 14, 2023
@rphair rphair changed the title CPS-??? |Better builtin datastructures in Plutus CPS-???? | Better builtin datastructures in Plutus Dec 14, 2023
@rphair rphair added the Category: Plutus Proposals belonging to the 'Plutus' category. label Dec 14, 2023
@rphair rphair changed the title CPS-???? | Better builtin datastructures in Plutus CPS-???? | Better builtin data structures in Plutus Dec 14, 2023
Copy link
Collaborator

@rphair rphair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice to see such a proposal sourced in cooperation with the community 🤓

CIP-????/README.md Outdated Show resolved Hide resolved
CIP-????/README.md Outdated Show resolved Hide resolved
CIP-????/README.md Show resolved Hide resolved
@Quantumplation
Copy link
Contributor

Thanks for taking the time to compile this @michaelpj!

Not sure if this is appropriate for a CPS, but in the spirit of inspiring some CIPs to solve this CPS, let me document a few of the rough ideas we kicked around on that twitter thread:

Length-prefixed arrays with constant time lookup

New builtin type array, with a length and type, such as:

(con (array 1 integer) [1])
(con (array 3 string) ["1", "2", "3"])
(con (array 7 data) [I 1, B aabb, I 2, I 3, B ddee, B ffgg, I 7])

New builtin function arrayLength that returns the length of the array

New builtin function newArray that constructs a new array of a given length

(builtin.newArray 5 data)

evalutes to `(con (array 5 data) [bottom, bottom, bottom, bottom, bottom])

(or perhaps you provide the initial value?)

New builtin function index, which retreives the nth item, or faults if its out of bounds

(builtin.index 3 (con (array 7 data) [I 1, B aabb, I 2, I 3, B ddee, B ffaa, I 7]))

evalutes to `I 3`

New builtin function set that returns a new array, with the nth item replaced by a new value, or faults if its out of bounds

(builtin.set 3 (B beef) (con (array 7 data) [I 1, B aabb, I 2, I 3, B ddee, B ffgg, I 7]))

evalutes to (con (array 7 data) [I 1, B aabb, I 2, B beef, B ddee, B ffgg, I 7])

Why is this useful?

There are several good candidates in the script context that are "index heavy" (most notably the inputs and outputs, but also withdrawals) that would benefit from being made into arrays. Even if not, a single linear pass to construct the array, and then constant time lookups after that, this would save quadratic work in many contracts.

GADTs from languages that compile to UPLC could compile structs to "Constructors with one field, which is an array of the logical fields of the struct", and then any field access is a builtin.head followed by a constant time access for the field we need.

Associative Arrays

I have less fidelity here, so either a new builtin or a new Plutus Data type that got accelerated with a hashmap implementation; ideally, one that mapped directly onto CBOR maps for efficient encoding.

Would likely need builtins for looking up by key, and inserting (producing a new map with the entry added).

Why is this useful?

Structs could compile down to maps, for log (with very small constants) time lookup of fields.
Datums / Redeemers could contain maps from address to value, or maps from txInput to action to take on that input, etc.

Preserve the order of inputs from the transaction

This one is kind of an orthogonal / out of the box solution, but is highlighted by the struggles in this CPS; in particular, the best optimization is just to not do the work at all anymore, and a huge source of indexing slowdown is having to re-sort the transaction inputs in the first place.

CIP-????/README.md Show resolved Hide resolved
CIP-????/README.md Outdated Show resolved Hide resolved
#### Operations on `Value`

The `Value` type is a nested map: it is a map from bytestrings (representing policy IDs) to maps from bytestrings (representing token names) to integers (representing quantities).
Since map operations are currently linear, this means that even simple operations like checking whether one value is less than another can have quadratic cost.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since map operations are currently linear

To be precise, operations that should be logarithmic are currently linear (e.g., insertion, deletion, member), whereas operations should be linear (or nlogn) are currently quadratic (e.g., union)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we had a list with O(1) index lookup then valueOf would be constant time in the majority of DApps in practice. This is because every robust DApp already explicitly prevents non-protocol related tokens from being stored in protocol script UTxOs, and in the vast majority of cases the validator will have access to the currency symbol of all the protocol tokens that it must check for. In all such cases you can just predetermine the indices for the relevant protocol tokens and then use elemAt index valuePairList instead of valueOf

CIP-????/README.md Show resolved Hide resolved
@michaelpj
Copy link
Contributor Author

@rphair I think this is ready for review

CIP-????/README.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@rphair rphair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelpj all looks generally in order to me. Have added it to CIP agenda # 81 for Review this time, since at meeting # 79 it only got Triage. I expect we'll be able to assign it a number at the meeting & I'm ready to green-check this soon after that's done... unless any major outcry or new unresolved issues. cc @Quantumplation

To avoid confusion for reviewers, can you rename the containing directory to something beginning with CPS- and link to the latest file in your branch in your original comment up top?

Also we agreed ## References is OK but please see discussion in #638 (comment) about what to do with the link currently there.

Note this was supposed to be combined with #638 (comment) but my Internet connection went boogey.

Copy link
Collaborator

@Crypto2099 Crypto2099 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good and in order for a CPS to me. Thanks @rphair @michaelpj @Quantumplation for getting this together. Definitely worthy of a number at the next meeting at the very least!

Copy link
Collaborator

@Ryun1 Ryun1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be a well written problem statement, with good discussions, happy to see a number applied then merged.

Copy link
Collaborator

@rphair rphair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I"m also very happy with it but will wait until it gets a number (very likely in couple days' time after meeting) in the usual cycle before I ✅ it for merge.

@Ryun1
Copy link
Collaborator

Ryun1 commented Feb 6, 2024

cc @michele-nuzzi

@rphair rphair added the Last Check This proposal has been reviewed and approved, staged for merging. label Feb 6, 2024
@rphair rphair changed the title CPS-???? | Better builtin data structures in Plutus CPS-0013? | Better builtin data structures in Plutus Feb 6, 2024
Copy link
Collaborator

@rphair rphair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelpj this has passed editorial review but zooming ahead too quickly to merge now. Please update the number here & in your branch (and correcting the document link in your OP please)... it's on Last Check now for next meeting agenda, if not merged by popular request & consensus sooner 🚀

I think #638 (comment) still needs to be resolved although I believe that's a document quality & not a "standards" issue & therefore up to @michaelpj @Quantumplation about whether it needs to be fixed; so I won't let the merge be contingent upon it.

  • TL;DR (already documented above) I think our body of CIPs shouldn't have References which are links of unknown significance to social discussions included without context: our CIPs are generally written at a higher calibre than that.

CIP-????/README.md Outdated Show resolved Hide resolved
@rphair rphair merged commit e7db3a4 into cardano-foundation:master Feb 20, 2024
@colll78
Copy link
Contributor

colll78 commented Mar 7, 2024

#767 (comment)

Here are benchmarks that are relevant to the discussion of the benefits of indexed data-structures. They illustrate how incredibly expensive recursive lookups are currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Plutus Proposals belonging to the 'Plutus' category. Last Check This proposal has been reviewed and approved, staged for merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants