Clarity: introduce merge function #2117

lgalabru · 2020-12-02T06:09:24Z

This PR is addressing #1733 by introducing a (set base-tuple tuple).
The initial issue was suggesting merge, it could also be combine, or set (i personally like the symmetry with get) - please chime in on the naming :).

Todo left:

Set cost functions

njordhov · 2020-12-02T06:50:39Z

The verb set is used in the mutating functions map-set and var-set making it a mismatch for a pure function that returns a new merged record.

The term merge is widely used for this functionality: The Clojure lisp dialect has merge; Typed Racket lisp has record-merge2; Ruby has merge; In Javascript, Immutable has merge; Lodash FP has merge.

The name merge works well as an input function for e.g.fold:

(fold merge record-list initial-record)

Alternatively, what about using the declarative name record and allow one or more arguments that are merged into a record?

src/vm/functions/mod.rs

kantai · 2020-12-07T19:27:44Z

src/vm/ast/definition_sorter/mod.rs

@@ -254,6 +254,14 @@ impl<'a> DefinitionSorter {
                                    }
                                    return Ok(());
                                }
+                                NativeFunctions::TupleSet => {


Does this need to be handled with a special case in this match? Or could TupleSet just be handled like any other native function (i.e., fall through to line 270, which just iterates over the expressions)?

See also #2123

Addressed with a1a3082

kantai

This looks good to me -- just a couple small comments. One is that I agree with @njordhov that the function can (and should) be implemented as a normal native function. I also agree that it makes more sense to call this merge -- I think set connotes that it is a mutating method, which it isn't.

src/vm/docs/mod.rs

src/vm/types/mod.rs

jcnelson · 2020-12-08T03:37:01Z

src/vm/types/mod.rs

+                        base.type_signature.clone(),
+                    ))?;
+
+            let new_value = match existing_value {


Just to make sure I understand the implications of this code:

(set { a: { c: 1 }, b: 2 } { a: 1 }) will fail at runtime, because when processing a in base, existing_value would be a Value::Tuple but value would be a Value::Uint(_), triggering the CheckErrors::ExpectedTuple error.

(set { a: 1, b: 2 } { a: { c: 1 } }) would succeed, because when processing a in base, existing_value would be a Value::Uint(_) and would just fall through this case. The resulting tuple would be { a: { c: 1 }, b: 2 }.

(set { a: { c: 1 }, b: 2 } { a: { d: 2 } }) would fail, because the inner call to TupleData::deep_merge() would fail for the inner { c: 1 } and { d: 2 } values for a.

(set { a: { c: 1, d: 2 }, b: 2 } { a: { c: 2 } }) succeed, and the tuple values for a in both base and updates would be merged through this inner call to TupleData::deep_merge(). The resulting tuple would be { a: { c: 1, d: 2 }, b: 2}.

Is my understanding of the above correct? If so, then I think it's kind of surprising. As a programmer, I would have expected the following behaviors:

(set { a: { c: 1 }, b: 2 } { a: 1 }) should succeed, but produce the tuple { a: 1, b: 2 }.

(set { a: 1, b: 2 } {a: { c: 1 } }) should behave as described above, and produce { a: { c: 1 }, b: 2 }.

(set { a: { c: 1 }, b: 2 } { a: { d: 2 } }) should succeed, but produce the tuple { a: { d: 2 }, b: 2 }.

(set { a: { c: 1, d: 2 }, b: 2 } { a: { c : 2 } }) should succeed, but produce the tuple { a: { c: 2 }, b: 2 }.

I figured that if I wanted to merge two tuples, the merger ought to be "shallow" and only affect the tuple's immediate keys (instead of a "deep" set, which merges all contained tuples as well). If I wanted the "deep" behavior, I would have explicitly called set on the inner tuples. Making the merger "deep" by default makes cost-tracking more difficult, because the "deep" merge algorithm introduces recursion -- you'll need to bill the user for processing each sub-tuple, as well as find a way to statically determine that a single (set ..) can't cause a stack overflow in the VM (leading to a node crash).

What do you all think? Also cc @kantai

Hmm -- yes, I think I agree with this. I think it makes more sense for this function to produce a shallow "union" tuple, rather than trying to do a deep merge.

I'm not so worried about the cost-tracking of a deep merge, which could be implemented as a different function (though, I'm not sure how useful it would be -- in our boot contracts and from the contracts people have mostly produced, tuples tend not to nest deeply, and when they do, it's rare that only a field of a child tuple would be updated) -- the cost of a deep merge is still linear in the size of the two inputs, and the depth of the merge is bounded by the type depth limit.

Sure. I designed this feature based on what I've been experiencing while writing contracts. I think in 100% of cases, I would have use this function for updating a tuple coming from storage.
If you look at this use case, then you never want to update the "base" type signature, and be as terse as possible.

Sure. I designed this feature based on what I've been experiencing while writing contracts. I think in 100% of cases, I would have use this function for updating a tuple coming from storage.
If you look at this use case, then you never want to update the "base" type signature, and be as terse as possible.

This use-case doesn't invalidate the argument for creating a "union" tuple via (set ..). A subsequent call to var-set, map-set or map-insert would still need to pass type-check. Assuming we did something like what I suggest above, then this code snippet would not pass analysis, for example:

(define-map foo { a: uint, b : { c: uint } } { d: uint, e: { f: uint } }) (let ( (test (unwrap-panic (map-get { a: u1, b: { c: u2 } }))) ) ( ;; will fail to type-check, since `(set ..)` has type `{ d: uint, e: { f: uint, g: uint } }`, ;; which is incompatible with the value type definition for `foo` map-set { a: u1, b: { c: u2 } } (set test { d: u3, e: { g: u4 } })) ))

So, having the "union" behavior shouldn't prevent you from safely loading, updating, and storing structured data.

I think in 100% of cases, I would have use this function for updating a tuple coming from storage.

@lgalabru A common use case for merge in Clarity is to merge the state in a reducer with updated values. See for example the add-pox-addr-to-ith-reward-cycle reducer function tin the pox contract, which could use a (merge params {i: ...}) expression to replace the explicit passing of properties in line 264-267:

https://github.com/blockstack/stacks-blockchain/blob/1ad60a216e53473d6fb103e2e171cfe1003b9883/src/chainstate/stacks/boot/pox.clar#L254-L284

ok, I'll fold

I'm working on re-implementing this feature, but one problem that we would now be facing, is how do we handle optionals without turning this merge feature into a footgun - as in what should be the type signature of the following cases:

(merge { a: (some { x: 0, y: 1 }), b: 2, c: 3 } { a: none })

(merge { a: 1, b: 2, c: 3 } { a: none })

(merge { b: 2, c: 3 } { a: none })

Happy to get a new round of thoughts on this.

how do we handle optionals

@lgalabru Option properties should be handled like any other type. Consider that it is is just an enum with two variants (Some and None) just like a Response is an enum with two variants (Ok and Err). Despite the Clarity type name, the property is not optional (clarity-lang/reference#22).

In all cases, I think the output type would be the same:

(merge { a: (some { x: 0, y: 1 }), b: 2, c: 3 } { a: none }) -> { a: (optional NoType), b: int, c: int } (merge { a: 1, b: 2, c: 3 } { a: none }) -> { a: (optional NoType), b: int, c: int } (merge { b: 2, c: 3 } { a: none }) -> { a: (optional NoType), b: int, c: int }

This should be the natural output from a type checker that just set the output tuple type to be the "shallow" union of the two input tuples. The NoType here won't present difficulties to the type checker unless someone tried to unwrap! the optional, which is a type error.

Gotcha, I was not familiar with the type admission rules of NoType / UnknownType, and it looks like this works as expected:

(define-map users uint { address: principal, name: (optional (string-ascii 32)) }) (let ((user (unwrap-panic (map-get? users u0)))) (map-set users u0 (merge user { name: none })))

Approach revisited in 8b5c1f0.

jcnelson

I need some clarity (no pun intended) in the intended behavior of set, as outlined in my comments.

# Conflicts: # src/vm/tests/datamaps.rs

jcnelson

LGTM! Thank you for addressing my feedback!

kantai · 2020-12-09T14:25:19Z

src/vm/types/signatures.rs

@@ -64,7 +64,7 @@ impl AssetIdentifier {

 #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
 pub struct TupleTypeSignature {
-    type_map: BTreeMap<ClarityName, TypeSignature>,
+    pub type_map: BTreeMap<ClarityName, TypeSignature>,


Does this need to be pub?

Yeah when doing the merge operation, I'm draining, in order to avoid cloning. Do you prefer a method returning a mutable ref? or cloning?

Sorry for the slow response -- it looks like the draining/mutation is all happening in shallow_merge, which is defined in this same module, so the pub shouldn't be necessary.

Indeed, got confused by pub behavior, thanks.
Addressed with 65706f3

@lgalabru Perhaps a reference counted pointer could be of use here, which may avoid much of the cloning through structural sharing in persistent data structures.
https://doc.rust-lang.org/book/ch15-04-rc.html

kantai

This LGTM! Only comment is about the visibility of TupleTypeSignature::type_map, I think that field should remain private.

lgalabru added 2 commits December 2, 2020 01:03

feat: implement native function (analysis + execution)

df197a6

feat: add tests, add docs

ae3a47a

lgalabru self-assigned this Dec 2, 2020

chore: cargo fnt

cb7a70c

lgalabru marked this pull request as ready for review December 3, 2020 15:02

njordhov reviewed Dec 4, 2020

View reviewed changes

src/vm/functions/mod.rs Outdated Show resolved Hide resolved

lgalabru requested review from kantai and jcnelson December 7, 2020 17:54

kantai reviewed Dec 7, 2020

View reviewed changes

kantai approved these changes Dec 7, 2020

View reviewed changes

njordhov reviewed Dec 7, 2020

View reviewed changes

src/vm/docs/mod.rs Show resolved Hide resolved

jcnelson reviewed Dec 8, 2020

View reviewed changes

src/vm/types/mod.rs Outdated Show resolved Hide resolved

jcnelson reviewed Dec 8, 2020

View reviewed changes

jcnelson requested changes Dec 8, 2020

View reviewed changes

lgalabru changed the base branch from feat/tuples-volume-1 to next December 8, 2020 19:37

lgalabru added 8 commits December 8, 2020 17:46

Merge branch 'next' into feat/add-merge

cbc9a01

# Conflicts: # src/vm/tests/datamaps.rs

refactor: rename set to merge

6f976eb

feat: shallow merge, instead of deep merge

8b5c1f0

chore: cargo fmt

c8efb7d

fix: name collision

049d97d

fix: update definition sorter

a1a3082

chore: cargo fmt

bc5d32d

feat: add cost tracking

1b076b0

lgalabru requested review from jcnelson and kantai December 9, 2020 02:59

jcnelson approved these changes Dec 9, 2020

View reviewed changes

kantai reviewed Dec 9, 2020

View reviewed changes

kantai approved these changes Dec 9, 2020

View reviewed changes

fix: acl

65706f3

lgalabru merged commit 883f7af into next Dec 9, 2020

lgalabru mentioned this pull request Jan 29, 2021

Clarity function to merge record structures #1733

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarity: introduce merge function #2117

Clarity: introduce merge function #2117

lgalabru commented Dec 2, 2020 •

edited

Loading

njordhov commented Dec 2, 2020 •

edited

Loading

kantai Dec 7, 2020

njordhov Dec 8, 2020 •

edited

Loading

lgalabru Dec 9, 2020

kantai left a comment

jcnelson Dec 8, 2020 •

edited

Loading

kantai Dec 8, 2020

lgalabru Dec 8, 2020

jcnelson Dec 8, 2020 •

edited

Loading

njordhov Dec 8, 2020 •

edited

Loading

lgalabru Dec 8, 2020

lgalabru Dec 8, 2020 •

edited

Loading

njordhov Dec 8, 2020 •

edited

Loading

kantai Dec 8, 2020

lgalabru Dec 8, 2020

jcnelson left a comment

jcnelson left a comment

kantai Dec 9, 2020

lgalabru Dec 9, 2020

kantai Dec 9, 2020

lgalabru Dec 9, 2020

njordhov Dec 9, 2020 •

edited

Loading

kantai left a comment

Clarity: introduce merge function #2117

Clarity: introduce merge function #2117

Conversation

lgalabru commented Dec 2, 2020 • edited Loading

njordhov commented Dec 2, 2020 • edited Loading

Choose a reason for hiding this comment

njordhov Dec 8, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kantai left a comment

Choose a reason for hiding this comment

jcnelson Dec 8, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jcnelson Dec 8, 2020 • edited Loading

Choose a reason for hiding this comment

njordhov Dec 8, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lgalabru Dec 8, 2020 • edited Loading

Choose a reason for hiding this comment

njordhov Dec 8, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jcnelson left a comment

Choose a reason for hiding this comment

jcnelson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

njordhov Dec 9, 2020 • edited Loading

Choose a reason for hiding this comment

kantai left a comment

Choose a reason for hiding this comment

lgalabru commented Dec 2, 2020 •

edited

Loading

njordhov commented Dec 2, 2020 •

edited

Loading

njordhov Dec 8, 2020 •

edited

Loading

jcnelson Dec 8, 2020 •

edited

Loading

jcnelson Dec 8, 2020 •

edited

Loading

njordhov Dec 8, 2020 •

edited

Loading

lgalabru Dec 8, 2020 •

edited

Loading

njordhov Dec 8, 2020 •

edited

Loading

njordhov Dec 9, 2020 •

edited

Loading