-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarity: introduce merge function #2117
Conversation
The verb The term The name (fold merge record-list initial-record) Alternatively, what about using the declarative name |
src/vm/ast/definition_sorter/mod.rs
Outdated
@@ -254,6 +254,14 @@ impl<'a> DefinitionSorter { | |||
} | |||
return Ok(()); | |||
} | |||
NativeFunctions::TupleSet => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be handled with a special case in this match? Or could TupleSet just be handled like any other native function (i.e., fall through to line 270, which just iterates over the expressions)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also #2123
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed with a1a3082
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me -- just a couple small comments. One is that I agree with @njordhov that the function can (and should) be implemented as a normal native function. I also agree that it makes more sense to call this merge
-- I think set
connotes that it is a mutating method, which it isn't.
src/vm/types/mod.rs
Outdated
base.type_signature.clone(), | ||
))?; | ||
|
||
let new_value = match existing_value { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure I understand the implications of this code:
(set { a: { c: 1 }, b: 2 } { a: 1 })
will fail at runtime, because when processinga
inbase
,existing_value
would be aValue::Tuple
butvalue
would be aValue::Uint(_)
, triggering theCheckErrors::ExpectedTuple
error.(set { a: 1, b: 2 } { a: { c: 1 } })
would succeed, because when processinga
inbase
,existing_value
would be aValue::Uint(_)
and would just fall through this case. The resulting tuple would be{ a: { c: 1 }, b: 2 }
.(set { a: { c: 1 }, b: 2 } { a: { d: 2 } })
would fail, because the inner call toTupleData::deep_merge()
would fail for the inner{ c: 1 }
and{ d: 2 }
values fora
.(set { a: { c: 1, d: 2 }, b: 2 } { a: { c: 2 } })
succeed, and the tuple values fora
in bothbase
andupdates
would be merged through this inner call toTupleData::deep_merge()
. The resulting tuple would be{ a: { c: 1, d: 2 }, b: 2}
.
Is my understanding of the above correct? If so, then I think it's kind of surprising. As a programmer, I would have expected the following behaviors:
(set { a: { c: 1 }, b: 2 } { a: 1 })
should succeed, but produce the tuple{ a: 1, b: 2 }
.(set { a: 1, b: 2 } {a: { c: 1 } })
should behave as described above, and produce{ a: { c: 1 }, b: 2 }
.(set { a: { c: 1 }, b: 2 } { a: { d: 2 } })
should succeed, but produce the tuple{ a: { d: 2 }, b: 2 }
.(set { a: { c: 1, d: 2 }, b: 2 } { a: { c : 2 } })
should succeed, but produce the tuple{ a: { c: 2 }, b: 2 }
.
I figured that if I wanted to merge two tuples, the merger ought to be "shallow" and only affect the tuple's immediate keys (instead of a "deep" set, which merges all contained tuples as well). If I wanted the "deep" behavior, I would have explicitly called set
on the inner tuples. Making the merger "deep" by default makes cost-tracking more difficult, because the "deep" merge algorithm introduces recursion -- you'll need to bill the user for processing each sub-tuple, as well as find a way to statically determine that a single (set ..)
can't cause a stack overflow in the VM (leading to a node crash).
What do you all think? Also cc @kantai
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm -- yes, I think I agree with this. I think it makes more sense for this function to produce a shallow "union" tuple, rather than trying to do a deep merge.
I'm not so worried about the cost-tracking of a deep merge, which could be implemented as a different function (though, I'm not sure how useful it would be -- in our boot contracts and from the contracts people have mostly produced, tuples tend not to nest deeply, and when they do, it's rare that only a field of a child tuple would be updated) -- the cost of a deep merge is still linear in the size of the two inputs, and the depth of the merge is bounded by the type depth limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I designed this feature based on what I've been experiencing while writing contracts. I think in 100% of cases, I would have use this function for updating a tuple coming from storage.
If you look at this use case, then you never want to update the "base" type signature, and be as terse as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I designed this feature based on what I've been experiencing while writing contracts. I think in 100% of cases, I would have use this function for updating a tuple coming from storage.
If you look at this use case, then you never want to update the "base" type signature, and be as terse as possible.
This use-case doesn't invalidate the argument for creating a "union" tuple via (set ..)
. A subsequent call to var-set
, map-set
or map-insert
would still need to pass type-check. Assuming we did something like what I suggest above, then this code snippet would not pass analysis, for example:
(define-map foo { a: uint, b : { c: uint } } { d: uint, e: { f: uint } })
(let (
(test (unwrap-panic (map-get { a: u1, b: { c: u2 } })))
) (
;; will fail to type-check, since `(set ..)` has type `{ d: uint, e: { f: uint, g: uint } }`,
;; which is incompatible with the value type definition for `foo`
map-set { a: u1, b: { c: u2 } } (set test { d: u3, e: { g: u4 } }))
))
So, having the "union" behavior shouldn't prevent you from safely loading, updating, and storing structured data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in 100% of cases, I would have use this function for updating a tuple coming from storage.
@lgalabru A common use case for merge
in Clarity is to merge the state in a reducer with updated values. See for example the add-pox-addr-to-ith-reward-cycle
reducer function tin the pox contract, which could use a (merge params {i: ...})
expression to replace the explicit passing of properties in line 264-267:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I'll fold
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working on re-implementing this feature, but one problem that we would now be facing, is how do we handle optionals without turning this merge
feature into a footgun - as in what should be the type signature of the following cases:
(merge { a: (some { x: 0, y: 1 }), b: 2, c: 3 } { a: none })
(merge { a: 1, b: 2, c: 3 } { a: none })
(merge { b: 2, c: 3 } { a: none })
Happy to get a new round of thoughts on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do we handle optionals
@lgalabru Option properties should be handled like any other type. Consider that it is is just an enum with two variants (Some and None) just like a Response is an enum with two variants (Ok and Err). Despite the Clarity type name, the property is not optional (clarity-lang/reference#22).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all cases, I think the output type would be the same:
(merge { a: (some { x: 0, y: 1 }), b: 2, c: 3 } { a: none }) -> { a: (optional NoType), b: int, c: int }
(merge { a: 1, b: 2, c: 3 } { a: none }) -> { a: (optional NoType), b: int, c: int }
(merge { b: 2, c: 3 } { a: none }) -> { a: (optional NoType), b: int, c: int }
This should be the natural output from a type checker that just set the output tuple type to be the "shallow" union of the two input tuples. The NoType
here won't present difficulties to the type checker unless someone tried to unwrap!
the optional
, which is a type error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, I was not familiar with the type admission rules of NoType
/ UnknownType
, and it looks like this works as expected:
(define-map users uint { address: principal, name: (optional (string-ascii 32)) })
(let
((user (unwrap-panic (map-get? users u0))))
(map-set users u0 (merge user { name: none })))
Approach revisited in 8b5c1f0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need some clarity (no pun intended) in the intended behavior of set
, as outlined in my comments.
# Conflicts: # src/vm/tests/datamaps.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you for addressing my feedback!
src/vm/types/signatures.rs
Outdated
@@ -64,7 +64,7 @@ impl AssetIdentifier { | |||
|
|||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] | |||
pub struct TupleTypeSignature { | |||
type_map: BTreeMap<ClarityName, TypeSignature>, | |||
pub type_map: BTreeMap<ClarityName, TypeSignature>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be pub
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah when doing the merge operation, I'm draining, in order to avoid cloning. Do you prefer a method returning a mutable ref? or cloning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the slow response -- it looks like the draining/mutation is all happening in shallow_merge
, which is defined in this same module, so the pub
shouldn't be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, got confused by pub
behavior, thanks.
Addressed with 65706f3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgalabru Perhaps a reference counted pointer could be of use here, which may avoid much of the cloning through structural sharing in persistent data structures.
https://doc.rust-lang.org/book/ch15-04-rc.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM! Only comment is about the visibility of TupleTypeSignature::type_map
, I think that field should remain private.
This PR is addressing #1733 by introducing a
(set base-tuple tuple)
.The initial issue was suggesting
merge
, it could also becombine
, orset
(i personally like the symmetry withget
) - please chime in on the naming :).Todo left: