-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spec: Refactor description of deserialization #128
Conversation
This is a significant rewirte of the section in the spec describing deserialization. Some points worth noting: * This replaces the elaboration relation with one that takes a value and a type as input, and returns a value. This makes it clearer that there is no “input _type_” of deeper relevance. * One great benefit: No more long prose about how the rules assume these types to be principal, even if they aren’t. This clarifies questions like [this one](#126 (comment)) * Also, the existing rules about “elaborating function values” were kinda bogus, as function values are just accepted as they are. This is much easier now, also good. * Unfortuantely for this application, our textual representation has overloading. Instead of defining yet another abstract value algebra, I did a bit of hand-waving, and defined an “overloading-free fragment” of the textual format for this section. * A fair number of rules becomes simpler. Promising! * I was able to phrase some interesting properties more formally. Also promising! Didn’t actually prove them, though, althogh we should. * Still unclear to me what we mean with “subtyping is complete”, see the section there. * Decoding is still an inductive relation, so not fully algorithmic. This may be an issue with a stright-forward formalization; not all theorem provers like negative occurrences of inductive relations in their rules. * Syntax of the relations up for discussion.
@rossberg @chenyan-dfinity , what do you think of this? As an implementor, I think I’d find this presentation more helpful. And if I were to prove soundess etc. formally, then too. |
If the diff is too confusing to read, then head to https://github.com/dfinity/candid/blob/joachim/rewrite/spec/Candid.md#decoding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff, thanks!
spec/Candid.md
Outdated
``` | ||
(∀ v. v :? T ~> _ ⇒ v :? T' ~> _) ⇒ T <: T' | ||
``` | ||
This does not hold as state, because of counter examples involving the empty type. For example we do not have `opt empty <: null`, or `Empty <: t` where `type Empty = rec { Empty }` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect the following formulation:
(∀ v, v'. v : T /\ v :? T' ~> v') ⇒ T <: T'
Is there anything wrong with that? And should it not be a requirement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That formulation doesn't make sense to me, the antecent is never true.
Should /\
be →
? Or should v'
be behind some ∃?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, oops, wrong quantifier:
(∃ v, v'. v : T /\ v ~> v' : T') ⇒ T <: T'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the antecent is far too weak, it says “T
is a subtype of T'
if some values of type T
make sense at type T'
). This would, for example, relate all list types (use v = []
and v' = []
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about I remove this section (so this can land if all else is fine), and we discuss this in a separate issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, you're right, I should turn my brain on. Okay, last try, if that's wrong as well, feel free to remove it. :)
(∃ v. v : T /\ ∀ v. v : T ⇒ ∃ v'. v ~> v' : T') ⇒ T <: T'
This looks a bit like monster-barring the uninhabited case, but I think it is a natural definition, saying: if serialised values exist for type T, and all such values can be deserialised at type T', then T ought to be considered a subtype of T'.
Uninhabited types tend to make everything harder...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, sorry still not good.
In the antecent instead of ∀ v. v : T ⇒
I am sure you ∀ v. v ~> _ : T ⇒
(i.e. not just the “canonical” values, but really “all values that can be deserialized at T). But I think that’s what you mean anyways.
But empty types still break it: This definition would still imply vec Empty <: vec t
for all t
.
And even then: Remember that coercion never fails for reference values (because we don't do a subtyping check when decoding and coercion), so any definition for completeness that is based on purely the _ ~> _ : _
relation will probably relate all function types, which is obviously not sound (in the non-local; IDL-Soundess.md sense).
Let’s take this into a separate issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to #132
Co-authored-by: Andreas Rossberg <rossberg@dfinity.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly answers (and questions), will perform the changes after lunch.
spec/Candid.md
Outdated
``` | ||
(∀ v. v :? T ~> _ ⇒ v :? T' ~> _) ⇒ T <: T' | ||
``` | ||
This does not hold as state, because of counter examples involving the empty type. For example we do not have `opt empty <: null`, or `Empty <: t` where `type Empty = rec { Empty }` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That formulation doesn't make sense to me, the antecent is never true.
Should /\
be →
? Or should v'
be behind some ∃?
Btw, nit: various inference bars are too short. |
Co-authored-by: Andreas Rossberg <rossberg@dfinity.org>
Ok, all comments handled one way or another |
I’ll give @chenyan-dfinity a chance to comment before merging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. It's indeed much cleaner!
|
||
* Number literals (`<primval>`) must be immediately enclosed with an `<annval>` that clarifies the precise number type. | ||
* The canonical value of type `reserved` is expressed as `(null : reserved)`. | ||
* No other use of `<annval>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need annval for null : opt _
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I am pretty sure we don't! That’s because
- On the left of
~>
, it doesn't matter whichnull
it is (anynull
behaves the same) - On the right of
~>
, the type is given, so again there is only nonull
value.
So it is fine to have a single null
value in the V
set.
This is not true for overloaded numbers; (5 : nat) ~>
and (5 : int) ~>
behave differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, null
has a principal type, numbers don't. (Which makes me wonder, why do we need a special case for reserved
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the algebra of binary values has a dedicated reserved
value (DIDL\x00\x01\x70
), but our textual representation doesn’t. The (null : reserved)
could just as well be ("reserved" : reserved)
or (record {} : reserved)
; it’s arbitrary and, in a way, always lie.
For me, the conceptual “abstract value data model” has a dedicated reserved
value, and I think it would be good to have it in the textual representation as well, to avoid the above wart in the formalism, but also to not force debug output to come up with such arbitrary values when decoding a message with a reserved
value:
~/dfinity/haskell-candid $ cabal -v0 run hcandid -- --decode --rust "DIDL\x00\x01\x70"
((null : reserved))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More evidence that it might be useful to have it: All implementations that have a “value” type seemed to feel the need to add it to their data type:
Co-authored-by: Yan Chen <48968912+chenyan-dfinity@users.noreply.github.com>
This is a significant rewirte of the section in the spec describing
deserialization. Some points worth noting:
This replaces the elaboration relation with one that takes a value
and a type as input, and returns a value. This makes it clearer that
there is no “input type” of deeper relevance.
One great benefit: No more long prose about how the rules assume
these types to be principal, even if they aren’t.
This clarifies questions like this
one
Also, the existing rules about “elaborating function values” were
kinda bogus, as function values are just accepted as they are. This
is much easier now, also good.
Unfortuantely for this application, our textual representation has
overloading. Instead of defining yet another abstract value algebra,
I did a bit of hand-waving, and defined an “overloading-free
fragment” of the textual format for this section.
A fair number of rules becomes simpler. Promising!
I was able to phrase some interesting properties more formally. Also
promising! Didn’t actually prove them, though, although we should.
Still unclear to me what we mean with “subtyping is complete”, see
the section there.
Decoding is still an inductive relation, so not fully algorithmic.
This may be an issue with a stright-forward formalization; not all
theorem provers like negative occurrences of inductive relations in
their rules.
Syntax of the relations up for discussion.