Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spec: Refactor description of deserialization #128

Merged
merged 12 commits into from
Nov 4, 2020
Merged

Conversation

nomeata
Copy link
Collaborator

@nomeata nomeata commented Nov 2, 2020

This is a significant rewirte of the section in the spec describing
deserialization. Some points worth noting:

  • This replaces the elaboration relation with one that takes a value
    and a type as input, and returns a value. This makes it clearer that
    there is no “input type” of deeper relevance.

  • One great benefit: No more long prose about how the rules assume
    these types to be principal, even if they aren’t.

    This clarifies questions like this
    one

  • Also, the existing rules about “elaborating function values” were
    kinda bogus, as function values are just accepted as they are. This
    is much easier now, also good.

  • Unfortuantely for this application, our textual representation has
    overloading. Instead of defining yet another abstract value algebra,
    I did a bit of hand-waving, and defined an “overloading-free
    fragment” of the textual format for this section.

  • A fair number of rules becomes simpler. Promising!

  • I was able to phrase some interesting properties more formally. Also
    promising! Didn’t actually prove them, though, although we should.

  • Still unclear to me what we mean with “subtyping is complete”, see
    the section there.

  • Decoding is still an inductive relation, so not fully algorithmic.
    This may be an issue with a stright-forward formalization; not all
    theorem provers like negative occurrences of inductive relations in
    their rules.

  • Syntax of the relations up for discussion.

This is a significant rewirte of the section in the spec describing
deserialization. Some points worth noting:

 * This replaces the elaboration relation with one that takes a value
   and a type as input, and returns a value. This makes it clearer that
   there is no “input _type_” of deeper relevance.

 * One great benefit: No more long prose about how the rules assume
   these types to be principal, even if they aren’t.

   This clarifies questions like [this
   one](#126 (comment))

 * Also, the existing rules about “elaborating function values” were
   kinda bogus, as function values are just accepted as they are. This
   is much easier now, also good.

 * Unfortuantely for this application, our textual representation has
   overloading. Instead of defining yet another abstract value algebra,
   I did a bit of hand-waving, and defined an “overloading-free
   fragment” of the textual format for this section.

 * A fair number of rules becomes simpler. Promising!

 * I was able to phrase some interesting properties more formally. Also
   promising! Didn’t actually prove them, though, althogh we should.

 * Still unclear to me what we mean with “subtyping is complete”, see
   the section there.

 * Decoding is still an inductive relation, so not fully algorithmic.
   This may be an issue with a stright-forward formalization; not all
   theorem provers like negative occurrences of inductive relations in
   their rules.

 * Syntax of the relations up for discussion.
@nomeata
Copy link
Collaborator Author

nomeata commented Nov 2, 2020

@rossberg @chenyan-dfinity , what do you think of this?

As an implementor, I think I’d find this presentation more helpful. And if I were to prove soundess etc. formally, then too.

@nomeata
Copy link
Collaborator Author

nomeata commented Nov 2, 2020

If the diff is too confusing to read, then head to https://github.com/dfinity/candid/blob/joachim/rewrite/spec/Candid.md#decoding

@nomeata nomeata requested a review from rossberg November 3, 2020 07:25
Copy link
Contributor

@rossberg rossberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff, thanks!

spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated
```
(∀ v. v :? T ~> _ ⇒ v :? T' ~> _) ⇒ T <: T'
```
This does not hold as state, because of counter examples involving the empty type. For example we do not have `opt empty <: null`, or `Empty <: t` where `type Empty = rec { Empty }`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the following formulation:

(∀ v, v'. v : T /\ v :? T' ~> v') ⇒ T <: T'

Is there anything wrong with that? And should it not be a requirement?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That formulation doesn't make sense to me, the antecent is never true.

Should /\ be ? Or should v' be behind some ∃?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, oops, wrong quantifier:

(∃ v, v'. v : T /\ v ~> v' : T') ⇒ T <: T'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the antecent is far too weak, it says “T is a subtype of T' if some values of type T make sense at type T'). This would, for example, relate all list types (use v = [] and v' = []).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about I remove this section (so this can land if all else is fine), and we discuss this in a separate issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, you're right, I should turn my brain on. Okay, last try, if that's wrong as well, feel free to remove it. :)

(∃ v. v : T /\ ∀ v. v : T ⇒ ∃ v'. v ~> v' : T') ⇒ T <: T'

This looks a bit like monster-barring the uninhabited case, but I think it is a natural definition, saying: if serialised values exist for type T, and all such values can be deserialised at type T', then T ought to be considered a subtype of T'.

Uninhabited types tend to make everything harder...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, sorry still not good.

In the antecent instead of ∀ v. v : T ⇒ I am sure you ∀ v. v ~> _ : T ⇒ (i.e. not just the “canonical” values, but really “all values that can be deserialized at T). But I think that’s what you mean anyways.

But empty types still break it: This definition would still imply vec Empty <: vec t for all t.

And even then: Remember that coercion never fails for reference values (because we don't do a subtyping check when decoding and coercion), so any definition for completeness that is based on purely the _ ~> _ : _ relation will probably relate all function types, which is obviously not sound (in the non-local; IDL-Soundess.md sense).

Let’s take this into a separate issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to #132

@nomeata nomeata changed the title Spec: Refactor description of deserializatoin Spec: Refactor description of deserialization Nov 3, 2020
spec/Candid.md Outdated Show resolved Hide resolved
Co-authored-by: Andreas Rossberg <rossberg@dfinity.org>
Copy link
Collaborator Author

@nomeata nomeata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly answers (and questions), will perform the changes after lunch.

spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated
```
(∀ v. v :? T ~> _ ⇒ v :? T' ~> _) ⇒ T <: T'
```
This does not hold as state, because of counter examples involving the empty type. For example we do not have `opt empty <: null`, or `Empty <: t` where `type Empty = rec { Empty }`
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That formulation doesn't make sense to me, the antecent is never true.

Should /\ be ? Or should v' be behind some ∃?

@rossberg
Copy link
Contributor

rossberg commented Nov 3, 2020

Btw, nit: various inference bars are too short.

spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
spec/Candid.md Outdated Show resolved Hide resolved
nomeata and others added 2 commits November 3, 2020 15:44
Co-authored-by: Andreas Rossberg <rossberg@dfinity.org>
@nomeata
Copy link
Collaborator Author

nomeata commented Nov 3, 2020

Ok, all comments handled one way or another

@nomeata
Copy link
Collaborator Author

nomeata commented Nov 3, 2020

I’ll give @chenyan-dfinity a chance to comment before merging

nomeata added a commit that referenced this pull request Nov 3, 2020
Copy link
Contributor

@chenyan-dfinity chenyan-dfinity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. It's indeed much cleaner!

spec/Candid.md Outdated Show resolved Hide resolved

* Number literals (`<primval>`) must be immediately enclosed with an `<annval>` that clarifies the precise number type.
* The canonical value of type `reserved` is expressed as `(null : reserved)`.
* No other use of `<annval>`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need annval for null : opt _

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I am pretty sure we don't! That’s because

  • On the left of ~>, it doesn't matter which null it is (any null behaves the same)
  • On the right of ~>, the type is given, so again there is only no null value.

So it is fine to have a single null value in the V set.

This is not true for overloaded numbers; (5 : nat) ~> and (5 : int) ~> behave differently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, null has a principal type, numbers don't. (Which makes me wonder, why do we need a special case for reserved?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the algebra of binary values has a dedicated reserved value (DIDL\x00\x01\x70), but our textual representation doesn’t. The (null : reserved) could just as well be ("reserved" : reserved) or (record {} : reserved); it’s arbitrary and, in a way, always lie.

For me, the conceptual “abstract value data model” has a dedicated reserved value, and I think it would be good to have it in the textual representation as well, to avoid the above wart in the formalism, but also to not force debug output to come up with such arbitrary values when decoding a message with a reserved value:

~/dfinity/haskell-candid $ cabal -v0 run hcandid -- --decode --rust "DIDL\x00\x01\x70"
((null : reserved))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More evidence that it might be useful to have it: All implementations that have a “value” type seemed to feel the need to add it to their data type:

spec/Candid.md Outdated Show resolved Hide resolved
Co-authored-by: Yan Chen <48968912+chenyan-dfinity@users.noreply.github.com>
@nomeata nomeata merged commit 03773cc into master Nov 4, 2020
@nomeata nomeata deleted the joachim/rewrite branch November 4, 2020 09:19
nomeata added a commit that referenced this pull request Nov 21, 2020
this adds some tests to account for the new behaviour of
#110 and #128 and #134 and #137.

Co-authored-by: chenyan-dfinity <yan.chen@dfinity.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants