Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for with keyword #923

Merged
merged 16 commits into from
Mar 3, 2020
Merged

Add support for with keyword #923

merged 16 commits into from
Mar 3, 2020

Conversation

Gabriella439
Copy link
Contributor

Fixes #114

Fixes #340

Related to #754

This adds a new keyword which is syntactic sugar for using // to
perform a deep override of the record. This comes in handy when
working with very large records (such as those found in
dhall-kubernetes). This similar in spirit to Kustomize, albeit not
as powerful and customizable as Kustomize.

The goal is that the user can write the following code to update an
existing default configuration:

someDeployment with {
, metadata.labels =
      someDeployment.metadata.labels
    # toMap { `scheduler.alpha.kubernetes.io/critical-pod` = "" }
}

You can also see this idiom outside of the Dhall/Kuberentes ecosystem.
For example, many Nix users manually apply deeply nested updates by
hand, like this example from the Nixpkgs manual:

  {
    haskell = super.haskell // {
      packages = super.haskell.packages // {
        ghc784 = super.haskell.packages.ghc784.override {
          overrides = self: super: {
            ghc-events = self.callPackage ./ghc-events-0.4.3.0.nix {};
          };
        };
      };
    };
  };

... where the Dhall equivalent would be:

super with
  { haskell.packages.ghc784 = super.haskell.packages.ghc784.override {
      
    }
  }

There is one additional change that I plan to make that builds on top of
this which is to would like to have the record completion syntax desugar
to use with instead of //, like this:

T::r = (T.default with r) : T.Type

... which would allow for deep overrides for the record completion
operator, too. However, I chose to defer that to a separate change to
keep this change as minimal as possible.

Fixes #114

Fixes #340

Related to #754

This adds a new keyword which is syntactic sugar for using `//` to
perform a deep override of the record.  This comes in handy when
working with very large records (such as those found in
`dhall-kubernetes`).  This similar in spirit to Kustomize, albeit not
as powerful and customizable as Kustomize.

The goal is that the user can write the following code to update an
existing default configuration:

```dhall
someDeployment with {
, metadata.labels =
      someDeployment.metadata.labels
    # toMap { `scheduler.alpha.kubernetes.io/critical-pod` = "" }
}
```

You can also see this idiom outside of the Dhall/Kuberentes ecosystem.
For example, many Nix users manually apply deeply nested updates by
hand, like this example from the Nixpkgs manual:

```nix
  {
    haskell = super.haskell // {
      packages = super.haskell.packages // {
        ghc784 = super.haskell.packages.ghc784.override {
          overrides = self: super: {
            ghc-events = self.callPackage ./ghc-events-0.4.3.0.nix {};
          };
        };
      };
    };
  };
```

... where the Dhall equivalent would be:

```dhall
super with
  { haskell.packages.ghc784 = super.haskell.packages.ghc784.override {
      …
    }
  }
```

There is one additional change that I plan to make that builds on top of
this which is to would like to have the record completion syntax desugar
to use `with` instead of `//`, like this:

```dhall
T::r = (T.default with r) : T.Type
```

... which would allow for deep overrides for the record completion
operator, too.  However, I chose to defer that to a separate change to
keep this change as minimal as possible.
standard/dhall.abnf Outdated Show resolved Hide resolved
tests/parser/success/unit/withA.dhall Outdated Show resolved Hide resolved
@TristanCacqueray
Copy link
Collaborator

IIUC the implementation, this won't work for Optional nested record, as described in #924 . Could this be supported in the future? I guess this would requires a bit of syntax to differentiate optional, for example using: x with { template.spec?.serviceAccount = Some "builder" }

standard/record.md Outdated Show resolved Hide resolved
... to avoid a desugaring algorithm with exponential code complexity

... as suggested by @Nadrieril
... for consistency with other files in the same directory
... which verifies that they behave the same as multiple updates for a
single `with`

... as suggested by @Nadrieril
... to make room for a simpler test for the base case
@Gabriella439
Copy link
Contributor Author

I forgot to mention that the matching change to the Haskell implementation is here: dhall-lang/dhall-haskell#1685

@sjakobi
Copy link
Collaborator

sjakobi commented Feb 26, 2020

I find it rather tricky to wrap my head around the desugaring rules. Ultimately I'm somewhat uncomfortable that the "override-record" looks like a record but isn't one. I expect this to cause quite a bit of confusion.

Say, for example, you want to change { a.b = 1 } to { a = { b = 1, c.d = 2 } }. Naively, I'd write

{ a.b = 1 } with { a.c.d = 2 }

but that fails with

Error: Missing record field: c

What you have to write is

{ a.b = 1 } with { a.c = { d = 2 } }

I think it's not very easy to understand why only the second version works, especially since the "override-records" are equivalent when interpreted as actual records.

So I want to bring up two alternatives which might help:

1. Pick a syntax for the updates that doesn't look like record literals. For example, if the updates look more like list literals, that might emphasize the importance of the ordering:

Instead of

record with { a = x, b = y }

we could write

record with [ a = x, b = y ]

2. How about implementing this via β-normalization instead of desugaring?! Could that help us regain confluence even when we treat the overrides as a proper record? If we add proper type-checking, that should also help with the error messages. My impression is that this feature might be too complex to handle it via desugaring.

@Gabriella439
Copy link
Contributor Author

@sjakobi @Nadrieril: What if I were to retract this proposal and replace it with another proposal that does the following:

  • Change the right argument to with to be a real record (i.e. no longer a superficially record-like update syntax)

  • Treat with as something handled at β-normalization time rather than desugaring time (like @sjakobi and @Nadrieril suggested)

  • The β-normalization rules would be something like:

    -- Translating multiple updates into multiple chained `with` statements
    r with { k = v,  } = r with { k = v } with {  }
    -- Base case for a single update where the update expression is an immediate record literal
    r with { k = v } = let _ = r in _ // { k = _.k with v }
    
    -- Base case for a single update where the update expression is known not to be a record
    -- literal (e.g. an immediate `Bool` literal)
    r with v = v
    
    -- For all other cases (e.g. bound variables, operator expressions, etc.), don't reduce further
    r with x = r with x

Most of the complexity would be in explicitly enumerating which expressions fit into the latter two categories (i.e. immediate non-record literals vs. abstract expressions), but the upside is that the update expression could then be a real record literal and desugaring/resugaring the update record to add/remove dotted labels wouldn't change the behavior.

@Nadrieril
Copy link
Member

What if I were to retract this proposal and replace it with another proposal that does the following:

That sounds appealing, but then that means there is no way to deeply replace a whole record, and that might be something we want.
Actually, there seems to be a deeper problem. Imagine some kind of generic code that uses with:

let update
    : (T : Type)  (U : Type)  U  { contents : T }  { contents : U }
    =   λ(T : Type)
       λ(U : Type)
       λ(thing : { contents : T })
       λ(newContents : U)
       thing with { contents = newContents }
in ...

With the new proposal, this would be equivalent to:

let update
    : (T : Type)  (U : Type)  U  { contents : T }  { contents : U }
    =   λ(T : Type)
       λ(U : Type)
       λ(thing : { contents : T })
       λ(newContents : U)
       { contents = thing.contents with newContents }
in ...

What happens if we instantiate update on a record type however ?

update { _1 : Bool } { _2 : Natural } { _1 = True } { _2 = 0 }
=== { contents = { _1 = True } with { _2 = 0 } }
=== { contents = { _1 = True, _2 = 0 } }

We just broke type soundness: the return value of update should have been { contents : { _2 : Natural } }.

More generally, we would need an equivalent of with at the type level if we wanted that to be sound, and update would have return type { contents : T With U }. However I believe most users would expect to be able to write update like I did, which becomes impossible if we had a type-level with. This, plus @TristanCacqueray's suggestion, makes me prefer the current proposal to the new one.

How about we change the syntax however, and change it to record with x.y.z = 42 ? with would look a bit like let, and subsequent withs would be aligned like lets:

record
	with x.y.z = 42
	with a.b.c = False
	with foo.bar?.baz = Some "string"

Then the confusion with records disappears.

@sjakobi
Copy link
Collaborator

sjakobi commented Feb 27, 2020

@Nadrieril I like the syntax you propose! My example from #923 (comment) would look like this, right?

{ a.b = 1 } with a.c = { d = 2 }
=> { a = { b = 1, c.d = 2 } }

To replace the entire a field instead, I could write

{ a.b = 1 } with a = { c.d = 2 }
=> { a.c.d = 2 }

And this would still be illegal I assume (although it might be nice if we could make it work):

{ a.b = 1 } with a.c.d = 2

Regarding the ?. syntax for Optional fields, I think we should maybe add that separately later on.

@Gabriella439
Copy link
Contributor Author

@Nadrieril: Yeah, I like your idea. I'll amend this proposal to use that syntax instead

... as suggested by @Nadrieril

This requires changing the relative precedence of
`with-expression` and `application-expression`
@Gabriella439
Copy link
Contributor Author

@Nadrieril @sjakobi: Alright, I updated the proposal so that the new syntax is r with ks = v

standard/record.md Outdated Show resolved Hide resolved
... by adding the case-sensitive `with` rule and using that instead of
the case-insensitive "with"
Copy link
Collaborator

@sjakobi sjakobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add another parser test that updates a field named Some?

I'm very happy how this proposal has shaped up, and I'm honestly a bit proud to have been involved in the discussion. :)

Great work!

standard/record.md Outdated Show resolved Hide resolved
@Nadrieril
Copy link
Member

I've thought more about how I would implement this and I have changed my thoughts a bit:

  • What I said about shift is actually incorrect: since we desugar at parse time there is no way around using shift. This makes me reconsider my position, because shift is both inefficient and conceptually ugly: the whole subtree needs to be traversed and entirely reallocated. This also prevents phase fusion, where one would start resolving imports and typechecking before parsing is finished.
  • Desugaring makes it quite harder to provide clear error messages, which are already difficult to implement well.

Considering this, I'd prefer to have with handled like ::, i.e. with a desugaring that happens after parse time. This would give more freedom to implementations to give informative error messages and/or avoid unnecessary shifting. We would just need to move the desugaring after the parse phase, and add a binary encoding for the with operator.

@sjakobi
Copy link
Collaborator

sjakobi commented Mar 1, 2020

FWIW I played a bit with the Haskell implementation and thought that the error messages were mostly okay, e.g.

⊢ { a.b = 1 } with c.d = 2

Error: Missing record field: c

1│ { a.b = 1 } with c.d = 2

This one wasn't so good though, even with --explain:

⊢ { a = 1 } with a.b = 2

Error: You can only combine records

1│ { a = 1 } with a.b = 2
⊢ :set --explain
⊢ { a = 1 } with a.b = 2

it : { a : { b : Natural }, c : Natural }
_ : { a : Natural }

Error: You can only combine records

Explanation: You can combine records using the ❰⫽❱ operator, like this:


    ┌───────────────────────────────────────────┐
    │ { foo = 1, bar = "ABC" } ⫽ { baz = True } │
    └───────────────────────────────────────────┘


    ┌─────────────────────────────────────────────┐
    │ λ(r : { foo : Bool }) → r ⫽ { bar = "ABC" } │
    └─────────────────────────────────────────────┘


... but you cannot combine values that are not records.

For example, the following expressions are not valid:


    ┌──────────────────────────────┐
    │ { foo = 1, bar = "ABC" } ⫽ 1 │
    └──────────────────────────────┘
                                 ⇧
                                 Invalid: Not a record


    ┌───────────────────────────────────────────┐
    │ { foo = 1, bar = "ABC" } ⫽ { baz : Bool } │
    └───────────────────────────────────────────┘
                                 ⇧
                                 Invalid: This is a record type and not a record


    ┌───────────────────────────────────────────┐
    │ { foo = 1, bar = "ABC" } ⫽ < baz : Bool > │
    └───────────────────────────────────────────┘
                                 ⇧
                                 Invalid: This is a union type and not a record


You tried to combine the following value:

↳ _.a

... which is not a record, but is actually a:

↳ { b = 2 }

────────────────────────────────────────────────────────────────────────────────

1│ { a = 1 } with a.b = 2

(stdin):1:1

Maybe there's a simple way to improve this error?

@Nadrieril
Copy link
Member

This is the ideal I had in mind (in the style of dhall-rust and rustc errors):

⊢ { a.b = 1 } with c.d = 2

Error: Missing record field: c

1│ { a.b = 1 } with c.d = 2
                    ^ you tried to modify the `c` field here
   ^^^^^^^^^^^ but this record has no `c` field
note: In a `with` expression, you can't modify subfields if the parent field doesn't already exist
⊢ { a = 1 } with a.b = 2

Error: This is not a record

1│ { a = 1 } with a.b = 2
                  ^^^ you tried to modify a subfield of `a` here
     ^^^^^ but the type of the `a` field is not a record, it is `Natural`
note: In a `with` expression, you can only modify subfields if the parent field is a record
help: Did you want to replace the `a` field altogether ? Then try `... with a = { b = 2 }`.

The --explain approach doesn't satisfy me: I want to show the user the reason for the error, not all possible reasons. It surely can be done after a desugaring, but imho that makes it way harder than it needs to be.

@Gabriella439
Copy link
Contributor Author

@Nadrieril: What if we used the originally simpler desugaring without the let? After all, users can always introduce the let themselves as an optimization and most of the time I expect the left argument to with to already be a bound variable.

As far as error messages go, the Haskell implementation already has a separate constructor for with and then desugars it later on, precisely to reserve the ability to improve error messages (although it doesn't do so, yet). However, I think a simpler solution that captures what you want is to add a generic note that an implementation may choose to delay desugaring in order to improve error messages.

@Nadrieril
Copy link
Member

As far as error messages go, the Haskell implementation already has a separate constructor for with and then desugars it later on, precisely to reserve the ability to improve error messages (although it doesn't do so, yet). However, I think a simpler solution that captures what you want is to add a generic note that an implementation may choose to delay desugaring in order to improve error messages.

Huh, that option hadn't crossed my mind, thanks. I was too attached to having my AST match the one specified by the binary encoding. With this new option open, I'm way less concerned about the particular choice we make for this PR.

@Nadrieril: What if we used the originally simpler desugaring without the let? After all, users can always introduce the let themselves as an optimization and most of the time I expect the left argument to with to already be a bound variable.

Yeah, I'm happy with that. Since I now understand that implementations are free to do something more clever I don't believe this is a major problem anymore.

@Gabriella439 Gabriella439 merged commit 4c180d8 into master Mar 3, 2020
@Gabriella439 Gabriella439 deleted the gabriel/with_2 branch March 3, 2020 03:32
mergify bot added a commit to dhall-lang/dhall-haskell that referenced this pull request Mar 15, 2020
* Implement `with` keyword

... as standardized in dhall-lang/dhall-lang#923

* Implement new desugaring logic

* Update `dhall-lang` reference

* Update to reflect new `with` syntax

* Update to latest draft of proposal

* Switch to simpler desugaring

* Relocate `desugarWith` back to `Dhall.Syntax` module

* Fix downstream build failures

* Update `dhall-lang` to `master`

* Improve formatting of `with` expressions

... as suggested by @sjakobi

* Fix warnings

* Improve error messages

... both for `with` and `//`

... as requested by @sjakobi

* Use a `PreferAnnotation` type

... to record the origin of the `Prefer` constructor

... as suggested by @sjakobi

* Preserve the original `with` expression

... in order to improve the error for `with`

* Fix error message for `MustCombineRecord`

... as caught by @sjakobi

* Fix build failure for benchmark suite

* Fix test build failure

* Fix property test failure

* Fix benchmark build failure

... for real this time 😅

* Add missing documentation for `PreferAnnotation`

* Add tutorial documentation for `with`

... and for nested labels, which was missing

* Fix `dhall-nix` build failure

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Simon Jakobi <simon.jakobi@gmail.com>
@PierreR
Copy link

PierreR commented May 6, 2020

@is the with keyword explained in the documentation already ?

@sjakobi
Copy link
Collaborator

sjakobi commented May 6, 2020

@PierreR
Copy link

PierreR commented May 6, 2020

@sjakobi Thanks. This "with" keyword is a difficult candidate for the search box ;-)

Gabriella439 added a commit that referenced this pull request Aug 2, 2020
The motivation for this change is:

dhall-lang/dhall-haskell#1960

… where a deeply nested and chained `with` expression caused a blow-up
in the size of the desugared syntax tree.

This alternative desugaring was actually an idea we contemplated in the
original change that standardized `with`:

#923

… but at the time we elected to go with the simpler desugaring, reasoning
that users could work around performance issues with explicit `let`
binding:

#923 (comment)

However, for chained `with` expressions that work-around is not
ergonomic, as illustrated in
dhall-lang/dhall-haskell#1960.  Instead,
this adds language support for the `let` work-around.

The change to the `WithMultiple` test demonstrates that this leads to
a more compact desugared expression.  Before this change the
expression `{ a.b = 1, c.d = 2 } with a.b = 3 with c.e = 4` would
desugar to:

```dhall
  { a.b = 1, c.d = 2 }
⫽ { a = { a.b = 1, c.d = 2 }.a ⫽ { b = 3 } }
⫽ { c =
        ({ a.b = 1, c.d = 2 } ⫽ { a = { a.b = 1, c.d = 2 }.a ⫽ { b = 3 } }).c
      ⫽ { e = 4 }
  }
```

… and now the same expression desugars to:

```dhall
let _ = let _ = { a.b = 1, c.d = 2 } in _ ⫽ { a = _.a ⫽ { b = 3 } }

in  _ ⫽ { c = _.c ⫽ { e = 4 } }
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Non-Recursive Deep Record Merges Allow deep combining without failure
6 participants