New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Less heavy unions #163

Closed
aleator opened this Issue Oct 18, 2017 · 9 comments

Comments

Projects
None yet
3 participants
@aleator
Contributor

aleator commented Oct 18, 2017

I find handling larger sum types a bit awkward. Consider for example
a simple 5-field sum. Each time a literal value is needed, one needs to enumerate
all the fields, which is, at minimum, this long:

<A=+1 | B=./b | C=./c | D=./d | E=./e>

This really doesn't encourage one to use larger sums. What I would like is a way to give a name for the 'other options' part. For example, I think it could be possible to do something like this

<A=+1 | ./theSumType>

Or even,

<A=+1> : ./theSumType

which would mirror Optional.

Observing Dhall.Core, union literals are coded as

UnionLit Text (Expr s a) (Map Text (Expr s a))

Where the second part is suspiciously similar to

Union (Map Text (Expr s a))

So I would guess that implementing this would require extending the syntax a slight bit and allow mentioning the used sum-field in the list of alternatives.

Alternatively, one could consider:

  • Open sums, where any sum is a subtype of a larger sum including the compatible field
  • A dhall-format like tool that would autogenerate constructors for sums
  • Having a syntactic sugar that would create constructors when necessary
@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Oct 18, 2017

Collaborator

Currently, the recommended approach is to define a constructor function for each alternative as either alet-binding or a file. Assuming we go with files, we would create:

$ cat ./A
λ(x : Integer) → < A = x | B : Bool | C : Text >
$ cat ./B
λ(x : Bool) → < A : Integer | B = x | C : Text >
$ cat ./C
λ(x : Text) → < A : Integer | B : Bool | C = x >

... and then you could use these constructors like this:

[ ./A 1
, ./B True
, ./B False
, ./A 2
, ./C "ABC"
]

If we were do to the same thing with let-bindings, it would be:

    let A = λ(x : Integer)  < A = x | B : Bool | C : Text >
in  let B = λ(x : Bool)  < A : Integer | B = x | C : Text >
in  let C = λ(x : Text)  < A : Integer | B : Bool | C = x >
in  [ A 1
    , B True
    , B False
    , A 2
    , C "ABC"
    ]

So that minimizes the overhead of union alternatives by only requiring you to write out the full union literal once (in the files or let-bindings for the constructors)

Out of the alternatives you proposed, the one I prefer is syntactic sugar for defining constructors. In other words, something like the following hypothetical syntax:

    data < A : Integer | B : Bool | C : Text >
in  ...

... although perhaps the keyword data could be renamed to another keyword. And I think this can be done in such a way that the union type argument to data could itself be imported, so you could do:

    data ./MySumType
in  ...

Also, for completeness I'll also address the strongest competing alternative you mentioned, which is open sums (a.k.a. polymorphic variants), since this question comes up frequently.

I chose not to support open sums because of a broader policy to not support any form of subtyping. This is the same reason why Dhall uses explicit type-abstraction/type-application (since automatic specialization of polymorphic types is a form of subtyping) and this is also the same reason why Dhall does not support row polymorphism.

The main reason why Dhall does not support subtyping is philosophical: Dhall is an experiment to see if a "fully functional" language is viable. "Fully functional" is a term I made up to denote the absence of logic-style programming at the type level (such as unification, resolving type class constraints, or checking refined types). I love Haskell, but one of my biggest criticisms of the Haskell language and ecosystem is that people seem "embarrassed" to use ordinary functional programming to solve problems and instead rely heavily on type-level Prolog to program. Dhall is basically a reaction to that to try to swing the pendulum back in the other direction to focus more on term-level functional programming.

There are practical benefits to being fully functional, but the philosophical motivation is the core reason. The main practical benefits of being fully functional are:

  • it's easier to teach
    • you don't have to teach language users to reason about unification
    • everything is just functions
  • it plays nice with Dhall's import system
    • If you support type inference and Dhall-style expression imports then you run into a problem very similar to the "let generalization" problem Haskell has
  • easy to implement
    • this matters more for bindings to other languages
Collaborator

Gabriel439 commented Oct 18, 2017

Currently, the recommended approach is to define a constructor function for each alternative as either alet-binding or a file. Assuming we go with files, we would create:

$ cat ./A
λ(x : Integer) → < A = x | B : Bool | C : Text >
$ cat ./B
λ(x : Bool) → < A : Integer | B = x | C : Text >
$ cat ./C
λ(x : Text) → < A : Integer | B : Bool | C = x >

... and then you could use these constructors like this:

[ ./A 1
, ./B True
, ./B False
, ./A 2
, ./C "ABC"
]

If we were do to the same thing with let-bindings, it would be:

    let A = λ(x : Integer)  < A = x | B : Bool | C : Text >
in  let B = λ(x : Bool)  < A : Integer | B = x | C : Text >
in  let C = λ(x : Text)  < A : Integer | B : Bool | C = x >
in  [ A 1
    , B True
    , B False
    , A 2
    , C "ABC"
    ]

So that minimizes the overhead of union alternatives by only requiring you to write out the full union literal once (in the files or let-bindings for the constructors)

Out of the alternatives you proposed, the one I prefer is syntactic sugar for defining constructors. In other words, something like the following hypothetical syntax:

    data < A : Integer | B : Bool | C : Text >
in  ...

... although perhaps the keyword data could be renamed to another keyword. And I think this can be done in such a way that the union type argument to data could itself be imported, so you could do:

    data ./MySumType
in  ...

Also, for completeness I'll also address the strongest competing alternative you mentioned, which is open sums (a.k.a. polymorphic variants), since this question comes up frequently.

I chose not to support open sums because of a broader policy to not support any form of subtyping. This is the same reason why Dhall uses explicit type-abstraction/type-application (since automatic specialization of polymorphic types is a form of subtyping) and this is also the same reason why Dhall does not support row polymorphism.

The main reason why Dhall does not support subtyping is philosophical: Dhall is an experiment to see if a "fully functional" language is viable. "Fully functional" is a term I made up to denote the absence of logic-style programming at the type level (such as unification, resolving type class constraints, or checking refined types). I love Haskell, but one of my biggest criticisms of the Haskell language and ecosystem is that people seem "embarrassed" to use ordinary functional programming to solve problems and instead rely heavily on type-level Prolog to program. Dhall is basically a reaction to that to try to swing the pendulum back in the other direction to focus more on term-level functional programming.

There are practical benefits to being fully functional, but the philosophical motivation is the core reason. The main practical benefits of being fully functional are:

  • it's easier to teach
    • you don't have to teach language users to reason about unification
    • everything is just functions
  • it plays nice with Dhall's import system
    • If you support type inference and Dhall-style expression imports then you run into a problem very similar to the "let generalization" problem Haskell has
  • easy to implement
    • this matters more for bindings to other languages
@aleator

This comment has been minimized.

Show comment
Hide comment
@aleator

aleator Oct 19, 2017

Contributor

Thank you for the detailed response!

I already have a bit of Haskell code that exports dhall types for my Haskell records. Also, I've got beginnings of code that exports the constructors for unions types. Doing this with types not defined in Haskell feels bit inelegant, since generated and written dhall files have to be kept in sync manually.

I fully appreciate the design philosophy but I would like to hear your opinions on the following two ideas:

a)

let theSumType = <A:Nat | B:{} | C:{foo:Text,bar:Text}>
in <A=+1 | fields theSumType>

Here, dhall would have syntactic sugar for naming a set of fields. The fields (better name is in order) would expand to A:Nat | B:{} | C:{foo:Text,bar:Text}.

The negatives I can think of are, firstly that fields theSumType is not an expression by itself, and secondly, that this is just syntactic sugar. However, unlike your data proposal, this would not bring new variables to scope. I would consider that a definite plus. It might not be clear what variables are introduced if constructors are derived for huge sums.

b)

let theSumType = <A:Nat | B:{} | C:{foo:Text,bar:Text}>
in <A=+1> : theSumType 

In this idea, there is no syntactic sugar, but the type annotation would be integral part to sum literals. This would mirror the situation with Optional which requires a similar type annotation. On the downside this looks like an open sum but isn't.

Contributor

aleator commented Oct 19, 2017

Thank you for the detailed response!

I already have a bit of Haskell code that exports dhall types for my Haskell records. Also, I've got beginnings of code that exports the constructors for unions types. Doing this with types not defined in Haskell feels bit inelegant, since generated and written dhall files have to be kept in sync manually.

I fully appreciate the design philosophy but I would like to hear your opinions on the following two ideas:

a)

let theSumType = <A:Nat | B:{} | C:{foo:Text,bar:Text}>
in <A=+1 | fields theSumType>

Here, dhall would have syntactic sugar for naming a set of fields. The fields (better name is in order) would expand to A:Nat | B:{} | C:{foo:Text,bar:Text}.

The negatives I can think of are, firstly that fields theSumType is not an expression by itself, and secondly, that this is just syntactic sugar. However, unlike your data proposal, this would not bring new variables to scope. I would consider that a definite plus. It might not be clear what variables are introduced if constructors are derived for huge sums.

b)

let theSumType = <A:Nat | B:{} | C:{foo:Text,bar:Text}>
in <A=+1> : theSumType 

In this idea, there is no syntactic sugar, but the type annotation would be integral part to sum literals. This would mirror the situation with Optional which requires a similar type annotation. On the downside this looks like an open sum but isn't.

@benjamin-travis-summers

This comment has been minimized.

Show comment
Hide comment
@benjamin-travis-summers

benjamin-travis-summers Oct 19, 2017

How about syntactic sugar for creating a record full of constructors?

Something like:

let Sum = SOMEKEYWORD < A : Integer | B : Bool | C : Text >
in  [ Sum.A 1
    , Sum.B True
    , Sum.B False
    , Sum.A 2
    , Sum.C "ABC"
    ]

Something like this would help a lot, but large and fully normalized sums are still absurdly verbose.

benjamin-travis-summers commented Oct 19, 2017

How about syntactic sugar for creating a record full of constructors?

Something like:

let Sum = SOMEKEYWORD < A : Integer | B : Bool | C : Text >
in  [ Sum.A 1
    , Sum.B True
    , Sum.B False
    , Sum.A 2
    , Sum.C "ABC"
    ]

Something like this would help a lot, but large and fully normalized sums are still absurdly verbose.

@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Oct 21, 2017

Collaborator

@aleator: The problem with both proposals is that Dhall does not support Haskell-style type synonyms defined using let-bindings. In other words:

    let theSumType = <A:Nat | B:{} | C:{foo:Text,bar:Text}>
in  ...

... would not work because Dhall's type-checker would not know that theSumType and <A:Nat | B:{} | C:{foo:Text,bar:Text}> are the same type. The fundamental issue here is that type-checking precedes normalization and there is no way for Dhall to substitute a let-defined value like theSumType for its right-hand-side until the normalization phase. If you try to substitute let-defined values before the type-checking phase is complete then you can introduce infinite loops.

That in turns means that the only way you would be able to use this feature would be to import the type annotation from a path, but I would like this sum-type simplification feature to be usable without Dhall's import system. This is the reason I suggested a proposal that only requires specifying the type once because it sidesteps Dhall's limitations around repetitive types

@benjamin-travis-summers: I like your proposal since it avoids cluttering the local namespace and it also doesn't technically require a let-binding at all (although that would still be the common case). You could in theory just write the following standalone expression:

SOMEKEYWORD < A : Integer | B : Bool | C : Text >

... and that would be legal Dhall code representing some record

The other reason I like it is because it's way easier to implement

@aleator: I think @benjamin-travis-summers's proposal would still be compatible with your use case, because you could still emit a single Dhall file containing the Dhall representation of your Haskell type, such as ./MyHaskellType.dhall, and then in the rest of your Dhall code you would do:

    let MyHaskellType = SOMEKEYWORD ./MyHaskellType.dhall
in  [ MyHaskellType.A +0, MyHaskellType.B {=}, ... ]

I think the main thing that's left is deciding what SOMEKEYWORD should be. Some ideas I'll throw out:

  • data (to resemble a Haskell data definition)
  • alternatives (consistent with Dhall's terminology)
  • constructors (consistent with Haskell's terminology)
Collaborator

Gabriel439 commented Oct 21, 2017

@aleator: The problem with both proposals is that Dhall does not support Haskell-style type synonyms defined using let-bindings. In other words:

    let theSumType = <A:Nat | B:{} | C:{foo:Text,bar:Text}>
in  ...

... would not work because Dhall's type-checker would not know that theSumType and <A:Nat | B:{} | C:{foo:Text,bar:Text}> are the same type. The fundamental issue here is that type-checking precedes normalization and there is no way for Dhall to substitute a let-defined value like theSumType for its right-hand-side until the normalization phase. If you try to substitute let-defined values before the type-checking phase is complete then you can introduce infinite loops.

That in turns means that the only way you would be able to use this feature would be to import the type annotation from a path, but I would like this sum-type simplification feature to be usable without Dhall's import system. This is the reason I suggested a proposal that only requires specifying the type once because it sidesteps Dhall's limitations around repetitive types

@benjamin-travis-summers: I like your proposal since it avoids cluttering the local namespace and it also doesn't technically require a let-binding at all (although that would still be the common case). You could in theory just write the following standalone expression:

SOMEKEYWORD < A : Integer | B : Bool | C : Text >

... and that would be legal Dhall code representing some record

The other reason I like it is because it's way easier to implement

@aleator: I think @benjamin-travis-summers's proposal would still be compatible with your use case, because you could still emit a single Dhall file containing the Dhall representation of your Haskell type, such as ./MyHaskellType.dhall, and then in the rest of your Dhall code you would do:

    let MyHaskellType = SOMEKEYWORD ./MyHaskellType.dhall
in  [ MyHaskellType.A +0, MyHaskellType.B {=}, ... ]

I think the main thing that's left is deciding what SOMEKEYWORD should be. Some ideas I'll throw out:

  • data (to resemble a Haskell data definition)
  • alternatives (consistent with Dhall's terminology)
  • constructors (consistent with Haskell's terminology)
@aleator

This comment has been minimized.

Show comment
Hide comment
@aleator

aleator Oct 24, 2017

Contributor

I think @benjamin-travis-summers suggestion would be sufficient for my use case. I still have slight doubt whether there could be something even nicer, but I can't think of anything.

I vote for keyword constructors.

Contributor

aleator commented Oct 24, 2017

I think @benjamin-travis-summers suggestion would be sufficient for my use case. I still have slight doubt whether there could be something even nicer, but I can't think of anything.

I vote for keyword constructors.

@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Nov 4, 2017

Collaborator

So I will probably implement this using constructors as the keyword, but I want to postpone the implementation of this until after I'm doing standardizing the language so that I can test drive the process of adding a new language feature through a standardized process

Collaborator

Gabriel439 commented Nov 4, 2017

So I will probably implement this using constructors as the keyword, but I want to postpone the implementation of this until after I'm doing standardizing the language so that I can test drive the process of adding a new language feature through a standardized process

Gabriel439 added a commit that referenced this issue Dec 29, 2017

Add `constructors` keyword
Fixes #163

This provides a new `constructors` keyword which can be used to create a
record of constructors from a union type literal.  The record contains
one constructor per alternative in the union type.
@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Dec 29, 2017

Collaborator

Alright, I have a pull request out with support for the constructors keyword: #199

Let me know if that satisfies your use case

Collaborator

Gabriel439 commented Dec 29, 2017

Alright, I have a pull request out with support for the constructors keyword: #199

Let me know if that satisfies your use case

@aleator

This comment has been minimized.

Show comment
Hide comment
@aleator

aleator Jan 3, 2018

Contributor

I think it will work fine for me.

Contributor

aleator commented Jan 3, 2018

I think it will work fine for me.

Gabriel439 added a commit that referenced this issue Jan 3, 2018

Add `constructors` keyword (#199)
Fixes #163

This provides a new `constructors` keyword which can be used to create a
record of constructors from a union type literal.  The record contains
one constructor per alternative in the union type.
@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Jan 3, 2018

Collaborator

Great! Also, thank you for requesting this feature :)

Collaborator

Gabriel439 commented Jan 3, 2018

Great! Also, thank you for requesting this feature :)

Gabriel439 added a commit to dhall-lang/dhall-lang that referenced this issue Jan 16, 2018

Type synonym support
This changes the standard semantics to support type synonyms, which is one of
the most common feature requests.  Here are some example requests:

* dhall-lang/dhall-haskell#10
* dhall-lang/dhall-haskell#55
* dhall-lang/dhall-haskell#92
* dhall-lang/dhall-haskell#163
* dhall-lang/dhall-haskell#176
* dhall-lang/dhall-haskell#201
* #10

The change to the semantics is pretty simple: don't require that `let`
expressions type-check as an equivalent anonymous function.  Instead, just
perform substitution and type-check the substituted expression.  The semantics
are still sound and this behaves the way users expect.

For example, the following expression would be rejected before this change
and accepted after this change:

```haskell
let t = Integer in 1 : t
```

Gabriel439 added a commit to dhall-lang/dhall-lang that referenced this issue Jan 18, 2018

Type synonym support (#69)
* Type synonym support

This changes the standard semantics to support type synonyms, which is one of
the most common feature requests.  Here are some example requests:

* dhall-lang/dhall-haskell#10
* dhall-lang/dhall-haskell#55
* dhall-lang/dhall-haskell#92
* dhall-lang/dhall-haskell#163
* dhall-lang/dhall-haskell#176
* dhall-lang/dhall-haskell#201
* #10

The change to the semantics is pretty simple: don't require that `let`
expressions type-check as an equivalent anonymous function.  Instead, just
perform substitution and type-check the substituted expression.  The semantics
are still sound and this behaves the way users expect.

For example, the following expression would be rejected before this change
and accepted after this change:

```haskell
let t = Integer in 1 : t
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment