Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GADTs and recursive types #1356

Closed
hallettj opened this Issue Feb 2, 2016 · 7 comments

Comments

Projects
None yet
4 participants
@hallettj
Copy link

hallettj commented Feb 2, 2016

tl;dr: GADTs can be implemented with existing Flow features, but a problem checking recursive types with varying constraints on type variables, and a problem with accumulated constraints prevent the implementation from type-checking.

There are multiple ways to implement algebraic data types with flow. One option that I think is interesting is to encode values as pattern match functions.

type Tree<A> = <R>(
  Node:  (left: Tree<A>, value: A, right: Tree<A>) => R,
  Empty: () => R
) => R

With this pattern, values of type Tree<A> are functions that are applied to access the data inside. The type R is the result of pattern matching. R can be any type, as long as the result of every pattern returns the same type. For example:

function includes<A>(value: A, tree: Tree<A>): boolean {
  return tree(
    /* Node */ (left, v, right) => {
      return v === value || includes(value, left) || includes(value, right)
    },
    /* Empty */ () => false
  )
}

Constructors for this type look like this:

function Node<A>(left: Tree<A>, value: A, right: Tree<A>): Tree<A> {
  return function<R>(node: (l: *, v: *, r: *) => R, empty): R {
    return node(left, value, right)
  }
}

function Empty<A>(): Tree<A> {
  return function<R>(node, empty: () => R): R {
    return empty()
  }
}

Some benefits of this approach are that we get pattern matching for free, and the compiler will make sure that pattern matching is exhaustive.

Another benefit is that the pattern-match approach can encode generalized algebraic data types (GADTs).

Unfortunately, I hit a wall with GADTs due to two problems:

  • Certain recursive types cannot currently be checked with Flow.
  • A type like <A>(expr: Expr<A>) => A does not pass checking if there are multiple concrete possibilities for the type of A in a function implementation, even if the invariant A == A holds in every case.

For an explanation of why GADTs are useful, see this Wikibooks chapter on the Haskell GADTs language extension. The chapter describes a type for representing an abstract syntax tree with type-level invariants.

type Expr<A> = <R>(
  I:   (x: number)  => R,
  B:   (x: boolean) => R,
  Add: (x: Expr<number>, y: Expr<number>) => R,
  Mul: (x: Expr<number>, y: Expr<number>) => R,
  Eq:  (x: Expr<number>, y: Expr<number>) => R,
) => R

Constructors for this type specify their return type, which means that we can track which operations are legal for an AST subtree by specifying concrete types for a phantom type variable (the A in Expr<A>).
Note the return types for these constructors:

function I(x: number): Expr<number> {
  return function<R>(a: (x: number) => R, b, c, d, e): R {
    return a(x)
  }
}

function Add(x: Expr<number>, y: Expr<number>): Expr<number> {
  return function<R>(a, b, c: (x: *, y: *) => R, d, e): R {
    return c(x, y)
  }
}

function Mul(x: Expr<number>, y: Expr<number>): Expr<number> {
  return function<R>(a, b, c, d: (x: *, y: *) => R, e): R {
    return d(x, y)
  }
}

function Eq(x: Expr<number>, y: Expr<number>): Expr<boolean> {
  return function<R>(a, b, c, d, e: (x: *, y: *) => R): R {
    return e(x, y)
  }
}

The Expr<A> type can be used to implement functions with a high level of type safety.

function evaluate<A>(expr: Expr<A>): A {
  return expr(
    /* I */   x => x,
    /* B */   x => x,
    /* Add */ (x, y) => evaluate(x) + evaluate(y),
    /* Mul */ (x, y) => evaluate(x) * evaluate(y),
    /* Eq  */ (x, y) => evaluate(x) === evaluate(y)
  )
}

This means that the expression Add( I(3), I(4) ) is legal, but Add( I(3), B(true) ) results in a type error. The implementation of evaluate can take advantage of that, so it does not need error handling to check for nonsensical inputs.

Unfortunately this does not work yet. When I type-check the code above, I get these errors:

gadts.js:49
 49:   return expr(
              ^ boolean. This type is incompatible with
 48: function evaluate<A>(expr: Expr<A>): A {
     ^ some incompatible instantiation of `A`

gadts.js:49
 49:   return expr(
              ^ number. This type is incompatible with
 48: function evaluate<A>(expr: Expr<A>): A {
     ^ some incompatible instantiation of `A`

gadts.js:52
 52:     /* Add */ (x, y) => evaluate(x) + evaluate(y),
                             ^^^^^^^^^^^ function call
 48: function evaluate<A>(expr: Expr<A>): A {
     ^ polymorphic type: function. *** Recursion limit exceeded ***
 48: function evaluate<A>(expr: Expr<A>): A {
     ^ function evaluate

The full source of gadts.js is available at https://gist.github.com/hallettj/bbf4e75566123a8ed07a

@avikchaudhuri

This comment has been minimized.

Copy link
Contributor

avikchaudhuri commented Feb 3, 2016

Thanks for doing this background work! Improvements to union types in general will happen in the coming months, and specific use cases like this are great to keep at the back of our minds for adequate coverage. cc @bhosmer

@calebmer

This comment has been minimized.

Copy link
Contributor

calebmer commented Jul 20, 2017

I’m going to close assuming the necessary union work happened. If this issue still needs discussion please reopen 😊

@calebmer calebmer closed this Jul 20, 2017

@asolove

This comment has been minimized.

Copy link
Contributor

asolove commented Aug 16, 2017

@calebmer FYI, trying the example code in Try Flow on v0.52 still fails with the same error messages.

@asolove

This comment has been minimized.

Copy link
Contributor

asolove commented Aug 25, 2017

This is still failing in v0.53.1. Would be super cool to get it to work, or at least have a clearer picture of why it doesn't. It seems like I'm still hitting the problem mentioned above:

A type like (expr: Expr) => A does not pass checking if there are multiple concrete possibilities for the type of A in a function implementation, even if the invariant A == A holds in every case.

@hallettj

This comment has been minimized.

Copy link
Author

hallettj commented Aug 26, 2017

@asolove I'm glad I'm not the only one who is interested in this kind of stuff! I updated the gist demonstrating the Expr<A> type to make the code less verbose and more readable.

https://gist.github.com/hallettj/bbf4e75566123a8ed07a

I wanted to note that there is a workaround for this issue: I put a $FlowFixMe on the implementation of evaluate. Flow is not able to check the implementation of evaluate itself, but it will note the type signature and will correctly typecheck expressions that use evaluate. (An alternative would be to declare the type of evaluate and provide the implementation with no type annotation, in case consumers have configured Flow to use a different "fix me" comment.) So GADTs can be useful in Flow code today with the caveat that functions that pattern match on GADTs cannot be checked, but their types can be declared.

@calebmer

This comment has been minimized.

Copy link
Contributor

calebmer commented Aug 27, 2017

The design of generic functions in Flow is that when you call the function and we create new types for function generic arguments we do not look at the implementation of the function. We only look at the function signature. So in this case:

function fn<A>(): A;

There isn’t really a sensible type for A. So Flow does not allow generic arguments in output positions. (I’m not the most qualified to speak on why this makes sense from a type system design perspective, but I tried my best 😊)

Removing A as the return type and annotating it as the type you want to return gives you the results that I think you want. See my modified example here:

type Expr = <R>(
  I:   (x: number)  => R,
  B:   (x: boolean) => R,
  Add: (x: Expr, y: Expr) => R,
  Mul: (x: Expr, y: Expr) => R,
  Eq:  (x: Expr, y: Expr) => R,
) => R

function evaluate1(expr: Expr): number | boolean {
  return expr(
    /* I */   x => x,
    /* B */   x => x,
    /* Add */ (x, y) => evaluate1(x) + evaluate1(y), // Error: boolean cannot be added
    /* Mul */ (x, y) => evaluate1(x) * evaluate1(y), // Error: boolean cannot be multiplied
    /* Eq  */ (x, y) => evaluate1(x) === evaluate1(y)
  );
}

function evaluate2(expr: Expr): number {
  return expr(
    /* I */   x => x,
    /* B */   x => x ? 0 : 1,
    /* Add */ (x, y) => evaluate2(x) + evaluate2(y),
    /* Mul */ (x, y) => evaluate2(x) * evaluate2(y),
    /* Eq  */ (x, y) => (evaluate2(x) === evaluate2(y)) ? 0 : 1,
  );
}

https://flow.org/try/#0C4TwDgpgBAogHmATlAvFAPAJQHwAoBQUUAkgFxFS5zkB2ArgLYBGEiAlEStlJgDSFQAQuSJVyTAPYSANhACGNDlx78iAQQAmG8mNgJEvKCHLwkS7nwEBZOtJ3U9SQ8cftUF1bACOI3aYNGJvrmKvghmPj4AGZ0NADGwACWEjRQEABuctJ0csAQAIy4EPpBZrSMLMgAPlCSMvKpAN4CiBDAdIipxUgEFFAA9ABUJFCD-RRw7lBwnkRDQqPjRJPKMwJzw5oai5QzRiEZWTl5hXAcANRpmdm5BbggbLMDwzbSO1TOB9fHd2ejV0dboUHk95jAvEQxrtPlNDjcTlQlCg0HCfsC2AI2ABufAAX0iMXiSRSAPhEAATEUSq42OVmKwoM0iK12p00vpehR5sQdsspms+vNBLzpvyoAB+KAABig5HyoM2WneewesO+t0pf0uqI190e62eUFeyphyh1eU1HGG5opeoV3kh4w++ymRXVFsRqGRpJ+lIeHElMrlnmxePwQA

@hallettj

This comment has been minimized.

Copy link
Author

hallettj commented Aug 27, 2017

@calebmer: I think that the use of a generic variable in the return type is not the problem. Note that in the signature for evaluate the variable A appears in both the input and the output. So there is a sensible type for A for a given invocation, which Flow can infer from the type of the input:

function evaluate<A>(expr: Expr<A>): A

(Btw, I'm reasonably confident that Flow does allow type variables that appear in the output type but not in any inputs. The result is an unresolved type that does not yet have any constraints. I often do this instead of using any. For example, function decode<T>(input: string): T { return JSON.parse(input) })

Removing the type variable from Expr<A> and using a union type for the output for evaluate does lead to a function that typechecks. But the result is no longer a generalized algebraic data type, which was the point of my exercise. My goal was an evaluate function that can infer the correct type for the evaluation of an expression based on the type parameter given for that expression. Flow can do this - as I mentioned in my previous comment if I declare the type for evaluate Flow is able to correctly typecheck invocations of evaluate. The problem is just that Flow is not able to typecheck the implementation of evaluate itself. Here are some examples from the gist:

/*
 * This example represents the expression (2 + 1) * 4, and has the type `Expr<number>`
 */
const expr = Mul(
  Add(I(2), I(1)),
  I(4)
)

const a: number = evaluate(expr)
const b: boolean = evaluate(Eq(I(1), I(1)) /* has the type `Expr<boolean>` */)

/*
 * Does not typecheck because `expr` evaluates to a `number` not a `boolean`
 * $FlowFixMe
 */
const c: boolean = evaluate(expr)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.