Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive types #154

Closed
xsebek opened this issue Oct 3, 2021 · 9 comments · Fixed by #1894
Closed

Recursive types #154

xsebek opened this issue Oct 3, 2021 · 9 comments · Fixed by #1894
Assignees
Labels
C-Project A larger project, more suitable for experienced contributors. L-Language design Issues relating to the overall design of the Swarm language. L-Type system Issues related to the Swarm language type system. S-Moderate The fix or feature would substantially improve user experience. Z-Feature A new feature to be added to the game.

Comments

@xsebek
Copy link
Member

xsebek commented Oct 3, 2021

Add a rec (like Fix) operator which would open the door to any recursive data structure, like list, set or map.

This would also add two reserved function names:

  • roll : f (rec f) -> rec f
  • unroll : rec f -> f (rec f)

Example from IRC (lets assume a + b = L a | R b, and functions getLeft, getRight):

type listf a b = () + a * b end
type list a = rec (listf a) end

def nil  : list a           = roll (L ()) end
def cons : a -> list a      = \x. \xs. roll (R (x,xs)) end
def head : list a -> a      = \xs. fst (getRight (unroll xs)) end
def tail : list a -> list a = \xs. snd (getRight (unroll xs)) end

> let l = cons 1 (cons 2 (cons 3 nil)) in head l, tail l
(1, cons 2 (cons 3 nil)) : int * rec 

Special case for list

As the list is particularly useful it might be provided to players early and internally stored as rec, with special rules for parsing and pretty-printing. Something like [1,2,3] and hardcoding the type synonym list.
In gameplay, this could be introduced as a "stable form" of something more advanced. Having the appropriate device installed would load the basic list functions into the dictionary. Curious players could then type > cons and wonder what it will later be useful for. 🙂

Related:

@xsebek xsebek added the Z-Feature A new feature to be added to the game. label Oct 3, 2021
@xsebek xsebek mentioned this issue Oct 3, 2021
@xsebek xsebek added C-Project A larger project, more suitable for experienced contributors. L-Language design Issues relating to the overall design of the Swarm language. L-Type system Issues related to the Swarm language type system. S-Moderate The fix or feature would substantially improve user experience. labels Oct 3, 2021
@xsebek xsebek mentioned this issue Oct 3, 2021
@byorgey byorgey mentioned this issue Oct 13, 2021
@byorgey
Copy link
Member

byorgey commented Oct 18, 2021

Some additional thoughts:

  • Instead of having rec be a type operator, like Fix in Haskell, it's probably more flexible to have it bind a type variable. For example, instead of
    type listf a b = () + a * b end
    type list a = rec (listf a) end
    
    we could just write
    type list a = rec l. () + a * l end
    
    This way we don't have to go into contortions to always define a "structure functor" with a type variable representing recursive occurrences as the last parameter.
  • It might not be necessary to hardcode the type synonym list. It might be more fun to just have the built-in syntax work directly with rec l. () + a * l, and implement equality checking for types (in this case I think only alpha-equivalence would be needed), so that built-in list things are compatible with the user's type synonym.
  • We might also consider implementing equirecursive types instead of isorecursive types. In brief,
    • With equirecursive types, we could have e.g. list a = () + a * list a, where the left- and right-hand sides are completely interchangeable. In other words, recursive type synonyms (Haskell does not have this!). So no roll and unroll operations are needed. For example, cons : a -> list a -> list a = \x. \xs. inr (x,xs) would be well-typed in such a system.
    • This means that equality testing for types becomes much more interesting. We can no longer tell just by looking whether two types are the same. However, type equality is still decidable via coinduction: keep unrolling the types being compared and matching up their components, failing if we ever discover an incompatible pair of types, and stopping the recursion whenever we see a pair of types we have seen before.
    • The implementation would be more complex but it would be nicer for users in some way since no roll and unroll would be required (although it could perhaps be more confusing for the same reason).
  • With isorecursive types, on the other hand, list a = rec l. () + a * l is not the same type as () + a * list l, but they are related by roll/unroll.
    • This makes the implementation much easier: type equality testing is trivial. But code is noisier due to the need for explicit rec operator in types and roll and unroll in terms.

@byorgey byorgey changed the title Add advanced rec type Recursive types Mar 30, 2022
@xsebek
Copy link
Member Author

xsebek commented Jun 2, 2022

@byorgey the equirecursive types look cool, but the paper about them in F-omega system seems a bit too powerful for Swarm. 😮

I like the rec binding a type variable suggestion, as that seems easier to implement in the current type system. 👍

If I understand correctly, it is not even blocked on type synonyms (#153), though it would be annoying to use without them.

But what would the types of functions change to? 🤔

@byorgey
Copy link
Member

byorgey commented Jun 2, 2022

@byorgey the equirecursive types look cool, but the paper about them in F-omega system seems a bit too powerful for Swarm. 😮

Well, we don't have System F-omega, just System F.

If I understand correctly, it is not even blocked on type synonyms (#153), though it would be annoying to use without them.

Agreed.

But what would the types of functions change to? 🤔

What do you mean? Why would the types of functions change?

@xsebek
Copy link
Member Author

xsebek commented Jun 3, 2022

Well if rec works like type list a = rec l. () + a * l end, would the definitions of roll/unroll need to change?

@byorgey
Copy link
Member

byorgey commented Jun 3, 2022

Well, regardless of whether rec is a type operator or a binding form, I don't think roll and unroll can just be functions with a given type; they have to be special built-in syntax with special typing rules. But in any case it's not difficult. For example if x : rec a. t then unroll x : [a -> rec a. t] t (that is, the type t but with every occurrence of a substituted with rec a. t). Hmm, roll is a bit more annoying, I guess, because you have to do a sort of "reverse substitution" and it's impossible in general to automatically figure out which one you want to do. In practice it would probably require that we already know the desired overall type, then we can check that the application of roll is correct. For example if we have roll uList : rec l. () + a * l then we would have to make sure that the type of uList is the unfolded version of that, i.e. uList : () + a * (rec l. () + a * l).

@byorgey
Copy link
Member

byorgey commented Aug 8, 2022

Hmm. I'm thinking about how to implement type equality checking with equirecursive types, since it's the most interesting to me. It seems tricky though. We need to change the implementation of the =:= operator, which currently just calls =:= from the unification-fd library. But with equirecursive types, type equality can't just be unification anymore. Maybe it would be enough to have our =:= unfold recursive types coinductively, calling U.=:= on each pair of types encountered, but also keeping track of which pairs we've seen so we can stop recursing when seeing a pair we've already called U.=:= on. However, the other wrinkle is that if rec binds a type variable then we need to check equality up to alpha-equivalence; for example, rec l. () + int * l and rec q. () + int * q should be equal. Perhaps this could be solved by using some kind of locally nameless representation for types, so that alpha equality becomes just literal first-order equality.

@kostmo
Copy link
Member

kostmo commented Oct 20, 2022

Not to miss an opportunity for puns, might I suggest harvesting "pecans" to unlock use of "Cons" cells?

@byorgey
Copy link
Member

byorgey commented Apr 29, 2023

Rereading the discussion above, I think equirecursive types would be too complicated. We should just stick with simpler isorecursive types with explicit roll / unroll. I think probably implementing them via a rec that binds a type variable makes the most sense. Such type variables will always be bound, not free, so we can represent them via de Bruijn indices, so that unification will work without having to worry about alpha-equivalence.

@byorgey
Copy link
Member

byorgey commented Nov 22, 2023

I took my comment as a challenge and started working on this; my initial explorations are on the feature/recursive-types branch. Here's what I've learned so far:

  • roll and unroll need to be special syntactic forms, not constants, because we cannot express standalone types for them; they need special typechecking rules. In particular roll can only be done in checking mode, when we know an expected recursive type, and unroll can only be done in inference mode, when we can infer a specific recursive type for its argument.
    • The reason for this asymmetry is that we can easily expand a recursive type one step (rec x. t becomes t [x |-> rec x.t], that is, the body t with the entire recursive type substituted for the bound variable) but it is impossible to "unexpand" a recursive type. So for roll, we take the expected recursive type, expand it, and check that the argument has the expanded type; for unroll, we take the recursive type inferred for its argument, expand it, and infer that the entire application of unroll has the expanded type. Going in the opposite directions would involve "unexpanding".
  • However, it is very difficult to arrange that the expected type for roll, and the inferred type for the argument to unroll, are always known recursive types and never unification variables.
    • For example, roll (inl ()) : rec x. unit + x works fine, but roll (inr (roll (inl ()))) : rec x. unit + x fails, since as soon as we hit the function application inr ... we switch into inference mode, so by the time we switch back into checking mode for the nested roll, the "expected type" is a unification variable. We could in theory add a checking mode case for function application (although there's a good reason we don't have one!) but likely there would then just be some other example that failed; the point is that this is all rather finicky and it's pretty much impossible to guarantee to reach a given subterm in a particular mode without introducing any unification variables.
  • So really, to make this work, unification has to be able to deal with recursive types. However, unification-fd cannot; it does only first-order unification with no way to do intermediate reduction steps (such as expanding recursive types, or substituting definitions of type synonyms Add type synonyms (or just generalize to System Fω) #153). I thought maybe we might be able to do it with a custom Unifiable instance, but that doesn't really seem possible.

mergify bot pushed a commit that referenced this issue Apr 27, 2024
…1802)

Closes #1661; towards #154.

`unification-fd` is very powerful, and extremely fast, but it was written a long time ago and its age shows.  It was not possible to incorporate it into our effects system in a nice way, necessitating the use of concrete monad transformers in the typechecking code.  In addition it is impossible to customize, and we have been contemplating new type system features such as #153 and #154 that turn out to require hooking into the way the unification algorithm works (see #154 (comment) for more details).

This PR thus removes the dependency on `unification-fd` and implements our own version of unification.  It is not quite as fast as `unification-fd` but I consider the slowdown acceptable in order to gain e.g. recursive types.  And of course there is also room to optimize it.

The custom `UTerm` from `unification-fd` is replaced with the standard `Free` (free monad) construction from the `free` package, and the custom `Fix` from `unification-fd` is replaced with the one from `data-fix`.

We also get rid of the `unifyCheck` function, which used to be a quick short-circuiting way to check whether two types definitely did not unify or might unify, allowing us to give better error messages more quickly.  Now, the `=:=` unification operator itself just does this.
@mergify mergify bot closed this as completed in #1894 Jun 6, 2024
mergify bot pushed a commit that referenced this issue Jun 6, 2024
Implements recursive types.  Example:
```
tydef List a = rec l. Unit + a * l end

def nil : List a = inl () end
def cons : a -> List a -> List a = \x. \l. inr (x, l) end

def foldr : (a -> b -> b) -> b -> List a -> b = \f. \z. \xs.
  case xs
    (\_. z)
    (\c. f (fst c) (foldr f z (snd c)))
end

def map : (a -> b) -> List a -> List b = \f.
  foldr (\y. cons (f y)) nil
end
```
As discussed at #154, this uses the syntax `rec x. F(x)` to define the type `x` which is a solution to `x = F(x)`.  However, unlike the discussion there, I ended up going with *equirecursive* types (so a recursive type is *equal to* its unfolding). For example, as you can see above, if we have a value of type `List a` (defined as `rec l. Unit + a * l`), then it actually has the equivalent type `Unit + a * List a`, so we can simply do a `case` on it, without having to `unroll` first.  This actually turned out to be *easier* to implement (and it is cooler).

There are multiple built-in functions that conceptually return a list but currently do something different (like return a fold, or take an index and return a single element, etc.)  We should consider changing them to actually return lists, but that should definitely be in a separate PR.

Closes #154.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-Project A larger project, more suitable for experienced contributors. L-Language design Issues relating to the overall design of the Swarm language. L-Type system Issues related to the Swarm language type system. S-Moderate The fix or feature would substantially improve user experience. Z-Feature A new feature to be added to the game.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants