Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge yaya into recursion-schemes? #76

Open
gelisam opened this issue Feb 2, 2019 · 18 comments
Open

Merge yaya into recursion-schemes? #76

gelisam opened this issue Feb 2, 2019 · 18 comments

Comments

@gelisam
Copy link
Collaborator

gelisam commented Feb 2, 2019

@sellout said he hopes to eventually get yaya merged into recursion-schemes, rather than splitting the user base. I'm more concerned with duplicating our efforts, but whatever: it's a good idea in theory. But does it work in practice? It wouldn't make much sense to just merge everything as-is, as we would end up with two versions of every recursion scheme and it would be confusing for users. Or would it? Perhaps there are legitimate reasons for offering two versions of the same recursion scheme, e.g. some packages offer a strict and a lazy version of the same API. Or perhaps there is a generalization of our APIs which makes sense, and then users can specialize that API to get the features they need in their particular project.

So let's look at the concrete differences (conveniently listed in the yaya documentation) and for each one, decide whether one approach is better, whether it makes sense to provide both, and or whether they can be unified under a more general API. If for almost all the features it ends up making sense to provide both, then I don't think merging the two projects is worthwhile, but we can discuss that too.

@phadej
Copy link
Contributor

phadej commented Feb 2, 2019

  • I think the only practical difference is TypeFamlies vs. FunctionalDependencies

    • the difference is visible when you define your own recursion schemes, not when you use them with a concrete recursive type.
    • if we go for FunctionalDependencies, then the class should have the base functor first, i.e. instance (LisfF a) [a], as then we could conveniently use DeriveAnyClass in deriving clauses.
  • What does

    "safe"

    means? Ever-non-diverging?

    • How much wrong would be to have just single class with project,embed,cata and ana? Given Generics deriving.
    • Or then the what's "safe" or "unsafe" have to be explained in ELI5 way, because defining "unsafe" instances doesn't feel unsafe at all.

@gelisam
Copy link
Collaborator Author

gelisam commented Feb 2, 2019

(we've already discussed some of this on Twitter, but let's re-hash the argument for the benefit of everyone else)

  • uses multi-parameter type classes rather than a type family for the Base functor, because the latter frequently requires constraints in the form of (Recursive t, Base t ~ f), so we prefer Recursive t f.

In (Recursive t, Base t ~ f), is the f a type variable, or does it stand in for a concrete type like ListF a? What would be an example situation in which you've found yourself writing (Recursive t, Base t ~ f)?

One advantage of using a type family is that we can give it a default implementation. In one of my other projects, I am working on deriving associated types using generics, and I think it might make sense one day to use UnRep (NonRecursiveRep (Rep t)), or NonRecursive t for short, as the default for Base t. This would work well with our new generics-based default implementations for embed and project.

I should mention, however, that when implementing recursion schemes Base t is a bit of a mouthful, so it would be helpful if I could write f instead. So maybe I'll start using (Recursive t, Base t ~ f) (with a type variable f, not a concrete type) in the recursion-schemes code :)

  • total operations, with anything potentially partial relegated to the yaya-unsafe package, which has consequences:
    Recursive (Mu f) f and Corecursive (Nu f) f are the only “safe” instances of recursive data types

That's a pretty big difference: on the contrary, one of the breaking changes I am planning (and which I have not yet discussed with the rest of the maintainers) is to remove Mu, Nu and Fix! Let me explain.

  • I think it's confusing for users that we have so many fixpoint datatypes. The fact we are providing several options hints that there are tradeoffs to consider, and so the user is likely to waste a lot of time choosing a fixpoint datatype when in fact the difference barely matters. So I would like to remove Mu and Nu, and to only provide a single obvious choice, Fix (I'll explain later why I want to remove Fix as well).
  • Mu and Nu are relevant in total languages which distinguish between data and codata. Then, Mu produces the least fixpoint (data) and Nu produces the greatest fixpoint (codata). But Haskell is not such a language: we allow general recursion, and we tend to use the same recursive type (e.g. lists) to represent both its least (finite lists) and greatest (possibly infinite streams) fixpoints. In particular, we can easily use Mu with a higher-order function which uses its input function infinitely-many times, so in Haskell, Mu and Nu both produce the greatest fixpoint.
  • One case in which Fix, Mu and Nu differ in their number of non-bottom inhabitants is when the base functor is strict in its recursive positions. In that case, Fix and Mu both construct the least fixed point, while Nu constructs the greatest fixpoint. Otherwise, with non-strict recursive positions (the default in Haskell!), they all construct the greatest fixpoint. This isn't very elegant; Nu ignores the strictness and always constructs the greatest fixpoint, but Mu isn't always constructing the least fixpoint, it depends on the strictness. So I think it's better to embrace the existing Haskell conventions: a single recursive type, which is used both for data and codata, and if you really need to enforce finiteness in the types, write a variant with strict recursive positions. This is exactly what we get if we always use Fix.
  • Too many recursion-schemes tutorials guide users towards a hard-to-read style in which users define their base functor ListF a first, and then pass Fix (ListF a)s around. This is needlessly complicated and incompatible with all the other libraries on hackage which use [a] instead of Fix (ListF a). recursion-schemes already supports a better way, namely defining the recursive type [a] first, and deriving the base functor from it. I recommend this approach, and I think more users would discover that approach if it was the only approach.

I should mention, however, that if you do go through the trouble of defining strict base functors and carefully distinguishing between data and codata, you get the benefit that length [0..] is now a type error instead of an infinite loop. We should definitely link to yaya in the Safer section of the README.

a separate Steppable class with project and embed, increasing the number of operations that can be defined without needing unsafe instances.

how much wrong would be to have just single class with project,embed,cata and ana? Given Generics deriving.

I would definitely be in favour of merging Recursive and Corecursive into a single typeclass! Adding a third typeclass class (Recursive t, Corecursive t) => Steppable t is less exciting to me, but I wouldn't be opposed either.

@gelisam
Copy link
Collaborator Author

gelisam commented Feb 2, 2019

relatively few operations are defined, with the intent being that g{cata|ana} be used directly instead. The more commonly known operations are provided in the Zoo modules, with the intent that they provide a mapping from names you may have heard to how Yaya expects you to accomplish them.

I like the idea, but gcata's API is so much worse than cata, zygo etc., I would insist we put a lot of effort improving that API before forcing it onto our users. The fact that its API is currently hard to use doesn't bother me much because I don't think many people use it.

@gelisam
Copy link
Collaborator Author

gelisam commented Feb 2, 2019

pattern functors tend to be named independently of their fixed-points. E.g., we use Maybe directly instead of some NatF, XNor a b instead of ListF, and a `AndMaybe` b instead of NonEmptyF.

That's another big philosophical difference: I think names like ConsF are very important in making the code readable. In fact, I think we should use more specialized names, e.g. data ParaSubTree { originalSubTree :: t, recursiveResult :: a } instead of (t, a), in order to improve the readability further.

@gelisam
Copy link
Collaborator Author

gelisam commented Feb 2, 2019

no TH yet – this might be a philosophical difference, actually. On the one hand, it useful to be able to migrate existing code by quickly creating a pattern functor, but on the other, this stuff is generally unsafe. So, it should at least be relegated to yaya-unsafe.

recursion-schemes can be used without TH, you just need to define the base functor by hand. I would like to preserve this property. Unfortunately, the UnRep approach I mentioned earlier still relies on a bit of TH, it just minimizes the amount of TH code we have to write. Which doesn't make much of a difference, since (1) the TH code for recursion-schemes is already written, and (2) it's only one line of TH for the user regardless of whether we use UnRep or not.

@ocharles
Copy link

ocharles commented Feb 2, 2019

One of the breaking changes I am planning (and which I have not yet discussed with the rest of the maintainers) is to remove Mu, Nu and Fix! Let me explain.

I'm on board with all of this. When I realised the power of Base, recursion-schemes made a lot more sense. When you see it only through the light of Fix, it seems to exist on this theoretical island and incompatible with everything we do otherwise. But this isn't the case! Once we understand that Recursive and Corecursive are classes that we should derive for our implicitly recursive data types, the library begins look a lot more attractive.

I should mention, however, that if you do go through the trouble of defining strict base functors and carefully distinguishing between data and codata, you get the benefit that length [0..] is now a type error instead of an infinite loop.

I have to admit this is also quite attractive. Perhaps there is another way we can get this without needing Mu, Nu and Fix? Though in a certain light length [0..] can make sense, if you produce lazy Peano naturals. Then you can ask questions like length [0..] > 2, which is perfectly sensible (and can terminate).

@gelisam
Copy link
Collaborator Author

gelisam commented Feb 2, 2019

Yaya has productive metamorphisms (stream, etc.). The naïve composition of cata and ana has no benefits, but it has been difficult to generalize Gibbons’ streaming transformers to data structures other than lists.

I haven't read Gibbons' paper yet, sorry. Are you saying that hylo has no performance benefits? I think it has another benefit: succinctly indicating to the reader that the function they are about to read uses a divide-and-conquer approach.

@phadej
Copy link
Contributor

phadej commented Feb 2, 2019

Fix is a nice helper type which you may need. E.g. unification-fd defines it too.
E.g. when you actually only define a base functor and then use it with Fix or UTerm or Compose (Either Err). Even it doesn't end up in the final version of whatever you are building, it's a nice building block to exist. If you remove it from this library, it will pop somewhere else.

I agree that we should de-emphasise Fix; maybe even moving it into own module (There were proposals to even move the type to base). I'd really push everything to depend on recursion-schemes so the efforts aren't duplicated and/or people move to rather use Recursive class, when more applicable

I'm not sure about Mu and Nu. One might need them, might not. We have Coyoneda, however; and even I don't use Coyoneda directly that often, I do occasionally look how its instances are defined. Mu and Nu have educational value. If we move Fix to own module, Mu and Nu could live there too.

@phadej
Copy link
Contributor

phadej commented Feb 2, 2019

And FWIW, discussing anything on Twitter with a goal to make any decisions is terrible idea. Stuff get simply lost there.

@phadej
Copy link
Contributor

phadej commented Feb 2, 2019

Generic based NonRecursive t derived from Rep t: won't that lose the nice property of readability of ConsF. The Rep of Representable makes sense, as you can use that class without really inspecting the index value. Not the case with recursion-schemes. Am I missing something?

@phadej
Copy link
Contributor

phadej commented Feb 2, 2019

I think we should use more specialized names, ParaSubTree

That's better argued with an example, e.g. how splitAtCommaSpace in #74 would look like with it?

@phadej
Copy link
Contributor

phadej commented Feb 2, 2019

Then you can ask questions like length [0..] > 2, which is perfectly sensible (and can terminate).

Technically list and peano naturals in this example are codata. You can write that in Agda nowadays. So yes, in Haskell data and codata are the same, and when you mix them in wrong way, bad things happen; but trying to artificially separate them doesn't IMHO adds anything (ivory tower smell, maybe).

@gelisam
Copy link
Collaborator Author

gelisam commented Feb 2, 2019

If you remove it from this library, it will pop somewhere else.

Oh, I definitely think Fix should exist! I think it's fundamental enough to be in base, even.

@gelisam
Copy link
Collaborator Author

gelisam commented Feb 2, 2019

Generic based NonRecursive t derived from Rep t: won't that lose the nice property of readability of ConsF. The Rep of Representable makes sense, as you can use that class without really inspecting the index value. Not the case with recursion-schemes. Am I missing something?

I agree that datatypes generated using generics are unreadable; this is precisely the problem which I want to address in my unrepresentable library.

@ocharles
Copy link

ocharles commented Feb 2, 2019

Theres a lot going on in this thread. Perhaps it's worth breaking this apart into smaller PRs and then once you're happy with planned rewrites/improvements @gelisam, we can revisit what yaya still has to offer? I mean things like whether or not we have Mu seems fairly tangential. GitHub doesn't have great tools for complex discussions, but multiple issues is a bit more manageable. I'm not a maintainer though, so it's your call!

@gelisam
Copy link
Collaborator Author

gelisam commented Feb 2, 2019

Sure, good idea.

@gelisam
Copy link
Collaborator Author

gelisam commented Feb 2, 2019

Split into #77, #78, #79, #80, and #81. I did not create separate issues for the TH and metamorphisms topics, feel free to discuss them here or to open a new ticket if you have something to say about either.

@sellout
Copy link

sellout commented Feb 3, 2019

I’ll come back to the rest of this stuff later, but quickly …

Yaya has productive metamorphisms (stream, etc.). The naïve composition of cata and ana has no benefits, but it has been difficult to generalize Gibbons’ streaming transformers to data structures other than lists.

I haven't read Gibbons' paper yet, sorry. Are you saying that hylo has no performance benefits? I think it has another benefit: succinctly indicating to the reader that the function they are about to read uses a divide-and-conquer approach.

the README should probably say “naïve composition of ana and cata”, not “cata and ana”.

hylo φ ψ  cata φ . ana ψ
meta ψ φ = ana ψ . cata φ -- what this refers to

You can see that meta doesn’t have any obvious opportunity for fusion or anything. It just folds, then takes the result and unfolds it to something else. But Gibbons shows how in certain cases you can get streaming behavior from meta in some cases without forcing the entire fold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants