Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add implementation of elgot hylomorphism #69

Merged
merged 8 commits into from
Mar 9, 2017

Conversation

b-studios
Copy link
Contributor

This PR adds elgot hylomorphisms, as mentioned in this TODO on Paramerge.

The two auxiliary definitions of loop and trans were introduced for documentation purposes and assist type inference in the remaining code.

@CLAassistant
Copy link

CLAassistant commented Feb 26, 2017

CLA assistant check
All committers have signed the CLA.

@b-studios
Copy link
Contributor Author

@sellout Are tests obligatory? To be honest, I haven't looked very deeply into your testing infrastructure, yet. What would be the best location to add tests to?

@sellout
Copy link
Contributor

sellout commented Feb 26, 2017

@b-studios Yeah, that would be nice – we should have codecov verifying that coverage doesn’t decrease, but I think I have to fix something to make that work again.

Tests should go under tests/src/test/… – it’s not very well organized at the moment – I think most tests are just in spec.scala, but feel free to put things elsewhere if it makes things clearer for you.

@b-studios
Copy link
Contributor Author

@sellout I added some tests, though I am not super happy with the use cases. They still look contrived and semantically don't use everything the elgotHylo has to offer.

I also added tests for elgotCata and elgotAna while at it. I noticed that elgot(Cata|Ana) take the distributive laws as first argument, I would be interested in the design considerations there. Currently, my implementation of elgotHylo first takes phi and psi and then the distributive laws.

@sellout
Copy link
Contributor

sellout commented Feb 28, 2017

@b-studios Yeah, so the order of parameters I probably just got from @ekmett’s recursion-schemes. I guess we could potentially do something like

  def elgotHylo[M[_]: Monad, W[_]: Comonad, F[_]: Functor, A, B](
     kf: DistributiveLaw[F, W],
     kg: DistributiveLaw[M, F]
   ): (ElgotAlgebra[W, F, B], ElgotCoalgebra[M, F, A]) => A => B

def elgotDyna[…] = elgotHylo(distHisto, distAna)

but that probably doesn’t play well with inference. Also we tend to have the value A as the first parameter – just seems to be the Scala way. My Haskell brain definitely prefers the A => B.

@b-studios
Copy link
Contributor Author

b-studios commented Feb 28, 2017

@sellout I agree, this won't play well with type inference since A and B are known too late. It probably doesn't make any sense to discuss this in this PR, but I prefer the A => B for the simple reason that I like to see the x-morphisms as building blocks to define functions, rather than as a tool to recursive over something.

Edit: After reading the last line of your example above, I now also understand why you first want to specify the distributive laws.

@b-studios
Copy link
Contributor Author

b-studios commented Feb 28, 2017

Sidenote: I also don't understand why Kmett puts the algebra argument before the coalgebra one. Maybe I am thinking to imperative here, but for me it is (1) unfold something, then (2) fold into a result.

But again, maybe that made sense with currying and partial application.

@ekmett
Copy link

ekmett commented Feb 28, 2017

The order of arguments was picked a decade before I ever appeared on the scene, but the argument order is good for equational reasoning and fits with existing practice with (.).

Look at it like a generalized (.).

f (g a) becomes (f . g) a to keep things in the same order.

The parts are in the same order as you rewrite:

hylo f g a = cata f (ana g a) = (cata f . ana g) a

Internally,

hylo f g = h where h = f . fmap h . g

f & g are placed in the order they appear in the result

You can derive the internal definition of hylo from taking cata and ana's internal definition and fusing them this way, and nothing ever gets interchanged in the argument list.

@b-studios
Copy link
Contributor Author

@ekmett Thanks, for your explanations! Taking (.) as the primary composition mechanism on arrows this completely makes sense. As I said, maybe I am still thinking too imperative here, but if we build on reverse arrow composition (;) instead (as I have also seen in some literature on CT) we at least would have:

hylo g f = ana g ; cata f 

which also keeps the parts in order. Admittedly, I can't claim the same for f (g a) and (g ; f) a.

@ekmett
Copy link

ekmett commented Feb 28, 2017

You'd also run into the problem that every single person that knew what these were would get the argument order wrong.

Functional Programming with Bananas, Lenses and Barbed Wire is a chapter in Erik Meijer's thesis, and pretty much the paper that set the order of these things in stone and popularized the core concepts.

@ekmett
Copy link

ekmett commented Feb 28, 2017

Just pretend you're writing POSIX-compliant code, as such the destination/output description is the first argument by convention if it makes you feel better. ;)

@djspiewak
Copy link
Contributor

@ekmett Scala basically universally inverts argument order. Intuitively, this is because the "first" argument is the dispatch receiver in an OO language, which is also by definition usually the most specific argument and thus the one that would be placed last in a language where functions are the primary abstraction.

The unfortunate reality is that the type inferencer and the parser are both biased toward the "most-to-least specific" ordering, and it's almost impossible to create a usable Scala API which bucks that trend. It is backwards from the standpoint of composition (.), but there's not a lot that can be done about it unless you want to give up on inference and, secondarily, extraneous punctuation everywhere.

@b-studios
Copy link
Contributor Author

@ekmett I have to apologize, I know the Bananas paper and the argument ordering there and misattributed you in my statement above. It is just that you are an authority on the field of recursion schemes and I was too quick with putting fingers.

To conclude the slightly off-topic discussion: I will change the argument ordering to be compliant with the established standards, but still need to find an elegant solution for the distributive laws.

}
g ⋙ loop ⋙ f
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is beautiful.

@ekmett
Copy link

ekmett commented Mar 1, 2017

def elgotHylo[M[_]: Monad, W[_]: Comonad, F[_]: Functor, A, B](
  f: ElgotAlgebra[W, F, B],
  g: ElgotCoalgebra[M, F, A],
  kf: DistributiveLaw[F, W],
  kg: DistributiveLaw[M, F]
): A => B

@djspiewak: argument order doesn't really help these with inference without doing a ton of different argument groups, basically all the parameters are functions, so the best you could do is take the (A) as an argument in one parenthesis group, then all the functions after and it'd help you infer a tiny bit about the carrier of the coalgebra argument. Once you do that, though, the whole thing becomes a mess, though. You could maybe use the determination of F from the coalgebra to infer the type of F in the algebra, and then in a separate argument group you'd be able to fully infer rather than check types for the distributive laws.

def elgotHylo[M[_]: Monad, W[_]: Comonad, F[_]: Functor, A, B]
  (a: A)
  (g: ElgotCoalgebra[M, F, A])(f: ElgotAlgebra[W, F, B])
  (kg: DistributiveLaw[M, F], kf: DistributiveLaw[F, W]): B

But god damn is that a mess. ;)Actually using the distributive law arguments first might work out better, because they'll tell you M and F and W. so that you just need, but to get inference to flow for F you need to take them in separate ()'s.

def elgotHylo[M[_]: Monad, W[_]: Comonad, F[_]: Functor, A, B]
  (kg: DistributiveLaw[M, F])(kf: DistributiveLaw[F, W], a: A)
  (g: ElgotCoalgebra[M, F, A], f: ElgotAlgebra[W, F, B]): B

Yuck again. 3-4 argument groups haphazardly sorted? But interestingly in this last case, it doesn't matter which way you take algebra vs. coalgebra arguments as

def elgotHylo[M[_]: Monad, W[_]: Comonad, F[_]: Functor, A, B]
  (kf: DistributiveLaw[F, W])(kg: DistributiveLaw[M, F], a: A)
  (f: ElgotAlgebra[W, F, B], g: ElgotCoalgebra[M, F, A]): B

works just as well. In theory that'd let you infer all the type arguments at the low low price of your soul when you have to read the code later or figure out how to call it. =)

@sellout sellout self-assigned this Mar 1, 2017
@sellout
Copy link
Contributor

sellout commented Mar 1, 2017

@ekmett Yeah, I remember jumping through some of these hoops when initially writing the generalized stuff.

I think the main reason we ended up with something like def gcata(T)(DistributiveLaw, GeneralizedAlgebra): A was to make it possible for Simulacrum to generate implicit Ops classes (via @typeclass). But that only affects Recursive. Corecursive and hylo-ish ops need to be hand-written anyway, but consistency is useful.

However, we later abandoned @typeclass sow we could use other tricks (PartiallyApplied) to make it easier for users to call the Ops methods. There are various issues (scala/scala#5108, typelevel/simulacrum#75, typelevel/simulacrum#76) that would make it possible to eliminate various tricks and/or get back some of the automation.

@ekmett
Copy link

ekmett commented Mar 1, 2017

To be honest, I'll be shocked if you get anybody ever using anything more complicated than a dynamorphism in practice. Examples become sparse the farther up the tree you climb.

All of these fancy hylo variants can be written in terms of the base hylo. They are just stylized applications of distributive laws. hylo itself is already out in Turing complete territory, and the fusion laws aren't terribly good, so as a normal form you don't really gain a lot in reasoning power.

And indeed within the confines of these sorts of combinators, a better approach overall would be to base work on the adjoint folds stuff that Hinze, Wu, and Gibbons have built up, but that requires a better language than Scala, or frankly, Haskell as it is used today as well. =(

I've long considered my work collecting these as an almost complete waste of time. You learn something from each individual cata or ana variant about a thing you might do in a sort of recursive pattern, but rarely is the composition of such beasts a thing that actually helps you reason about the code compared to a more straightforward variant.

Spotting the basic cata/ana/hylo cases is useful, if only because they let you figure out how to fuse the cata/ana stuff case into a hylo, but beyond that I've yet to see them gainfully employed beyond an occasional para, dyna or elgot algebra.

@sellout
Copy link
Contributor

sellout commented Mar 1, 2017

Yeah, adjoint folds are on the TODO.

Most of the code in this repo is motivated by use in quasar-analytics/quasar.

Are you saying that ghylo can be written in terms of hylo? It’s not immediately obvious to me how, but I’m interested in digging in if so.

@ekmett
Copy link

ekmett commented Mar 2, 2017

You need an extract and a return on the outside. Your carriers shift to being (M A) and (W B) instead of A and B. The result is less efficient if you don't use Yoneda[F] instead of F, though, as it will use separate fmaps because you'll have to do an fmap inside of each of the algebra and coalgebra.

@ekmett
Copy link

ekmett commented Mar 2, 2017

All the distributive laws do is 'shove crap back into the carriers' so that the recursion scheme can keep pushing it out of sight one more recursion level deeper in.

@ekmett
Copy link

ekmett commented Mar 2, 2017

ghylo covers everything here except for Fokkinga's postpro/prepro and some adjoint fold cases that aren't expressible through this (co)monad transformer approach.

@sellout
Copy link
Contributor

sellout commented Mar 2, 2017

Does this also work for the “elgot” variants? In Matryoshka, I’ve generalized the term from the usual “elgot {co}algebra” to type EAlgebra w f a = w (f a) => a (and the obvious dual).

@ekmett
Copy link

ekmett commented Mar 2, 2017

Sticking to the basic version:

elgot :: Functor f => (f b -> b) -> (a -> Either b (f a)) -> a -> b
elgot phi psi = h where h = (id ||| phi . fmap h) . psi

doesn't fit the pattern of

hylo phi psi = h where h = phi . fmap h . psi

directly, because of that pesky (id |||) on the outside. That is the key to using elgot efficiently though. It means that we can build up layers one by one or choose to cheat and just tell the output its carrier directly.

But you should be able to change base functor from f to Compose (Either b) f and stylistically replace the algebra with one that just passes through the left case as its answer and otherwise does what the original did.

So in the elgot case you're changing the functor, not the carrier to shoehorn it into the hylo mold.

@sellout
Copy link
Contributor

sellout commented Mar 2, 2017

Thanks, this has all been super-informative.

@sellout
Copy link
Contributor

sellout commented Mar 8, 2017

@b-studios I still think this PR is useful as-is (with a reordering of the parameters). The implementation might change (a la #72), but the signature should be the same.

Also, could you define elgot and coelgot in terms of elgotHylo? Preemptively get rid of a few more Recursion warnings 😆

@b-studios
Copy link
Contributor Author

@sellout Should I also implement elgot(Cata|Ana) in terms of elgotHylo or is this counterproductive wrt your recursion elimination strategy?

@sellout
Copy link
Contributor

sellout commented Mar 8, 2017

@b-studios Yeah, I think that’d be good. Eventually elgot(Cata|Ana) should be implemented in terms of cata or ana, but I’m currently stuck on how to do that. You can see I did the same with anaM, etc. in #72 – implement it in terms of some hylo until I can figure out how to do it in terms of ana.

@b-studios
Copy link
Contributor Author

b-studios commented Mar 9, 2017

@sellout As promised on gitter, I implemented elgot(Ana|Cata) and (co)Elgot in terms of ana|cata / hylo, respectively. I was slightly surprised that I did not need to use the elgotHylo -- which again seems like a point for "we don't actually need it".

To finalize this PR, I think we only need to change the argument order of elgotHylo as desired. However, in the discussion above I lost track of what's the argument order that you'd prefer. I suggest the following to be somewhat compatible to the existing style in Matryoshka:

def elgotHylo[M[_]: Monad, W[_]: Comonad, F[_]: Functor, A, B]
      (a: A)
      (: DistributiveLaw[F, W], : DistributiveLaw[M, F])
      (φ: ElgotAlgebra[W, F, B], ψ: ElgotCoalgebra[M, F, A]): B = ???

@ekmett
Copy link

ekmett commented Mar 9, 2017

In practice you probably don't want to implement them in terms of ana/cata as you'll run into problems with performance. scala isn't the best at inlining and/or dealing with Yoneda'd code. ;) I tend to favor just directly implementing them as a result.

@sellout
Copy link
Contributor

sellout commented Mar 9, 2017

Yeah, that’s close. A couple small changes – reverse the order of the M and W type parameters, merge the distributive law & algebra parameter lists, and since we use Monads in two ways (as part of a gana as well as in anaM), we use N for the first type and M for the latter.

def elgotHylo[W[_]: Comonad, N[_]: Monad, F[_]: Functor, A, B]
  (a: A)
  (: DistributiveLaw[F, W], : DistributiveLaw[N, F], φ: ElgotAlgebra[W, F, B], ψ: ElgotCoalgebra[N, F, A])
    : B = ???

This’ll also need a merge from master, and probably some (hopefully minor) conflict resolution in the tests.

@sellout
Copy link
Contributor

sellout commented Mar 9, 2017

@ekmett The reason for defining the unfolds in terms of ana is so that, e.g., Corecursive[Nu[F]].elgotAna will be lazy like Nu’s ana. #72 makes this change more generally, and I’d be more than happy to get your input on it.

@ekmett
Copy link

ekmett commented Mar 9, 2017

The thing to make sure is that the stylized distributive law tweaks don't wind up causing you to map 2-3 times per iteration rather than once. Expand out the code for your distributive laws and make sure you don't wind up with, say:

f . fmap duplicate . fmap h . fmap join . g

rather than

f . fmap (duplicate . h . join) . g

Depending on the base functor, this cost can be tremendous. You can fuse them by using Yoneda to force the fmap to happen in one pass, but you'd need to benchmark / inspect the generated code to see how awful it is in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants