Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new Mendler-style API: more readable and more correct #106

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

gelisam
Copy link
Collaborator

@gelisam gelisam commented Sep 19, 2020

This is still a WIP, as it's missing documentation and will probably need to be changed in order to support hylo, but since it's such a big change, I would like to get some early feedback as to whether we agree that this is a good direction for the library.


new Mendler-style API: more readable and more correct

This is a huge, yet backwards-compatible change! It introduces a new API
which is both more readable and more correct than the previous API, and
it also changes the name of the recursion-schemes import.

First, the name of the module. "Data.Functor.Foldable" is a strange
choice of module name by modern standards, "RecursionSchemes" is better
because it matches the name of the library instead of trying to fit into
the "Data.<...>" / "Control.<...>" semantic hierarchy. By introducing
the new API and the new module name at the same time, it becomes
possible to preserve backwards compatibility simply by keeping the old
API at the old module name and adding the new API at the new module
name.

Second, the new API. One big advantage of Mendler-style
recursion-schemes such as mcata and mhisto is that the resulting
code is more readable. At least in my opinion, since readability is
always subjective. Still, compare the following four implementations of
fib:

data Unary = Zero | Succ Unary
makeBaseFunctor ''Unary

fibA :: Unary -> Int
fibA = \case
  Zero
    -> 1
  Succ Zero
    -> 1
  Succ nMinus1@(Succ nMinus2)
    -> fibA nMinus1 + fibA nMinus2

fibB :: Unary -> Int
fibB = histo $ \case
  ZeroF
    -> 1
  SuccF (_ :< ZeroF)
    -> 1
  SuccF (fibNMinus1 :< SuccF (fibNMinus2 :< _))
    -> fibNMinus1 + fibNMinus2

fibC :: Fix (Base Unary) -> Int
fibC = mhisto $ \recur split -> \case
  ZeroF
    -> 1
  SuccF nMinus1 -> case split nMinus1 of
    ZeroF
      -> 1
    SuccF nMinus2
      -> recur nMinus1 + recur nMinus2

fibD :: RecFun Unary Int
fibD = recFun histo $ \case
  ZeroF
    -> 1
  SuccF nMinus1 -> case project nMinus1 of
    ZeroF
      -> 1
    SuccF nMinus2
      -> recur nMinus1 + recur nMinus2

I would say that fibA is the most readable, as it directly expresses
that fib 0 = 1, fib 1 = 1, and fib n = fib (n-1) + fib (n-2). But
that definition uses general recursion, which does not guarantee that
we're only making recursive calls on smaller terms. It is also very
inefficient.

fibB improves upon fibA by using the histo recursion scheme, which
perform the recursion on fibB's behalf; and since fibB does not
make any recursion calls itself, it cannot accidentally make recursive
calls on terms which are not smaller. It is also much more efficient,
because the recursion structure is no longer branching into two, instead
it combines two pre-computed results. However, fibB's pattern-matching
is a lot messier: the structure of the Unary and the pre-computed
results are interleaved, and you have to be very familiar with the
details of the Base Unary (Cofree (Base Unary) Int) type on which it
is pattern-matching in order to know which position holds what. This
becomes even messier when combining multiple recursion-schemes, e.g.
with a zygo-paramorphism.

Furthermore, while the word "histo" is helpful to those who are familiar
with that name, as it immediately conveys the idea that we are going to
be combining recursively-pre-computed values from multiple depths, it is
deeply unhelpful to those who aren't, as the recursion structure of
fibB is no longer explicit, it is now hidden inside the implementation
of histo.

Next, fibC has the best of both worlds: the word mhisto conveys
the same idea to those who are familiar with the name "histo", while the
explicit recur calls clarify the recursion structure for those who
aren't. Furthermore, while the pattern-matching is not as readable as in
fibA, it is much more readable than in fibB, because we don't have
to bring all of the information into scope all at once. Instead, the
helper functions recur and split extract the bits we need when we
need them. We also get the same efficient implementation; recur
extracts the pre-computed value for that recursive position, it does not
actually make a recursive call, but recur is still a very good name
for it because it returns the same result as if we were making a
recursive call to fibC. Moreover, the type of recur is restricted in
a way which guarantees that it can only be called on smaller terms.

Finally, fibD demonstrates the new API. Like fibC, it has the best
of both worlds, but in addition, the recur and split (renamed to
project, because we can reuse Recursive.project) functions are now
global instead of local. Also, while Data.Functor.Foldable.histo has a
ghisto variant which makes it possible to combine recursion-schemes,
Data.Functor.Foldable.mhisto doesn't. In contrast, the new API's
design does make it possible to provide a histoT variant while
retaining the advantages of the Mendler style.

See
#105 (comment)
for a similar example with para instead of histo.

One last advantage of the new API is that is is more correct than the
old API. See
#50 (comment)
for an explanation of what is wrong with the old API, and
#51
for an explanation of the Gather/Scatter solution to that problem. In
short, generalized recursion-schemes such as ghisto can sometimes
combine in ways that appear type-correct but actually give
subtly-incorrect results, while the new API's transformers, such as
histoT, use a much simpler approach, in the hope of making the library
closer to "so simple that there are obviously no deficiencies" than to
"so complicated that there are no obvious deficiencies".

In the new API, every Recursive-based function is a catamorphism applied
to an algebra of type base pos -> pos. This pos is different for
every recursion scheme, and it contains all the data which the recursion
scheme needs to keep track during the catamorphism. The user provides a
function of type base pos -> a, so pos must also have instances
which allow the user to call functions like recur on the recursive
positions. Each recursion scheme picks their pos so that the user is
only allowed to make the observations which make sense for that
recursion scheme. That's it. Well, sort of.

The user provides a function of type base pos -> a, but we need a
function of type base pos -> pos. The user's function explains how one
step of the recursion computes the value the user is interested in from
the values provided by the base pos. What we need is a function
explaining how one step of the recursion computes all the data we need
in order to continue the fold, including both the data which the user is
interested in and the data which the next incarnation of the user
function will be able to draw from. Each recursion scheme must thus
provide a "gather" function of type a -> base pos -> pos, which
extracts all that information, and stores it in the pos along with the
a which the user function returned; after all, the next incarnation of
the user function must be able to call recur on that pos in order to
get back that a.

We might also want to combine recursion schemes; that's the feature for
which the old API brought in complex things like comonad transformers
and functions expressing distributive laws. In the new API, a recursion
scheme transformer such as paraT is associated with a pos
transformer, such as ParaPosT, and instead of providing a gather
function, it transforms a gather function on pos into a gather
function on ParaPosT pos.

The Corecursive-based functions work analogously, using an anamorphism
on a pos -> base pos co-algebra, scatter functions, and scatter
function transformers.

This is a huge, yet backwards-compatible change! It introduces a new API
which is both more readable and more correct than the previous API, and
it also changes the name of the recursion-schemes import.

First, the name of the module. "Data.Functor.Foldable" is a strange
choice of module name by modern standards, "RecursionSchemes" is better
because it matches the name of the library instead of trying to fit into
the "Data.<...>" / "Control.<...>" semantic hierarchy. By introducing
the new API and the new module name at the same time, it becomes
possible to preserve backwards compatibility simply by keeping the old
API at the old module name and adding the new API at the new module
name.

Second, the new API. One big advantage of Mendler-style
recursion-schemes such as `mcata` and `mhisto` is that the resulting
code is more readable. At least in my opinion, since readability is
always subjective. Still, compare the following four implementations of
`fib`:

    data Unary = Zero | Succ Unary
    makeBaseFunctor ''Unary

    fibA :: Unary -> Int
    fibA = \case
      Zero
        -> 1
      Succ Zero
        -> 1
      Succ nMinus1@(Succ nMinus2)
        -> fibA nMinus1 + fibA nMinus2

    fibB :: Unary -> Int
    fibB = histo $ \case
      ZeroF
        -> 1
      SuccF (_ :< ZeroF)
        -> 1
      SuccF (fibNMinus1 :< SuccF (fibNMinus2 :< _))
        -> fibNMinus1 + fibNMinus2

    fibC :: Fix (Base Unary) -> Int
    fibC = mhisto $ \recur split -> \case
      ZeroF
        -> 1
      SuccF nMinus1 -> case split nMinus1 of
        ZeroF
          -> 1
        SuccF nMinus2
          -> recur nMinus1 + recur nMinus2

    fibD :: RecFun Unary Int
    fibD = recFun histo $ \case
      ZeroF
        -> 1
      SuccF nMinus1 -> case project nMinus1 of
        ZeroF
          -> 1
        SuccF nMinus2
          -> recur nMinus1 + recur nMinus2

I would say that `fibA` is the most readable, as it directly expresses
that `fib 0 = 1`, `fib 1 = 1`, and `fib n = fib (n-1) + fib (n-2)`. But
that definition uses general recursion, which does not guarantee that
we're only making recursive calls on smaller terms. It is also very
inefficient.

`fibB` improves upon `fibA` by using the `histo` recursion scheme, which
perform the recursion on `fibB`'s behalf; and since `fibB` does not
make any recursion calls itself, it cannot accidentally make recursive
calls on terms which are not smaller. It is also much more efficient,
because the recursion structure is no longer branching into two, instead
it combines two pre-computed results. However, `fibB`'s pattern-matching
is a lot messier: the structure of the Unary and the pre-computed
results are interleaved, and you have to be very familiar with the
details of the `Base Unary (Cofree (Base Unary) Int)` type on which it
is pattern-matching in order to know which position holds what. This
becomes even messier when combining multiple recursion-schemes, e.g.
with a zygo-paramorphism.

Furthermore, while the word "histo" is helpful to those who are familiar
with that name, as it immediately conveys the idea that we are going to
be combining recursively-pre-computed values from multiple depths, it is
deeply unhelpful to those who aren't, as the recursion structure of
`fibB` is no longer explicit, it is now hidden inside the implementation
of `histo`.

Next, `fibC` has the best of both worlds: the word `mhisto` conveys
the same idea to those who are familiar with the name "histo", while the
explicit `recur` calls clarify the recursion structure for those who
aren't. Furthermore, while the pattern-matching is not as readable as in
`fibA`, it is much more readable than in `fibB`, because we don't have
to bring all of the information into scope all at once. Instead, the
helper functions `recur` and `split` extract the bits we need when we
need them. We also get the same efficient implementation; `recur`
extracts the pre-computed value for that recursive position, it does not
actually make a recursive call, but `recur` is still a very good name
for it because it returns the same result as if we were making a
recursive call to `fibC`. Moreover, the type of `recur` is restricted in
a way which guarantees that it can only be called on smaller terms.

Finally, `fibD` demonstrates the new API. Like `fibC`, it has the best
of both worlds, but in addition, the `recur` and `split` (renamed to
`project`, because we can reuse `Recursive.project`) functions are now
global instead of local. Also, while `Data.Functor.Foldable.histo` has a
`ghisto` variant which makes it possible to combine recursion-schemes,
`Data.Functor.Foldable.mhisto` doesn't. In contrast, the new API's
design does make it possible to provide a `histoT` variant while
retaining the advantages of the Mendler style.

See
#105 (comment)
for a similar example with `para` instead of `histo`.

One last advantage of the new API is that is is more correct than the
old API. See
#50 (comment)
for an explanation of what is wrong with the old API, and
#51
for an explanation of the Gather/Scatter solution to that problem. In
short, generalized recursion-schemes such as `ghisto` can sometimes
combine in ways that appear type-correct but actually give
subtly-incorrect results, while the new API's transformers, such as
`histoT`, use a much simpler approach, in the hope of making the library
closer to "so simple that there are obviously no deficiencies" than to
"so complicated that there are no obvious deficiencies".

In the new API, everything is a catamorphism applied to an algebra of
type `base pos -> pos`. This `pos` is different for every recursion
scheme, and it contains all the data which the recursion scheme needs to
keep track during the catamorphism. The user provides a function of type
`base pos -> a`, so `pos` must also have instances which allow the user
to call functions like `recur` on the recursive positions. Each
recursion scheme picks their `pos` so that the user is only allowed to
make the observations which make sense for that recursion scheme. That's
it. Well, sort of.

The user provides a function of type `base pos -> a`, but we need a
function of type `base pos -> pos`. The user's function explains how one
step of the recursion computes the value the user is interested in from
the values provided by the `base pos`. What we need is a function
explaining how one step of the recursion computes all the data we need
in order to continue the fold, including both the data which the user is
interested in and the data which the next incarnation of the user
function will be able to draw from. Each recursion scheme must thus
provide a "gather" function of type `a -> base pos -> pos`, which
extracts all that information, and stores it in the `pos` along with the
`a` which the user function returned; after all, the next incarnation of
the user function must be able to call `recur` on that `pos` in order to
get back that `a`.

We might also want to combine recursion schemes; that's the feature for
which the old API brought in complex things like comonad transformers
and functions expressing distributive laws. In the new API, a recursion
scheme transformer such as `paraT` is associated with a `pos`
transformer, such as `ParaPosT`, and instead of providing a gather
function, it transforms a gather function on `pos` into a gather
function on `ParaPosT pos`.

The `Scatter` version, where everything is an anamorphism, will be
implemented in a latter commit.
details need to be public and stable in order for users to define their
own custom recursion schemes
the implementation details which are subject to change are in Internal,
the implementation details which implemententers of custom recursion
schemes need to know about and which must thus be stable are in Custom.
Copy link
Contributor

@phadej phadej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a lot of benefit for users who define Recursive instance and just use cata, but I guess if I only use cata nothing changes for me? Or does cata becomes mcata?

Also I'm unappy to see {-# OVERLAPPABLE #-} in the implementation (I barely tolerate them in GHC.Generics code).

--

I don't feel confident rewriting library with such major change. If @ekmett doesn't object, I won't object either.

@gelisam gelisam mentioned this pull request Sep 19, 2020
@gelisam
Copy link
Collaborator Author

gelisam commented Sep 19, 2020

I don't see a lot of benefit for users who define Recursive instance and just use cata

Indeed, with cata the recursive positions only contain one thing, the recursively-computed result, so having an explicit recur does not add much to readability.

but I guess if I only use cata nothing changes for me? Or does cata becomes mcata?

It's not quite as simple as RecursionSchemes.cata = Data.Functor.Foldable.mcata, but RecursionSchemes.cata is closer to mcata than to Data.Functor.Foldable.cata, yes:

-- |
-- >>> sumA [1..3]
-- 6
sumA :: [Int] -> Int
sumA = Data.Functor.Foldable.cata $ \case
  Nil       -> 0
  Cons x xs -> x + xs

-- |
-- >>> sumB $ refix [1..3]
-- 6
sumB :: Fix (ListF Int) -> Int
sumB = Data.Functor.Foldable.mcata $ \recur -> \case
  Nil       -> 0
  Cons x xs -> x + recur xs

-- |
-- >>> runRecFun sumC [1..3]
sumC :: RecFun [Int] Int
sumC = recFun RecursionSchemes.cata $ \case
  Nil       -> 0
  Cons x xs -> x + recur xs

The extra recFun and runRecFun noise is needed for the fancier features, but if you're not using the fancier features, I guess the new API is clearly a downgrade for you since it only adds more boilerplate :(

Also I'm unappy to see {-# OVERLAPPABLE #-} in the implementation (I barely tolerate them in GHC.Generics code).

Can you explain why? One reason to be wary of OVERLAPPABLE is that the behaviour can be different depending on whether or not the more-precise instance is in scope, but I don't think this is likely to happen in this case. Each recursion scheme defines its own typeclass, its own pos transformer, one instance for their pos transformer, and one OVERLAPPABLE instance for every other pos transformer, thus avoiding the quadratic instances problem. Since all the instances you will ever need are defined in the module which defines the typeclass, you can never accidentally have fewer instances in scope than intended.

I don't feel confident rewriting library with such major change.

It is indeed quite a big change, but I think the readability improvement is worth it, especially for new users. Of course, if that is done at the expense of existing users, that might still not be worthwhile. If you're only using cata, then I think we might both be atypical users; perhaps we should poll users to see if they like the new syntax?

If @ekmett doesn't object, I won't object either.

@ekmett asked me if I wanted to take over the project precisely because my suggested change, #51, strayed too far off the beaten path by replacing the academically-supported comonadic-approach in favour of an unpublished gather/scatter approach. This PR is in the same vein: in addition to implementing that gather/scatter approach, this PR also changes the API away from the academically-supported approach of representing recursion-schemes via higher-order functions. Instead, this PR represents recursion schemes as composable pieces which describe the shape of the recursion and constrain the user function so that it matches that shape, e.g. para constrains the user function so that it can only call recur or untouched on the recursive positions. So I think he would feel similarly: this is fine as long as somebody else supports it :)

@jhgarner
Copy link

From a reducing boilerplate point of view, does it make sense to define RecursionSchemes.cata and the like as runRecFun . recFun . RecursionSchemes.cataGather where cataGather is the current RecursionSchemes.cata? I guess a related question is what's the benefit of keeping a RecFun around from the user's point of view? It seems like the only thing you can do with it is run it.

This all looks really cool though and I'm excited to see what comes of it!

@gelisam
Copy link
Collaborator Author

gelisam commented Sep 21, 2020

In my first iteration of this design, recFun was returning a recursive function, like you are recommending, and like the current API. This hides the fact that a recursive function happens to be implemented by a recursion scheme, which makes sense because it seems to be an implementation detail.

However, in both APIs, zygo can only help improve the performance of your calls to helper functions if those helper functions are themselves exposed as an algebra rather than (or in addition to) a recursive function. This is unfortunate, because if the fact that a helper function happens to be implemented using a recursion scheme is indeed an implemention detail, then the algebra is clearly also an implemention detail, and so it makes more sense to hide it in a where clause than to expose it as a separate function. There is thus tension in the API: one part encourages hiding the implementation details, and another part encourages exposing them.

I realized that I had an opportunity to improve on this, by choosing only one side. I chose to expose the fact that a computation is implemented using a recursion scheme, by exposing it as a RecFun instead of a recursive function. Having to call runRecFun at the call site is definitely a downside, but it has the advantage that if you are calling runRecFun f (untouched pos), you can now have a linter rule teaching you to use zygo instead.

@gelisam
Copy link
Collaborator Author

gelisam commented Sep 21, 2020

That is, in addition to calling runRecFun, the other thing which users can do with a RecFun is to call withZygotizedRecFun.

@ocharles
Copy link

However, in both APIs, zygo can only help improve the performance of your calls to helper functions if those helper functions are themselves exposed as an algebra rather than (or in addition to) a recursive function. This is unfortunate, because if the fact that a helper function happens to be implemented using a recursion scheme is indeed an implemention detail, then the algebra is clearly also an implemention detail, and so it makes more sense to hide it in a where clause than to expose it as a separate function. There is thus tension in the API: one part encourages hiding the implementation details, and another part encourages exposing them.

I was musing on this today and wonder if you can get the best of both worlds with rewrite rules. You would be at the mercy of the optimizer though, which is a shame. But basically, what you want is for zygo to "see" the cata it's been given, which looks like a rewrite rule on zygo (cata f)

@ocharles
Copy link

Maybe another option is for RecFun to be a type class on the arrow - if you give it to zygo you get a RecFun, but there would also be an instance on -> to just run it? Things are getting very hard to discover now though :(

@gelisam
Copy link
Collaborator Author

gelisam commented Sep 22, 2020

But basically, what you want is for zygo to "see" the cata it's been given, which looks like a rewrite rule on zygo (cata f)

I really like that idea! It doesn't quite type-check, since zygo takes the algebra f rather than the recursive function cata f. Hmm, rewrite rules are supposed to preserve the types, but the compiler doesn't check that. Would it be too evil to rewrite zygo (cata f) to zygo f, so that you get a type error at compile-time if the rewrite does not happen?

@gelisam
Copy link
Collaborator Author

gelisam commented Sep 22, 2020

Maybe another option is for RecFun to be a type class on the arrow

This one, however, feels like the kind of hack which would be more at home in a Ruby codebase than a Haskell codebase :) Do you know of other libraries which do this?

it's the black sheep of recursion schemes. it's a faster version of
(ana f >> cata g), but  (ana f >>> cata g) is a bad idea because it
doesn't guarantee termination: cata is guaranteed to terminate given a
finite input, but ana is only guaranteed to be productive, it does not
guarantee to generate a finite output.
@treeowl
Copy link
Collaborator

treeowl commented May 13, 2021

I hate overlapping instances with a passion. Can the trick of faking instance chains with type families work here?

@treeowl
Copy link
Collaborator

treeowl commented May 13, 2021

Rewrite rules don't preserve types? Are you sure about that?

@gelisam
Copy link
Collaborator Author

gelisam commented May 14, 2021

if you're not using the fancier features, I guess the new API is clearly a downgrade for you since it only adds more boilerplate :(

I am now thinking of renaming RecursionSchemes.cata to something like RecursionSchemes.directRecursion, so that the name RecursionSchemes.cata can be used for a higher-order function of type Foldable t => (Base t a -> a) -> RecFun t a. This way, users can choose to use either recFun RecursionSchemes.directRecursion if they want to make the recursive calls explicit, and RecursionSchemes.cata if they don't. In both cases, they get the ability to use the resulting RecFun t a as a helper function in zygomorphisms.

Same for ana, probably, but I am not planning to give the same treatment to zygo, histo and friends, because their API is just too cryptic, I don't think it's worth it. Still, given the RecursionSchemes.directRecursion precedent, I think it would be worth finding other names for the thing which tells recFun what kind of recursion we're doing, and to let the traditional recursion-scheme names refer to the higher-order function version if any.

@gelisam
Copy link
Collaborator Author

gelisam commented May 14, 2021

I hate overlapping instances with a passion.

Can you explain why? I've already addressed its main downside in an earlier comment.

Can the trick of faking instance chains with type families work here?

Do you have an example of that trick? I am using overlapping instances because I thought it was the canonical trick to avoid having to write a quadratic number of instances. Is there a different trick which accomplishes the same goal I should learn about?

@gelisam
Copy link
Collaborator Author

gelisam commented May 14, 2021

Rewrite rules don't preserve types? Are you sure about that?

Looks like I misremembered, as the documentation clearly states that both sides of the rewrite rule must have the same type. Thanks for the correction!

@gelisam
Copy link
Collaborator Author

gelisam commented May 14, 2021

@treeowl , I have found a blog post which explains how to use a type family to avoid OverlappingInstances, is this what you were referring to?

Since there is only one type for which I want to use a different instance, I do think the trick will work. I still think overlapping instances aren't causing any trouble in this particular case, but if they were, it would be easy to avoid them using that trick. Thanks for indirectly teaching me that trick!

@treeowl
Copy link
Collaborator

treeowl commented May 14, 2021

Can the trick of faking instance chains with type families work here?

Do you have an example of that trick? I am using overlapping instances because I thought it was the canonical trick to avoid having to write a quadratic number of instances. Is there a different trick which accomplishes the same goal I should learn about?

The idea is that you take much more control of instance resolution. I don't have time to dig into your code this minute, but here's the gist (with overly generic names).

Say you have

class C a b where
  meth :: ...

and you want to make some overlapping instances for it, all tightly controlled.

Then you write a closed type family to select the instance, and an auxiliary class to use it.

data WhichInstance = Mine | Yours

type family Choose a b :: WhichInstance where
  ....

class C' (tag :: WhichInstance) a b where
  -- This can take a passed proxy for the tag, or use Data.Tagged.Tagged,
  -- or in modern GHC you can use type applications.
  meth' :: ...

-- These instances don't overlap, no matter what a and b look like.
instance C' 'Mine ....
instance C' 'Yours ....

instance (tag ~ Choose a b, C' tag a b) => C a b where
  meth = meth' @tag
  -- or meth' (Proxy :: Proxy tag)
  -- or proxy meth' (Proxy :: Proxy tag) for Tagged

In this example, the C class actually becomes a bit redundant (except for documentation purposes); you can just make meth a function with the appropriate constraints. But if you only want to use a type family for some instances, you can do that too. Something like

instance (tag ~ Choose a b, C' tag a b) => C (Maybe a) b where
  meth = meth' @tag

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants