-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new Mendler-style API: more readable and more correct #106
base: master
Are you sure you want to change the base?
Conversation
This is a huge, yet backwards-compatible change! It introduces a new API which is both more readable and more correct than the previous API, and it also changes the name of the recursion-schemes import. First, the name of the module. "Data.Functor.Foldable" is a strange choice of module name by modern standards, "RecursionSchemes" is better because it matches the name of the library instead of trying to fit into the "Data.<...>" / "Control.<...>" semantic hierarchy. By introducing the new API and the new module name at the same time, it becomes possible to preserve backwards compatibility simply by keeping the old API at the old module name and adding the new API at the new module name. Second, the new API. One big advantage of Mendler-style recursion-schemes such as `mcata` and `mhisto` is that the resulting code is more readable. At least in my opinion, since readability is always subjective. Still, compare the following four implementations of `fib`: data Unary = Zero | Succ Unary makeBaseFunctor ''Unary fibA :: Unary -> Int fibA = \case Zero -> 1 Succ Zero -> 1 Succ nMinus1@(Succ nMinus2) -> fibA nMinus1 + fibA nMinus2 fibB :: Unary -> Int fibB = histo $ \case ZeroF -> 1 SuccF (_ :< ZeroF) -> 1 SuccF (fibNMinus1 :< SuccF (fibNMinus2 :< _)) -> fibNMinus1 + fibNMinus2 fibC :: Fix (Base Unary) -> Int fibC = mhisto $ \recur split -> \case ZeroF -> 1 SuccF nMinus1 -> case split nMinus1 of ZeroF -> 1 SuccF nMinus2 -> recur nMinus1 + recur nMinus2 fibD :: RecFun Unary Int fibD = recFun histo $ \case ZeroF -> 1 SuccF nMinus1 -> case project nMinus1 of ZeroF -> 1 SuccF nMinus2 -> recur nMinus1 + recur nMinus2 I would say that `fibA` is the most readable, as it directly expresses that `fib 0 = 1`, `fib 1 = 1`, and `fib n = fib (n-1) + fib (n-2)`. But that definition uses general recursion, which does not guarantee that we're only making recursive calls on smaller terms. It is also very inefficient. `fibB` improves upon `fibA` by using the `histo` recursion scheme, which perform the recursion on `fibB`'s behalf; and since `fibB` does not make any recursion calls itself, it cannot accidentally make recursive calls on terms which are not smaller. It is also much more efficient, because the recursion structure is no longer branching into two, instead it combines two pre-computed results. However, `fibB`'s pattern-matching is a lot messier: the structure of the Unary and the pre-computed results are interleaved, and you have to be very familiar with the details of the `Base Unary (Cofree (Base Unary) Int)` type on which it is pattern-matching in order to know which position holds what. This becomes even messier when combining multiple recursion-schemes, e.g. with a zygo-paramorphism. Furthermore, while the word "histo" is helpful to those who are familiar with that name, as it immediately conveys the idea that we are going to be combining recursively-pre-computed values from multiple depths, it is deeply unhelpful to those who aren't, as the recursion structure of `fibB` is no longer explicit, it is now hidden inside the implementation of `histo`. Next, `fibC` has the best of both worlds: the word `mhisto` conveys the same idea to those who are familiar with the name "histo", while the explicit `recur` calls clarify the recursion structure for those who aren't. Furthermore, while the pattern-matching is not as readable as in `fibA`, it is much more readable than in `fibB`, because we don't have to bring all of the information into scope all at once. Instead, the helper functions `recur` and `split` extract the bits we need when we need them. We also get the same efficient implementation; `recur` extracts the pre-computed value for that recursive position, it does not actually make a recursive call, but `recur` is still a very good name for it because it returns the same result as if we were making a recursive call to `fibC`. Moreover, the type of `recur` is restricted in a way which guarantees that it can only be called on smaller terms. Finally, `fibD` demonstrates the new API. Like `fibC`, it has the best of both worlds, but in addition, the `recur` and `split` (renamed to `project`, because we can reuse `Recursive.project`) functions are now global instead of local. Also, while `Data.Functor.Foldable.histo` has a `ghisto` variant which makes it possible to combine recursion-schemes, `Data.Functor.Foldable.mhisto` doesn't. In contrast, the new API's design does make it possible to provide a `histoT` variant while retaining the advantages of the Mendler style. See #105 (comment) for a similar example with `para` instead of `histo`. One last advantage of the new API is that is is more correct than the old API. See #50 (comment) for an explanation of what is wrong with the old API, and #51 for an explanation of the Gather/Scatter solution to that problem. In short, generalized recursion-schemes such as `ghisto` can sometimes combine in ways that appear type-correct but actually give subtly-incorrect results, while the new API's transformers, such as `histoT`, use a much simpler approach, in the hope of making the library closer to "so simple that there are obviously no deficiencies" than to "so complicated that there are no obvious deficiencies". In the new API, everything is a catamorphism applied to an algebra of type `base pos -> pos`. This `pos` is different for every recursion scheme, and it contains all the data which the recursion scheme needs to keep track during the catamorphism. The user provides a function of type `base pos -> a`, so `pos` must also have instances which allow the user to call functions like `recur` on the recursive positions. Each recursion scheme picks their `pos` so that the user is only allowed to make the observations which make sense for that recursion scheme. That's it. Well, sort of. The user provides a function of type `base pos -> a`, but we need a function of type `base pos -> pos`. The user's function explains how one step of the recursion computes the value the user is interested in from the values provided by the `base pos`. What we need is a function explaining how one step of the recursion computes all the data we need in order to continue the fold, including both the data which the user is interested in and the data which the next incarnation of the user function will be able to draw from. Each recursion scheme must thus provide a "gather" function of type `a -> base pos -> pos`, which extracts all that information, and stores it in the `pos` along with the `a` which the user function returned; after all, the next incarnation of the user function must be able to call `recur` on that `pos` in order to get back that `a`. We might also want to combine recursion schemes; that's the feature for which the old API brought in complex things like comonad transformers and functions expressing distributive laws. In the new API, a recursion scheme transformer such as `paraT` is associated with a `pos` transformer, such as `ParaPosT`, and instead of providing a gather function, it transforms a gather function on `pos` into a gather function on `ParaPosT pos`. The `Scatter` version, where everything is an anamorphism, will be implemented in a latter commit.
details need to be public and stable in order for users to define their own custom recursion schemes
the implementation details which are subject to change are in Internal, the implementation details which implemententers of custom recursion schemes need to know about and which must thus be stable are in Custom.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a lot of benefit for users who define Recursive
instance and just use cata
, but I guess if I only use cata
nothing changes for me? Or does cata
becomes mcata
?
Also I'm unappy to see {-# OVERLAPPABLE #-}
in the implementation (I barely tolerate them in GHC.Generics
code).
--
I don't feel confident rewriting library with such major change. If @ekmett doesn't object, I won't object either.
Indeed, with
It's not quite as simple as -- |
-- >>> sumA [1..3]
-- 6
sumA :: [Int] -> Int
sumA = Data.Functor.Foldable.cata $ \case
Nil -> 0
Cons x xs -> x + xs
-- |
-- >>> sumB $ refix [1..3]
-- 6
sumB :: Fix (ListF Int) -> Int
sumB = Data.Functor.Foldable.mcata $ \recur -> \case
Nil -> 0
Cons x xs -> x + recur xs
-- |
-- >>> runRecFun sumC [1..3]
sumC :: RecFun [Int] Int
sumC = recFun RecursionSchemes.cata $ \case
Nil -> 0
Cons x xs -> x + recur xs The extra
Can you explain why? One reason to be wary of
It is indeed quite a big change, but I think the readability improvement is worth it, especially for new users. Of course, if that is done at the expense of existing users, that might still not be worthwhile. If you're only using
@ekmett asked me if I wanted to take over the project precisely because my suggested change, #51, strayed too far off the beaten path by replacing the academically-supported comonadic-approach in favour of an unpublished gather/scatter approach. This PR is in the same vein: in addition to implementing that gather/scatter approach, this PR also changes the API away from the academically-supported approach of representing recursion-schemes via higher-order functions. Instead, this PR represents recursion schemes as composable pieces which describe the shape of the recursion and constrain the user function so that it matches that shape, e.g. |
From a reducing boilerplate point of view, does it make sense to define RecursionSchemes.cata and the like as This all looks really cool though and I'm excited to see what comes of it! |
In my first iteration of this design, However, in both APIs, I realized that I had an opportunity to improve on this, by choosing only one side. I chose to expose the fact that a computation is implemented using a recursion scheme, by exposing it as a |
That is, in addition to calling |
I was musing on this today and wonder if you can get the best of both worlds with rewrite rules. You would be at the mercy of the optimizer though, which is a shame. But basically, what you want is for |
Maybe another option is for |
I really like that idea! It doesn't quite type-check, since |
This one, however, feels like the kind of hack which would be more at home in a Ruby codebase than a Haskell codebase :) Do you know of other libraries which do this? |
it's the black sheep of recursion schemes. it's a faster version of (ana f >> cata g), but (ana f >>> cata g) is a bad idea because it doesn't guarantee termination: cata is guaranteed to terminate given a finite input, but ana is only guaranteed to be productive, it does not guarantee to generate a finite output.
I hate overlapping instances with a passion. Can the trick of faking instance chains with type families work here? |
Rewrite rules don't preserve types? Are you sure about that? |
I am now thinking of renaming Same for |
Can you explain why? I've already addressed its main downside in an earlier comment.
Do you have an example of that trick? I am using overlapping instances because I thought it was the canonical trick to avoid having to write a quadratic number of instances. Is there a different trick which accomplishes the same goal I should learn about? |
Looks like I misremembered, as the documentation clearly states that both sides of the rewrite rule must have the same type. Thanks for the correction! |
@treeowl , I have found a blog post which explains how to use a type family to avoid OverlappingInstances, is this what you were referring to? Since there is only one type for which I want to use a different instance, I do think the trick will work. I still think overlapping instances aren't causing any trouble in this particular case, but if they were, it would be easy to avoid them using that trick. Thanks for indirectly teaching me that trick! |
The idea is that you take much more control of instance resolution. I don't have time to dig into your code this minute, but here's the gist (with overly generic names). Say you have class C a b where
meth :: ... and you want to make some overlapping instances for it, all tightly controlled. Then you write a closed type family to select the instance, and an auxiliary class to use it. data WhichInstance = Mine | Yours
type family Choose a b :: WhichInstance where
....
class C' (tag :: WhichInstance) a b where
-- This can take a passed proxy for the tag, or use Data.Tagged.Tagged,
-- or in modern GHC you can use type applications.
meth' :: ...
-- These instances don't overlap, no matter what a and b look like.
instance C' 'Mine ....
instance C' 'Yours ....
instance (tag ~ Choose a b, C' tag a b) => C a b where
meth = meth' @tag
-- or meth' (Proxy :: Proxy tag)
-- or proxy meth' (Proxy :: Proxy tag) for Tagged In this example, the instance (tag ~ Choose a b, C' tag a b) => C (Maybe a) b where
meth = meth' @tag |
This is still a WIP, as it's missing documentation and will probably need to be changed in order to support
hylo
, but since it's such a big change, I would like to get some early feedback as to whether we agree that this is a good direction for the library.new Mendler-style API: more readable and more correct
This is a huge, yet backwards-compatible change! It introduces a new API
which is both more readable and more correct than the previous API, and
it also changes the name of the recursion-schemes import.
First, the name of the module. "Data.Functor.Foldable" is a strange
choice of module name by modern standards, "RecursionSchemes" is better
because it matches the name of the library instead of trying to fit into
the "Data.<...>" / "Control.<...>" semantic hierarchy. By introducing
the new API and the new module name at the same time, it becomes
possible to preserve backwards compatibility simply by keeping the old
API at the old module name and adding the new API at the new module
name.
Second, the new API. One big advantage of Mendler-style
recursion-schemes such as
mcata
andmhisto
is that the resultingcode is more readable. At least in my opinion, since readability is
always subjective. Still, compare the following four implementations of
fib
:I would say that
fibA
is the most readable, as it directly expressesthat
fib 0 = 1
,fib 1 = 1
, andfib n = fib (n-1) + fib (n-2)
. Butthat definition uses general recursion, which does not guarantee that
we're only making recursive calls on smaller terms. It is also very
inefficient.
fibB
improves uponfibA
by using thehisto
recursion scheme, whichperform the recursion on
fibB
's behalf; and sincefibB
does notmake any recursion calls itself, it cannot accidentally make recursive
calls on terms which are not smaller. It is also much more efficient,
because the recursion structure is no longer branching into two, instead
it combines two pre-computed results. However,
fibB
's pattern-matchingis a lot messier: the structure of the Unary and the pre-computed
results are interleaved, and you have to be very familiar with the
details of the
Base Unary (Cofree (Base Unary) Int)
type on which itis pattern-matching in order to know which position holds what. This
becomes even messier when combining multiple recursion-schemes, e.g.
with a zygo-paramorphism.
Furthermore, while the word "histo" is helpful to those who are familiar
with that name, as it immediately conveys the idea that we are going to
be combining recursively-pre-computed values from multiple depths, it is
deeply unhelpful to those who aren't, as the recursion structure of
fibB
is no longer explicit, it is now hidden inside the implementationof
histo
.Next,
fibC
has the best of both worlds: the wordmhisto
conveysthe same idea to those who are familiar with the name "histo", while the
explicit
recur
calls clarify the recursion structure for those whoaren't. Furthermore, while the pattern-matching is not as readable as in
fibA
, it is much more readable than infibB
, because we don't haveto bring all of the information into scope all at once. Instead, the
helper functions
recur
andsplit
extract the bits we need when weneed them. We also get the same efficient implementation;
recur
extracts the pre-computed value for that recursive position, it does not
actually make a recursive call, but
recur
is still a very good namefor it because it returns the same result as if we were making a
recursive call to
fibC
. Moreover, the type ofrecur
is restricted ina way which guarantees that it can only be called on smaller terms.
Finally,
fibD
demonstrates the new API. LikefibC
, it has the bestof both worlds, but in addition, the
recur
andsplit
(renamed toproject
, because we can reuseRecursive.project
) functions are nowglobal instead of local. Also, while
Data.Functor.Foldable.histo
has aghisto
variant which makes it possible to combine recursion-schemes,Data.Functor.Foldable.mhisto
doesn't. In contrast, the new API'sdesign does make it possible to provide a
histoT
variant whileretaining the advantages of the Mendler style.
See
#105 (comment)
for a similar example with
para
instead ofhisto
.One last advantage of the new API is that is is more correct than the
old API. See
#50 (comment)
for an explanation of what is wrong with the old API, and
#51
for an explanation of the Gather/Scatter solution to that problem. In
short, generalized recursion-schemes such as
ghisto
can sometimescombine in ways that appear type-correct but actually give
subtly-incorrect results, while the new API's transformers, such as
histoT
, use a much simpler approach, in the hope of making the librarycloser to "so simple that there are obviously no deficiencies" than to
"so complicated that there are no obvious deficiencies".
In the new API, every Recursive-based function is a catamorphism applied
to an algebra of type
base pos -> pos
. Thispos
is different forevery recursion scheme, and it contains all the data which the recursion
scheme needs to keep track during the catamorphism. The user provides a
function of type
base pos -> a
, sopos
must also have instanceswhich allow the user to call functions like
recur
on the recursivepositions. Each recursion scheme picks their
pos
so that the user isonly allowed to make the observations which make sense for that
recursion scheme. That's it. Well, sort of.
The user provides a function of type
base pos -> a
, but we need afunction of type
base pos -> pos
. The user's function explains how onestep of the recursion computes the value the user is interested in from
the values provided by the
base pos
. What we need is a functionexplaining how one step of the recursion computes all the data we need
in order to continue the fold, including both the data which the user is
interested in and the data which the next incarnation of the user
function will be able to draw from. Each recursion scheme must thus
provide a "gather" function of type
a -> base pos -> pos
, whichextracts all that information, and stores it in the
pos
along with thea
which the user function returned; after all, the next incarnation ofthe user function must be able to call
recur
on thatpos
in order toget back that
a
.We might also want to combine recursion schemes; that's the feature for
which the old API brought in complex things like comonad transformers
and functions expressing distributive laws. In the new API, a recursion
scheme transformer such as
paraT
is associated with apos
transformer, such as
ParaPosT
, and instead of providing a gatherfunction, it transforms a gather function on
pos
into a gatherfunction on
ParaPosT pos
.The Corecursive-based functions work analogously, using an anamorphism
on a
pos -> base pos
co-algebra, scatter functions, and scatterfunction transformers.