Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

absorption/lifting typeclass to generalize a few concepts #3472

Open
johnynek opened this issue Jun 13, 2020 · 12 comments
Open

absorption/lifting typeclass to generalize a few concepts #3472

johnynek opened this issue Jun 13, 2020 · 12 comments

Comments

@johnynek
Copy link
Contributor

johnynek commented Jun 13, 2020

We have FunctorFilter and TraverseFilter. I have in the past proposed a generalization (FunctorFlatten, later AlternativeFlatten: #1337).

But it seems to me the above are actually doing something like:

trait Absorbs[F[_], G[_]] {
  def absorb[A](f: F[G[A]]): F[A]
}

In the case of FunctorFilter[F] what you are saying is you have Functor[F] and Absorbs[F, Option]. In the case of FunctorFlatten what you are saying is that you have Functor[F] and Absorbs[F, G] for all G[_]: Foldable. You can imagine Absorb[F, Eval] when you have Defer[F] and Functor[F].

Now, as stated above, Absorbs is lawless. It is just describing shapes. But there is way to talk about laws depending on what constraints to put on G[_]. Consider the case of FunctorFlatten[F] where we are talking about Absorbs[F, Option].

I think the law here is monadic bind on option:

val f1: A0 => Option[A1] = ???
val f2: A1 => Option[A2] = ???
//...
val f: F[A0] = ???
f.map(f1).map(_.flatMap(f2)).absorb[G] == fa.map(f1).absorb[G].map(f2).absorb[G]

Indeed, all of the examples I'm discussion are about absorbing a Monad: Option, List, Vector. So, let's say the first parameter F[_]: Functor and the second is a monad: G[_]: Monad then I think we can cover four cases: FunctorFilter, TraverseFilter, AlternativeFlatten, and see below LiftIO.

So, a bit more flushed out:

trait Absorbs[F[_], G[_]] {
  def functorF: Functor[F]
  def monadG: Monad[G]

  def absorb[A](f: F[G[A]]): F[A]

  def absorbMap[A, B](f: F[A])(fn: A => G[B]): F[B] =
    absorb(functorF.map(f)(fn))
}

I think these things compose: if you have Absorbs[F, G1] and Absorbs[G1, G2] then you have Absorbs[F, G2]. This is the case if the inner parameter is a monad and the outer a functor, then you can go: F[G2[A]] => F[G1[G2[A]] using map and G1.pure, then use map(_.absorb)to getF[G1[A]]then finally the first absorb to get toF[A]`.

To recall the motivation of #1337, scalding TypedPipe can absorb foldable things: Absorbs[TypedPipe, List]. Similarly with spark: Absorbs[RDD, List] Absorbs[Dataset, List].

Lastly, I will note that LiftIO[F] is quite similar to Absorb[F, IO] (if you have IO[A] you can use pure to get F[IO[A]] then absorb to F[A]. So, Applicative[F] and Absorb[F, IO] as sufficient for LiftIO[A].

Note, all the list-like collections can absorb each other: Absorb[List, Vector], Absorb[List, Chain], etc... and of course they can absorb options: Absorb[List, Option]....

Lastly, I will note that LiftIO[F] is quite similar to Absorb[F, IO] (if you have IO[A] you can use pure to get F[IO[A]] then absorb to F[A]. So, Applicative[F] and Absorb[F, IO] as sufficient for LiftIO[A].

For all Monads, Absorb[F, F] can be defined.

Note, all the list-like collections can absorb each other: Absorb[List, Vector], Absorb[List, Chain], etc... and of course they can absorb options: Absorb[List, Option]....

This is an idea I've been thinking a bit here and there. There may be prior literature on it (I may have even seen it and forgot it, if so I apologize for forgetting). It seems to me to be a bit more principled than what we have now: a few ad-hoc examples of absorption.

@djspiewak
Copy link
Member

As a note on related works, take a look at MonadPartialOrder, which was originally introduced by Oleg (I've been told; I have yet to find the reference) and is now one of the underpinnings of cats-mtl.

@johnynek
Copy link
Contributor Author

Yeah, it seems that MonadPartialOrder is more restrictive:
https://github.com/typelevel/cats-mtl/blob/master/core/src/main/scala/cats/mtl/MonadPartialOrder.scala

it requires the outer F[_] to be a Monad. In the above, I limit to Functor and I think keep it lawful. Moreover, the cases I was thinking of (scalding, spark, other data-infra systems) are not generally Monads, they are Alternative usually, and always Functors.

@djspiewak
Copy link
Member

Regarding some more general thoughts…

// I think "Absorb" is a more natural name btw
trait FlatMap[F[_]] extends Apply[F] with Absorbs[F, F] {
  def absorb[A](ffa: F[F[A]]): F[A] = flatten(ffa)
}

implicit def absorbForInjectK[F[_], G[_]: FlatMap](implicit I: InjectK[F, G]): Absorbs[F, G]  =
  new Absorbs[F, G] {
    def absorb[A](fga: F[G[A]]): G[A] =
      I.inj(fga).flatten
  }

Something I definitely don't like about this: there is no distinction at all between FlatMap[F] and Absorbs[F, F]. They're actually very literally the same thing. That does rather imply that Absorbs generalizes FlatMap, but FlatMap still exists as a nominal thing.

Thinking a little more deeply about this… It is actually quite intuitive that Absorbs generalizes FlatMap: they're both monoidal composition in an endofunctor category. (as a corollary, this probably means that your proposed Functor restriction is not only necessary for laws, but also necessary for… pretty much anything) I say "an" because FlatMap applies to a very specific category of endofunctor structures, while Absorbs applies to a slightly more general set.

It reminds me a bit of how it is possible to define Functor for Set, but you need to generalize Functor somewhat:

trait Functor[F[_], C[_]] {
  def map[A, B](fa: F[A])(f: A => B)(implicit ca: C[F[A]], cb: C[F[B]]): F[B]
}

There was some Haskell article somewhere that I can't find right now which connected this notion back to category theory and noted that this category is slightly larger than what we conventionally think of as endofunctors in Hask. Absorbs strikes me as a similar thing.

@johnynek
Copy link
Contributor Author

Actually, related to your generalization of Functor above, I wonder if what I'm really talking about is something like a generalization of functor:

trait Func0[F[_], C[_, _]] {
  def category: Category[C]
  def map[A, B](f: F[A])(c: C[A, B]): F[B]
}

and so the law you want is f.map(c1).map(c2) == f.map(c1.andThen(c2)) and f.map(category.id) == f.
Then FilterFunctor is on the category A => Option[B] whereas the above Absorb the category is on A => G[B] for some Monad[G].

Our conventional Functor is on the Category A => B.

Maybe this is making it general to the point of meaninglessness,

@johnynek
Copy link
Contributor Author

Here's another Haskell typeclass exploring the space of FilterFunctor:

https://hackage.haskell.org/package/compactable

@ctongfei
Copy link
Contributor

@johnynek I think I actually have a use case for your generalized Functor:

trait GenFunctor[F[_], C[_, _]] {
  def category: Category[C]
  def map[A, B](f: F[A])(c: C[A, B]): F[B]
}

This can be used in autodiff programs (neural networks). C == ~> is the category of differentiable functions, and F[A] is the type of derivative-traced expressions. Then you have

implicit object GenFunctorOfTraced extends GenFunctor[Traced, ~>] {
  def category = ???
  def map[A, B](a: Traced[A])(f: A ~> B): Traced[B]
}

This says that you can only map a traced expression by a differentiable function.

@johnynek
Copy link
Contributor Author

@ctongfei that's a really interesting example.

@johnynek
Copy link
Contributor Author

Another example that seems close is in distributed computing. You can map if you have A => B and Serialization[A] => Serialization[B]. So, that pair forms a category too: case class SerPair[A, B](fn: A => B, ser: Serialization[A] => Serialization[B]) since it is just a pair of two functions.

That said, I'm not too sure what to do with a functor... if we could build this up to Applicative at least then we would be cooking with something interesting.

@johnynek
Copy link
Contributor Author

johnynek commented Jul 16, 2020

Okay, here is a sketch that takes generalizing the composition category in Functor and Applicative through to Traverse.

package cats

import cats.arrow.Category

trait GenFunctor[F[_], C[_, _]] {
  def category: Category[C]
  def map[A, B](f: F[A])(c: C[A, B]): F[B]
}

trait GenApplicative[F[_], C[_, _]] extends GenFunctor[F, C] {
  def unit: F[Unit]
  def map2[A, B, Z](fa: F[A], fb: F[B])(fn: C[(A, B), Z]): F[Z]

  def pure[A](fa: C[Unit, A]): F[A] = map(unit)(fa)
}

trait GenTraverse[F[_], C[_, _]] {
  def traverse[A, B, G[_]](fn: A => G[B])(implicit gf: GenApplicative[G, C]): F[A] => G[F[B]]
}

trait CatK[F[_], G[_], C[_, _]] {
  def apply[A]: C[F[A], G[A]]
}

object GenTraverse {
  type Empty[A] = Unit
  type Cons[A] = (A, List[A])

  def listTrav[C[_, _]](emptyk: CatK[Empty, List, C], consk: CatK[Cons, List, C]): GenTraverse[List, C] =
    new GenTraverse[List, C] {
      def traverse[A, B, F[_]](fn: A => F[B])(implicit gf: GenApplicative[F, C]): List[A] => F[List[B]] = {
        val empty = emptyk[B]
        val cons = consk[B]

        lazy val rec: List[A] => F[List[B]] =
          {
            case Nil =>
              gf.pure(empty)
            case h :: tail =>
              val ftail = rec(tail)
              val fhead = fn(h)
              gf.map2(fhead, ftail)(cons)
          }

        rec
      }
    }
}

Here, the category for traverse is still A => B it is just the category for composition of Applicatives that is C[_, _]. You could imagine generalizing the category on traverse, but I don't see the use of that at the moment.

I do see the value of changing the category for e.g. Functor, e.g.:

case class OrdCat[A, B](fn: A => B, ord: Ordering[A] => Ordering[B])
object OrdCat {
  implicit val ordCatCategory: Category[OrdCat] =
    new Category[OrdCat] {
      def id[A] = OrdCat[A, A](identity, identity)
      def compose[A, B, C](bc: OrdCat[B, C], ab: OrdCat[A, B]): OrdCat[A, C] =
        OrdCat(bc.fn.compose(ab.fn), bc.ord.compose(ab.ord))
    }
}

import scala.collection.immutable.SortedSet

object SetGenFunc extends GenFunctor[SortedSet, OrdCat] {
  val category = OrdCat.ordCatCategory
  val unit: SortedSet[Unit] = SortedSet(())
  def map[A, B](ss: SortedSet[A])(ordCat: OrdCat[A, B]): SortedSet[B] = {
    implicit val ordB = ordCat.ord(ss.ordering)
    val b = SortedSet.newBuilder[B]
    b ++= ss.iterator.map(ordCat.fn)
    b.result()
  }

Similarly, a distributed data type is usually a pair of some handle representing the computation and a way to serialize the result, so, it composes with a case class DistFn[A, B](fn: A => B, ser: Serialization[A] => Serialization[B]) For this, you could definitely make a DList[A] type that was applicative (has maps and zips), or more interestingly: DKeyed[K, V] being GenApplicative[DKeyed[K, *], ValueFn[K, *, *]] with case class ValueFn[K, A, B](valueFn: (K, A) => B, ser: Serialization[A] => Serialization[B])`.

@ctongfei maybe you would have time to work out the GenApplicative example for differentiable functions.

@LukaJCB
Copy link
Member

LukaJCB commented Jul 16, 2020

@sellout gave a talk on this a couple years back: https://www.youtube.com/watch?v=QE3zqV4kVEo
I believe I saw someone try even before that though, but not sure where exactly

@ctongfei
Copy link
Contributor

ctongfei commented Jul 16, 2020

Actually in the scenario of differentiable functions, I'm thinking that we can get up to Applicative and Monad and they have interesting correspondences to current deep learning practices.

trait Differentiable[X, Y] extends Function1[X, Y] {
  def apply(x: X): Y
  def forward(x; X): (Y, Y => X)  // the `Y=>X` is the backward closure
}

trait GenApplicative[F[_], ~>[_, _]] {
  def pure[A]: F[A]
  def productWith[A, B, C](fa: F[A], fb: F[B])(f: (A, B) ~> C): F[C]
}

trait GenMonad[F[_], ~>[_, _]] extends GenApplicative[F, ~>] {
  def flatMap[A, B](fa: F[A])(f: A ~> F[B]): F[B]
}

GenApplicative corresponds to TensorFlow-style static computation graph engine, where you can always construct traced nodes in computation graphs F[A] (tf.Tensor), but you cannot get the value A out of F[A].

On the other hand, GenMonad supports peeking into F[A] to get the actual A when sequencing differentiable operations: This is the PyTorch-style dynamic computation graph construction, where the building of neural networks can be dependent on intermediate values A.

@johnynek
Copy link
Contributor Author

johnynek commented Nov 2, 2020

I noticed this, and it seems that SelectiveZero is enough to absorb option:

https://github.com/snowleopard/ideas/blob/7692eba70758a48cf2dc9224627df906fe078a9b/src/SelectiveZero.hs#L41

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants