Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic effects, take 2, and more? #45

Closed
wants to merge 9 commits into from

Conversation

armanbilge
Copy link
Member

@armanbilge armanbilge commented Nov 9, 2021

This takes another stab at #33, specifically based on #33 (comment). The awkwardness of embedding tracing into the hierarchy in #44 (no offense to @bpholt who did a fine job given the constraints of the current abstraction :) made me consider this again. This time no Dispatcher hacks, and we keep IOLambda as the only runtime.

A lambda is reimagined as a:

type Lambda[F[_], Event, Result] = (Event, Context) => F[Option[Result]]

and IOLambda now defines an abstract:

def run: Resource[IO, Lambda[IO]] // maybe bikeshed that, should be `handler` or `install`?

This Resource encapsulates the previously separate Setup: effectively all resource acquisition goes into building the Lambda once, and that is the setup. That Lambda is then installed as the handler.

From here, we have two axes on which we can do composition:

  1. Middlewares, e.g.:
  • Lambda[F, Event, Result] => Lambda[F, Event, Result]
  • HttpRoutes[F] => Lambda[F, ApiGatewayProxyEventV2, ApiGatewayProxyStructuredResultV2]
  • Client[F] => CFCR[F, In, Out] => Lambda[F, CFCRRequest[In], Unit]
  • (Span[F] => Lambda[F, Event, Result]) => Lambda[F, Event, Result]
  1. Resource-injection, which is simply flatMapping resources.

This is still a bit rough around the edges; I think we'd benefit from additional helpers to compose Resource[F, Lambda[F]] with each other. But I think this time I got it right 😅

Copy link
Member

@bpholt bpholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is promising!

I'm going to make an attempt to add the tracing middleware to see if it raises any interesting issues. (I might not get to it until Tuesday though, as I'll be on vacation tomorrow through then.)


final override protected def setup: Resource[IO, Setup] = run

def run: Resource[IO, Lambda[IO, Event, Result]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spirit of bikeshedding, I like install for this better than run or handle. I think the latter two imply this would run for every invocation of the Lambda, but if I understand the lifecycle correctly, it will really only run during initialization, and subsequent requests will basically go straight to the Lambda[IO, Event, Result].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 that was exactly my thinking as well, if we think of it as a verb/action ("install the lambda"). If we think of it as a thing/noun then it is the "handler".

@armanbilge
Copy link
Member Author

Thanks! I fiddled a bit with tracing as well, but I got a bit confused if we want a generic TracedLambda middleware that is pluggable or a specific XRayedLambda middleware and whether we wanted to inject a Span[F] or a Trace[F].

Have a wonderful vacation! 😁

@bpholt
Copy link
Member

bpholt commented Nov 12, 2021

Thanks! I fiddled a bit with tracing as well, but I got a bit confused if we want a generic TracedLambda middleware that is pluggable or a specific XRayedLambda middleware and whether we wanted to inject a Span[F] or a Trace[F].

So, I think e.g. the ApiGatewayProxyHttp4sLambda should continue a span passed in via HTTP request headers, or fall back to the behavior of a non-HTTP lambda, where it pulls the XRay trace ID from the environment. (If both are missing, or you're not using XRay, it should start its own root span.) That implies you need access to the event to start tracing.

But: it is my experience that the resources used by the handler typically ask for a Trace[F] constraint. I think that means we need to have a Trace[F] available when installing the handler, even though we don't really care about tracing at that stage? I think this also means that we should lean on Kleisli[F, Span[F], *] in order to keep the generic effect type, since ioTrace locks you into IO.

Idk. If you have more thoughts on how a (Span[F] => Lambda[F, Event, Result]) => Lambda[F, Event, Result] middleware would work (specifically, where it fits in and what actually builds the Span[F]), that would be helpful.

@armanbilge
Copy link
Member Author

armanbilge commented Nov 12, 2021

Thanks, those were helpful thoughts. I spent some more time studying the interfaces and I'm still trying to wrap my head around it, but I think the key observations are:

That implies you need access to the event to start tracing.

I think that means we need to have a Trace[F] available when installing the handler, even though we don't really care about tracing at that stage?

It's a catch-22 :) If we have a Trace[F] that implies that for any F, we currently have a Span[F]. But as you point out, to get that Span[F] we need the event first, which we are receiving/processing in F 😅

So actually, I think we are working with two effect types. F is our non-traced effect, which we can use out-of-the-box to acquire resources, install the lambda, and receive the event from which we can derive a Span. Then, once we have that span, we can "upgrade" into some traced effect G. To make this a bit more concrete, I'm imagining something like this:

// This lets us run fancy traced effect `G` from the untraced world of `F`
// To complete the "upgrade" we have to inject a `Span[F]`
def runTraced[F[_], G[_], A](span: Span[F])(ga: G[A]): F[A]

// Here is a concrete implementation
def runTraced[F[_], A](span: Span[F])(ga: Kleisli[F, Span[F], A]): F[A] = ga(span)

// So our middleware looks something like this
object TracedLambda {
  def apply[F[_], G[_], Event, Result](entryPoint: EntryPoint[F])(lambda: Lambda[G, Event, Result]): Lambda[F, Event, Result] = {
    (event, context) => {
      for {
        kernel <- extractKernel(event, context) // This is event-specific magic!
        span <- entryPoint.continueOrElseRoot(context.functionName, kernel)
        result <- runTraced(span)(lambda(event, context))
      } yield result
    }
  }
}

Essentially, this lets you transform a Lambda written in traced effect G into a more vanilla effect F. In doing so, it removes the requirement to have Trace to install the handler, despite the fact that the handler uses Trace internally.

The important abstractions here are:

  • extractKernel which requires an event-specific implementation.
  • runTraced which encodes the relationship between F and G. I wonder if it can find a home in natchez?
  • entryPoint which is where one could plug-in XRay or whatever tracing backend they want

A downside of this approach is that we cannot implement runTraced for the ioTrace implementation, which won't give us a Trace[IO] until we have the Span[IO], unlike the Kleisli, where we can inject the Span after-the-fact. I'll think more about this.

Hope that makes some sense; looking forward to your thoughts :)

Edit/update: and here's how you'd do it for ioTrace. We have to introduce a newtype TracedIO:

sealed abstract class TracedIO[A] {
  def run(span: Span[IO]): IO[A]
}

object TracedIO {
  def apply[A](f: Trace[IO] => IO[A]): TracedIO[A] =
    new TracedIO[A] {
      def run(span: Span[IO]): IO[A] = Trace.ioTrace(span).flatMap(f)
    }
}

def runTraced[A](span: Span[IO])(ga: TracedIO[A]): IO[A] = ga.run(span)

I don't think we need to provide any Monad etc. instances for TracedIO. The idea is, you build everything with IO (or even F) assuming you have an implicit Trace[IO] in scope. Then, at the last minute you wrap it inside of a TracedIO which injects the required Trace[IO]. And then you immediately consume the TracedIO via its runTraced implementation.

@bpholt
Copy link
Member

bpholt commented Nov 16, 2021

@armanbilge I pushed up some tracing implementation work—it needs to be reorganized but I published it locally and it fits together pretty well. I'll add some additional thoughts as review comments.

remainingTime: F[FiniteDuration]
) {
def mapK[G[_]](fk: F ~> G): Context[G] =
this.copy(remainingTime = fk(this.remainingTime))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably isn't the best approach for bincompat but it's enough for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapK? I think that's fine. Anything case classes however is dubious 😆

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I meant the case class copy. We'll probably want to make Context itself not a case class, right?

Copy link
Member Author

@armanbilge armanbilge Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah absolutely, Context we should (eventually) make an ordinary class. But the bigger issue for bincompat is our use of case classes for the events/responses for circe derivation ...

Lambda[Kleisli[F, Span[F], *], Event, Result]]): Resource[F, Lambda[F, Event, Result]] =
installer
.mapK(
Kleisli.applyK(new NoopSpan[F])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little weird, but I'm not sure if it can be eliminated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's because we need to continue distinguishing between untraced F for installing and traced G for running.

Copy link
Member Author

@armanbilge armanbilge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing on this! Some thoughts below :)

import natchez.xray.{XRay, XRayEnvironment}

abstract class TracedLambda[F[_]: MonadCancelThrow, G[_], Event] {
def extractKernel(event: Event, context: Context[F]): F[Kernel]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can make this a typeclass in terms of Event. The http one can have something special that extracts the kernel from the headers, and the rest can use a low-priority fallback.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an explicit extractKernel: Kleisli[F, (Event, Context[F]), Kernel] parameter instead of making it a typeclass, because the instances will vary depending on what backend is being used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah I was afraid that it might depend on the backend, too bad. 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I looked into this again and I'm not sure. I think we should toss all the headers from the event into the Kernel and leave it to the entrypoint to sort it out. There's nothing backend-specific about that :)

abstract class TracedLambda[F[_]: MonadCancelThrow, G[_], Event] {
def extractKernel(event: Event, context: Context[F]): F[Kernel]

def runTraced[A](span: Span[F])(ga: G[A]): F[A]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with this one, we're playing with some ideas in typelevel/natchez#448.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I pushed up a version using an inlined copy of LiftTrace for now, while that's discussed over there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the LiftTrace concept with some new ideas, check the PR. I'm about to push a branch trying out that one instead.

import natchez.noop.NoopSpan
import natchez.xray.{XRay, XRayEnvironment}

abstract class TracedLambda[F[_]: MonadCancelThrow, G[_], Event] {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should avoid the inheritance, I feel burned after over-using it in the old Lambda encoding 😆 I really like the middleware concept Lambda[G, Event, Result] => Lambda[F, Event, Result].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. It's now (EntryPoint[F], F ~> G, Kleisli[F, (Event, Context[F]), Kernel]) => Lambda[G, Event, Result] => Lambda[F, Event, Result] given an implicit LiftTrace[F, G].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid. This hints to me that maybe LiftTrace[F, G] should provide the F ~> G.

Copy link
Member Author

@armanbilge armanbilge Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although now I am dubious how it works with "TracedIO", maybe that idea was not so good 🤔

Lambda[Kleisli[F, Span[F], *], Event, Result]]): Resource[F, Lambda[F, Event, Result]] =
installer
.mapK(
Kleisli.applyK(new NoopSpan[F])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's because we need to continue distinguishing between untraced F for installing and traced G for running.

Comment on lines 69 to 71
installer: Resource[
Kleisli[F, Span[F], *],
Lambda[Kleisli[F, Span[F], *], Event, Result]]): Resource[F, Lambda[F, Event, Result]] =
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm not crazy about this 😆 I need to think about this more, but I don't think the lambda middlewares should need to know anything about resources. Any resources they need (such as an http client or natchez entrypoint, they should ask for in the form of arguments. The user can inject these as they build their resource stack with the stuff they want.

An alternative encoding, if we want to prescribe a particular resource, could be that the middleware itself is contained inside a resource, like:

Resource[F, Lambda[G, Event, Result] => Lambda[F, Event, Result]]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started with only the Lambda[Kleisli[F, Span[F], *], Event, Result] => Resource[F, Lambda[F, Event, Result]] variant, but I found that everywhere I wanted to use it, I was having to write the body of this one. So this is really just an alternate convenience constructor that includes the installer.mapK(Kleisli.applyK(new NoopSpan[F])) that the user would need to include otherwise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that you need trace the resource, I understand why this helps. But I'm still confused why that is. I haven't really played with it yet so I'm probably missing something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an example:

My resource needs a Trace[F] constraint because it creates resources that themselves require Trace[F].

def handler[F[_] : Async : Trace]: Resource[F, lambda.Lambda[F, RabbitMQConfig, Unit]] =

With the convenience method, I can just pass that handler[Kleisli[IO, Span[IO], *]] to XRayTracedLambda:

override def run: Resource[IO, lambda.Lambda[IO, RabbitMQConfig, Unit]] =
  XRayTracedLambda(handler[Kleisli[IO, Span[IO], *]])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for the example. Makes sense, but I don't like it 😆 I want to think about this more.

Btw, just an aside: for lambdas with Unit result type, I've been wondering if actually we should use Nothing as the type parameter. The only valid instance of Option[Nothing] is None which is exactly what we want :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, seems like an uphill battle. No harm sticking to unit :) I forgot about the encoder thing, I think I remember trying this before and immediately running into that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, maybe the type inference thing is why fs2 has INothing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, INothing infers fine. Nice!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't completely follow the KmsAlg.resource[F].map(_.withTracing) part

The .withTracing enhancement method uses cats-tagless to auto-instrument an algebra using the class and method name as the span name. This lets me take an algebra and wrap its methods in tracing with very little boilerplate, because almost everything is auto-derived at compile time.

When we separate F[_] and G[_] : Trace (given LiftTrace[F, G]), it ends up looking like

KmsAlg.resource[F].map(_.mapK(LT.liftK).withTracing)

which works but I don't love—it feels like separating the effects introduces more boilerplate for no gain when using Kleisli trace. It'd be nice to know for sure whether the separation is required for the IOLocal trace.

Copy link
Member Author

@armanbilge armanbilge Nov 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woah, that is very cool!!

which works but I don't love

Yeah, this F and G stuff is technically sound but a real bummer in practice. I had a chat with @ChristopherDavenport in Discord, and he strongly encourages using Span[F] directly instead of Trace[G] precisely to avoid these complexities.
https://discord.com/channels/632277896739946517/632286375311573032/910352732353863770

It'd be nice to know for sure whether the separation is required for the IOLocal trace.

Are you referring to my half-baked thought from above?

As I see it, the problem is if you build the client in F, and F is traced, that means that the root span has to exist at the time when you create the client. But that means all requests from that client are part of the same span! This certainly doesn't make sense, if the client is re-used for multiple events that themselves start their own spans.

The answer for IO is ... it depends 😆

To be more specific: IO is special, because (unlike Kleisli) it can function as both the untraced and traced effect :). So, if you know your concrete type is IO, then you can build all this stuff without the terrible mapKs and translates because everything is IO and life is good.

However, if you try to build everything in generic F you get into trouble real fast. E.g.

def mkTracedClient[F: Async: Trace]: Resource[F, Client[F]] =
  clientBuilderThing.build[F].map(natchezTracingThing[F])

// Now let's use it with IO :)
// We need a Trace[IO]
Trace.ioTrace(myRootSpan) // Boom! You just installed your root span, game over.
  .flatMap { implicit trace =>
    mkTracedClient[IO].use {
      // etc.
    }
  }

Basically, the problem is you only have one chance to insert a root span, and that's at the time you create the Trace[IO]. So if you do it too early, then you're stuck with whatever root you had at that point.

Apologies if I'm being dumb, but let me know if that makes sense.

remainingTime: F[FiniteDuration]
) {
def mapK[G[_]](fk: F ~> G): Context[G] =
this.copy(remainingTime = fk(this.remainingTime))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapK? I think that's fine. Anything case classes however is dubious 😆

@armanbilge armanbilge mentioned this pull request Nov 17, 2021
Resource
.eval(extractKernel((event, context)))
.flatMap { kernel => entryPoint.continueOrElseRoot(context.functionName, kernel) }
.use(LiftTrace[F, G].run(_)((_: Trace[G]) => lambda(event, context.mapK(LiftTrace[F, G].liftK))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@armanbilge were you expecting the Trace[G] here to just be discarded? I think it works but it's kind of confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm not crazy about it but I'm still trying to figure out how to deal with IOLocal-based tracing. See how I used it in https://github.com/typelevel/feral/blob/7860c4d788161bece6724607ba26e39a033fbfea/lambda-natchez/src/main/scala/feral/lambda/natchez/TracedLambda.scala.

@armanbilge
Copy link
Member Author

armanbilge commented Nov 17, 2021

@bpholt I just added a lowerK method to my LiftTrace concept in typelevel/natchez#448. I'm fairly confident that with this, combined with translate on Client, you should be able to created traced stuff inside of untraced resources e.g.

val lift: LiftTrace[F, G] = ???
EmberClientBuilder
  .default[F]
  .build
  .map(_.translate[G](lift.liftK)(lift.lowerK)) // Resource[F, Client[G]]

without needing a signature like this:

def apply[F[_]: Async, Event, Result](
installer: Resource[
Kleisli[F, Span[F], *],
Lambda[Kleisli[F, Span[F], *], Event, Result]]): Resource[F, Lambda[F, Event, Result]] =

Let me know what you think!

@armanbilge
Copy link
Member Author

Closing in favor of #60 and various other PRs. The only thing here that isn't anywhere else is the X-Ray integration, which depends on Natchez release anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants