Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fiber locals #1393

Merged
merged 24 commits into from Mar 26, 2021
Merged

Fiber locals #1393

merged 24 commits into from Mar 26, 2021

Conversation

RaasAhsan
Copy link

@RaasAhsan RaasAhsan commented Nov 8, 2020

Fiber locals for CE3, the implementation is absurdly simple.

TODO:

  • Documentation (probably will do this in a follow up)
  • Maybe add an IO.clearLocals method to reset all local values, this makes it easier to achieve "isolated locals" semantics
  • Think about a typeclass for locals. Monix expressed interest in this

Closes #1266

@RaasAhsan RaasAhsan added the CE 3 label Nov 8, 2020
@@ -101,6 +102,8 @@ private final class IOFiber[A](
/* true when semantically blocking (ensures that we only unblock *once*) */
private[this] val suspended: AtomicBoolean = new AtomicBoolean(true)

private[this] var localState: scala.collection.immutable.Map[Int, Any] = initLocalState
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two options here for holding local state:

  1. Immutable map/array: Spawning fibers just passes along the current local state reference (constant-time). Manipulating local state means inserting into the map and reassigning the variable (constant-time, worst case linear time IIRC?). This is the current approach
  2. Mutable hashmap/array: Spawning fibers has to perform a shallow copy of the local state (linear time). Manipulating local state is constant-time.

@RaasAhsan RaasAhsan linked an issue Nov 8, 2020 that may be closed by this pull request
@RaasAhsan RaasAhsan mentioned this pull request Nov 16, 2020
5 tasks
@RaasAhsan
Copy link
Author

@djspiewak I'm going to think about the typeclass in a separate PR but I think this is ready

@TimWSpence
Copy link
Member

Just curious what the known use-cases are for this? Is it just when you need mutable state but don't need to share it and hence don't need to pay the price of all the Ref machinery?

@joroKr21
Copy link
Member

Tracing and telemetry are classic use cases.

@RaasAhsan
Copy link
Author

RaasAhsan commented Dec 14, 2020

👍 to those use cases

Conceptually, Local offers a slightly different state scope than Ref. Whereas the state sharing scope for Ref extends to all fibers which hold a reference to the Ref, the sharing scope for Local is precisely the current fiber (plus state is copied on start). These two notions are composable: if you stick a Ref in a Local you can get a scope which encompasses a fiber and its progeny.

Personally I'm not that enthusiastic about introducing this for a few reasons. It's hard to define a typeclass for Local, since the semantics of locals in Monix and ZIO are slightly different, though I've confirmed that the semantics for this PR can be expressed with those implementations

There are also alternative, less invasive ways to achieve this kind of scoping. MTL/Kleisli is one solution though the complaint here is that the type signature permeates throughout a codebase, even though the use cases are typically cross cutting concerns. Another interesting way I've thought about doing this is to basically break apart your application dependency graph and scope components as necessary. I've been wanting to write a blog post about this.

@djspiewak
Copy link
Member

@RaasAhsan Okay I'm pretty convinced we want this :-D

@RaasAhsan
Copy link
Author

Will merge mainline back in today/tomorrow

@joroKr21
Copy link
Member

For modifying state of a Local, cats.mtl.Local / cats.mtl.Stateful seem good to me, and should actually be lawful implementations this time (unlike Stateful for an arbitrary Ref, which would break laws in the presence of concurrency).

Yes that's right, MTL offers the correct abstractions for this. I've used TaskLocal from Monix combined with Ask and Local for tracing in Tagless Final and it worked out great. It also lets you interchange with ReaderT if you want to.

@RaasAhsan
Copy link
Author

RaasAhsan commented Mar 20, 2021

That being said, without such behaviour, and in the current implementation, how does LocalRef differ from a mere Ref, semantically speaking ?

So here's how it's intended to work:

Given a Ref instance, all fibers that hold a reference to that instance will share state. Given a Local instance, each fiber that holds a reference will observe isolated state (which is copied on fork, but not shared). Given a LocalRef instance, all fibers that hold a reference which share lineage will share state. When I say "share state" I mean that writes from one fiber will be observed by another who logically shares the same dynamic scope

I took a look at the implementation again, and I see it doesn't reflect the above description 😓 Going to add a failing test and then refine. I think this may call for an effectful default value

For a better naming situation (not conflicting with mtl Local) I'd call the thing FiberLocal or just FiberRef

what do you think about Local -> FiberLocal and LocalRef -> FiberLocalRef/FiberRef? I think there are several directions we can go here. Local -> FiberRef, LocalRef -> FiberRefShared or something :D

Ref.Make is an existing abstraction we could be using to create locals.

How would this work ? The purpose of Ref.Make is to encode the disjunction over Concurrent and Sync so that we don't need to write specialized constructors in the Ref companion object. In contrast, Local is specialized to IO

@djspiewak
Copy link
Member

@RaasAhsan FiberRef seems good to me. 👍 to everything else you wrote. Also I agree that Ref.Make isn't really appropriate here.

@kubukoz
Copy link
Member

kubukoz commented Mar 21, 2021

what do you think about Local -> FiberLocal and LocalRef -> FiberLocalRef/FiberRef? I think there are several directions we can go here. Local -> FiberRef, LocalRef -> FiberRefShared or something :D

I'm good with either :)

How would this work ? The purpose of Ref.Make is to encode the disjunction over Concurrent and Sync so that we don't need to write specialized constructors in the Ref companion object. In contrast, Local is specialized to IO

Yeah, well, I meant that Local.of has the shape of a Make - it hides the Sync/Concurrent constraint (and here it'd also hide the fact that F=IO), but after some more thought I realized it'd be useless unless passed explicitly (to override the "shared ref" behavior of Ref.Make[IO]). So let's scratch that idea.

Anyway, some abstraction for creating locals would be nice to have - sth like Local.Make[F[_]], which would be possible to derive for monad transformers as well.

@kubukoz
Copy link
Member

kubukoz commented Mar 23, 2021

As I understand, FiberRef differs from Ref in that a fiber can reset the ref, and from that point start with a clean slate that will not be synchronized to its child fibers (and vice versa). What would be a potential use-case for this?

@RaasAhsan
Copy link
Author

@kubukoz So imagine you've got a HTTP service and you want to collect traces for requests. The de facto way to do this today is via some API that involve Kleisli which is used to inject the trace context through your call stack. The main complaint with this it's basically a cross-cutting, nonfunctional concern that permeates through all your interfaces and types.

The problem with the old LocalRef is that it would allocate a single Ref and share that among all fibers. Really, what we want here is a way to share a given Ref among a particular fiber and its progeny. So, translating back to the HTTP service, you want a new trace context for each request. At the beginning of each request (possibly in a middleware), you'd call reset to divorce the current reference and create a new one.

If FiberRef seems a bit odd, we can pull it out into another PR for the time being since it can be expressed in terms of Local

@kubukoz
Copy link
Member

kubukoz commented Mar 23, 2021

At the beginning of each request (possibly in a middleware), you'd call reset to divorce the current reference and create a new one.

I might be wrong... but isn't that possible with just a plain FiberLocal?

// Assuming this is in FiberLocal
def locallyF(f: A => F[A])(fa: F[A]): F[A] = get.bracket(f(_) *> fa)(set)

//...

//could probably work if it's Resource[F, Span[F]] too
def makeNewSpan[F[_]]: Span[F] => F[Span[F]] = ???

// You get a local back, might as well be Trace[F] or something instead
def middleware[F[_]: FiberLocal.Make /* yes, I'm still trying to make Make happen */]:
  F[(FiberLocal[F, Traces], HttpRoutes[F] => HttpRoutes[F])] = FiberLocal.Make[F](emptySpan).map { local =>
    def wrap(routes: HttpRoutes[F]): HttpRoutes[F] =
      HttpRoutes { req => OptionT(local.locallyF(makeNewSpan)(routes.run(req).value)) }

    (local, wrap)
  }

Since every call to the route will run in a different fiber, they won't see each other's changes, so each will start with the emptySpan value and override it for itself separately.

I'm also wondering whether it would be a problem if you forked inside the processing of the HTTP call - then the FiberLocals will no longer share their state with each other, which might or might not be an issue.
If Span[F] is the same instance backed by something like a Ref, the underlying value would still be shared (same ref, same atomicref), so I guess we're good.

...And I guess that's pretty much what FiberRef is doing, wrapping the Ref so Span and others wouldn't need to bother with that on their own, and do reset instead. Is this correct?

@RaasAhsan
Copy link
Author

RaasAhsan commented Mar 23, 2021

I might be wrong... but isn't that possible with just a plain FiberLocal?

It is! FiberRef is merely a convenience wrapper for a FiberLocal plus a Ref. The Traces thing you've got there is probably backed by something that looks like a Ref so they behave identically. I don't really have an objective framework for understanding if people are actually going to use this, but it sounded useful and it was easy to add (which is why I'm happy to throw it away)

It may also be a reasonable idea to replace reset with something that has lexical scoping (so you can revert to the old instance if you want) like locally.

// divorces reference for the duration of this call, reverts after
def reset[A](fa: F[A]): F[A]

something like that probably makes sense for both FiberLocal and FiberRef

@kubukoz
Copy link
Member

kubukoz commented Mar 23, 2021

btw. my point isn't that FiberRef is wrong or useless (both that and FiberLocal are quite nice), I'm just trying to avoid a situation where users will have to pick one of the two, while one could achieve all the things the other can (which is probably the case, since FiberRef builds on FiberLocal).

Also, had an idea just now that FiberRef is more like a Ref that can be divorced on purpose - it behaves like a Ref until you actually use reset. Maybe it should be located closer to Ref, then? (something like Ref.Scoped or ScopedRef comes to mind)

Just saw your comment, maybe we really should make it a separate discussion - sounds like we could do more for FiberRef and it's probably best not to rush it for 3.0.0 (I definitely want FiberLocal there though). For that matter, I'm also curious what @tpolecat would say about the two - which one would look like a better fit for something like Natchez for Trace[IO]?

@kubukoz
Copy link
Member

kubukoz commented Mar 24, 2021

I'd still consider doing FiberRef in another PR but I'm 👍 if y'all are :)

@kubukoz
Copy link
Member

kubukoz commented Mar 26, 2021

@RaasAhsan let's merge this when you're ready, I think we can tackle all the others in follow-ups (even IO.clearLocals should be a BC compatible change I think)

@djspiewak djspiewak merged commit 6093b4a into typelevel:series/3.x Mar 26, 2021
@kubukoz
Copy link
Member

kubukoz commented Mar 26, 2021

aaaaaaaaaaaa I'm so excited 🎉 🎉 🎉 🎉

@kubukoz kubukoz mentioned this pull request Mar 26, 2021
@RaasAhsan RaasAhsan mentioned this pull request Mar 26, 2021
@kubukoz
Copy link
Member

kubukoz commented Mar 26, 2021

back to semantics...

quoting @djspiewak

As a note though, par is an interesting digression in and of itself:

for {
  lc <- Local(0)

  _ <- (lc.update(_ + 1), lc.update(_ + 2)).tupled
  _ <- (lc.update(_ + 1), lc.update(_ + 2)).parTupled

  r <- lc.get
} yield r

...
The answer is 3 (...)

As a note, LocalRef (now FiberRef - @kubukoz) fixes the above quite nicely when it comes to the parTupled semantics, and the results would always be 5.

hold on... why would it be 5? It's 3 after the first round of updates and then we get the state from the RHS of parTupled?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fiber locals
8 participants