New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAJOR: Change internal encoding for Task.Async #647

Merged
merged 19 commits into from May 11, 2018

Conversation

Projects
None yet
4 participants
@alexandru
Member

alexandru commented Apr 13, 2018

Fixes #612 — The initial implementation for TaskLocal, which I contributed after collaborated with @leandrob13 on Local and on TracingSchedule, was a little superficial as it did not take into account executeOn among others.

So upon defining Task.Async like ...

Task.Async { (ctx, cb) =>
  ctx.scheduler.executeAsync(() => cb.onSuccess(1))
}

What the (current) implementation does is to read the current locals context via Local.getContext and after this task gets evaluated restore the context via Local.setContext, which happens upon calling cb.onSuccess. The virtue of this implementation is that it doesn't matter if the ctx.scheduler is a TracingScheduler or not.

Well, that's not good enough for operations like executeOn, operations that in fact wrap another Task:

Async { (ctx: Context, cb: Callback[A]) => 
  val ctx2 = ctx.withScheduler(s)
  if (forceAsync)
   Task.unsafeStartAsync(source, ctx2, cb)
  else
    Task.unsafeStartNow(source, ctx2, cb)
}    

Restoring the previous context in such a case isn't good, because the task that we are executing here could apply changes to the local context that we want to be available in bind chains after the evaluation happens.

To fix this I introduced 2 changes:

  1. the Scheduler passed in Async tasks via Context will now be a TracingScheduler in case localContextPropagation is on
  2. I changed the encoding of Async to allow disabling this restoration of the context for operations that wrap other tasks, like executeOn is doing

The new definition is:

final case class Async[+A](
    register: (Context, Callback[A]) => Unit,
    trampolineBefore: Boolean = false,
    trampolineAfter: Boolean = true,
    restoreLocals: Boolean = true)
    extends Task[A]

If restoreLocals is false, then we are not restoring the Local.Context after evaluation — this will now make executeOn behave like expected.


While working on this I also noticed that RestartCallback admits an optimization for triggering asynchronous boundaries. So asynchronous boundaries in Task.Async are mandatory, except that:

  • sometimes we need them only on the callback, so after evaluation
  • sometimes we need them before evaluating Async's function
  • sometimes we need async boundaries both before and after evaluation

Therefore I introduced two booleans in Async such that:

  1. if trampolineBefore is true, then it triggers an optimized async boundary before evaluation
  2. if trampolineAfter is true, it triggers an async boundary after evaluation, but before calling the callback (on by default)

Before this we had to do ...

def signal(n: Int): Task[Int] = 
  Task.Async { (ctx, cb) =>
    // Obligatory
    cb.asyncOnSuccess(n)(ctx.scheduler)
  }

After this change we can do:

def signal(n: Int): Task[Int] = 
  Task.Async(
    (_, cb) => cb.onSuccess(n),
    trampolineAfter = true)

With this change the memory usage becomes more efficient because in managing these boundaries the RestartCallback implementation can avoid creating extraneous Runnable instances.


As part of this task it became clear that Task.unsafeCreate is really unsafe. This is why we're doing a deprecations:

  1. Task.unsafeCreate is now deprecated, to be removed at a later time
  2. Task.async is also deprecated, because it doesn't reflect cats.effect.Async

In place of unsafeCreate I've expanded the available builders:

  1. Task.async is now the equivalent of cats.effect.Async#async; which is cool to have in place of using create with a Cancelable.empty return because it's more optimal (we don't have to do the internal cancelable bookkeeping)
  2. Task.asyncS is the async builder that also injects a Scheduler (S suffix comes from Scheduler; naming is hard 😑)
  3. Task.cancelable is the equivalent of cats.effect.Cancelable#cancelable
  4. Task.cancelableS is the cancelable builder that also injects a Scheduler (just like async vs asyncS)
  5. Task.create has been changed to use the Partially-Applied Type technique and thus to change behavior depending on the return type of the provided function ... it supports Unit (equivalent with Task.simple / async), it also supports Cancelable, IO[Unit], Coeval[Unit] and Task[Unit] as return types

After changing the Async implementation I had to review and fix the implementation of the following:

  • executeOn
  • executeWithModel
  • executeWithOptions
  • deferAction
  • uncancelable
  • raiseError
  • fromFuture / deferFutureAction
  • fromAsync / fromConcurrent
  • simple / cancelable
  • doOnCancel
  • startAndForget
  • gather
  • gatherUnordered
  • mapBoth
  • memoize
  • memoizeOnSuccess
  • race
  • raceList
  • racePair
  • sleep
  • start

N.B. TaskLocal due to its behavior might still yield surprises. For example Task.gather executes tasks in parallel, so if any of those tasks does a context write, then that write cannot get propagated after gather is finished. Same goes for start, for the same reasons.

Basically when concurrent execution is involved (e.g. mapBoth, race, etc.) then we cannot keep the changes to the local context in the bind continuation.

@Avasil

This comment has been minimized.

Collaborator

Avasil commented Apr 13, 2018

Should we make unsafeCreate private? I kind of think so — we can also add extra create overloads for optimization purposes and that should be fine for most use-cases; and if anybody needs unsafeCreate, he/she should try to push the use-case in Monix proper

Let's make it private - I'm not a fan of public builder that we advise against using at every step. Still, we need to have in mind people who might had to use it for certain use cases and this change might prevent their upgrade from 2.3.x to 3.x.x. If we can't cover all of the use cases now maybe we can deprecate it in 3.0.0 and remove in 3.1.0? So it's not that sudden.

TracingScheduler only takes care of real async boundaries, not trampolined ones — should it also save and restore the context for trampolined execution as well (e.g. TrampolinedRunnable)? At this point I do not think so, but I might be wrong

What are repercussions of not restoring it?

I'm thinking of making an issue / document with all use-cases for TaskLocal we can think of, introducing necessary tests and making sure that this isn't missing anything.

Yes please, it will not only help with assuring that TaskLocal implementation is correct but we could also easily integrate it into documentation / blog post. I think it's not instantly obvious why TaskLocal is awesome so stuff like this will be very useful.

@leandrob13

This comment has been minimized.

Contributor

leandrob13 commented Apr 14, 2018

1.- I agree unsafeCreate should be private

2.- We can keep beta testing to see if there is a use case where the propagation is needed in trampolined execution.

alexandru added some commits May 1, 2018

@codecov

This comment has been minimized.

codecov bot commented May 4, 2018

Codecov Report

Merging #647 into master will increase coverage by 0.05%.
The diff coverage is 91.89%.

@@            Coverage Diff             @@
##           master     #647      +/-   ##
==========================================
+ Coverage   90.89%   90.94%   +0.05%     
==========================================
  Files         377      379       +2     
  Lines       10089    10143      +54     
  Branches     1900     1894       -6     
==========================================
+ Hits         9170     9225      +55     
+ Misses        919      918       -1

alexandru added some commits May 6, 2018

@alexandru alexandru changed the title from WIP: Change TaskLocal behavior to Fix TaskLocal behavior by changing internal encoding for Task.Async May 10, 2018

@alexandru alexandru changed the title from Fix TaskLocal behavior by changing internal encoding for Task.Async to MAJOR: Change internal encoding for Task.Async May 10, 2018

alexandru added some commits May 10, 2018

@oleg-py

Looks good at a glance, just few picks from me

*/
def async[A](register: (Scheduler, Callback[A]) => Cancelable): Task[A] =
create(register)
def simple[A](start: (Scheduler, Callback[A]) => Unit): Task[A] =

This comment has been minimized.

@oleg-py

oleg-py May 10, 2018

Collaborator

Not sure I like simple as a name (I know, naming is hard, but it does look less simple than eval and apply).

Maybe that could be called async, but without Scheduler parameter, so erased type is Function1, and it can have a deprecated overload with Function2. This would be consistent with cats-effect IO.async too, so might help people migrating from IO.

This comment has been minimized.

@alexandru

alexandru May 10, 2018

Member

So one strength of Task is the injected Scheduler and so we need a builder with (start: (Scheduler, Callback[A]) => Unit): Task[A] as a signature. We cannot name it async, because as you've noticed, we already have Task.async taking a Function2 which is now deprecated.

Originally I thought about just changing its signature. Unfortunately due to Scala's Unit and its freaking implicit conversion from Any to Unit, people that have used Task.async will only get a warning at best and this builder has been there from the start. It's an old builder. So given code like ...

Task.async { (s, cb) => Cancelable(???) }

Such code will trigger a warning at best, something about a pure expression in statement position or something, but Scala will happily convert that to unit. I also looked for ways to workaround it, but it's not possible. So people will get tasks that can't be canceled.

When I named it simple, I actually thought about naming it atomic, as in a "single, irreducible unit", as opposed to cancelable tasks.

This comment has been minimized.

@alexandru

alexandru May 10, 2018

Member

Unless we come up with our own Void or something.

This comment has been minimized.

@alexandru

alexandru May 10, 2018

Member

Another possibility would be to have Task.async with a Function1 parameter, as you are suggesting, then async0 or async2 (or some other name that's not async) for the Function2. But then we'll need to do it for cancelable as well, for consistency.

So what kind of overload convention would you prefer?

This comment has been minimized.

@oleg-py

oleg-py May 10, 2018

Collaborator

@alexandru is new Task.create not a such builder? I thought that is a strictly more generic version, just return a Unit if you want it uncancelable, and existing code can migrate to it directly.

This comment has been minimized.

@oleg-py

oleg-py May 10, 2018

Collaborator

Crazy idea: def async(f: CallbackWithScheduler[A] => Unit)

where f.scheduler would be implicitly available, so you don't even have to do

implicit val sc = scheduler

Less crazy idea, but probably less efficient: Scheduler => Callback[A] => Unit, lets you do implicit sc => cb => { ... }

*/
type FrameIndex = Int
val readOptions: Task[Options] =

This comment has been minimized.

@oleg-py

oleg-py May 10, 2018

Collaborator

👍

How about readScheduler too, then? (I know I can do Task.deferAction(Task.pure), but it might not be immediately obvious to others).
Maybe call it currentScheduler / currentOptions too.

register: (Context, Callback[A]) => Unit,
trampolineBefore: Boolean = false,
trampolineAfter: Boolean = true,
restoreLocals: Boolean = true)

This comment has been minimized.

@oleg-py

oleg-py May 10, 2018

Collaborator

Maybe these bools should be made into a byte flag field? I have a feeling we might have to add another flag to solve bracket with autoCancelableRunLoops.

This comment has been minimized.

@alexandru

alexandru May 10, 2018

Member

Originally I did it like that, but you end up doing bitwise logic and I have a hunch that performance suffers — I remember something about branch miss-prediction. I don't have numbers though. But it's definitely more error prone.

This comment has been minimized.

@alexandru

alexandru May 10, 2018

Member

Well, it's an internal thing, it can be changed later if we'll need it.

@@ -3130,10 +3393,10 @@ object Task extends TaskInstancesLevel1 {
* what you're doing. Prefer [[Task.runAsync(cb* Task.runAsync]]
* and [[Task.executeAsync .executeAsync]].

This comment has been minimized.

@oleg-py

oleg-py May 10, 2018

Collaborator

Do we really need this warning on internal API? :)

@alexandru

This comment has been minimized.

Member

alexandru commented May 11, 2018

@oleg-py I renamed simple to async and introduced asyncS and cancelableS (S suffix comes from Scheduler).

I don't have better names and they need to be exposed. In absence of better options, we're going to go with these.

@oleg-py

This comment has been minimized.

Collaborator

oleg-py commented May 11, 2018

Works for me 👍

@alexandru alexandru merged commit 4c1a772 into monix:master May 11, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@leandrob13 leandrob13 referenced this pull request Dec 11, 2018

Open

[WIP] Create a streaming module for Monix over JDBC #1261

2 of 5 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment