New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task gets smarter about forking and async boundaries #670

Merged
merged 17 commits into from May 14, 2018

Conversation

Projects
None yet
1 participant
@alexandru
Member

alexandru commented May 13, 2018

Summary of changes:

  1. Task operations that gather results in parallel or that create race conditions are now forking automatically
  2. Task's executeOn shifts back on the default Scheduler automatically
  3. Task.apply no longer forks
  4. Optimized Task.create, ArrayQueue, TrampolineExecutionContext and TaskRestartCallback in order to improve Task's performance on light async boundaries

1. Task goes parallel

Previously Task was following a model of being explicit about forking in operations capable of gathering results in parallel, so an operation like this would in fact execute the tasks in sequence, because the specified tasks are not async:

// N.B. Task.eval does not fork
val task1 = Task.eval(1 + 1)
val task2 = Task.eval(2 + 2)

// Yields 6; does not process in parallel!
Task.parMap2(task1, task2)(_ + _)

So in order to parallelize synchronous tasks in such a case, one has to ensure async boundaries:

val task1 = Task.shift *> Task.eval(1 + 1)
val task2 = Task.shift *> Task.eval(2 + 2)

Task.parMap2(task1, task2)(_ + _)

This can take rookies by surprise and the chat channel we've got periodic questions about it. It can also be inefficient because people end up applying extraneous async boundaries. This is no longer needed. Operations such as Task.gather will now evaluate tasks in parallel, even if their execution model is synchronous.

So after this PR:

val task1 = Task.eval(1 + 1)
val task2 = Task.eval(2 + 2)

// Evaluated in parallel:
Task.parMap2(task1, task2)(_ + _)

This diverges from cats.effect.IO, who doesn't afford to do this 馃槈

Operations that have been changed to do automatic forking:

  1. gather
  2. gatherUnordered
  3. wander
  4. wanderUnordered
  5. mapBoth, parMap2 ... parMap6

Note that fork, race, racePair and raceMany where already doing automatic forking.

Also note that the implementation is optimal, for example in this case only a single async boundary will happen when evaluating the provided tasks, not more, because the implementation is able to detect when the tasks are forked, for the obvious cases:

val task1 = Task.eval(1 + 1).executeAsync
val task2 = Task.eval(2 + 2).executeAsync

// Detects that the provided tasks are already asynchronous, so no 
// extra forks will happen ;-)
Task.parMap2(task1, task2)(_ + _)

2. Task.executeOn shifts back to the default thread-pool automatically

The executeOn operation is a strength of Task (compared with cats.effect.IO and others), because it allows you to do this:

for {
  r1 <- cpuBoundTask1
  r2 <- ioBoundTask.executeOn(ioScheduler)
  // Unfortunately this shift has been required to get back to the default scheduler
  _ <- Task.shift
  r3 <- cpuBoundTask3
} yield r1 + r2 + r3

This is sort of a gotcha with the current API and even the advanced users get it wrong. But no more, that Task.shift is now no longer needed, because executeOn shifts back to the default thread-pool:

for {
  r1 <- cpuBoundTask1 // executes on default thread-pool
  r2 <- ioBoundTask.executeOn(ioScheduler) // executes on I/O thread-pool
  r3 <- cpuBoundTask3 // executes on default thread-pool
} yield r1 + r2 + r3

3. Task.apply no longer forks

This is a risky breaking change, however Task.apply forking has been a design mistake, copied from Scala's Future and Scalaz 7's Task.

  • Task.apply is now an alias of Task.eval and thus its evaluation is synchronous.
  • There's now a new Task.evalAsync builder with the old behavior

And fortunately all operations that gather tasks in parallel will now fork automatically, so hopefully no code will break because of this.

4. Task Optimizations for Trampolined Async Boundaries

I've managed to push improvements when generating trampolined async boundaries, which are mandatory. Also optimized Task.create and others.

Name 3.0.0-RC1 This PR
createCancelable 1564.664 2702.895
createNonCancelable 2018.738 6641.278
executeWithOptions 3789.643 6952.140
forkedShift 513.493 543.263
lightAsync 5499.940 6807.206
trampolinedShift1 4901.814 6502.987
trampolinedShift2 6115.811 6854.566

Some better numbers could have been obtained, but the new implementation is safer because it takes care of concerns that weren't taken care of in RC1.

Optimizations went into:

  1. RestartCallback, as we are now avoiding the creation of extraneous TrampolinedRunnable instances for before & after boundaries
  2. TrampolinedExecutionContext was optimized to do a single read of its ThreadLocal variable, with the trampoline itself being optimized to use ArrayStack instead of the Scala List we were using before
  3. ArrayStack was optimized to grow and shrink in fixed chucks and thus maintain its O(1) complexity, whereas before (in RC1) the complexity was actually amortized O(1) due to the need to copy the array when growing and shrinking

4.1. Optimized Task.create for Cancelable.empty

This piece of code:

Task.create { (_, cb) => 
  cb.onSuccess(1) 
  Cancelable.empty
}

Is now equivalent with:

Task.async { (_, cb) =>  cb.onSuccess(1) }

Using our builder magic, this is a compile time trick that yields tremendous performance improvement, because keeping track of cancelable tokens is expensive, so if we see Cancelable.empty, we now treat it as Unit 馃槈

You can see it in the benchmarks above, that's the reason for why we're getting such good numbers on the createNonCancelable test.

alexandru added some commits May 13, 2018

@alexandru alexandru changed the title from Task gets smarter about forking and trampolined async boundaries to Task gets smarter about forking and async boundaries May 13, 2018

alexandru added some commits May 13, 2018

@alexandru alexandru added this to the 3.0.0-RC2 milestone May 13, 2018

alexandru added some commits May 14, 2018

@codecov

This comment has been minimized.

codecov bot commented May 14, 2018

Codecov Report

Merging #670 into master will increase coverage by 0.14%.
The diff coverage is 90.66%.

@@            Coverage Diff             @@
##           master     #670      +/-   ##
==========================================
+ Coverage   90.89%   91.03%   +0.14%     
==========================================
  Files         379      383       +4     
  Lines       10143    10318     +175     
  Branches     1894     1923      +29     
==========================================
+ Hits         9219     9393     +174     
- Misses        924      925       +1

alexandru added some commits May 14, 2018

@alexandru alexandru merged commit be7dd82 into monix:master May 14, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

sderosiaux added a commit to sderosiaux/every-single-day-i-tldr that referenced this pull request May 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment