Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Asynchronous Stack Traces for Task #1267

Merged
merged 7 commits into from
Oct 28, 2020

Conversation

Avasil
Copy link
Collaborator

@Avasil Avasil commented Sep 13, 2020

Fixes #1015

Latest snapshot version: 3.2.2+55-9adf112c-SNAPSHOT

This is a direct port of cats.effect.IO solution, released in 2.2.0.
Tracing docs

See the following PR's for more context:

So pretty much all credit goes to @RaasAhsan and @djspiewak 👏

TODO:

The feature works exactly the same as in Cats-Effect and is enabled by default.
It can be configured the same way as Cats version, which is documented here

The prefix is monix.eval instead of cats.effect e.g.:

-Dmonix.eval.stackTracingMode=full -Dmonix.eval.traceBufferLogSize=5

Example of the current state:

package test.app

import monix.eval.Task
import monix.execution.Scheduler
import cats.implicits._
import scala.concurrent.duration._

object TestTracingApp extends App {
  implicit val s = Scheduler.global

  def customMethod: Task[Unit] =
    Task.now(()).guarantee(Task.sleep(10.millis))

  val tracingTestApp: Task[Unit] = for {
    _ <- Task.shift
    _ <- Task.unit.attempt
    _ <- (Task(println("Started the program")), Task.unit).parTupled
    _ <- customMethod
    _ <- if (true) Task.raiseError(new Exception("boom")) else Task.unit
  } yield ()

  tracingTestApp.onErrorHandleWith(ex => Task(ex.printStackTrace())).runSyncUnsafe
}

Old stack trace:

java.lang.Exception: boom
        at test.app.TestTracingApp$.$anonfun$tracingTestApp$5(TestTracingApp.scala:36)
        at monix.eval.internal.TaskRunLoop$.startFull(TaskRunLoop.scala:188)
        at monix.eval.internal.TaskRestartCallback.syncOnSuccess(TaskRestartCallback.scala:101)
        at monix.eval.internal.TaskRestartCallback$$anon$1.run(TaskRestartCallback.scala:118)
        at monix.execution.internal.Trampoline.monix$execution$internal$Trampoline$$immediateLoop(Trampoline.scala:66)
        at monix.execution.internal.Trampoline.startLoop(Trampoline.scala:32)
        at monix.execution.schedulers.TrampolineExecutionContext$JVMNormalTrampoline.super$startLoop(TrampolineExecutionContext.scala:142)
        at monix.execution.schedulers.TrampolineExecutionContext$JVMNormalTrampoline.$anonfun$startLoop$1(TrampolineExecutionContext.scala:142)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
        at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:94)
        at monix.execution.schedulers.TrampolineExecutionContext$JVMNormalTrampoline.startLoop(TrampolineExecutionContext.scala:142)
        at monix.execution.internal.Trampoline.execute(Trampoline.scala:40)
        at monix.execution.schedulers.TrampolineExecutionContext.execute(TrampolineExecutionContext.scala:57)
        at monix.execution.schedulers.BatchingScheduler.execute(BatchingScheduler.scala:50)
        at monix.execution.schedulers.BatchingScheduler.execute$(BatchingScheduler.scala:47)
        at monix.execution.schedulers.AsyncScheduler.execute(AsyncScheduler.scala:31)
        at monix.eval.internal.TaskRestartCallback.onSuccess(TaskRestartCallback.scala:72)
        at monix.eval.internal.TaskRunLoop$.startFull(TaskRunLoop.scala:183)
        at monix.eval.internal.TaskRestartCallback.syncOnSuccess(TaskRestartCallback.scala:101)
        at monix.eval.internal.TaskRestartCallback.onSuccess(TaskRestartCallback.scala:74)
        at monix.eval.internal.TaskSleep$SleepRunnable.run(TaskSleep.scala:71)
        at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
        at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
        at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)

Cached Mode:

java.lang.Exception: boom
        at test.app.TestTracingApp$.$anonfun$tracingTestApp$5(TestTracingApp.scala:36)
        at guarantee @ test.app.TestTracingApp$.customMethod(TestTracingApp.scala:29)
        at flatMap @ test.app.TestTracingApp$.$anonfun$tracingTestApp$4(TestTracingApp.scala:35)
        at parTupled @ test.app.TestTracingApp$.$anonfun$tracingTestApp$2(TestTracingApp.scala:34)
        at parTupled @ test.app.TestTracingApp$.$anonfun$tracingTestApp$2(TestTracingApp.scala:34)
        at flatMap @ test.app.TestTracingApp$.$anonfun$tracingTestApp$2(TestTracingApp.scala:34)
        at flatMap @ test.app.TestTracingApp$.$anonfun$tracingTestApp$1(TestTracingApp.scala:33)
        at flatMap @ test.app.TestTracingApp$.delayedEndpoint$test$app$TestTracingApp$1(TestTracingApp.scala:32)

Full Tracing Mode:

java.lang.Exception: boom
        at test.app.TestTracingApp$.$anonfun$tracingTestApp$5(TestTracingApp.scala:36)
        at raiseError @ test.app.TestTracingApp$.$anonfun$tracingTestApp$5(TestTracingApp.scala:36)
        at now @ test.app.TestTracingApp$.customMethod(TestTracingApp.scala:29)
        at guarantee @ test.app.TestTracingApp$.customMethod(TestTracingApp.scala:29)
        at flatMap @ test.app.TestTracingApp$.$anonfun$tracingTestApp$4(TestTracingApp.scala:35)
        at <clinit> @ test.app.TestTracingApp$.delayedEndpoint$test$app$TestTracingApp$1(TestTracingApp.scala:32)
        at apply @ test.app.TestTracingApp$.$anonfun$tracingTestApp$2(TestTracingApp.scala:34)
        at parTupled @ test.app.TestTracingApp$.$anonfun$tracingTestApp$2(TestTracingApp.scala:34)
        at parTupled @ test.app.TestTracingApp$.$anonfun$tracingTestApp$2(TestTracingApp.scala:34)
        at flatMap @ test.app.TestTracingApp$.$anonfun$tracingTestApp$2(TestTracingApp.scala:34)
        at <clinit> @ test.app.TestTracingApp$.delayedEndpoint$test$app$TestTracingApp$1(TestTracingApp.scala:32)
        at flatMap @ test.app.TestTracingApp$.$anonfun$tracingTestApp$1(TestTracingApp.scala:33)
        at flatMap @ test.app.TestTracingApp$.delayedEndpoint$test$app$TestTracingApp$1(TestTracingApp.scala:32)

import java.util.Optional;

/**
* All Credits to https://github.com/typelevel/cats-effect and https://github.com/RaasAhsan
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just added this comment to all files that I copy-pasted with minimal changes

I'm not sure how should I give credits properly (and in a legal way) - I'm more than happy to follow any suggestions @RaasAhsan , @djspiewak

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is pretty much all that is really needed. From an etiquette standpoint, this seems ample. From a legal standpoint, both Monix and CE are under ASLv2, so derivative works like this are entirely expected and normal.

/**
* All Credits to https://github.com/typelevel/cats-effect and https://github.com/RaasAhsan
*/
final case class TaskTrace(events: List[TaskEvent], captured: Int, omitted: Int) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether classes like that should be reused with Cats-Effect, or not.
I suppose it's safer to have it separate and create a type class if there are any libraries that operate on it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea would be to slurp this out into a common library which both can share. You would probably also be interested in the ConcurrentHashMap replacement that we're working on for 2.2.1, which should shrink the performance hit pretty dramatically.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djspiewak that looks interesting, where can I learn more about what replacement you're preparing for ConcurrentHashMap?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jvican typelevel/cats-effect#1101

It's not a general replacement. CHM is actually insanely good at what it does. But tracing is a rather weird case. The cache converges to a relatively small size (strictly bounded by the number of classes) and then usually never changes again. Cache misses aren't fatal, they're just poorly optimized, and writes are not mandatory (follows from the miss bit). But within the context of all of that, we want reads to be blisteringly fast once fully converged.

In our synthetic benchmarks, tracing imposes a roughly 16% performance penalty on programs' straight-line performance (i.e. long chains of flatMaps). From what we can tell, nearly all of that is tied up in just reading from ConcurrentHashMap, due to the memory barriers which are necessary to ensure proper consistency, and secondarily due to the fact that the table doesn't aggressively optimize for a small and convergent set of keys. We can relax both of those constraints with a custom implementation and theoretically shave quite a bit off of that performance penalty.

Still a work in progress. :-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea would be to slurp this out into a common library which both can share.

That would be nice. Most of the code seems to be nicely separated and quite generic

Currently, I'm fine with copy-pasting and keeping an eye on updates but I will probably get tired of it pretty quickly, considering I plan to do the same for Coeval and monix-bio in the future

You would probably also be interested in the ConcurrentHashMap replacement that we're working on for 2.2.1, which should shrink the performance hit pretty dramatically.

Sounds awesome!
Could you use any help?

Copy link

@djspiewak djspiewak Sep 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be nice. Most of the code seems to be nicely separated and quite generic

Currently, I'm fine with copy-pasting and keeping an eye on updates but I will probably get tired of it pretty quickly, considering I plan to do the same for Coeval and monix-bio in the future

We should definitely pull it upstream into a new lib then. This is also relevant for CE3, where we want to reimplement tracing, but it would be nice to not have to duplicate things. Ideas on a name? I rather like Sacagawea (tracking, tracing, etc), partially because Native American names and words are sorely underrepresented in the software space, and partially because she's cool and it seems like a distinctive name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 on factoring out code for reusability. I would start with something like the UnsafeConvergentCache because it's actually very general-purpose and initialization-safe, thread-unsafe data structures may have applications outside tracing. Pulling out tracing-specific code like IOTrace/TaskTrace is also an awesome idea, but my only caution is that functionality will most certainly evolve in the CE2/CE3/Monix world (e.g. thread tracking will be introduced to CE2 soon), and saying that they won't diverge is probably short-sighted. It would be unfortunate to end up in a situation where the library prescribes too much on implementations or implementations are restricted by the library, but I think we can definitely find a middle ground. Ultimately if things don't work out, we should be able to sever the dependency easily (keeping use of that library internal is probably a reasonable precaution).

Currently, I'm fine with copy-pasting and keeping an eye on updates but I will probably get tired of it pretty quickly, considering I plan to do the same for Coeval and monix-bio in the future

If there are any improvements, it would be awesome to see those back in CE as well! :D But that's a really good reason to share all this code.

I rather like Sacagawea (tracking, tracing, etc), partially because Native American names and words are sorely underrepresented in the software space, and partially because she's cool and it seems like a distinctive name.

Sacagawea is an awesome name :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sacagawea is an awesome name :)

I can create a repo and set up some infra for it a little later this morning and you guys can shovel things in there as you see fit. :-) If it turns out to not be a good idea, it's always easy to un-shovel.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think I'll try to settle this PR as is, and then we could move towards keeping internals in sacagawea. Hopefully, I can find some improvements to pay you back a little. :)

I've counted only three public pieces (apart from configuration):

  • final case class IOTrace(events: List[IOEvent], captured: Int, omitted: Int)
  • IOEvent
  • PrintingOptions

And it should be perfectly fine to keep it per library, mostly cosmetic stuff apart from IOTrace.
In case of IOTrace, we could just use a type class in any future libraries.

Pulling out tracing-specific code like IOTrace/TaskTrace is also an awesome idea, but my only caution is that functionality will most certainly evolve in the CE2/CE3/Monix world (e.g. thread tracking will be introduced to CE2 soon)

Could you elaborate on this feature?
I'm wondering about IOTrace#showFiberTrace for other types of events.
Would it add extra lines about the current Thread?

* Global cache for trace frames. Keys are references to lambda classes.
* Should converge to the working set of traces very quickly for hot code paths.
*/
private[this] val frameCache: ConcurrentHashMap[Class[_], TaskEvent] = new ConcurrentHashMap()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Echoing my comment from typelevel/cats-effect#1101 (comment)

Storing strong references to the lambda classes as the key of the frame cache smells like a potential classloader leak when the lambda class is loaded from a classloader with a shorter lifecycle than cats-eval's classloader.

java.lang.ClassValue is designed to avoid such leaks. It's also designed to be extremely fast for lookups.

@Avasil
Copy link
Collaborator Author

Avasil commented Sep 18, 2020

BTW @jvican - do you have any performance tests in bloop?
If so, could you run them with 3.2.2+55-9adf112c-SNAPSHOT ?
I imagine that bloop will probably turn off stack traces anyway but it would be great to see what's the impact on real applications.

In microbenchmarks, I experience 10-30% slow-down for cached tracing

@jvican
Copy link
Collaborator

jvican commented Sep 21, 2020

Hey @Avasil, I have performance tests in bloop but bloop hasn't yet migrated to 3.x, it's something I'm planning to get around to soon, so unfortunately we won't be able to use our performance checks for a real-world performance evaluation.

In microbenchmarks, I experience 10-30% slow-down for cached tracing

What do you mean by cached tracing? Will this be the baseline when tracing is enabled by default?

@Avasil
Copy link
Collaborator Author

Avasil commented Sep 21, 2020

What do you mean by cached tracing? Will this be the baseline when tracing is enabled by default?

There are multiple configurations of tracing: https://typelevel.org/cats-effect/tracing/#configuration

It is going to be default but it can be disabled if someone notices any performance degradation

@9rb
Copy link

9rb commented Oct 17, 2020

This would be a very helpful feature. If a small release can be done that includes this feature, that would be great

@Avasil
Copy link
Collaborator Author

Avasil commented Oct 17, 2020

This would be a very helpful feature. If a small release can be done that includes this feature, that would be great

@9rb I plan to finish this PR and do an official release within two weeks but there is a snapshot that you could try: 3.2.2+55-9adf112c-SNAPSHOT - feedback would be greatly appreciated :)

@9rb
Copy link

9rb commented Oct 18, 2020

This would be a very helpful feature. If a small release can be done that includes this feature, that would be great

@9rb I plan to finish this PR and do an official release within two weeks but there is a snapshot that you could try: 3.2.2+55-9adf112c-SNAPSHOT - feedback would be greatly appreciated :)
Thanks. Will do.. @nzpr @tgrospic

@Avasil Avasil merged commit e7fa462 into monix:series/3.x Oct 28, 2020
@Avasil Avasil deleted the task-stack-traces branch November 14, 2020 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better stack traces for Task/Coeval/IO? Possible solutions?
6 participants