Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting IndexOutOfBoundsException while Kamon is active #71

Closed
adamhonen opened this issue Aug 6, 2014 · 11 comments
Closed

Getting IndexOutOfBoundsException while Kamon is active #71

adamhonen opened this issue Aug 6, 2014 · 11 comments
Assignees
Labels

Comments

@adamhonen
Copy link

While using Kamon to monitor an Akka system I get a IndexOutOfBoundsException.

I'm using Kamon 0.3.3 on JDK 8 update 11 with Scala 2.11.1 and Akka 2.3.4.
I'm not using Spray or Play.

My system typically works at ~12K requests per second in this specific environment (development machine). With Kamon active this is reduced to ~7500K requests per seconds.

A few examples (notice the index always seems to be 7040):
java.lang.IndexOutOfBoundsException: index 7040
at java.util.concurrent.atomic.AtomicLongArray.checkedByteOffset(AtomicLongArray.java:65) ~[?:1.8.0_11]
at java.util.concurrent.atomic.AtomicLongArray.getAndAdd(AtomicLongArray.java:215) ~[?:1.8.0_11]
at java.util.concurrent.atomic.AtomicLongArray.incrementAndGet(AtomicLongArray.java:225) ~[?:1.8.0_11]
at org.HdrHistogram.AtomicHistogram.incrementCountAtIndex(AtomicHistogram.java:34) ~[HdrHistogram-1.0.8.jar:0.3.3]
at org.HdrHistogram.AbstractHistogram.recordSingleValue(AbstractHistogram.java:295) ~[HdrHistogram-1.0.8.jar:0.3.3]
at org.HdrHistogram.AbstractHistogram.recordValue(AbstractHistogram.java:372) ~[HdrHistogram-1.0.8.jar:0.3.3]
at kamon.metric.instrument.HdrHistogram.record(Histogram.scala:105) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.instrument.PaddedMinMaxCounter.refreshValues(MinMaxCounter.scala:110) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.instrument.MinMaxCounter$$anonfun$1.apply$mcV$sp(MinMaxCounter.scala:47) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at akka.actor.Scheduler$$anon$5.run(Scheduler.scala:79) ~[akka-actor_2.11-2.3.4.jar:?]
at akka.actor.LightArrayRevolverScheduler$$anon$2$$anon$1.run(Scheduler.scala:242) ~[akka-actor_2.11-2.3.4.jar:?]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) ~[akka-actor_2.11-2.3.4.jar:?]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393) [akka-actor_2.11-2.3.4.jar:?]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [scala-library-2.11.1.jar:?]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [scala-library-2.11.1.jar:?]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [scala-library-2.11.1.jar:?]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [scala-library-2.11.1.jar:?]

java.lang.IndexOutOfBoundsException: index 7040
at java.util.concurrent.atomic.AtomicLongArray.checkedByteOffset(AtomicLongArray.java:65) ~[?:1.8.0_11]
at java.util.concurrent.atomic.AtomicLongArray.getAndAdd(AtomicLongArray.java:215) ~[?:1.8.0_11]
at java.util.concurrent.atomic.AtomicLongArray.incrementAndGet(AtomicLongArray.java:225) ~[?:1.8.0_11]
at org.HdrHistogram.AtomicHistogram.incrementCountAtIndex(AtomicHistogram.java:34) ~[HdrHistogram-1.0.8.jar:0.3.3]
at org.HdrHistogram.AbstractHistogram.recordSingleValue(AbstractHistogram.java:295) ~[HdrHistogram-1.0.8.jar:0.3.3]
at org.HdrHistogram.AbstractHistogram.recordValue(AbstractHistogram.java:372) ~[HdrHistogram-1.0.8.jar:0.3.3]
at kamon.metric.instrument.HdrHistogram.record(Histogram.scala:105) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.instrument.PaddedMinMaxCounter.refreshValues(MinMaxCounter.scala:110) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.instrument.PaddedMinMaxCounter.collect(MinMaxCounter.scala:89) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.instrument.PaddedMinMaxCounter.collect(MinMaxCounter.scala:65) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.ActorMetrics$ActorMetricsRecorder.collect(ActorMetrics.scala:42) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.ActorMetrics$ActorMetricsRecorder.collect(ActorMetrics.scala:35) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.Subscriptions$$anonfun$collectAll$1.apply(Subscriptions.scala:83) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.Subscriptions$$anonfun$collectAll$1.apply(Subscriptions.scala:82) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at scala.collection.Iterator$class.foreach(Iterator.scala:743) ~[scala-library-2.11.1.jar:?]
at scala.collection.concurrent.TrieMapIterator.foreach(TrieMap.scala:923) ~[scala-library-2.11.1.jar:?]
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) ~[scala-library-2.11.1.jar:?]
at scala.collection.concurrent.TrieMap.foreach(TrieMap.scala:633) ~[scala-library-2.11.1.jar:?]
at kamon.metric.Subscriptions.collectAll(Subscriptions.scala:82) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.Subscriptions.flush(Subscriptions.scala:69) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.Subscriptions$$anonfun$receive$1.applyOrElse(Subscriptions.scala:41) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at akka.actor.Actor$class.aroundReceive(Actor.scala:465) ~[akka-actor_2.11-2.3.4.jar:?]
at kamon.metric.Subscriptions.aroundReceive(Subscriptions.scala:27) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) [akka-actor_2.11-2.3.4.jar:?]
at akka.actor.ActorCell.invoke_aroundBody0(ActorCell.scala:487) [akka-actor_2.11-2.3.4.jar:?]
at akka.actor.ActorCell$AjcClosure1.run(ActorCell.scala:1) [akka-actor_2.11-2.3.4.jar:?]
at org.aspectj.runtime.reflect.JoinPointImpl.proceed(JoinPointImpl.java:149) [aspectjrt-1.7.4.jar:?]
at akka.instrumentation.ActorCellInstrumentation$$anonfun$aroundBehaviourInvoke$1.apply(ActorCellInstrumentation.scala:56) [kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.trace.TraceRecorder$.withTraceContext(TraceRecorder.scala:69) [kamon-core_2.11-0.3.3.jar:0.3.3]
at akka.instrumentation.ActorCellInstrumentation.aroundBehaviourInvoke(ActorCellInstrumentation.scala:55) [kamon-core_2.11-0.3.3.jar:0.3.3]
at akka.actor.ActorCell.invoke(ActorCell.scala:483) [akka-actor_2.11-2.3.4.jar:?]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) [akka-actor_2.11-2.3.4.jar:?]
at akka.dispatch.Mailbox.run(Mailbox.scala:220) [akka-actor_2.11-2.3.4.jar:?]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393) [akka-actor_2.11-2.3.4.jar:?]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [scala-library-2.11.1.jar:?]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [scala-library-2.11.1.jar:?]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [scala-library-2.11.1.jar:?]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [scala-library-2.11.1.jar:?]

java.lang.IndexOutOfBoundsException: index 7040
at java.util.concurrent.atomic.AtomicLongArray.checkedByteOffset(AtomicLongArray.java:65) ~[?:1.8.0_11]
at java.util.concurrent.atomic.AtomicLongArray.getAndAdd(AtomicLongArray.java:215) ~[?:1.8.0_11]
at java.util.concurrent.atomic.AtomicLongArray.incrementAndGet(AtomicLongArray.java:225) ~[?:1.8.0_11]
at org.HdrHistogram.AtomicHistogram.incrementCountAtIndex(AtomicHistogram.java:34) ~[HdrHistogram-1.0.8.jar:0.3.3]
at org.HdrHistogram.AbstractHistogram.recordSingleValue(AbstractHistogram.java:295) ~[HdrHistogram-1.0.8.jar:0.3.3]
at org.HdrHistogram.AbstractHistogram.recordValue(AbstractHistogram.java:372) ~[HdrHistogram-1.0.8.jar:0.3.3]
at kamon.metric.instrument.HdrHistogram.record(Histogram.scala:105) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.instrument.PaddedMinMaxCounter.refreshValues(MinMaxCounter.scala:110) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.instrument.PaddedMinMaxCounter.collect(MinMaxCounter.scala:89) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.instrument.PaddedMinMaxCounter.collect(MinMaxCounter.scala:65) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.ActorMetrics$ActorMetricsRecorder.collect(ActorMetrics.scala:42) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.ActorMetrics$ActorMetricsRecorder.collect(ActorMetrics.scala:35) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.Subscriptions$$anonfun$collectAll$1.apply(Subscriptions.scala:83) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.Subscriptions$$anonfun$collectAll$1.apply(Subscriptions.scala:82) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at scala.collection.Iterator$class.foreach(Iterator.scala:743) ~[scala-library-2.11.1.jar:?]
at scala.collection.concurrent.TrieMapIterator.foreach(TrieMap.scala:923) ~[scala-library-2.11.1.jar:?]
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) ~[scala-library-2.11.1.jar:?]
at scala.collection.concurrent.TrieMap.foreach(TrieMap.scala:633) ~[scala-library-2.11.1.jar:?]
at kamon.metric.Subscriptions.collectAll(Subscriptions.scala:82) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.Subscriptions.flush(Subscriptions.scala:69) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.metric.Subscriptions$$anonfun$receive$1.applyOrElse(Subscriptions.scala:41) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at akka.actor.Actor$class.aroundReceive(Actor.scala:465) ~[akka-actor_2.11-2.3.4.jar:?]
at kamon.metric.Subscriptions.aroundReceive(Subscriptions.scala:27) ~[kamon-core_2.11-0.3.3.jar:0.3.3]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) [akka-actor_2.11-2.3.4.jar:?]
at akka.actor.ActorCell.invoke_aroundBody0(ActorCell.scala:487) [akka-actor_2.11-2.3.4.jar:?]
at akka.actor.ActorCell$AjcClosure1.run(ActorCell.scala:1) [akka-actor_2.11-2.3.4.jar:?]
at org.aspectj.runtime.reflect.JoinPointImpl.proceed(JoinPointImpl.java:149) [aspectjrt-1.7.4.jar:?]
at akka.instrumentation.ActorCellInstrumentation$$anonfun$aroundBehaviourInvoke$1.apply(ActorCellInstrumentation.scala:56) [kamon-core_2.11-0.3.3.jar:0.3.3]
at kamon.trace.TraceRecorder$.withTraceContext(TraceRecorder.scala:69) [kamon-core_2.11-0.3.3.jar:0.3.3]
at akka.instrumentation.ActorCellInstrumentation.aroundBehaviourInvoke(ActorCellInstrumentation.scala:55) [kamon-core_2.11-0.3.3.jar:0.3.3]
at akka.actor.ActorCell.invoke(ActorCell.scala:483) [akka-actor_2.11-2.3.4.jar:?]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) [akka-actor_2.11-2.3.4.jar:?]
at akka.dispatch.Mailbox.run(Mailbox.scala:220) [akka-actor_2.11-2.3.4.jar:?]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393) [akka-actor_2.11-2.3.4.jar:?]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [scala-library-2.11.1.jar:?]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [scala-library-2.11.1.jar:?]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [scala-library-2.11.1.jar:?]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [scala-library-2.11.1.jar:?]

Let me know in case you need more information.
Thanks.

@m50d
Copy link
Contributor

m50d commented Aug 7, 2014

This looks like the same issue as mine, https://groups.google.com/forum/#!topic/kamon-user/ph568abmR0s

@adamhonen
Copy link
Author

Yeah, same one.
I only noticed that thread after opening this issue.

@ivantopo
Copy link
Contributor

ivantopo commented Aug 7, 2014

@adamhonen what OS are you using?

@adamhonen
Copy link
Author

I'm running on Windows 7 64bit with Service Pack 1.

On Thu, Aug 7, 2014 at 5:02 PM, Ivan Topolnjak notifications@github.com
wrote:

@adamhonen https://github.com/adamhonen what OS are you using?


Reply to this email directly or view it on GitHub
#71 (comment).

@ivantopo
Copy link
Contributor

@adamhonen please read this post on the mailing list for more info on what current snapshots solve this issue. Also, with regards to the first comment on the reduced speed: the metrics collection code in Kamon will introduce some overhead in your app, that is for sure, even more in Windows where calls to System.nanoTime() are much more expensive than in Linux and OSX and Kamon uses it a lot.

The impact in your application's performance will also depend a lot in your design: if your messages are very simple to process, like updating a counter, the overhead of measuring metrics will be obviously bigger than the time taken to do x += 1.. if your actors do some decent processing or wrap a blocking API such as JDBC then the time spent recording metrics will be unnoticeable.. one of the busiest app we have works pretty much the same with or without Kamon.

We work very hard to make Kamon the fastest and most feature complete tool for monitoring Akka, Spray and Play and it would be very nice if you can profile your app and do some more investigation to see if there are some hot spots that we can improve, we are always happy to receive feedback!

@adamhonen
Copy link
Author

Yeah,

I'm quite familiar with how monitoring tools' overhead works, as I've spent close to 8 years developing such tools. I actually only mentioned the overhead as indication for whether or not Kamon works at all. The test I was running is completely naive and in no way near to what a proper performance test should look like - I was running everything on the same machine (which also happens to be my development machine). So while in production I expect this application to be IO bound and process much less operations per second, it's currently CPU bound.

I will, however, definitely be using Kamon to monitor it once I start running proper load tests and I will definitely let you know if I see anything.

@pcting
Copy link

pcting commented Nov 14, 2014

I'm using spray 1.3.2 and I've also been encountering this issue. My environment gets about 7.5k req/min to 15k req/min. The problem seems to manifests itself after 3+ hours of later. I've tried and experienced this on both Kamon 0.3.5-a883a07bba0758df982feb431accb6447118ab69 and latest 0.3.5, but this is a stacktrace on 0.3.5:

16:07:12.327 [bulkhead-akka.actor.default-dispatcher-4] ERROR akka.actor.ActorSystemImpl - Uncaught error from thread [XXX-akka.actor.default-dispatcher-8]
java.lang.IndexOutOfBoundsException: index 4663
        at java.util.concurrent.atomic.AtomicLongArray.checkedByteOffset(AtomicLongArray.java:64) ~[na:1.7.0_72]
        at java.util.concurrent.atomic.AtomicLongArray.addAndGet(AtomicLongArray.java:256) ~[na:1.7.0_72]
        at java.util.concurrent.atomic.AtomicLongArray.incrementAndGet(AtomicLongArray.java:235) ~[na:1.7.0_72]
        at org.HdrHistogram.AtomicHistogram.incrementCountAtIndex(AtomicHistogram.java:36) ~[org.hdrhistogram.HdrHistogram-1.2.1.jar:0.3.5]
        at org.HdrHistogram.AbstractHistogram.recordSingleValue(AbstractHistogram.java:285) ~[org.hdrhistogram.HdrHistogram-1.2.1.jar:0.3.5]
        at org.HdrHistogram.AbstractHistogram.recordValue(AbstractHistogram.java:216) ~[org.hdrhistogram.HdrHistogram-1.2.1.jar:0.3.5]
        at kamon.metric.instrument.HdrHistogram.record(Histogram.scala:109) ~[io.kamon.kamon-core_2.11-0.3.5.jar:0.3.5]
        at akka.instrumentation.ActorCellInstrumentation.aroundBehaviourInvoke(ActorCellInstrumentation.scala:69) ~[io.kamon.kamon-core_2.11-0.3.5.jar:0.3.5]
        at akka.actor.ActorCell.invoke(ActorCell.scala:483) [com.typesafe.akka.akka-actor_2.11-2.3.6.jar:na]
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) [com.typesafe.akka.akka-actor_2.11-2.3.6.jar:na]
        at akka.dispatch.Mailbox.run(Mailbox.scala:220) [com.typesafe.akka.akka-actor_2.11-2.3.6.jar:na]
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393) [com.typesafe.akka.akka-actor_2.11-2.3.6.jar:na]
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [org.scala-lang.scala-library-2.11.2.jar:na]
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [org.scala-lang.scala-library-2.11.2.jar:na]
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [org.scala-lang.scala-library-2.11.2.jar:na]
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [org.scala-lang.scala-library-2.11.2.jar:na]

@ivantopo
Copy link
Contributor

@pcting judging by the stacktrace and the index with which it is failing, the reason for getting this is that you are trying to record a measurement of about 2 hours long but the default configuration goes up to 1 hour. If it is normal to your app to have something waiting in the mailbox or being processed during such a long period, you can always set the kamon.metrics.precision.actor.time-in-mailbox.highest-trackable-value to a value that makes sense to you, in nanoseconds.

Please note that trying to modify kamon.metrics.precision.default-histogram-precision will not work in this case because substitutions inside reference.conf files are resolved before merging with application.conf files, we already had that issue before :s.. please let us know if this behavior fits with your app and changing the settings solve the problem, bye!

@pcting
Copy link

pcting commented Nov 14, 2014

I'll make that adjustment to see if it helps, but it looks like it may be an upstream bug in the histogram library: HdrHistogram/HdrHistogram#24

@pcting
Copy link

pcting commented Nov 19, 2014

@ivantopo Good news! I realized the root of my problem; I was generating an ton of tracecontexts with unique names! I added a bunch of traceName directives with static names and upgraded to HdrHistogram 2.0.2. I've been testing for several days now and the code seems stable now!

I'm not sure if the HdrHistogram update was the fix, but I'm happy with the results now.

@ivantopo
Copy link
Contributor

@pcting excellent! The HdrHistogram update should not be necessary, please note that we are using a specialised subclass of HdrHistogram for Kamon and even while your upgrade was smooth, it will not necessarily be true for all future versions of HdrHistogram... we will try to upgrade to the latest version for the next release. Thanks for sharing your experience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants