Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Histogram recorded value cannot be negative. #232

Closed
ShaneDelmore opened this issue Jul 17, 2015 · 7 comments · Fixed by #335
Closed

Histogram recorded value cannot be negative. #232

ShaneDelmore opened this issue Jul 17, 2015 · 7 comments · Fixed by #335
Assignees

Comments

@ShaneDelmore
Copy link

I noticed the following error shortly after starting up one of my scala services running in a docker container on OSX.

I do not know how to reproduce it but if there are any steps I can take to help troubleshoot the issue please let me know.

java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be negative.
dispatcher_1 | at org.HdrHistogram.AbstractHistogram.countsArrayIndex(AbstractHistogram.java:1944)
dispatcher_1 | at org.HdrHistogram.AbstractHistogram.recordSingleValue(AbstractHistogram.java:414)
dispatcher_1 | at org.HdrHistogram.AbstractHistogram.recordValue(AbstractHistogram.java:333)
dispatcher_1 | at kamon.metric.instrument.HdrHistogram.record(Histogram.scala:115)
dispatcher_1 | at akka.kamon.instrumentation.ActorCellInstrumentation.aroundBehaviourInvoke(ActorCellInstrumentation.scala:68)
dispatcher_1 | at akka.actor.ActorCell.invoke(ActorCell.scala:483)
dispatcher_1 | at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
dispatcher_1 | at akka.dispatch.Mailbox.run(Mailbox.scala:221)
dispatcher_1 | at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
dispatcher_1 | at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
dispatcher_1 | at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
dispatcher_1 | at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
dispatcher_1 | at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

@ShaneDelmore
Copy link
Author

Here are the kamon modules and versions I am using.
val kamonV: String = "0.4.0"
"io.kamon" %% "kamon-core" % kamonV,
"io.kamon" %% "kamon-system-metrics" % kamonV,
"io.kamon" %% "kamon-newrelic" % kamonV,
"io.kamon" %% "kamon-statsd" % kamonV,
"io.kamon" %% "kamon-akka" % kamonV,
"io.kamon" %% "kamon-spray" % kamonV

I am running on a Java8 JVM and using aspectjweaver-1.8.6.jar.

@tabdulradi
Copy link

I am hitting same issue. I am sure I am recording positive values.
Note: I am also using Docker on OSX, if it make a difference!

@joshlemer
Copy link

I am having the same issue, using Java8. Here's the stack trace
[ERROR] [09/25/2015 10:58:56.721] [kamon-akka.actor.default-dispatcher-3] [TaskInvocation] Histogram recorded value cannot be negative.
java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be negative.
at org.HdrHistogram.AbstractHistogram.countsArrayIndex(AbstractHistogram.java:1944)
at org.HdrHistogram.AbstractHistogram.recordSingleValue(AbstractHistogram.java:414)
at org.HdrHistogram.AbstractHistogram.recordValue(AbstractHistogram.java:333)
at kamon.metric.instrument.HdrHistogram.record(Histogram.scala:115)
at kamon.metric.instrument.HistogramBackedGauge.refreshValue(Gauge.scala:117)
at kamon.metric.instrument.Gauge$$anonfun$1.apply$mcV$sp(Gauge.scala:41)
at akka.actor.Scheduler$$anon$5.run(Scheduler.scala:79)
at akka.actor.LightArrayRevolverScheduler$$anon$2$$anon$1.run(Scheduler.scala:242)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

@ivantopo
Copy link
Contributor

I am thinking on this issue now and sadly there doesn't seem to be too much that we can do in the case of a failing gauge because as you can see in the code, there is no information at all to tell you which gauge is failing. But I see two things that we can do to improve this situation:

  • catch the exception! All in all, Kamon is a monitoring tool and we shouldn't let a problem with a single measurement in Kamon bubble up and possibly make a user-level functionality fail.
  • provide better context when something fails and even have some metrics about it. I guess that if we at least log the entity and metric name we can have a better chance at reproducing the issue.

I will work on this ASAP, regards!

@thjaeckle
Copy link

I am also facing this issue. In my case it has something to do with "kamon-system-metics":

[ERROR] [11/12/2015 10:39:26.403] [kamon-akka.actor.default-dispatcher-8] [akka://kamon/user/kamon-system-metrics/sigar-metrics-recorder] Histogram recorded value cannot be negative.
java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be negative.
    at org.HdrHistogram.AbstractHistogram.countsArrayIndex(AbstractHistogram.java:2068)
    at org.HdrHistogram.AbstractHistogram.recordSingleValue(AbstractHistogram.java:427)
    at org.HdrHistogram.AbstractHistogram.recordValue(AbstractHistogram.java:346)
    at kamon.metric.instrument.HdrHistogram.record(Histogram.scala:115)
    at kamon.system.sigar.ProcessCpuMetrics.update(ProcessCpuMetrics.scala:69)
    at kamon.system.sigar.SigarMetricsUpdater.updateMetrics(SigarMetricsUpdater.scala:49)

Don't know how the CPU metrics should be negative, but this is the case :) (happening on Windows 7 btw.)

@swachter
Copy link

Looking at the code of ProcessCpuMetrics it seems to me that in fact there can occur negative numbers:

def update(): Unit = {
    val currentProcCpu = sigar.getProcCpu(pid)
    val totalDiff = currentProcCpu.getTotal - lastProcCpu.getTotal
    val userDiff = currentProcCpu.getUser - lastProcCpu.getUser
    val systemDiff = currentProcCpu.getSys - lastProcCpu.getSys
    val timeDiff = currentProcCpu.getLastTime - lastProcCpu.getLastTime

    def percentUsage(delta: Long): Long = Try(100 * delta / timeDiff / totalCores).getOrElse(0L)

    if (totalDiff == 0) {
      if (timeDiff > 2000) currentLoad = 0
      if (currentLoad == 0) lastProcCpu = currentProcCpu
    } else {
      val totalPercent = percentUsage(totalDiff)
      val userPercent = percentUsage(userDiff)
      val systemPercent = percentUsage(systemDiff)

      processUserCpu.record(userPercent)
      processSystemCpu.record(systemPercent)   // <<<==== this is line 69
      processTotalCpu.record(userPercent + systemPercent)

      currentLoad = totalPercent
      lastProcCpu = currentProcCpu
    }
  }

The value of systemDiff and systemPercent can be negative. I assume that simply taking the absolute value would fix that problem. (I am not so deep into the details.)

@ghost
Copy link

ghost commented Mar 30, 2016

Please ensure that any exception triggered within Kamon is caught. We have experienced a couple of issues due to #284 and the result is that the underlying actor no longer received any messages sent to it's mailbox. Under most circumstances, it is possible to rework applications to prevent 'old' messages triggering the exception, but if you need to stash a message pending user intervention #252 means that the same exception can be triggered. As a monitoring tool Kamon should not interfere with the running of the application.

@ivantopo ivantopo self-assigned this Apr 1, 2016
ivantopo added a commit to ivantopo/Kamon that referenced this issue Apr 1, 2016
dpsoft referenced this issue Apr 22, 2016
core: catch any exception being thrown when recording values on histograms
dpsoft pushed a commit that referenced this issue Dec 4, 2016
dpsoft referenced this issue Dec 4, 2016
core: catch any exception being thrown when recording values on histograms
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants