Skip to content

hot_threads stacktraces can come from threads that are no longer hot #5952

@nik9000

Description

@nik9000

I frequently see hot_threads output like this:

   24.6% (122.8ms out of 500ms) cpu usage by thread 'elasticsearch[elastic1012][search][T#13]'
     10/10 snapshots sharing following 10 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:706)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.xfer(LinkedTransferQueue.java:615)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.take(LinkedTransferQueue.java:1109)
       org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:162)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:724)   

or when the system is under lower load, I see this:

    8.2% (41.1ms out of 500ms) cpu usage by thread 'elasticsearch[elastic1016][refresh][T#5]'
     10/10 snapshots sharing following 9 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:702)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.xfer(LinkedTransferQueue.java:615)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.poll(LinkedTransferQueue.java:1117)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:724)

It looks like hot_threads figures out what threads are hot and then takes stack trace snapshots. I imagine this is significantly less load on the system then snapshotting all the threads during the load measurements but it makes the results less useful, I think. Fine for really long running stuff like merges, but searches tend to be bursty.

Its still super useful but I find myself using jstack and thread dumps now that I've squashed most of the long running problems. Maybe something with hot_threads' stack trace merging that takes snapshots of all the threads and merges them would be more useful to me at this stage.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions