Skip to content

Conversation

gpolaert
Copy link
Contributor

@gpolaert gpolaert commented Jul 7, 2017

No description provided.

@gpolaert gpolaert closed this Jul 7, 2017
@gpolaert gpolaert deleted the gpolaert/circle-ci-test branch July 7, 2017 10:13
@tylerbenson tylerbenson added this to the Closed milestone Apr 10, 2020
deejgregor added a commit to deejgregor/dd-trace-java that referenced this pull request Oct 14, 2025
In DumpDrain, the collectTraces method replaces the 'data' field
with an empty ArrayList, but at the same time, it does not also
reset the 'index' field. If another dump is performed later, this
leads the get method reaching the 'return null' statement, and
as the comment states, this can (and does) break the queue.

This change does a few things:
- Resets the index in collectTraces when the data field is replaced
  (and marks the index field as volatile). This should prevent the
  above situation from happening.
- In case the situation still happens, a stand-in CommandElement
  is returned to avoid returning null. A warning message is also logged.
- The existing "testing tracer flare dump with multiple traces"
  test case is expanded to exercise problem.

Here is an example stack trace when the hang happens:

"dd-trace-monitor" DataDog#38 daemon prio=5 os_prio=31 tid=0x0000000110e6e000 nid=0x7617 runnable [0x0000000171032000]
   java.lang.Thread.State: RUNNABLE
    at org.jctools.queues.MpscBlockingConsumerArrayQueue.spinWaitForElement(MpscBlockingConsumerArrayQueue.java:634)
    at org.jctools.queues.MpscBlockingConsumerArrayQueue.parkUntilNext(MpscBlockingConsumerArrayQueue.java:566)
    at org.jctools.queues.MpscBlockingConsumerArrayQueue.take(MpscBlockingConsumerArrayQueue.java:482)
    at datadog.trace.core.PendingTraceBuffer$DelayingPendingTraceBuffer$Worker.run(PendingTraceBuffer.java:317)
    at java.lang.Thread.run(Thread.java:750)
deejgregor added a commit to deejgregor/dd-trace-java that referenced this pull request Oct 16, 2025
In DumpDrain, the collectTraces method replaces the 'data' field
with an empty ArrayList, but at the same time, it does not also
reset the 'index' field. If another dump is performed later, this
leads the get method reaching the 'return null' statement, and
as the comment states, this can (and does) break the queue.

This change does a few things:
- Resets the index in collectTraces when the data field is replaced
  (and marks the index field as volatile). This should prevent the
  above situation from happening.
- In case the situation still happens, a stand-in CommandElement
  is returned to avoid returning null. A warning message is also logged.
- The existing "testing tracer flare dump with multiple traces"
  test case is expanded to exercise problem.

Here is an example stack trace when the hang happens:

"dd-trace-monitor" DataDog#38 daemon prio=5 os_prio=31 tid=0x0000000110e6e000 nid=0x7617 runnable [0x0000000171032000]
   java.lang.Thread.State: RUNNABLE
    at org.jctools.queues.MpscBlockingConsumerArrayQueue.spinWaitForElement(MpscBlockingConsumerArrayQueue.java:634)
    at org.jctools.queues.MpscBlockingConsumerArrayQueue.parkUntilNext(MpscBlockingConsumerArrayQueue.java:566)
    at org.jctools.queues.MpscBlockingConsumerArrayQueue.take(MpscBlockingConsumerArrayQueue.java:482)
    at datadog.trace.core.PendingTraceBuffer$DelayingPendingTraceBuffer$Worker.run(PendingTraceBuffer.java:317)
    at java.lang.Thread.run(Thread.java:750)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants