[processor/groupbytrace] Deadlock when eventMachineWorker's events queue is full #33719

vijayaggarwal · 2024-06-22T14:17:56Z

Component(s)

processor/groupbytrace

What happened?

Description

I run this processor with num_traces=100_000 (default is 1_000_000) to reduce the worst case memory requirement of this component.

Also, I run this processor with wait_duration=10s (default is 1s) to allow the processor handle larger time intervals in receiving spans of a trace.

Lastly, I use num_workers=1, which is also the default.

If the component gets a burst of traffic and more than num_traces=100_000 traces get into the processor in a short span of time, then the ringBuffer will get filled up and there will be evictions. Also, it is quite likely that eventMachineWorker.events channel's buffer (capacity=10_000/num_workers) will also get filled up with traceReceived events.

Now, when both ring buffer and events buffer are full, processing a traceReceived event will lead to firing of traceRemoved event (due to eviction) and the processor will get deadlocked as the worker will get blocked on pushing the traceRemoved event (since the worker is blocked, the events buffer will never get consumed).

Expected Result

The processor should not get deadlocked.

Actual Result

The processor gets deadlocked.

Collector version

0.95.0, but the issue affects all recent versions of collector

Environment information

The issue is environment agnostic

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

github-actions · 2024-06-22T14:18:12Z

Pinging code owners:

processor/groupbytrace: @jpkrohling

See Adding Labels via Comments if you do not have permissions to add labels yourself.

jpkrohling · 2024-07-04T12:05:17Z

@vijayaggarwal , I can totally see what you mean, but would you please create a test case for this? Perhaps we should split the state machine to deal with deletion events separately?

vijayaggarwal · 2024-07-07T07:26:57Z

@jpkrohling I can't think of how to capture this in a unit test case as there are multiple functions involved and the initial condition is also non-trivial. If you can point me to some other test case which is even partly similar to this one, I will be happy to give it a shot.

That said, here's a simple description:

Given:
eventMachineWorker.events is full, and eventMachineWorker.buffer is also full.

When:
eventMachineWorker calls eventMachine.handleEvent with e.typ == traceReceived

Then:
eventMachineWorker ends up firing traceRemoved event (in groupByTraceProcessor.onTraceReceived function). Firing an event is a blocking operation and hence the eventMachineWorker stalls (eventMachineWorker.events is full, and its consumer is blocked on producing).

Fundamentally speaking, I guess the eventMachineWorker should not produce events into the eventMachineWorker.events, as its primary role is to consume and not to produce. To maintain a clear separation of roles, we can either split the state machine as you suggested or we can execute the onTraceRemoved logic synchronously in onTraceReceived itself. I have a slight preference for the second approach as I feel it will lead to simpler code.

PS: There's one bit of detail I have skipped above for simplicity - The firing of event happens within doWithTimeout and hence there isn't a permanent deadlock actually. The worker times out after 1s and proceeds with next operation. However, Processing 1 event / s is too slow for the worker to keep up with practical traffic volumes that the processor receives. [This doWithTimeout is placed in handleEventWithObservability. I had overlooked it earlier as I didn't expect non-observability logic there.]

jpkrohling · 2024-07-09T15:29:50Z

That's helpful, thank you! Are you experiencing this in a particular scenario? As a workaround, can you increase the buffer size?

vijayaggarwal · 2024-07-10T03:20:47Z

@jpkrohling My configuration does make me more susceptible. Particularly,

num_traces=100_000 (default is 1_000_000). Assuming a single trace to be 10 KB, 1_000_000 traces would consume 10 GB memory. So, I reduce to 100_000 to keep max memory consumption at ~1 GB.
wait_duration=10s (default is 1s), as I have cron jobs that take a few seconds to execute.

This significantly increases the chances of the buffer getting full. That said, even with this configuration, I face deadlock only infrequently (like once in a week). For now, I have configured an alert on the num_events_in_queue metric exposed by this component and I simply restart the collector when the alert triggers.

jpkrohling · 2024-07-11T08:18:00Z

@vijayaggarwal , I appreciate your feedback, that's very helpful, thank you!

vijayaggarwal added bug Something isn't working needs triage New item requiring triage labels Jun 22, 2024

github-actions bot added the processor/groupbytrace Group By Trace processor label Jun 22, 2024

github-actions bot mentioned this issue Jul 2, 2024

Weekly Report: 2024-06-25 - 2024-07-02 #33839

Open

jpkrohling removed the needs triage New item requiring triage label Jul 4, 2024

jpkrohling self-assigned this Jul 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[processor/groupbytrace] Deadlock when eventMachineWorker's events queue is full #33719

[processor/groupbytrace] Deadlock when eventMachineWorker's events queue is full #33719

vijayaggarwal commented Jun 22, 2024

github-actions bot commented Jun 22, 2024

jpkrohling commented Jul 4, 2024

vijayaggarwal commented Jul 7, 2024

jpkrohling commented Jul 9, 2024

vijayaggarwal commented Jul 10, 2024

jpkrohling commented Jul 11, 2024