Skip to content

Split TraceConsumer into two different disruptors#1161

Merged
tylerbenson merged 5 commits into
masterfrom
tyler/disruptor-agent
Jan 31, 2020
Merged

Split TraceConsumer into two different disruptors#1161
tylerbenson merged 5 commits into
masterfrom
tyler/disruptor-agent

Conversation

@tylerbenson
Copy link
Copy Markdown
Contributor

First disruptor (TraceProcessingDisruptor) does processing, which is currently limited to serialization, but in the future can do other processing such as TraceInterceptor invocation.
Second disruptor (BatchWritingDisruptor) takes serialized traces and batches them into groups and flushes them periodically based on size and time.

@tylerbenson tylerbenson requested a review from dougqh January 6, 2020 21:13
@tylerbenson tylerbenson requested a review from a team as a code owner January 6, 2020 21:13
@tylerbenson tylerbenson force-pushed the tyler/disruptor-agent branch from 88baa9e to e3054cb Compare January 6, 2020 22:32
Comment thread dd-trace-ot/dd-trace-ot.gradle Outdated
Comment thread dd-trace-ot/src/main/java/datadog/trace/common/writer/DDAgentWriter.java Outdated
Comment thread dd-trace-ot/src/main/java/datadog/trace/common/writer/DDAgentWriter.java Outdated
/** Old signature (pre-Monitor) used in tests */
private DDAgentWriter(final DDAgentApi api) {
this(api, new Monitor.Noop());
batchWritingDisruptor =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be fine to pass the Spec object done to the disruptors.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently not all of the DDAgentWriter constructors use the Spec.


if (0 < flushFrequencySeconds) {
// This provides a steady stream of events to enable flushing with a low throughput.
final Runnable heartbeat =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not crazy about adding an extra executor for this.
The requesting flush on time out seen cleaner and lighter weight to me.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean "on time out"? if we don't have any events in the queue, the handler will never be called to trigger a flush.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the approach in the experimental branch that I did, it doesn't have a scheduled timer.

Instead the sender thread does something akin to this pseudo-code...
while ( !Thread.current().isInterrupted() ) {
try {
DDApi.Request request = queue.poll(flushFrequencySecs, TimeUnit.SECONDS);
send(request);
} catch ( TimeoutException e ) {
// request flush
flush();
}
}

I prefer this because it is one fewer threads, but also because it is easier to have the sender back-off its schedule.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, the heartbeat only ensures a minimum level of events to enable timely reporting in case no traces are sent in a given time window. The actual sending frequency can be adjusted in scheduleNextFlush().

public volatile boolean shouldFlush = false;
public volatile T data = null;
public volatile int representativeCount = 0;
public volatile CountDownLatch flushLatch = null;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having a latch per batch is a big improvement in the flush semantics.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also like this much better than the previous phaser approach.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A CountDownLatch might be overkill. We don't really need to wait for all the flushers to arrive before unblocking the others, but I don't think it is a big deal.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you suggest using instead?

Comment thread dd-trace-ot/src/main/java/datadog/trace/common/writer/ddagent/DisruptorEvent.java Outdated
if (event.data != null) {
try {
final byte[] serializedTrace = api.serializeTrace(event.data);
monitor.onSerialize(writer, event.data, serializedTrace);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a bug here. We shouldn't be calling onSerialize before we know if the publishing was successful.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit, I had a hard time understanding how to translate the monitor calls. That aspect of this change warrants a thorough review.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The monitor callbacks are following a couple rules...
1 - The call back happens after something is complete.
2 - The success and failure cases are split -- to force thinking carefully about failure.

So in general, I'd expect the callback to be at the end of a try block.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the order. Let me know if I'm missing anything else.

private final Monitor monitor;
private final DDAgentWriter writer;
private final List<byte[]> serializedTraces = new ArrayList<>();
private int representativeCount = 0;
Copy link
Copy Markdown
Contributor

@dougqh dougqh Jan 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could track not just traces but also spans. We wanted to include in health metrics, but that wasn't terribly easy in the prior design.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the number of total spans? What's the benefit there?

import lombok.extern.slf4j.Slf4j;

@Slf4j
public class BatchWritingDisruptor extends AbstractDisruptor<byte[]> {
Copy link
Copy Markdown
Contributor

@dougqh dougqh Jan 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concerns with this. I'd actually like to see us get away from producing many tiny byte[].

I'd prefer to see us build up one big byte[] instead to reduce the amount of allocation.
DDApi.Request from the experimental branch was built with that in mind. I don't quite see how we do that with this design.

Copy link
Copy Markdown
Contributor Author

@tylerbenson tylerbenson Jan 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked and came up with a good solution. Use a byte[] on the event as a buffer that gets reused and grows to satisfy the needed size and copy the array off to a large buffer when batching.

This requires moving off jackson though, so will be done in a separate PR.

Comment thread dd-trace-ot/src/main/java/datadog/trace/common/writer/DDAgentWriter.java Outdated
Comment thread dd-trace-ot/src/main/java/datadog/trace/common/writer/DDAgentWriter.java Outdated
}
}
};
heartbeatExecutor.scheduleAtFixedRate(heartbeat, 100, 100, TimeUnit.MILLISECONDS);
Copy link
Copy Markdown
Contributor

@dougqh dougqh Jan 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To have more meaningful back-pressure, we probably also need to able to back-off on the rate that we are sending. How would that work with the heartbeatExecutor?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The heartbeat Executor will only add an event to the queue if the queue is empty and it doesn't influence the frequency of flushing. What it does influence is the greatest amount of delay (beyond the flush frequency) that a flush will occur when the queue is empty. (ie, a flush will be at most 100 ms late from the 1/sec rate.)

@dougqh
Copy link
Copy Markdown
Contributor

dougqh commented Jan 7, 2020

I have a few concerns about how future changes will fit into this...
1 - I'd like to get away from producing many small byte[] -- and basing the second Disruptor on byte[] runs counter to that
2 - I'd prefer to have fewer threads, so I'd like to avoid the separate heartbeat if possible
3 - I'd like the sending rate to able to back-off and back-on depending on whether we're successful communicating with the agent. It isn't clear how that would work in this design.

Finally, it would help to have a comment / diagram that describes the overall publishing pipeline. That would the code easier to follow in the future.

First disruptor (TraceProcessingDisruptor) does processing, which is currently limited to serialization, but in the future can do other processing such as TraceInterceptor invocation.
Second disruptor (BatchWritingDisruptor) takes serialized traces and batches them into groups and flushes them periodically based on size and time.
@tylerbenson tylerbenson force-pushed the tyler/disruptor-agent branch from 090c9cb to 66928ae Compare January 17, 2020 00:43
// attempt to have agent scale the metrics properly
((DDSpan) event.data.get(0).getLocalRootSpan())
.context()
.setMetric("_sample_rate", 1d / event.representativeCount);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gbbr does this look like a legit way of getting our _sample_rate scaling done by the agent to be accurate?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The agent doesn't do any scaling, it's the backend, so I wouldn't be able to tell. _sample_rate is expected to hold the rate that a local client sampler (the one that doesn't send stuff to the agent at all) is using IIRC. @furmmon is our expert for answering any questions around sampling, maybe you can confirm.

if (traceProcessingDisruptor.running) {
final int representativeCount = traceCount.getAndSet(0) + 1;
final int representativeCount;
if (trace.isEmpty() || !(trace.get(0).isRootSpan())) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not work if the last span reported isn't the root span. This might be an issue for async traces and for partial flush traces. Any better ideas?

Also rename the builder class on DDTracer to default name generated by Lombok.
@tylerbenson tylerbenson force-pushed the tyler/disruptor-agent branch from 66928ae to 5cce4cb Compare January 17, 2020 19:10
this.writer = writer;
}

// TODO: reduce byte[] garbage by keeping the byte[] on the event and copy before returning.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So reducing byte[] remains to be done, I think that's fine for now. We can revisit that after ripping out Jackson.

Copy link
Copy Markdown
Contributor

@randomanderson randomanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the performance tests and got nearly identical results (~5,300 traces/s) from my local laptop on master and this PR

@tylerbenson tylerbenson merged commit 406b324 into master Jan 31, 2020
@tylerbenson tylerbenson deleted the tyler/disruptor-agent branch January 31, 2020 20:41
@tylerbenson tylerbenson restored the tyler/disruptor-agent branch January 31, 2020 20:41
@tylerbenson tylerbenson deleted the tyler/disruptor-agent branch January 31, 2020 20:41
@devinsba devinsba added this to the 0.43.0 milestone Jan 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants