Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow batch span processor to export data as soon as a batch is filled #494

Closed

Conversation

TommyCpp
Copy link
Contributor

@TommyCpp TommyCpp commented Mar 26, 2021

resolves #468

Problem

When generating spans at a high speed, users often experience span loss because the buffer reaches max size. Currently, the solution is to encourage users to use a larger buffer size and a shorter delay. This PR proposes a new export mode within the batch span processor. This allows the batch span processor to export spans as soon as the batch is filled.

Proposed solution

We spawn two tasks to handle the exports of spans. One is collect task, which collects spans and determines whether it's time to export it. The other is export task, which exports a batch of spans.

The collect task should be non-blocking unless the message it received is shutdown or force push. When a span sent to collect task, it will push the span into the buffer. collect task will send a batch to export task in two cases, one is when the batch's size reaches the max_export_size, the other is when pre-defined scheduled_delay reaches. collect task will restart the schedule_delay timer after each export.

The export task polls batches of spans from the channel between it and collect task. The channel between them serves as a temporary buffer for the exporter. The channel length is max_queue_size. Note that the element of this channel is Vec<SpanData>. Thus, the buffer can hold max_queue_size * batch_size spans.

Note that even in this mode, users can still experience span loss if the spans are generating faster than the exporter can handle. Specifically, the collect task will drop spans if the channel between it and export task has been filled.

Diagram
                ┌─────────────────────────┐
                │      User tasks         │
                │                         │
                └─────────┬───────────────┘
                          │
                          │ Export spans
                          │
┌───────────────────────────┼─────────────────────────────────────────────────────────────────────┐
│                           │                                                                     │
│  Batch span processor     │                                                                     │
│                           │                                                                     │
│                        ┌──┼──────────────────────────────────────────────────────────────┐      │
│                        │  │ Collect task                                                 │      │
│                        │  │                                                              │      │
│                        │ ┌▼──────────────┐                    ┌───────────────┐          │      │
│                        │ │ Buffer        │                    │  Delay timer  │          │      │
│                        │ └─────┬─────────┘                    └──────┬─────▲──┘          │      │
│                        │       │                                     │     │             │      │
│                        │       │                                     │     │             │      │
│                        │       │    Buffer filled or timer hit       │     │ Reset timer │      │
│                        │       └─────────────────┬───────────────────┘     │             │      │
│                        │                         │                         │             │      │
│                        │  ┌──────────────────────┴─────────────────────────┘             │      │
│                        │  │                                                              │      │
│                        └──┼──────────────────────────────────────────────────────────────┘      │
│                           │                                                                     │
│                           │  Push a batch of spans                                              │
│                           │                                                                     │
│                           │       ┌────────────────────────┐                                    │
│                           └───────►  Export queue          ├────┐                               │
│                                   └────────────────────────┘    │                               │
│                                                                 │                               │
│                                           Poll a batch of spans │                               │
│                                                                 │                               │
│                                               ┌─────────────────▼──────┐                        │
│                                               │   Export task          │                        │
│                                               │                        │                        │
│                                               └─────────────────┬──────┘                        │
│                                                                 │                               │
│                                                                 │ Export                        │
│                                                                 │                               │
└─────────────────────────────────────────────────────────────────┼───────────────────────────────┘
                                                                │
                                                                │
                                                                

Open questions

  1. How do we handle communication failure between tasks?
  2. How do we decide the length of the channel between collect task and export task?

@TommyCpp TommyCpp force-pushed the tommycpp/batch_processor branch 2 times, most recently from a84d003 to e6dd0d6 Compare March 27, 2021 02:26
@TommyCpp TommyCpp marked this pull request as ready for review March 28, 2021 16:51
@TommyCpp TommyCpp requested a review from a team as a code owner March 28, 2021 16:51
@djc
Copy link
Contributor

djc commented Mar 28, 2021

Why is an extra task needed to solve this problem? Could we just kick off an export if the buffer fills up without spawning an extra task to do it?

@TommyCpp
Copy link
Contributor Author

My intention would be to keep filing the next batch while the first one is being exported. So we could keep the exporter exporting without waiting for the buffer to be filled.

@jtescher
Copy link
Member

jtescher commented Mar 29, 2021

Interesting idea! sorry haven't had time to fully read over the change, will try to review tomorrow

@djc
Copy link
Contributor

djc commented Mar 29, 2021

Is that actually a problem we've observed in practice? The design does sound sane to me, I'm just wondering if we have some proof the extra complexity is actually warranted.

@djc djc mentioned this pull request Mar 29, 2021
@LukeMathWalker
Copy link
Contributor

LukeMathWalker commented Mar 29, 2021

When generating spans at a high speed, users often experience span loss because the buffer reaches max size. Currently, the solution is to encourage users to use a larger buffer size and a shorter delay. This PR proposes a new export mode within the batch span processor. This allows the batch span processor to export spans as soon as the batch is filled.

Wouldn't we get a simpler API by doing what is most appropriate in each occasion without exposing further configuration to the user?
By default, export on regular intervals. When the batch is full, trigger an export task immediately.
In other words, do you foresee a situation where export_on_batch_filled should be set to false?

@jtescher
Copy link
Member

By default, export on regular intervals. When the batch is full, trigger an export task immediately.
In other words, do you foresee a situation where export_on_batch_filled should be set to false?

The batch processor spec only mentions dropping spans once max queue size is hit, which seems to be the conservative approach (i.e. you always know what the max spans are that any app can produce over a given period of time for capacity planning), but it also doesn't explicitly disallow sending early.

@TommyCpp
Copy link
Contributor Author

In other words, do you foresee a situation where export_on_batch_filled should be set to false?

The overhead brought by this method is you potentially sending more requests. Users who may not care that much about span dropping may find application generating more internet IO. But it's debatable which method should we use as default.

@TommyCpp
Copy link
Contributor Author

TommyCpp commented Mar 29, 2021

Is that actually a problem we've observed in practice?

The problem was mainly because we have a fix interval between exports. Thus, if there is a spike of span volume, chances are they gonna be dropped. So this should be mitigated once we push on batch filled.

The alternative to having two separate tasks is having only one task. But make the size of the channel between application task and background task to be max size of queue * batch size as it's the maximum spans we can hold between export task and collect task(the export queue) in two tasks solution.

I guess the most concern I have is we re-create the delay future on each loop. Not so how much overhead that gonna bring to the time of each loop.

@djc
Copy link
Contributor

djc commented Mar 30, 2021

I guess the most concern I have is we re-create the delay future on each loop. Not so how much overhead that gonna bring to the time of each loop.

At least in tokio, delay futures can be reset, IIRC.

@LukeMathWalker
Copy link
Contributor

In other words, do you foresee a situation where export_on_batch_filled should be set to false?

The overhead brought by this method is you potentially sending more requests. Users who may not care that much about span dropping may find application generating more internet IO. But it's debatable which method should we use as default.

At least in our case, it's not the application to choose what to drop or what to keep.
We expect applications to ship traces to the collector, without losing them, while the collector gets to make smart tail sampling choices, drop load if necessary, etc.
A countermeasure you could add is having a minimum delay, which basically allows the user to know the minimum and maximum number of batches that might be shipped per unit of time.

@TommyCpp
Copy link
Contributor Author

TommyCpp commented Mar 30, 2021

I guess the most concern I have is we re-create the delay future on each loop. Not so how much overhead that gonna bring to the time of each loop.

At least in tokio, delay futures can be reset, IIRC.

That's true. But the Runtime trait doesn't expose reset function so we can't really use it.

Did some simple benchmark to see how much the impact is.

We end a span every 10 ms.

Based on the time needed to export a batch

Time needed to export a batch using one task using two tasks and a channel
1s 33.042s 20.326s
10ms 373.45ms 206.96ms
2ms 108.21ms 60.048 ms

@TommyCpp
Copy link
Contributor Author

At least in our case, it's not the application to choose what to drop or what to keep.
We expect applications to ship traces to the collector, without losing them, while the collector gets to make smart tail sampling choices, drop load if necessary, etc.

I agree that the usual use case is to run a collector as a sidecar or a dedicated service to process the spans before sending to the backend. So maybe we want to make this PR's mode as default instead

@jtescher
Copy link
Member

I think this is the right direction, there may be some ways to simplify but overall I think the two future approach makes sense to be able to independently make progress on both tasks. I think I'd go with the dropping behavior by default as it seems to be what the spec wants, but let people opt in to this behavior if they prefer.

* delay v.s. interval

Tokio's interval will complete immediately so in the loop the interval will always complete once we restart the loop.
It's also more intuitive to use delay as the batch processor will restart the delay/interval whenever it export.
Removed the Unpin restriction on Runtime::Interval

* separate two different export in two functions
… job.

Most of the errors could happen are from channels. It's usually indicate the other side of the channel has been dropped.
@LukeMathWalker
Copy link
Contributor

Additional data point to consider: exporting early when a batch is filled is the default behaviour of the batching processor in the OpenTelemetry Collector (see here).

@TommyCpp
Copy link
Contributor Author

TommyCpp commented Apr 2, 2021

Should we create a issue in spec to clarify the behavior of batch span processor?

@jtescher
Copy link
Member

jtescher commented Apr 2, 2021

@TommyCpp yeah could be useful to open an issue, this also brings up the question of batch parallelization, i.e. if we can export between the specified intervals as well, are the number of batches capped? or can we just keep creating more in-flight batches as needed?

@jtescher
Copy link
Member

jtescher commented Apr 3, 2021

Same question for force flush I suppose? if you both end a span and call force flush 3 times do you have 3 batches in flight being exported? or do you have 1 in flight and subsequent force flushes have no effect until the batch has been exported?

@jtescher
Copy link
Member

jtescher commented Apr 3, 2021

Looks like the go batch exporter uses mutexes, so single batch in flight at any time, and exports when full.

@jtescher
Copy link
Member

jtescher commented Apr 3, 2021

An implementation of @LukeMathWalker's suggestion to match the go/otlp behavior would look like d9a0015 I believe.

That version would always exports when batch limit is hit, max queue size is enforced at initial channel send rather than in the processor loop, single in-flight batch at any point.

Some pros with this approach are:

  • simplicity: most of the code got easier to reason about and check for correctness vs main branch impl
  • matches behavior of other high usage otel impls
  • single place to enforce max queue length

Cons:

  • Can't have multiple batches in flight at any given time, potentially limiting throughput.

@TommyCpp / @djc Thoughts about this variant conceptually?

@TommyCpp
Copy link
Contributor Author

TommyCpp commented Apr 5, 2021

@jtescher I think this should help users with a high volume of spans. I agree we should keep the implementation align with other SDKs. If users find the multiple batches can help we can always add it later. Feel free to close this PR and open another one base on your implementation if we decide to go that way

@djc
Copy link
Contributor

djc commented Apr 6, 2021

I think the design sounds like an improvement. The code in BatchSpanProcessor::new() still seems like it could stand to be refactored a bit both before and after your changes, probably starting with moving the async move {} block out into a separate function and adding some early return/continue paths to reduce rightward drift.

@TommyCpp
Copy link
Contributor Author

Close this one. @jtescher maybe you could open another PR based on d9a0015?

@TommyCpp TommyCpp closed this Apr 13, 2021
@jtescher
Copy link
Member

@TommyCpp right, will clean that up a bit then get a PR 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

batchexporter doesnt export spans when batch size is reached
4 participants