Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporterhelper] Make concurrency limit apply per-batch #9891

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

carsonip
Copy link
Contributor

@carsonip carsonip commented Apr 4, 2024

Description:

The current issue with concurrency limit in batch sender is that given unfavorable goroutine scheduling, the batch size will be much smaller than MinSizeItems due to the concurrency limit check. This may overwhelm the backend with a lot of small requests, which defeats the purpose of batching.

With this PR, concurrency limit will only apply per-batch, meaning that active batch will be flushed due to concurrency check only if all the goroutines are waiting for the same active batch to be flush. This aims to avoid all the goroutines meaninglessly waiting for new requests to come in, while there will be no spare consumers from the queue to do so.

Compared to the existing implementation, for example, concurrencyLimit = 10, and a group of 5 goroutines are waiting for an export to finish, and 4 goroutines are pending active batch to be ready. If the 10th consumer goroutine calls batch sender send, in the past, it should flush active batch immediately, but with this change it will not. Instead, it will wait for up to FlushTimeout for the other group to return and potentially produce a bigger batch.

This change will have implications on throughput, but I reckon that exporting small requests may also be bad for throughput.

Link to tracking Issue: Fixes #9952

Testing:

Batch sender concurrency limit test is extended to test both merge and merge split. The new tests will most likely fail before this PR.

Documentation:

Please delete paragraphs that you did not use before submitting.

Copy link

codecov bot commented Apr 4, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.63%. Comparing base (bb4955e) to head (bc9fd5c).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #9891   +/-   ##
=======================================
  Coverage   92.62%   92.63%           
=======================================
  Files         386      386           
  Lines       18236    18238    +2     
=======================================
+ Hits        16892    16894    +2     
  Misses       1000     1000           
  Partials      344      344           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

github-actions bot commented May 1, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added Stale and removed Stale labels May 1, 2024
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added Stale and removed Stale labels May 16, 2024
Copy link
Contributor

github-actions bot commented Jun 1, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jun 1, 2024
@github-actions github-actions bot removed the Stale label Jun 5, 2024
@dmitryax
Copy link
Member

dmitryax commented Jun 5, 2024

This approach sounds good to me.

Another option is to find a way to synchronize https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/batch_sender.go#L197 to drop the number of active requests by the number of released goroutines. That would be significantly more complicated, though. If you can try this out, that would be great to compare. Let me know

@carsonip
Copy link
Contributor Author

carsonip commented Jun 5, 2024

drop the number of active requests by the number of released goroutines. That would be significantly more complicated, though. If you can try this out, that would be great to compare.

Implemented in alternative PR #10337

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[exporterhelper] Batch sender produces inconsistent and suboptimal batch sizes
2 participants