Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/cloudwatchlogs] Add support concurrency in the awscloudwatchlogs exporter #26360

Closed
rapphil opened this issue Aug 31, 2023 · 1 comment · Fixed by #26692
Closed
Labels
enhancement New feature or request exporter/awscloudwatchlogs awscloudwatchlogs exporter priority:p2 Medium

Comments

@rapphil
Copy link
Contributor

rapphil commented Aug 31, 2023

Component(s)

exporter/awscloudwatchlogs

Is your feature request related to a problem? Please describe.

While profile the collector with filelogsreceiever + awscloudwatchlogs exporter I noticed that the filelogs receiver can handle a much higher throughput than awscloudwatchlogs exporter.

I think the limitation lies in the fact that awscloudwatchlogs sends request to the backend sequentially and therefore there is a fixed cost in handling the request + latency of the request.

Describe the solution you'd like

In order to increase the throughput of the awscloudwatch logs, I would like to add support to concurrency to it using the facilities provided by the exporterhelper.

The idea is that requests can be made in parallel to the backend, hence reducing the effect of the network latency + backend processing time.

This was probably not straightforward in the past given the limitation with the usage of a streamToken. However that limitation was removed recently.

On top of that the aws sdk for go is thread safe and is able to handle concurrent requests, therefore an implementation that leverages concurrent requests provided by the exporter helper should technically feasible without.

Describe alternatives you've considered

Send concurrent requests to cloudwatch logs from inside the component and not use the facilities provided by the expoterhelper. I don't think this approach makes sense given that there is no limitation in the backend such as it need to receive requests in specific order or similar limitation.

Additional context

No response

@rapphil rapphil added enhancement New feature or request needs triage New item requiring triage labels Aug 31, 2023
@github-actions github-actions bot added the exporter/awscloudwatchlogs awscloudwatchlogs exporter label Aug 31, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@bryan-aguilar bryan-aguilar added priority:p2 Medium and removed needs triage New item requiring triage labels Aug 31, 2023
@rapphil rapphil changed the title Add support concurrency in the awscloudwatchlogs exporter [exporter/cloudwatchlogs] Add support concurrency in the awscloudwatchlogs exporter Sep 1, 2023
codeboten pushed a commit that referenced this issue Oct 11, 2023
…oudwatchlogs exporter (#26692)

Adds support to the to parallelism in the
awscloudwatchlogs exporter by leveraging the [exporter
helper](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md).

In this PR, we are adding support to the `num_consumers` configuration
in the `sending_queue`. This will allow users to specify the number of
consumers that will consume from the sending_queue in parallel.

It is possible and straightforward to use this approach because
CloudWatch logs [no longer requires that you use a token to control
access to the stream that you are writing
to](https://aws.amazon.com/about-aws/whats-new/2023/01/amazon-cloudwatch-logs-log-stream-transaction-quota-sequencetoken-requirement/).
You can write to the same stream in parallel.

To achieve this, this PR does the following:
* Create Pusher that is able to push to multiple streams at the same
time.
* Move lifecycle of the Pusher to the function that is used to consume
from the sending queue. This allows you to safely send to multiple
streams at the same time without any resource contention since each call
to consume logs will not share resources with others that are happening
in parallel (one exception is the creation of log streams).

Besides that I analyzed the code and removed other limitations:
* locks that were not necessary
* Limiter that was used to limit the number of requests per stream to 5
per second. [The TPS is much higher now and is per
account.](https://aws.amazon.com/about-aws/whats-new/2023/01/amazon-cloudwatch-logs-log-stream-transaction-quota-sequencetoken-requirement/)

** How to review this PR: **

The first 3 commits in this PR were used to refactor the code before
making the real changes. Please use the commits to simplify the review
process.

**Link to tracking Issue:** #26360

**Testing:**

- Unit tests were added.
- Tested locally sending logs to cloudwatch logs.

**Documentation:** Documentation was added describing the new
parameters.

---------

Signed-off-by: Raphael Silva <rapphil@gmail.com>
Co-authored-by: Anthony Mirabella <a9@aneurysm9.com>
JaredTan95 pushed a commit to openinsight-proj/opentelemetry-collector-contrib that referenced this issue Oct 18, 2023
…oudwatchlogs exporter (open-telemetry#26692)

Adds support to the to parallelism in the
awscloudwatchlogs exporter by leveraging the [exporter
helper](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md).

In this PR, we are adding support to the `num_consumers` configuration
in the `sending_queue`. This will allow users to specify the number of
consumers that will consume from the sending_queue in parallel.

It is possible and straightforward to use this approach because
CloudWatch logs [no longer requires that you use a token to control
access to the stream that you are writing
to](https://aws.amazon.com/about-aws/whats-new/2023/01/amazon-cloudwatch-logs-log-stream-transaction-quota-sequencetoken-requirement/).
You can write to the same stream in parallel.

To achieve this, this PR does the following:
* Create Pusher that is able to push to multiple streams at the same
time.
* Move lifecycle of the Pusher to the function that is used to consume
from the sending queue. This allows you to safely send to multiple
streams at the same time without any resource contention since each call
to consume logs will not share resources with others that are happening
in parallel (one exception is the creation of log streams).

Besides that I analyzed the code and removed other limitations:
* locks that were not necessary
* Limiter that was used to limit the number of requests per stream to 5
per second. [The TPS is much higher now and is per
account.](https://aws.amazon.com/about-aws/whats-new/2023/01/amazon-cloudwatch-logs-log-stream-transaction-quota-sequencetoken-requirement/)

** How to review this PR: **

The first 3 commits in this PR were used to refactor the code before
making the real changes. Please use the commits to simplify the review
process.

**Link to tracking Issue:** open-telemetry#26360

**Testing:**

- Unit tests were added.
- Tested locally sending logs to cloudwatch logs.

**Documentation:** Documentation was added describing the new
parameters.

---------

Signed-off-by: Raphael Silva <rapphil@gmail.com>
Co-authored-by: Anthony Mirabella <a9@aneurysm9.com>
jmsnll pushed a commit to jmsnll/opentelemetry-collector-contrib that referenced this issue Nov 12, 2023
…oudwatchlogs exporter (open-telemetry#26692)

Adds support to the to parallelism in the
awscloudwatchlogs exporter by leveraging the [exporter
helper](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md).

In this PR, we are adding support to the `num_consumers` configuration
in the `sending_queue`. This will allow users to specify the number of
consumers that will consume from the sending_queue in parallel.

It is possible and straightforward to use this approach because
CloudWatch logs [no longer requires that you use a token to control
access to the stream that you are writing
to](https://aws.amazon.com/about-aws/whats-new/2023/01/amazon-cloudwatch-logs-log-stream-transaction-quota-sequencetoken-requirement/).
You can write to the same stream in parallel.

To achieve this, this PR does the following:
* Create Pusher that is able to push to multiple streams at the same
time.
* Move lifecycle of the Pusher to the function that is used to consume
from the sending queue. This allows you to safely send to multiple
streams at the same time without any resource contention since each call
to consume logs will not share resources with others that are happening
in parallel (one exception is the creation of log streams).

Besides that I analyzed the code and removed other limitations:
* locks that were not necessary
* Limiter that was used to limit the number of requests per stream to 5
per second. [The TPS is much higher now and is per
account.](https://aws.amazon.com/about-aws/whats-new/2023/01/amazon-cloudwatch-logs-log-stream-transaction-quota-sequencetoken-requirement/)

** How to review this PR: **

The first 3 commits in this PR were used to refactor the code before
making the real changes. Please use the commits to simplify the review
process.

**Link to tracking Issue:** open-telemetry#26360

**Testing:**

- Unit tests were added.
- Tested locally sending logs to cloudwatch logs.

**Documentation:** Documentation was added describing the new
parameters.

---------

Signed-off-by: Raphael Silva <rapphil@gmail.com>
Co-authored-by: Anthony Mirabella <a9@aneurysm9.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request exporter/awscloudwatchlogs awscloudwatchlogs exporter priority:p2 Medium
Projects
None yet
2 participants