Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add decoupleafterbatch converter to ensure decouple processor follows batch processor #1255

Merged
merged 22 commits into from
Apr 22, 2024

Conversation

nslaughter
Copy link
Contributor

@nslaughter nslaughter commented Apr 16, 2024

Problem

When using the batch processor in a Lambda Collector pipeline, it's important to always have the decouple processor come after the batch processor. This is because the batch processor introduces delays in processing some spans/metrics/logs in order to batch them with later arriving items. If the batch processor's effect isn't offset by the decouple processor then the delay can cause issues when the Lambda environment gets frozen, potentially leading to loss of data and higher costs. In short, the current situation places a burden on the user to apply knowledge specialized to the Lambda distribution to realize the results they're most likely expecting.

Solution

This PR introduces a new decoupleafterbatch converter which automatically modifies the config, potentially adding a decouple processor to the processor chains as defined in the user's config. The behavior is that checks if a batch processor is defined and if there is no decouple processor following it, then it adds it to the end.

The key aspects are:

  1. When a user configures a batch processor without following it with a batch processor, the converter will automatically add a decouple processor to the end of the pipeline.

  2. If a decouple processor is already present after the last occurrence of the batch processor then nothing happens.

  3. If batch processors are defined in multiple pipelines (logs, traces, etc), each one will be appended according to the state of that processor pipeline only.

Rationale

The decouple processor allows the lambda function to return while the Collector continues exporting the observability data asynchronously. By ensuring it always comes after the batch processor, we ensure that users avoid potential data loss scenarios, delays, and unnecessary expense caused when the Lambda environment gets frozen in the middle of a batch processor timeout.

Alternatives Considered

  1. Document the requirement and rely on users to always add the decouple processor. This is likely to be overlooked and places a burden on the user to get better results.

  2. Disable the batch processor when running in Lambda. This would ensure preventing bad performance caused by the batch processor in lambda, but it would also prevent users from using the batch processor's capabilities.

The automated config by converters approach is simplest for users and the most robust. It restores the benefit of the batch processor, which is recommended in OpenTelemetry best practices, while ensuring that users don't run into unexpected problems with the unique behaviors in the lambda environment.

Testing

Unit tests have been added to verify the converter works as expected, both when a user hasn't defined any processors as well as when they've defined just a batch processor, just a decouple processor, or both in different valid and invalid orders.

The rest is in Github Actions.

Please take a look and let me know if you have any feedback or suggestions!

@nslaughter nslaughter requested a review from a team as a code owner April 16, 2024 16:37
@tylerbenson
Copy link
Member

Thanks for the thorough review @adcharre!

Copy link
Contributor

@adcharre adcharre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just one pedantic comment about the readme.

…er/README.md

Co-authored-by: Adam Charrett <73886859+adcharre@users.noreply.github.com>
@nslaughter
Copy link
Contributor Author

Looking good, just one pedantic comment about the readme.

I appreciate the details. Thank you.

Copy link
Contributor

@adcharre adcharre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@tylerbenson tylerbenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!


## Auto-Configuration

Due to the significant performance improvements with this approach, the OpenTelemetry Lambda Layer automatically configures the decouple processor when the batch processor is used. This ensures the best performance by default.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also update collector/README.md with similar info.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I pushed the changes. Please let me know your thoughts.

@tylerbenson tylerbenson merged commit dff9bc6 into open-telemetry:main Apr 22, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants