-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance/Throughput Impact with auto instrumentation in Spring 5 applications #59
Comments
What sampler are you using? Why 100% sampling? That will have a perf impact. I used Switched to the parent based ratio sampler |
@stnor |
25% is a very high sampling frequency in my experience. |
Is there a standards recommendation available. It would defiantly help to publish benchmark performance with different samplers and ratios with a demo application interacting database and a message system |
Hi @kurvatch - I agree that the performance impact seems much larger than we'd expect. Sampling rate is great for reducing load on backends, but we wouldn't expect such that much overhead at hundreds of QPS. I have filed open-telemetry/opentelemetry-java-instrumentation#3047 as that repo is where the actual code is and the performance bottlenecks can be investigated. |
This issue is stale because it has been open 90 days with no activity. If you want to keep this issue open, please just leave a comment below and auto-close will be canceled |
This issue was closed because it has been marked as stale for 30 days with no activity. |
Describe the bug
We are seeing more than 50% performance degradation with instrumenting otel agents, Our application instrumented with otel runs on EKS cluster. OTel Collector running as daemon set in the same EKS cluster collects traces and ingest data to AWS Xray.
Steps to reproduce
This is Spring 5 project with webflux and spring cloud stream support interacting with SQS, DynamoDB and AWS MSK
What did you expect to see?
Without Otel Agent, application could reach upto 250 request per second with 2Gi memory.
What did you see instead?
After OTel agent, we are seeing ~65 request per second with same settings, I was expecting some degradation in the throughput but this is more 50%
Additional context
We are using aws-opentelemetry-agent-1.1.0 with default settings for BSP and sampling is set to 100% and metrics exporter is set to logging.
The text was updated successfully, but these errors were encountered: