Project Tracking: Performance Benchmarking SIG #1617

cartersocha · 2023-07-27T20:08:45Z

Description

As the adoption of OpenTelemetry grows and larger enterprises continue to deepen their usage of project components there are persistent and ongoing end user questions about the OpenTelemetry performance impact. End user performance varies due to the quirks of their environment but without a project performance standard and historical data record no one really knows if the numbers they're seeing are abnormal or expected. Additionally, there is no comprehensive documentation available on tuning project components or the performance trade-offs available to users which results in a reliance on vendor support.

Project Maintainers need to be able to track the current state of their components while preventing any performance regressions when making new releases. Customers need to be able to get a general sense of potential OpenTelemetry performance impact and the certainty that OpenTelemetry takes performance and customer resources seriously. Performance tracking and quantification is a general need that should be addressed by a project wide effort and automated tooling that minimizes repo owner effort while providing valuable new data points for all project stakeholders.

Project Board

SIG Charter

charter

Deliverables

Evaluate the current performance benchmarking specification, propose an updated benchmarking standard that can apply across project components, and make the requisite specification updates. The benchmarking standard should provide relevant information for maintainers and end users.
Develop automated tooling that can be used across project repos to report current performance numbers and track changes as new features / PRs are merged.
Write performance tuning documentation for the project website that can help customers make actionable decisions when faced with performance trade-offs or debugging bad component performance.
Provide ongoing maintenance as needed on automated tooling and own the underlying assets

Initial implementation scope would be the core Collector components (main repo), JavaScript / Java / Python SDKs and their core components. No contrib or instrumentation.

Staffing / Help Wanted

Anyone with an opinion on performance standards and testing.

Language maintainers or approvers as they will be tasked with implementing the changes and following through on the process.

Required staffing

lead - tbd
@jpkrohling domain expert
@cartersocha contributor
@mwear collector sig
@codeboten collector sig implementation
@ocelotl python sig
@martinkuba javascript
@tylerbenson java
@sbaum1994 contributor

@jpkrohling - TC/GC sponsor
@alolita - TC/GC sponsor

Need: more performance domain experts
Need: maintainers or approvers from several language sigs to participate

Meeting Times

TBD

Timeline

Initial scope is for the Collector and 3 SDKs. Output should be by KubeCon NA November 6, 2023

Labels

tbd

Linked Issues and PRs

https://opentelemetry.io/docs/collector/benchmarks/
cncf/cluster#245
cncf/cluster#182
https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/performance-benchmark.md
https://opentelemetry.io/docs/specs/otel/performance-benchmark/

cartersocha · 2023-07-27T20:08:57Z

@puckpuck fyi

tigrannajaryan · 2023-07-27T21:28:39Z

Please delete boilerplate like this form the description to make it easier to read:

A description of what this project is planning to deliver, or is in the process of delivering. This includes all OTEPs and their associated prototypes.

In general, OTEPs are not accepted unless they come with working prototypes available to review in at least two languages. Please discuss these requirements with a TC member before submitting an OTEP.

There is more like that that seems to be copied from a template and should be deleted/replaced by more specifics.

tigrannajaryan · 2023-07-27T21:30:53Z

Evaluate the current performance benchmarking specification

Does this refer to this document?

jpkrohling · 2023-07-28T01:21:19Z

cc @gsoria and @harshita19244, as they worked on performance benchmarks for SDKs at different stages (OpenTracing and OpenTelemetry) and can share their experience in doing so.

jpkrohling · 2023-07-28T01:22:04Z

cc @sh0rez and @frzifus , as they are interested in benchmarking the collector against other solutions

alolita · 2023-08-01T06:50:47Z

@cartersocha I'd be happy to be the second GC sponsor supporting this Performance Benchmarking SIG.

I recommend creating a Charter doc for this SIG to map out more details about the mission, goals, deliverables and logistics for this SIG. Let's also itemize what items are out of scope and non-goals since performance benchmarking is a subjective area for an open source project of OpenTelemetry's breadth and depth.

Please share link on this thread.

harshita19244 · 2023-08-01T18:20:55Z

Hi, I worked on the performance benchmarking project to compare the performance of the Opentracing and the Opentelemetry libraries as a part of my Outreachy internship. All tests were executed on bare metal machines. Please find the GitHub repo here: https://github.com/harshita19244/opentelemetry-java-benchmarks
Do feel free to reach out to me in case of questions.

brettmc · 2023-08-02T12:38:07Z

Over in PHP SIG, we've implemented (most of) the documented perf tests, but what I think we lack is a way to run them on consistent hardware, and a way to publish the results (or compare to a benchmark to track regressions/improvements).

cartersocha · 2023-08-02T16:03:03Z

@brettmc already made an ask for bare metal machines that was approved. I’ll share the details once we get them cncf/cluster#245

frzifus · 2023-08-02T18:15:47Z

Thx @cartersocha for starting this!

Anyone with an opinion on performance standards and testing.

I would be super interested in participating.

Recently @sh0rez started a project to compare the grafana-agent and the Prometheus-agent performance in collecting metrics. Since its quite flexible, it wasn't to hard to extend it to include the open telemetry collector. Maybe its beneficial for this project, happy to chat about it.

cartersocha · 2023-08-02T19:13:44Z

Would love to see the data / results or hear about any testing done here @frzifus. Thanks for being willing to share your work 😎

cartersocha · 2023-08-07T20:08:27Z

Added a charter to the proposal as @alolita suggested.

ocelotl · 2023-08-22T17:15:19Z

👍

vielmetti · 2023-08-24T19:39:22Z

Looking forward to seeing this go forward! cc @tobert

cartersocha · 2023-08-24T20:20:35Z

hey @frzifus @sh0rez @harshita19244 @gsoria @brettmc we now have bare metal machines to run tests on. I wasn't sure how to add all of you on slack but we're in the CNCF slack otel-benchmarking channel.

https://cloud-native.slack.com/archives/C05PEPYQ5L3

jack-berg · 2023-09-08T21:43:09Z

In java we've taken performance fairly seriously, and continue to make improvements as we receive feedback. For example, we received an issue about a use case in which millions of distinct metric series may need to be maintained in memory, and feedback that the SDK at the time would produce problematic memory churn. Since receiving, we worked to reduce metric memory allocation by 80%, and there is work in progress to reduce it by 99.9% (essentially zero memory allocations after the metric SDK reaches a steady state). We also have performance test suites for many sensitive areas and validate that changes to sensitive areas don't degrade performance.

All this is to say that I believe we have a decent performance story today.

However, where I think we could improve is in documentation for the performance to point curious users to. Our performance test suites require quite a bit of context to run and interpret the results. It would be great if we could extend the spec performance benchmark document to include high level descriptions of some use cases for each signal, and to provide tooling to be able to run and publish performance results to some central location.

If the above was available, we would have some nice material to point users to who are evaluating the project. We would still keep the nuanced performance tests around for sensitive areas, but it would be good to have something simpler / higher level.

In general, I think performance engineering is going to be very language / implementation dependent. I would caution against too expansive of a scope for a cross-language performance group. It would be great to provide some documentation of use cases to evaluate in suites, and tooling for running on bare metal / publishing results. But there are always going to be nuanced language specific concerns. I think we should raise those issues with the relevant SIGs, and let those maintainers / contributors work out solutions.

reyang · 2023-09-14T19:09:37Z

I have similar position with @jack-berg.

Taking OpenTelemetry .NET as an example, performance has been taken care of seriously from the beginning:

Stress test https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/test/OpenTelemetry.Tests.Stress.Metrics
Benchmarks https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/test/Benchmarks
Zero heap allocation is enforced on hot paths https://github.com/open-telemetry/opentelemetry-dotnet/blob/b870ed9b0c965ec89bf6b5aedab87ff3cab8ea68/test/Benchmarks/Metrics/HistogramBenchmarks.cs#L3

Thinking about what could potentially benefit OpenTelemetry .NET, having some perf numbers published to an official document on opentelemetry.io across all programming languages might increase the discoverability.

cartersocha · 2023-09-18T17:38:36Z

Thanks for the context all. @jack-berg could you share where the java tests are published and what compute they run on? @reyang could you share what compute you rely on in dotnet and consider migrating the test results to the otel website like the collector does?

jack-berg · 2023-09-18T21:09:05Z

The tests are scattered throughout the repo in directories next to the source they evaluate. All the directories contain "jmh". I wrote a quick little script to find them all:

find . -type d | grep "^.*\/jmh$" | grep -v ".*\/build\/.*"

# Results
./context/src/jmh
./exporters/otlp/all/src/jmh
./exporters/otlp/common/src/jmh
./extensions/trace-propagators/src/jmh
./extensions/incubator/src/jmh
./sdk/metrics/src/jmh
./sdk/trace/src/jmh
./sdk/logs/src/jmh
./api/all/src/jmh

They run on each developers local machine, and only on request. The basic idea is that maintainers / approvers know which areas of the code are sensitive and have JMH test suites. When someone opens a PR which we suspect has performance implications, we ask them to run the performance suite before and after and compare the results (example). Its obviously imperfect, but has generally been fine.

It would be good if there was an easy way to run a subset of these on stable compute and publish the results to a central place. I think running / publishing all of them might be overwhelming.

cartersocha · 2023-09-18T21:47:56Z

Makes sense. Thanks for sharing those details. Let me start a thread in the cncf slack to coordinate machine access

cwegener · 2023-09-20T03:30:39Z

A random find that I just stumbled across. K6 extension for generating OTEL signals created by an ING Bank engineer https://github.com/thmshmm/xk6-opentelemetry

I'm not sure what the guidelines on usage of 3rd party tooling are for the Performance Benchmarking SIG.

cartersocha · 2023-09-20T19:51:43Z

Thanks for sharing @cwegener! The guidelines are to be defined so we’ll see but the general preference is for community tooling (which can also be donated). We’re a decentralized project and each language has its quirks so whatever guidelines that would be defined would be more of a baseline. If you think this approach would be generally beneficial we’d love to hear more. Feel free to cross post in the #otel-benchmarking channel

cwegener · 2023-09-21T08:43:14Z

If you think this approach would be generally beneficial we’d love to hear more.

I will test drive the k6 extension myself a little bit and report back in Slack.

tedsuo · 2023-09-27T14:58:30Z

@cartersocha do you mind converting this issue to a PR? We are now placing proposals here: https://github.com/open-telemetry/community/tree/main/projects

vielmetti mentioned this issue Aug 24, 2023

Request - Equinix bare metal to use for OpenTelemetry benchmark performance tests cncf/cluster#245

Closed

codeboten mentioned this issue Aug 31, 2023

Need help adding custom runners for performance benchmarks #1662

Closed

martinkuba mentioned this issue Aug 31, 2023

Introduce benchmark tests open-telemetry/opentelemetry-js#4105

Merged

3 tasks

jack-berg closed this as completed Sep 8, 2023

jack-berg reopened this Sep 10, 2023

ocelotl mentioned this issue Sep 26, 2023

Re-add benchmark tests open-telemetry/opentelemetry-python#3449

Closed

tylerbenson mentioned this issue Sep 26, 2023

Add Benchmark workflows open-telemetry/opentelemetry-java#5842

Merged

tylerbenson mentioned this issue Sep 27, 2023

Improve Benchmark consistency and visibility open-telemetry/opentelemetry-java#5857

Open

8 tasks

martinkuba mentioned this issue Sep 28, 2023

Add benchmark tests open-telemetry/opentelemetry-js#4171

Closed

7 tasks

dmitryax mentioned this issue Oct 30, 2023

Attempted benchmarking plan signalfx/splunk-otel-collector#3769

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Project Tracking: Performance Benchmarking SIG #1617

Project Tracking: Performance Benchmarking SIG #1617

cartersocha commented Jul 27, 2023 •

edited

Loading

cartersocha commented Jul 27, 2023

tigrannajaryan commented Jul 27, 2023

tigrannajaryan commented Jul 27, 2023

jpkrohling commented Jul 28, 2023

jpkrohling commented Jul 28, 2023

alolita commented Aug 1, 2023

harshita19244 commented Aug 1, 2023 •

edited

Loading

brettmc commented Aug 2, 2023

cartersocha commented Aug 2, 2023

frzifus commented Aug 2, 2023

cartersocha commented Aug 2, 2023

cartersocha commented Aug 7, 2023

ocelotl commented Aug 22, 2023

vielmetti commented Aug 24, 2023

cartersocha commented Aug 24, 2023

jack-berg commented Sep 8, 2023

reyang commented Sep 14, 2023

cartersocha commented Sep 18, 2023

jack-berg commented Sep 18, 2023

cartersocha commented Sep 18, 2023

cwegener commented Sep 20, 2023 •

edited

Loading

cartersocha commented Sep 20, 2023

cwegener commented Sep 21, 2023

tedsuo commented Sep 27, 2023

Project Tracking: Performance Benchmarking SIG #1617

Project Tracking: Performance Benchmarking SIG #1617

Comments

cartersocha commented Jul 27, 2023 • edited Loading

Description

Project Board

SIG Charter

Deliverables

Staffing / Help Wanted

Required staffing

Meeting Times

Timeline

Labels

Linked Issues and PRs

cartersocha commented Jul 27, 2023

tigrannajaryan commented Jul 27, 2023

tigrannajaryan commented Jul 27, 2023

jpkrohling commented Jul 28, 2023

jpkrohling commented Jul 28, 2023

alolita commented Aug 1, 2023

harshita19244 commented Aug 1, 2023 • edited Loading

brettmc commented Aug 2, 2023

cartersocha commented Aug 2, 2023

frzifus commented Aug 2, 2023

cartersocha commented Aug 2, 2023

cartersocha commented Aug 7, 2023

ocelotl commented Aug 22, 2023

vielmetti commented Aug 24, 2023

cartersocha commented Aug 24, 2023

jack-berg commented Sep 8, 2023

reyang commented Sep 14, 2023

cartersocha commented Sep 18, 2023

jack-berg commented Sep 18, 2023

cartersocha commented Sep 18, 2023

cwegener commented Sep 20, 2023 • edited Loading

cartersocha commented Sep 20, 2023

cwegener commented Sep 21, 2023

tedsuo commented Sep 27, 2023

cartersocha commented Jul 27, 2023 •

edited

Loading

harshita19244 commented Aug 1, 2023 •

edited

Loading

cwegener commented Sep 20, 2023 •

edited

Loading