-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Benchmarks] Migrate linux throughput tests #6765
Conversation
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing the following branches/commits: Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6765) - mean (70ms) : 66, 73
. : milestone, 70,
master - mean (69ms) : 66, 72
. : milestone, 69,
section CallTarget+Inlining+NGEN
This PR (6765) - mean (1,007ms) : 987, 1028
. : milestone, 1007,
master - mean (1,024ms) : 958, 1090
. : milestone, 1024,
gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6765) - mean (102ms) : 100, 105
. : milestone, 102,
master - mean (105ms) : 101, 110
. : milestone, 105,
section CallTarget+Inlining+NGEN
This PR (6765) - mean (685ms) : 665, 706
. : milestone, 685,
master - mean (699ms) : 644, 755
. : milestone, 699,
gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6765) - mean (89ms) : 86, 91
. : milestone, 89,
master - mean (90ms) : 87, 93
. : milestone, 90,
section CallTarget+Inlining+NGEN
This PR (6765) - mean (641ms) : 626, 656
. : milestone, 641,
master - mean (645ms) : 623, 667
. : milestone, 645,
gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6765) - mean (190ms) : 186, 195
. : milestone, 190,
master - mean (190ms) : 187, 193
. : milestone, 190,
section CallTarget+Inlining+NGEN
This PR (6765) - mean (1,108ms) : 1080, 1137
. : milestone, 1108,
master - mean (1,113ms) : 1070, 1155
. : milestone, 1113,
gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6765) - mean (269ms) : 265, 274
. : milestone, 269,
master - mean (269ms) : 266, 273
. : milestone, 269,
section CallTarget+Inlining+NGEN
This PR (6765) - mean (873ms) : 848, 898
. : milestone, 873,
master - mean (877ms) : 843, 911
. : milestone, 877,
gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6765) - mean (262ms) : 258, 266
. : milestone, 262,
master - mean (261ms) : 257, 266
. : milestone, 261,
section CallTarget+Inlining+NGEN
This PR (6765) - mean (862ms) : 831, 893
. : milestone, 862,
master - mean (861ms) : 827, 895
. : milestone, 861,
|
Datadog ReportBranch report: ✅ 0 Failed, 247051 Passed, 2398 Skipped, 19h 41m 56.82s Total Time |
Benchmarks Report for tracer 🐌Benchmarks for #6765 compared to master:
The following thresholds were used for comparing the benchmark speeds:
Allocation changes below 0.5% are ignored. Benchmark detailsBenchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.ElasticsearchBenchmark - Slower
|
Benchmark | diff/base | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync‑net6.0 | 1.133 | 1,217.99 | 1,379.73 |
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | CallElasticsearch |
net6.0 | 1.15μs | 0.603ns | 2.34ns | 0.0138 | 0 | 0 | 976 B |
master | CallElasticsearch |
netcoreapp3.1 | 1.58μs | 0.916ns | 3.3ns | 0.0133 | 0 | 0 | 976 B |
master | CallElasticsearch |
net472 | 2.59μs | 1.09ns | 4.09ns | 0.157 | 0 | 0 | 995 B |
master | CallElasticsearchAsync |
net6.0 | 1.22μs | 0.568ns | 2.13ns | 0.0135 | 0 | 0 | 952 B |
master | CallElasticsearchAsync |
netcoreapp3.1 | 1.63μs | 1.23ns | 4.78ns | 0.014 | 0 | 0 | 1.02 KB |
master | CallElasticsearchAsync |
net472 | 2.77μs | 1.45ns | 5.41ns | 0.167 | 0 | 0 | 1.05 KB |
#6765 | CallElasticsearch |
net6.0 | 1.22μs | 0.483ns | 1.87ns | 0.0134 | 0 | 0 | 976 B |
#6765 | CallElasticsearch |
netcoreapp3.1 | 1.55μs | 0.433ns | 1.68ns | 0.0131 | 0 | 0 | 976 B |
#6765 | CallElasticsearch |
net472 | 2.64μs | 1.97ns | 7.38ns | 0.158 | 0 | 0 | 995 B |
#6765 | CallElasticsearchAsync |
net6.0 | 1.38μs | 0.894ns | 3.46ns | 0.0131 | 0 | 0 | 952 B |
#6765 | CallElasticsearchAsync |
netcoreapp3.1 | 1.64μs | 0.844ns | 3.04ns | 0.014 | 0 | 0 | 1.02 KB |
#6765 | CallElasticsearchAsync |
net472 | 2.77μs | 1.15ns | 4.46ns | 0.166 | 0 | 0 | 1.05 KB |
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | ExecuteAsync |
net6.0 | 1.29μs | 0.676ns | 2.53ns | 0.0135 | 0 | 0 | 952 B |
master | ExecuteAsync |
netcoreapp3.1 | 1.7μs | 0.973ns | 3.64ns | 0.0126 | 0 | 0 | 952 B |
master | ExecuteAsync |
net472 | 1.76μs | 0.488ns | 1.89ns | 0.145 | 0 | 0 | 915 B |
#6765 | ExecuteAsync |
net6.0 | 1.28μs | 0.988ns | 3.7ns | 0.0134 | 0 | 0 | 952 B |
#6765 | ExecuteAsync |
netcoreapp3.1 | 1.63μs | 1.02ns | 3.52ns | 0.0132 | 0 | 0 | 952 B |
#6765 | ExecuteAsync |
net472 | 1.78μs | 0.381ns | 1.43ns | 0.145 | 0 | 0 | 915 B |
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | SendAsync |
net6.0 | 4.5μs | 1.36ns | 4.91ns | 0.0315 | 0 | 0 | 2.31 KB |
master | SendAsync |
netcoreapp3.1 | 5.39μs | 2.4ns | 8.97ns | 0.0377 | 0 | 0 | 2.85 KB |
master | SendAsync |
net472 | 7.46μs | 6.43ns | 24.9ns | 0.495 | 0 | 0 | 3.12 KB |
#6765 | SendAsync |
net6.0 | 4.37μs | 1.29ns | 4.82ns | 0.0328 | 0 | 0 | 2.31 KB |
#6765 | SendAsync |
netcoreapp3.1 | 5.25μs | 2.28ns | 8.84ns | 0.0366 | 0 | 0 | 2.85 KB |
#6765 | SendAsync |
net472 | 7.54μs | 1.51ns | 5.65ns | 0.493 | 0 | 0 | 3.12 KB |
Benchmarks.Trace.ILoggerBenchmark - Faster 🎉 Same allocations ✔️
Faster 🎉 in #6765
Benchmark
base/diff
Base Median (ns)
Diff Median (ns)
Modality
Benchmarks.Trace.ILoggerBenchmark.EnrichedLog‑net6.0
1.125
1,653.43
1,470.33
Benchmark | base/diff | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Benchmarks.Trace.ILoggerBenchmark.EnrichedLog‑net6.0 | 1.125 | 1,653.43 | 1,470.33 |
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | EnrichedLog |
net6.0 | 1.65μs | 0.846ns | 3.28ns | 0.0231 | 0 | 0 | 1.64 KB |
master | EnrichedLog |
netcoreapp3.1 | 2.21μs | 6.26ns | 24.3ns | 0.0225 | 0 | 0 | 1.64 KB |
master | EnrichedLog |
net472 | 2.52μs | 0.951ns | 3.68ns | 0.25 | 0 | 0 | 1.57 KB |
#6765 | EnrichedLog |
net6.0 | 1.47μs | 3.31ns | 12.8ns | 0.0233 | 0 | 0 | 1.64 KB |
#6765 | EnrichedLog |
netcoreapp3.1 | 2.28μs | 2.71ns | 10.5ns | 0.0227 | 0 | 0 | 1.64 KB |
#6765 | EnrichedLog |
net472 | 2.52μs | 7.98ns | 30.9ns | 0.249 | 0 | 0 | 1.57 KB |
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | EnrichedLog |
net6.0 | 112μs | 135ns | 522ns | 0.0556 | 0 | 0 | 4.28 KB |
master | EnrichedLog |
netcoreapp3.1 | 116μs | 170ns | 657ns | 0.0578 | 0 | 0 | 4.28 KB |
master | EnrichedLog |
net472 | 148μs | 139ns | 539ns | 0.668 | 0.223 | 0 | 4.46 KB |
#6765 | EnrichedLog |
net6.0 | 111μs | 120ns | 463ns | 0.0549 | 0 | 0 | 4.28 KB |
#6765 | EnrichedLog |
netcoreapp3.1 | 117μs | 154ns | 574ns | 0.0582 | 0 | 0 | 4.28 KB |
#6765 | EnrichedLog |
net472 | 151μs | 146ns | 564ns | 0.68 | 0.227 | 0 | 4.46 KB |
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | EnrichedLog |
net6.0 | 2.85μs | 1.02ns | 3.81ns | 0.0314 | 0 | 0 | 2.2 KB |
master | EnrichedLog |
netcoreapp3.1 | 4.16μs | 3.1ns | 11.6ns | 0.0292 | 0 | 0 | 2.2 KB |
master | EnrichedLog |
net472 | 4.8μs | 1.56ns | 6.02ns | 0.319 | 0 | 0 | 2.02 KB |
#6765 | EnrichedLog |
net6.0 | 3.03μs | 2.7ns | 10.5ns | 0.0305 | 0 | 0 | 2.2 KB |
#6765 | EnrichedLog |
netcoreapp3.1 | 4.05μs | 2.21ns | 8.28ns | 0.0286 | 0 | 0 | 2.2 KB |
#6765 | EnrichedLog |
net472 | 4.99μs | 1.23ns | 4.6ns | 0.32 | 0 | 0 | 2.02 KB |
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | SendReceive |
net6.0 | 1.33μs | 3.38ns | 13.1ns | 0.0165 | 0 | 0 | 1.14 KB |
master | SendReceive |
netcoreapp3.1 | 1.78μs | 1.14ns | 4.25ns | 0.0151 | 0 | 0 | 1.14 KB |
master | SendReceive |
net472 | 2.01μs | 0.995ns | 3.85ns | 0.183 | 0 | 0 | 1.16 KB |
#6765 | SendReceive |
net6.0 | 1.34μs | 0.85ns | 3.29ns | 0.0159 | 0 | 0 | 1.14 KB |
#6765 | SendReceive |
netcoreapp3.1 | 1.76μs | 1.96ns | 7.58ns | 0.0159 | 0 | 0 | 1.14 KB |
#6765 | SendReceive |
net472 | 2.12μs | 0.866ns | 3.36ns | 0.183 | 0 | 0 | 1.16 KB |
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | EnrichedLog |
net6.0 | 2.83μs | 1.12ns | 4.2ns | 0.0227 | 0 | 0 | 1.6 KB |
master | EnrichedLog |
netcoreapp3.1 | 3.91μs | 1.93ns | 7.48ns | 0.0215 | 0 | 0 | 1.65 KB |
master | EnrichedLog |
net472 | 4.38μs | 2.81ns | 10.5ns | 0.322 | 0 | 0 | 2.04 KB |
#6765 | EnrichedLog |
net6.0 | 2.83μs | 1.47ns | 5.68ns | 0.0226 | 0 | 0 | 1.6 KB |
#6765 | EnrichedLog |
netcoreapp3.1 | 4.01μs | 3.74ns | 14.5ns | 0.0219 | 0 | 0 | 1.65 KB |
#6765 | EnrichedLog |
net472 | 4.43μs | 3.3ns | 12.4ns | 0.323 | 0 | 0 | 2.04 KB |
Benchmarks.Trace.SpanBenchmark - Slower ⚠️ Same allocations ✔️
Slower ⚠️ in #6765
Benchmark
diff/base
Base Median (ns)
Diff Median (ns)
Modality
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑net6.0
1.224
396.89
485.96
Benchmark | diff/base | Base Median (ns) | Diff Median (ns) | Modality |
---|---|---|---|---|
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑net6.0 | 1.224 | 396.89 | 485.96 |
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | StartFinishSpan |
net6.0 | 397ns | 0.131ns | 0.508ns | 0.00815 | 0 | 0 | 576 B |
master | StartFinishSpan |
netcoreapp3.1 | 626ns | 0.27ns | 1.01ns | 0.00782 | 0 | 0 | 576 B |
master | StartFinishSpan |
net472 | 675ns | 0.129ns | 0.466ns | 0.0915 | 0 | 0 | 578 B |
master | StartFinishScope |
net6.0 | 495ns | 0.172ns | 0.645ns | 0.00984 | 0 | 0 | 696 B |
master | StartFinishScope |
netcoreapp3.1 | 684ns | 0.331ns | 1.28ns | 0.00957 | 0 | 0 | 696 B |
master | StartFinishScope |
net472 | 846ns | 0.448ns | 1.68ns | 0.104 | 0 | 0 | 658 B |
#6765 | StartFinishSpan |
net6.0 | 486ns | 0.465ns | 1.8ns | 0.00807 | 0 | 0 | 576 B |
#6765 | StartFinishSpan |
netcoreapp3.1 | 572ns | 0.682ns | 2.64ns | 0.00793 | 0 | 0 | 576 B |
#6765 | StartFinishSpan |
net472 | 688ns | 0.14ns | 0.541ns | 0.0916 | 0 | 0 | 578 B |
#6765 | StartFinishScope |
net6.0 | 523ns | 1.56ns | 6.03ns | 0.00966 | 0 | 0 | 696 B |
#6765 | StartFinishScope |
netcoreapp3.1 | 738ns | 0.394ns | 1.48ns | 0.00924 | 0 | 0 | 696 B |
#6765 | StartFinishScope |
net472 | 932ns | 0.369ns | 1.43ns | 0.104 | 0 | 0 | 658 B |
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|
master | RunOnMethodBegin |
net6.0 | 590ns | 0.24ns | 0.929ns | 0.00974 | 0 | 0 | 696 B |
master | RunOnMethodBegin |
netcoreapp3.1 | 929ns | 0.658ns | 2.55ns | 0.00923 | 0 | 0 | 696 B |
master | RunOnMethodBegin |
net472 | 1.16μs | 0.39ns | 1.51ns | 0.104 | 0 | 0 | 658 B |
#6765 | RunOnMethodBegin |
net6.0 | 641ns | 0.237ns | 0.919ns | 0.00961 | 0 | 0 | 696 B |
#6765 | RunOnMethodBegin |
netcoreapp3.1 | 980ns | 0.265ns | 0.99ns | 0.00934 | 0 | 0 | 696 B |
#6765 | RunOnMethodBegin |
net472 | 1.1μs | 0.48ns | 1.86ns | 0.104 | 0 | 0 | 658 B |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much!
My main questions are that most of the logic for building the .NET images and pulling the artifacts from the consolidated-pipeline appears to be in the benchmarking repo (even though the code lives in our repo)... is that... normal? 😄
Additionally, in the dashboard, I only see baseline
jobs? 🤔 And what are the jobs called e.g. baseline-matrix--i3
(for example)
Thanks for working on this, it can't come soon enough 😄
rules: | ||
- if: '$CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_BRANCH == "master"' | ||
when: always | ||
- if: '$CI_PIPELINE_SOURCE == "schedule" && $BENCHMARK_RUN == "true"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify
- AFAIK, we don't have a scheduled gitlab run currently (just azure devops)
- Presumably, if we want to add one, we could add it to this file? I don't think it's necessary if we're testing against master, unless we want to also schedule tests against older branches for comparison🤔
- If we do a manual run, this will always run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't have a scheduled gitlab run currently (just azure devops)
Presumably, if we want to add one, we could add it to this file?
I will add the scheduled Gitlab pipeline run once the PR is merged, the other Gitlab schedules for the dd-trace-dotnet repo live here.
If we do a manual run, this will always run?
You'll need to manually trigger the child pipeline.
- benchmarks | ||
|
||
variables: | ||
MACROBENCHMARKS_CI_IMAGE: 486234852809.dkr.ecr.us-east-1.amazonaws.com/ci/benchmarking-platform:dotnet-throughput-2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where/how is this image created? 🤔 How would we go about changing it if we need to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is created from the benchmarking-platform
repo.
In order to change it, you just need to change the underlying Dockerfile and rerun the job to build the image!
- btw, this is a great starting point for all things Benchmarking-platform related: Dogfluence doc.
stage: check-azure-pipeline | ||
image: $MACROBENCHMARKS_CI_IMAGE | ||
script: | ||
- git clone --branch dd-trace-dotnet/macro https://gitlab-ci-token:${CI_JOB_TOKEN}@gitlab.ddbuild.io/DataDog/benchmarking-platform platform && cd platform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will we always be working in this branch? Or will this need to pull main
in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each team has a dedicated benchmarking-branch in the benchmarking-platform repo, the dd-trace-dotnet/macro
would be the branch hosting the 'prod' version of the macrobenchmarks, so these benchmarks will always use that one.
image: $MACROBENCHMARKS_CI_IMAGE | ||
script: | ||
- git clone --branch dd-trace-dotnet/macro https://gitlab-ci-token:${CI_JOB_TOKEN}@gitlab.ddbuild.io/DataDog/benchmarking-platform platform && cd platform | ||
- ./wait-for-pipeline.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I think this script should be in this repository, not in the benchmarking platform repo 🤔 If we change anything about our pipeline (which we 100% plan to do, a lot), then we have to try to synchronize the changes in the benchmarking repo too, which could be difficult
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The general consensus is that benchmarks (and subsequent helper scripts) live in the benchmarking-platform repo (which is the case of all other teams).
- Changes to the helper scripts might result in the need for changing the benchmarks themselves which is why they would need to live in the same place.
- iirc, long term there is a plan to move everything to Gitlab so this part of the benchmark is more of a temporary setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we change anything about our pipeline (which we 100% plan to do, a lot), then we have to try to synchronize the changes in the benchmarking repo too, which could be difficult
👋 Hi Andrew,
Thanks for feedback! We don't have any hard blockers for BP repo and you have write access in case you would like to move things. The current structure is easier for our intended setup, but we ofc won't be blocking if you decide to change things (as this is something you are going to own; plus the setup permissions that we have for dd-trace-dotnet allow a lot of flexibility, up to moving benchmarks completely from BP to dd-trace-dotnet repo, which I won't be recommending for macrobenchmarks, since we plan to setup black box testing environment in mid-term, where same test cases can be used across multiple lang teams - and having everything in one place would simplify setup for us all).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should agree more clearly on ownership of different parts of benchmarking CI pipeline 🤔
For example, I don't think it makes sense for you to own bp-runner parts necessary for running benchmarks as we can scale it x-team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is how all the other teams have it, then we should definitely stick with that 😄 I think my main feedback is that you could benefit from a little more separation for parts that are dependent on the dd-trace-dotnet CI pipeline.
As a concrete example, there's the libdatadog-build one-pipeline, which handles building the OCI images from binaries. The one-pipeline definition doesn't have concrete knowledge of each CI image, rather, it expects that the CI pipeline will run a stage to ensure the artifacts are placed in a specific location.
I think this separation would be nice - there's still a contract to maintain (the one-pipeline requires the artifacts to be in a known location) - but the contract is all about outputs, not the details of how you obtain those outputs. Currently the BMP design is highly intertwined.
I don't think any of this is a blocker, especially as this undoubtedly better than what we currently have, it's more a suggestion for improvements in maintainability 🙂
FWIW, I think it would benefit you too in that regard, as it gives a much clearer delineation of responsibilities - the BMP repo is purely responsible for taking some artifacts and running them, whereas currently it's also responsible for obtaining them too 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this separation would be nice - there's still a contract to maintain (the one-pipeline requires the artifacts to be in a known location) - but the contract is all about outputs, not the details of how you obtain those outputs. Currently the BMP design is highly intertwined.
Hey,
Every team has slightly different preferences, e.g. dd-trace-py folks want to move shell scripts from BP repo directly inside dd-trace-py like you want. At the same time dd-trace-java, dd-trace-js, dd-trace-php ... all other folks are happy with current structure and don't ask to change it.
So while some teams want things structured differently, we still enforce to keep everything in one place (benchmarking-platform repo). Which is not ideal for some workflows, as you said, convention over configuration might serve us better.
I see benefits in moving most of things to lang repos directly, and we need to do it organized. In meanwhile, the structure we have for dd-trace-dotnet is the same like for other repos.
- In dd-trace-dotnet - a thin layer of Gitlab jobs that specify different configuration options for benchmarks.
- In benchmarking-platform repo - the "meat" of benchmarking pipeline, scripts that do actual work.
The benefits of having things in benchmarking-platform repo:
- We hide all details of interactions with AWS SSM as it was originally suggested by security.
- Changes to benchmarking scripts are applied across all pipelines, regardless of the state of main branch in dd-trace-dotnet (so if something breaks, e.g. Azure Pipelines API changes, we don't need to ask users to rebase dev branches in order to have latest fix).
Downsides:
- Repos are too coupled, some teams would prefer having scripts directly in their lang repository instead of separate benchmarking-platform repo, mostly because they don't want to edit x-multiple repos when they need to change something in CI pipeline.
As about ownership contract that we currently have with other teams (and the one that we expect with .Net team):
- Macrobenchmark application code, microbenchmarks code and Gitlab configuration options for CI jobs (like if tracer is enabled, how much RPS is sent, etc.) after initial install is maintained by you.
- bp-runner, PR commenter, benchmark-analyzer, runners infrastructure and CI monitoring are owned by us.
- Dashboards and alerts setups are owned by you, but we provide tooling to streamline / automatize setup.
- Some CI shell scripts are owned by us and you, meaning if something breaks, either we will reach out to you asap or you should reach out to us and we will help.
As about artifacts downloading script, it is indeed fuzzy (as we currently needed something like it only for dd-trace-php due to involved Circle CI setup, and at some point for dd-trace-java, but after migration of build jobs for Java to Gitlab, not anymore), and falls into the territory of shared ownership. I would say you are in a better position to fix it if it breaks, so PRs are welcome, but you can also ping us and we look into it (as we initially setup it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I totally understand, thanks 🙂
I think my main assertion is that the scripts to download the artifacts, and to build the throughput app, should be owned by us. And because of that, it makes sense to have that defined in our repo, not in the benchmarking platform repo.
There are 2 reasons for that
- Making changes to these scripts with the current design (e.g. due to changes in our pipeline) is a massive pain without breaking everyone 😄 Doing it "properly" requires 3 separate PRs, instead of 1
- Remembering that there are external scripts will basically not happen, and as we don't test on PRs, this will break master builds. I practically guarantee it 😉
Anyway, that's all non-blocking stuff that can be updated in the future, just my feedback on the setup 🙂
Yes the rest of the benchmarking scripts are intertwined with the script that does that so changing it would probably result in changing the benchmarking scripts themselves! 😄
These are remnants from test runs (which shouldn't happen all the time 👀 ) |
variables: | ||
FF_USE_LEGACY_KUBERNETES_EXECUTION_STRATEGY: "true" | ||
|
||
K6_OPTIONS_WARMUP_RATE: 30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment, K6_OPTIONS_WARMUP_RATE seems to be too low, I would suggest raising to between K6_OPTIONS_NORMAL_OPERATION_RATE and K6_OPTIONS_HIGH_LOAD_RATE (400 to 600).
We will need to tweak RPS after there is some data in dashboards, so no action is needed right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some minor maintainability questions, but I don't think any of those are blockers.
Very minor point (unrelated to this PR) - the data is all sent to staging, right? we recently were asked to move all our CI data to org2 - should the BMP be doing the same? 🤔
Just to clarify, this PR does 1-4 of the docs, right? Do we need to do 5 + 6, or is that already done?
Thanks again for all your work on this!
Sure, let's discuss them tomorrow on the sync! As Dima said, most considerations here are recommendations and we allow a high level of flexibility (but we still need to take into account easy setup for future initiatives that require a certain degree of consistency across teams)
Yes results from the runs are being sent to staging as you can see from the dashboard! 😉
We might need to tweak some of the parameters of the test depending on the results from the nightly runs, but that is easily done through changing some CI varibales!
Thank you 🙇 |
…ak pipeline if fails
eafd4de
to
05304e4
Compare
/merge |
View all feedbacks in Devflow UI.
The expected merge time in
fayssal.defaa@datadoghq.com cancelled this merge request build |
/merge --cancel |
View all feedbacks in Devflow UI. |
Summary of changes
This PR enables running the Linux based throughput tests through the Benchmarking platform on Gitlab.
The tests are setup to run:
master
branchBENCHMARK_RUN
is set to"true"
for the scheduled pipeline.The results can be checked on the following dashboard
Reason for change
Implementation details