Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Kafka poll span when DSM is enabled #6969

Merged
merged 5 commits into from
May 9, 2024

Conversation

piochelepiotr
Copy link
Contributor

What Does This Do

Adds a span for Kafka polls when Data Streams is enabled. A few benefits of doing that:

  1. Debug slow polls, which can sometimes be an issue with the Kafka brokers
  2. We get visibility into Kafka schema registry calls (and extracting schemas) when a schema registry is used. This can cause slow Kafka polls, so it's useful debugging information

Motivation

Track schemas on Kafka consume (follow up to #6865)

Additional Notes

Jira ticket: [PROJ-IDENT]

@pr-commenter
Copy link

pr-commenter bot commented Apr 30, 2024

Kafka / producer-benchmark

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master piotr-wolski/instrument-poll
git_commit_date 1715194247 1715194621
git_commit_sha 2b99f74 1252812
See matching parameters
Baseline Candidate
ci_job_date 1715276757 1715276757
ci_job_id 507533076 507533076
ci_pipeline_id 33896875 33896875
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
jdkVersion 11.0.21 11.0.21
jmhVersion 1.36 1.36
jvm /usr/lib/jvm/java-11-openjdk-amd64/bin/java /usr/lib/jvm/java-11-openjdk-amd64/bin/java
jvmArgs -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/producer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/producer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant
vmName OpenJDK 64-Bit Server VM OpenJDK 64-Bit Server VM
vmVersion 11.0.21+9-post-Ubuntu-0ubuntu122.04 11.0.21+9-post-Ubuntu-0ubuntu122.04

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 3 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean throughput
scenario:not-instrumented/KafkaProduceBenchmark.benchProduce same
scenario:only-tracing-dsm-disabled-benchmarks/KafkaProduceBenchmark.benchProduce same
scenario:only-tracing-dsm-enabled-benchmarks/KafkaProduceBenchmark.benchProduce same

@pr-commenter
Copy link

pr-commenter bot commented Apr 30, 2024

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master piotr-wolski/instrument-poll
git_commit_date 1715269097 1715194621
git_commit_sha e1d7174 1252812
release_version 1.35.0-SNAPSHOT~e1d717457f 1.35.0-SNAPSHOT~125281264b
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1715278085 1715278085
ci_job_id 507533072 507533072
ci_pipeline_id 33896875 33896875
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
module Agent Agent
parent None None
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 49 metrics, 14 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.35.0-SNAPSHOT~125281264b, baseline=1.35.0-SNAPSHOT~e1d717457f

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.076 s) : 0, 1075935
Total [baseline] (8.544 s) : 0, 8544360
Agent [candidate] (1.077 s) : 0, 1076911
Total [candidate] (8.529 s) : 0, 8528643
section iast
Agent [baseline] (1.201 s) : 0, 1201120
Total [baseline] (8.993 s) : 0, 8993040
Agent [candidate] (1.204 s) : 0, 1203744
Total [candidate] (9.028 s) : 0, 9028443
section iast_HARDCODED_SECRET_DISABLED
Agent [baseline] (1.201 s) : 0, 1201025
Total [baseline] (9.004 s) : 0, 9004428
Agent [candidate] (1.216 s) : 0, 1216137
Total [candidate] (9.004 s) : 0, 9004316
section iast_TELEMETRY_OFF
Agent [baseline] (1.195 s) : 0, 1195254
Total [baseline] (8.983 s) : 0, 8982939
Agent [candidate] (1.197 s) : 0, 1197043
Total [candidate] (9.024 s) : 0, 9023893
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.076 s -
Agent iast 1.201 s 125.185 ms (11.6%)
Agent iast_HARDCODED_SECRET_DISABLED 1.201 s 125.091 ms (11.6%)
Agent iast_TELEMETRY_OFF 1.195 s 119.319 ms (11.1%)
Total tracing 8.544 s -
Total iast 8.993 s 448.68 ms (5.3%)
Total iast_HARDCODED_SECRET_DISABLED 9.004 s 460.068 ms (5.4%)
Total iast_TELEMETRY_OFF 8.983 s 438.578 ms (5.1%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.077 s -
Agent iast 1.204 s 126.833 ms (11.8%)
Agent iast_HARDCODED_SECRET_DISABLED 1.216 s 139.227 ms (12.9%)
Agent iast_TELEMETRY_OFF 1.197 s 120.132 ms (11.2%)
Total tracing 8.529 s -
Total iast 9.028 s 499.8 ms (5.9%)
Total iast_HARDCODED_SECRET_DISABLED 9.004 s 475.673 ms (5.6%)
Total iast_TELEMETRY_OFF 9.024 s 495.25 ms (5.8%)
gantt
    title insecure-bank - break down per module: candidate=1.35.0-SNAPSHOT~125281264b, baseline=1.35.0-SNAPSHOT~e1d717457f

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (673.666 ms) : 0, 673666
BytebuddyAgent [candidate] (673.652 ms) : 0, 673652
GlobalTracer [baseline] (310.308 ms) : 0, 310308
GlobalTracer [candidate] (311.018 ms) : 0, 311018
AppSec [baseline] (49.413 ms) : 0, 49413
AppSec [candidate] (49.551 ms) : 0, 49551
Remote Config [baseline] (656.968 µs) : 0, 657
Remote Config [candidate] (664.296 µs) : 0, 664
Telemetry [baseline] (7.576 ms) : 0, 7576
Telemetry [candidate] (7.636 ms) : 0, 7636
section iast
BytebuddyAgent [baseline] (794.234 ms) : 0, 794234
BytebuddyAgent [candidate] (795.553 ms) : 0, 795553
GlobalTracer [baseline] (290.597 ms) : 0, 290597
GlobalTracer [candidate] (291.768 ms) : 0, 291768
AppSec [baseline] (51.276 ms) : 0, 51276
AppSec [candidate] (49.565 ms) : 0, 49565
IAST [baseline] (23.491 ms) : 0, 23491
IAST [candidate] (24.464 ms) : 0, 24464
Remote Config [baseline] (588.827 µs) : 0, 589
Remote Config [candidate] (590.937 µs) : 0, 591
Telemetry [baseline] (6.601 ms) : 0, 6601
Telemetry [candidate] (7.374 ms) : 0, 7374
section iast_HARDCODED_SECRET_DISABLED
BytebuddyAgent [baseline] (793.293 ms) : 0, 793293
BytebuddyAgent [candidate] (804.533 ms) : 0, 804533
GlobalTracer [baseline] (290.742 ms) : 0, 290742
GlobalTracer [candidate] (294.273 ms) : 0, 294273
AppSec [baseline] (50.914 ms) : 0, 50914
AppSec [candidate] (51.351 ms) : 0, 51351
IAST [baseline] (23.768 ms) : 0, 23768
IAST [candidate] (23.829 ms) : 0, 23829
Remote Config [baseline] (1.313 ms) : 0, 1313
Remote Config [candidate] (601.175 µs) : 0, 601
Telemetry [baseline] (6.704 ms) : 0, 6704
Telemetry [candidate] (6.69 ms) : 0, 6690
section iast_TELEMETRY_OFF
BytebuddyAgent [baseline] (789.159 ms) : 0, 789159
BytebuddyAgent [candidate] (790.431 ms) : 0, 790431
GlobalTracer [baseline] (289.896 ms) : 0, 289896
GlobalTracer [candidate] (290.277 ms) : 0, 290277
AppSec [baseline] (48.229 ms) : 0, 48229
AppSec [candidate] (51.851 ms) : 0, 51851
IAST [baseline] (26.061 ms) : 0, 26061
IAST [candidate] (23.067 ms) : 0, 23067
Remote Config [baseline] (589.774 µs) : 0, 590
Remote Config [candidate] (609.514 µs) : 0, 610
Telemetry [baseline] (7.167 ms) : 0, 7167
Telemetry [candidate] (6.535 ms) : 0, 6535
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.35.0-SNAPSHOT~125281264b, baseline=1.35.0-SNAPSHOT~e1d717457f

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.084 s) : 0, 1083782
Total [baseline] (10.401 s) : 0, 10401076
Agent [candidate] (1.076 s) : 0, 1076412
Total [candidate] (10.371 s) : 0, 10371437
section appsec
Agent [baseline] (1.191 s) : 0, 1191152
Total [baseline] (10.432 s) : 0, 10432267
Agent [candidate] (1.194 s) : 0, 1194393
Total [candidate] (10.453 s) : 0, 10453304
section iast
Agent [baseline] (1.2 s) : 0, 1200492
Total [baseline] (10.735 s) : 0, 10734689
Agent [candidate] (1.21 s) : 0, 1209731
Total [candidate] (10.736 s) : 0, 10735918
section profiling
Agent [baseline] (1.279 s) : 0, 1279395
Total [baseline] (10.689 s) : 0, 10688726
Agent [candidate] (1.27 s) : 0, 1269917
Total [candidate] (10.623 s) : 0, 10622518
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.084 s -
Agent appsec 1.191 s 107.371 ms (9.9%)
Agent iast 1.2 s 116.71 ms (10.8%)
Agent profiling 1.279 s 195.613 ms (18.0%)
Total tracing 10.401 s -
Total appsec 10.432 s 31.191 ms (0.3%)
Total iast 10.735 s 333.614 ms (3.2%)
Total profiling 10.689 s 287.65 ms (2.8%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.076 s -
Agent appsec 1.194 s 117.981 ms (11.0%)
Agent iast 1.21 s 133.318 ms (12.4%)
Agent profiling 1.27 s 193.505 ms (18.0%)
Total tracing 10.371 s -
Total appsec 10.453 s 81.867 ms (0.8%)
Total iast 10.736 s 364.481 ms (3.5%)
Total profiling 10.623 s 251.081 ms (2.4%)
gantt
    title petclinic - break down per module: candidate=1.35.0-SNAPSHOT~125281264b, baseline=1.35.0-SNAPSHOT~e1d717457f

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (678.517 ms) : 0, 678517
BytebuddyAgent [candidate] (674.084 ms) : 0, 674084
GlobalTracer [baseline] (312.83 ms) : 0, 312830
GlobalTracer [candidate] (310.165 ms) : 0, 310165
AppSec [baseline] (49.733 ms) : 0, 49733
AppSec [candidate] (49.479 ms) : 0, 49479
Remote Config [baseline] (662.083 µs) : 0, 662
Remote Config [candidate] (670.757 µs) : 0, 671
Telemetry [baseline] (7.559 ms) : 0, 7559
Telemetry [candidate] (7.623 ms) : 0, 7623
section appsec
BytebuddyAgent [baseline] (695.082 ms) : 0, 695082
BytebuddyAgent [candidate] (698.02 ms) : 0, 698020
GlobalTracer [baseline] (293.587 ms) : 0, 293587
GlobalTracer [candidate] (294.691 ms) : 0, 294691
AppSec [baseline] (149.097 ms) : 0, 149097
AppSec [candidate] (149.056 ms) : 0, 149056
Remote Config [baseline] (616.433 µs) : 0, 616
Remote Config [candidate] (621.569 µs) : 0, 622
Telemetry [baseline] (8.825 ms) : 0, 8825
Telemetry [candidate] (7.85 ms) : 0, 7850
IAST [baseline] (19.287 ms) : 0, 19287
IAST [candidate] (19.342 ms) : 0, 19342
section iast
BytebuddyAgent [baseline] (793.436 ms) : 0, 793436
BytebuddyAgent [candidate] (799.643 ms) : 0, 799643
GlobalTracer [baseline] (291.263 ms) : 0, 291263
GlobalTracer [candidate] (293.004 ms) : 0, 293004
AppSec [baseline] (47.068 ms) : 0, 47068
AppSec [candidate] (50.1 ms) : 0, 50100
Remote Config [baseline] (2.05 ms) : 0, 2050
Remote Config [candidate] (591.479 µs) : 0, 591
Telemetry [baseline] (7.315 ms) : 0, 7315
Telemetry [candidate] (6.644 ms) : 0, 6644
IAST [baseline] (25.167 ms) : 0, 25167
IAST [candidate] (25.191 ms) : 0, 25191
section profiling
ProfilingAgent [baseline] (95.946 ms) : 0, 95946
ProfilingAgent [candidate] (95.805 ms) : 0, 95805
BytebuddyAgent [baseline] (683.45 ms) : 0, 683450
BytebuddyAgent [candidate] (678.169 ms) : 0, 678169
GlobalTracer [baseline] (384.083 ms) : 0, 384083
GlobalTracer [candidate] (381.298 ms) : 0, 381298
AppSec [baseline] (50.848 ms) : 0, 50848
AppSec [candidate] (50.04 ms) : 0, 50040
Remote Config [baseline] (710.82 µs) : 0, 711
Remote Config [candidate] (714.379 µs) : 0, 714
Telemetry [baseline] (7.512 ms) : 0, 7512
Telemetry [candidate] (7.494 ms) : 0, 7494
Profiling [baseline] (95.97 ms) : 0, 95970
Profiling [candidate] (95.831 ms) : 0, 95831
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
end_time 2024-05-09T17:38:52 2024-05-09T17:45:42
git_branch master piotr-wolski/instrument-poll
git_commit_date 1715269097 1715194621
git_commit_sha e1d7174 1252812
release_version 1.35.0-SNAPSHOT~e1d717457f 1.35.0-SNAPSHOT~125281264b
start_time 2024-05-09T17:38:39 2024-05-09T17:45:28
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1715277088 1715277088
ci_job_id 507533073 507533073
ci_pipeline_id 33896875 33896875
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 16 unstable metrics.

Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.35.0-SNAPSHOT~125281264b, baseline=1.35.0-SNAPSHOT~e1d717457f
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.344 ms) : 1325, 1363
.   : milestone, 1344,
appsec (1.717 ms) : 1693, 1741
.   : milestone, 1717,
appsec_no_iast (1.7 ms) : 1676, 1724
.   : milestone, 1700,
iast (1.486 ms) : 1463, 1509
.   : milestone, 1486,
profiling (1.544 ms) : 1518, 1569
.   : milestone, 1544,
tracing (1.488 ms) : 1464, 1512
.   : milestone, 1488,
section candidate
no_agent (1.353 ms) : 1333, 1373
.   : milestone, 1353,
appsec (1.742 ms) : 1718, 1765
.   : milestone, 1742,
appsec_no_iast (1.725 ms) : 1700, 1749
.   : milestone, 1725,
iast (1.495 ms) : 1472, 1517
.   : milestone, 1495,
profiling (1.497 ms) : 1473, 1522
.   : milestone, 1497,
tracing (1.496 ms) : 1472, 1520
.   : milestone, 1496,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.344 ms [1.325 ms, 1.363 ms] -
appsec 1.717 ms [1.693 ms, 1.741 ms] 373.216 µs (27.8%)
appsec_no_iast 1.7 ms [1.676 ms, 1.724 ms] 356.097 µs (26.5%)
iast 1.486 ms [1.463 ms, 1.509 ms] 141.978 µs (10.6%)
profiling 1.544 ms [1.518 ms, 1.569 ms] 199.511 µs (14.8%)
tracing 1.488 ms [1.464 ms, 1.512 ms] 143.801 µs (10.7%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.353 ms [1.333 ms, 1.373 ms] -
appsec 1.742 ms [1.718 ms, 1.765 ms] 388.607 µs (28.7%)
appsec_no_iast 1.725 ms [1.7 ms, 1.749 ms] 371.786 µs (27.5%)
iast 1.495 ms [1.472 ms, 1.517 ms] 141.863 µs (10.5%)
profiling 1.497 ms [1.473 ms, 1.522 ms] 144.507 µs (10.7%)
tracing 1.496 ms [1.472 ms, 1.52 ms] 142.822 µs (10.6%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.35.0-SNAPSHOT~125281264b, baseline=1.35.0-SNAPSHOT~e1d717457f
    dateFormat X
    axisFormat %s
section baseline
no_agent (366.565 µs) : 347, 387
.   : milestone, 367,
iast (470.996 µs) : 450, 492
.   : milestone, 471,
iast_FULL (536.868 µs) : 516, 557
.   : milestone, 537,
iast_GLOBAL (500.776 µs) : 479, 523
.   : milestone, 501,
iast_HARDCODED_SECRET_DISABLED (472.189 µs) : 452, 493
.   : milestone, 472,
iast_INACTIVE (453.358 µs) : 432, 475
.   : milestone, 453,
iast_TELEMETRY_OFF (471.947 µs) : 450, 494
.   : milestone, 472,
tracing (449.423 µs) : 428, 471
.   : milestone, 449,
section candidate
no_agent (367.263 µs) : 348, 387
.   : milestone, 367,
iast (477.423 µs) : 457, 498
.   : milestone, 477,
iast_FULL (547.414 µs) : 527, 568
.   : milestone, 547,
iast_GLOBAL (500.93 µs) : 480, 522
.   : milestone, 501,
iast_HARDCODED_SECRET_DISABLED (475.353 µs) : 454, 496
.   : milestone, 475,
iast_INACTIVE (459.777 µs) : 438, 481
.   : milestone, 460,
iast_TELEMETRY_OFF (476.941 µs) : 455, 499
.   : milestone, 477,
tracing (442.421 µs) : 422, 463
.   : milestone, 442,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 366.565 µs [346.569 µs, 386.561 µs] -
iast 470.996 µs [450.385 µs, 491.606 µs] 104.431 µs (28.5%)
iast_FULL 536.868 µs [516.32 µs, 557.415 µs] 170.303 µs (46.5%)
iast_GLOBAL 500.776 µs [479.032 µs, 522.519 µs] 134.211 µs (36.6%)
iast_HARDCODED_SECRET_DISABLED 472.189 µs [451.564 µs, 492.813 µs] 105.624 µs (28.8%)
iast_INACTIVE 453.358 µs [431.533 µs, 475.183 µs] 86.793 µs (23.7%)
iast_TELEMETRY_OFF 471.947 µs [450.161 µs, 493.732 µs] 105.382 µs (28.7%)
tracing 449.423 µs [428.17 µs, 470.676 µs] 82.858 µs (22.6%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 367.263 µs [347.514 µs, 387.013 µs] -
iast 477.423 µs [457.007 µs, 497.84 µs] 110.16 µs (30.0%)
iast_FULL 547.414 µs [526.843 µs, 567.986 µs] 180.151 µs (49.1%)
iast_GLOBAL 500.93 µs [479.656 µs, 522.204 µs] 133.667 µs (36.4%)
iast_HARDCODED_SECRET_DISABLED 475.353 µs [454.435 µs, 496.272 µs] 108.09 µs (29.4%)
iast_INACTIVE 459.777 µs [438.264 µs, 481.29 µs] 92.514 µs (25.2%)
iast_TELEMETRY_OFF 476.941 µs [455.36 µs, 498.522 µs] 109.678 µs (29.9%)
tracing 442.421 µs [422.076 µs, 462.766 µs] 75.158 µs (20.5%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master piotr-wolski/instrument-poll
git_commit_date 1715269097 1715194621
git_commit_sha e1d7174 1252812
release_version 1.35.0-SNAPSHOT~e1d717457f 1.35.0-SNAPSHOT~125281264b
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1715277588 1715277588
ci_job_id 507533074 507533074
ci_pipeline_id 33896875 33896875
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
variant appsec appsec

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.35.0-SNAPSHOT~125281264b, baseline=1.35.0-SNAPSHOT~e1d717457f
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.105 s) : 15105000, 15105000
.   : milestone, 15105000,
appsec (15.212 s) : 15212000, 15212000
.   : milestone, 15212000,
iast (19.004 s) : 19004000, 19004000
.   : milestone, 19004000,
iast_GLOBAL (17.647 s) : 17647000, 17647000
.   : milestone, 17647000,
profiling (15.783 s) : 15783000, 15783000
.   : milestone, 15783000,
tracing (14.926 s) : 14926000, 14926000
.   : milestone, 14926000,
section candidate
no_agent (15.071 s) : 15071000, 15071000
.   : milestone, 15071000,
appsec (15.223 s) : 15223000, 15223000
.   : milestone, 15223000,
iast (18.763 s) : 18763000, 18763000
.   : milestone, 18763000,
iast_GLOBAL (18.083 s) : 18083000, 18083000
.   : milestone, 18083000,
profiling (15.556 s) : 15556000, 15556000
.   : milestone, 15556000,
tracing (14.712 s) : 14712000, 14712000
.   : milestone, 14712000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.105 s [15.105 s, 15.105 s] -
appsec 15.212 s [15.212 s, 15.212 s] 107.0 ms (0.7%)
iast 19.004 s [19.004 s, 19.004 s] 3.899 s (25.8%)
iast_GLOBAL 17.647 s [17.647 s, 17.647 s] 2.542 s (16.8%)
profiling 15.783 s [15.783 s, 15.783 s] 678.0 ms (4.5%)
tracing 14.926 s [14.926 s, 14.926 s] -179.0 ms (-1.2%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.071 s [15.071 s, 15.071 s] -
appsec 15.223 s [15.223 s, 15.223 s] 152.0 ms (1.0%)
iast 18.763 s [18.763 s, 18.763 s] 3.692 s (24.5%)
iast_GLOBAL 18.083 s [18.083 s, 18.083 s] 3.012 s (20.0%)
profiling 15.556 s [15.556 s, 15.556 s] 485.0 ms (3.2%)
tracing 14.712 s [14.712 s, 14.712 s] -359.0 ms (-2.4%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.35.0-SNAPSHOT~125281264b, baseline=1.35.0-SNAPSHOT~e1d717457f
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.463 ms) : 1451, 1474
.   : milestone, 1463,
appsec (2.197 ms) : 2164, 2231
.   : milestone, 2197,
iast (1.884 ms) : 1848, 1920
.   : milestone, 1884,
iast_GLOBAL (1.916 ms) : 1881, 1952
.   : milestone, 1916,
profiling (1.849 ms) : 1815, 1883
.   : milestone, 1849,
tracing (1.831 ms) : 1799, 1864
.   : milestone, 1831,
section candidate
no_agent (1.458 ms) : 1447, 1470
.   : milestone, 1458,
appsec (2.199 ms) : 2165, 2233
.   : milestone, 2199,
iast (1.873 ms) : 1838, 1908
.   : milestone, 1873,
iast_GLOBAL (1.909 ms) : 1874, 1944
.   : milestone, 1909,
profiling (1.85 ms) : 1816, 1884
.   : milestone, 1850,
tracing (1.839 ms) : 1806, 1871
.   : milestone, 1839,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.463 ms [1.451 ms, 1.474 ms] -
appsec 2.197 ms [2.164 ms, 2.231 ms] 734.583 µs (50.2%)
iast 1.884 ms [1.848 ms, 1.92 ms] 421.194 µs (28.8%)
iast_GLOBAL 1.916 ms [1.881 ms, 1.952 ms] 453.72 µs (31.0%)
profiling 1.849 ms [1.815 ms, 1.883 ms] 386.38 µs (26.4%)
tracing 1.831 ms [1.799 ms, 1.864 ms] 368.557 µs (25.2%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.458 ms [1.447 ms, 1.47 ms] -
appsec 2.199 ms [2.165 ms, 2.233 ms] 740.602 µs (50.8%)
iast 1.873 ms [1.838 ms, 1.908 ms] 414.609 µs (28.4%)
iast_GLOBAL 1.909 ms [1.874 ms, 1.944 ms] 450.713 µs (30.9%)
profiling 1.85 ms [1.816 ms, 1.884 ms] 391.979 µs (26.9%)
tracing 1.839 ms [1.806 ms, 1.871 ms] 380.329 µs (26.1%)

@pr-commenter
Copy link

pr-commenter bot commented Apr 30, 2024

Kafka / consumer-benchmark

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master piotr-wolski/instrument-poll
git_commit_date 1715194247 1715194621
git_commit_sha 2b99f74 1252812
See matching parameters
Baseline Candidate
ci_job_date 1715276797 1715276797
ci_job_id 507533077 507533077
ci_pipeline_id 33896875 33896875
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
jdkVersion 11.0.21 11.0.21
jmhVersion 1.36 1.36
jvm /usr/lib/jvm/java-11-openjdk-amd64/bin/java /usr/lib/jvm/java-11-openjdk-amd64/bin/java
jvmArgs -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/consumer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/consumer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant
vmName OpenJDK 64-Bit Server VM OpenJDK 64-Bit Server VM
vmVersion 11.0.21+9-post-Ubuntu-0ubuntu122.04 11.0.21+9-post-Ubuntu-0ubuntu122.04

Summary

Found 0 performance improvements and 1 performance regressions! Performance is the same for 2 metrics, 0 unstable metrics.

scenario Δ mean throughput
scenario:only-tracing-dsm-enabled-benchmarks/KafkaConsumerBenchmark.benchConsume worse
[-155724.384op/s; -148641.858op/s] or [-50.347%; -48.057%]
See unchanged results
scenario Δ mean throughput
scenario:not-instrumented/KafkaConsumerBenchmark.benchConsume unsure
[-9451.022op/s; -1863.661op/s] or [-3.174%; -0.626%]
scenario:only-tracing-dsm-disabled-benchmarks/KafkaConsumerBenchmark.benchConsume same

@piochelepiotr piochelepiotr force-pushed the piotr-wolski/instrument-poll branch 3 times, most recently from cdee6af to c827943 Compare April 30, 2024 14:57
@piochelepiotr piochelepiotr marked this pull request as ready for review April 30, 2024 14:57
@piochelepiotr piochelepiotr requested a review from a team as a code owner April 30, 2024 14:57
@piochelepiotr piochelepiotr force-pushed the piotr-wolski/instrument-poll branch 4 times, most recently from 313020c to 1b61edf Compare April 30, 2024 19:31
@@ -130,10 +136,28 @@ public static void muzzleCheck(ConsumerRecord record) {
* KafkaConsumer class.
*/
public static class RecordsAdvice {
@Advice.OnMethodEnter(suppress = Throwable.class)
public static AgentScope onEnter() {
if (Config.get().isDataStreamsEnabled()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra spans have the potential to create noisy traces.
But I guess this is okay, since it is under "data streams enabled".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also try to see if you have an active span and take the dsm config from it since using Config.get() won't let you access to the dynamic config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I updated the code.
Note that this codepath will today probably not trigger, since the reason I'm adding this span, is because there is no active span to attach the schema on.

if (records == null) {
if (scope != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit messy to me. On first read, it looked like we were closing scope & spans twice.
I see that one part is handling the null records case and another the we have records case, but I think this can be written more cleanly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the code, please let me know what you think of the new version.
The tricky part is that both records & scope can be null, but in both cases, I still want to take an action.

}

// TraceID, start times & names changed based on the configuration, so overriding the sort to give consistent test results
private static class SortKafkaTraces implements Comparator<List<DDSpan>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the usual test sorting mechanisms not sufficient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, it doesn't look like it 😢
trace ID are inconsistent because poll can start before produce (and finish after)
names are not good because the kafka consume span has different names depending on the configs
start times are not good either, same reason as trace IDs

if (records == null) {
if (scope != null) {
AgentSpan span = scope.span();
if (span != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this scope span null check needed at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, is scope is not null, there must be a span (as far as I understand)

@@ -142,6 +166,15 @@ public static void captureGroup(
InstrumentationContext.get(ConsumerRecords.class, KafkaConsumerInfo.class)
.put(records, kafkaConsumerInfo);
}
if (scope == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do this scope null check once at the beginning of the method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated a bit the code, the hard part is that I want to attach the kafka consumer info to records even if the scope is null.

@piochelepiotr piochelepiotr requested a review from ygree May 8, 2024 20:33
Copy link
Contributor

@ygree ygree left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@piochelepiotr piochelepiotr merged commit 6df14c1 into master May 9, 2024
82 checks passed
@piochelepiotr piochelepiotr deleted the piotr-wolski/instrument-poll branch May 9, 2024 18:32
@github-actions github-actions bot added this to the 1.35.0 milestone May 9, 2024
@amarziali amarziali added inst: kafka Kafka instrumentation comp: data streams Data Streams Monitoring labels Jun 5, 2024
@PerfectSlayer PerfectSlayer changed the title kafka: Add poll span when DSM is enabled Add Kafka poll span when DSM is enabled Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp: data streams Data Streams Monitoring inst: kafka Kafka instrumentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants