Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to locate tracer JAR using class loader resource lookup when bootstrapping the agent #6126

Conversation

nikita-tkachenko-datadog
Copy link
Contributor

@nikita-tkachenko-datadog nikita-tkachenko-datadog commented Oct 31, 2023

What Does This Do

Updates agent bootstrap logic to try to locate tracer JAR by finding the AgentBootstrap.class resource using the class' classloader.

Motivation

Currently the lookup is done using ProtectionDomain/CodeSource and by parsing the -javaagent argument if the first approach fails.
However the second approach may fail too if multiple agents are specified (which, for example, is possible in CI Visibility when a test JVM is forked both with the tracer and with the Jacoco agent).

Jira ticket: CIVIS-7865

@pr-commenter
Copy link

pr-commenter bot commented Oct 31, 2023

Benchmarks

Startup

Parameters

Baseline Candidate
commit 1.24.0-SNAPSHOT~3c1890cf33 1.23.0-SNAPSHOT~70d9234851
config baseline candidate
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
module Agent Agent
parent None None
variant iast iast

Summary

Found 0 performance improvements and 7 performance regressions! Performance is the same for 47 cases.

scenario Δ mean execution_time candidate mean execution_time baseline mean execution_time
scenario:insecure-bank:iast:Telemetry worse
[+2.821ms; +9.354ms] or [+32.748%; +108.602%]
14.701ms 8.613ms
scenario:insecure-bank:tracing:Remote Config worse
[+15.384µs; +61.217µs] or [+2.299%; +9.149%]
707.445µs 669.145µs
scenario:insecure-bank:tracing:Telemetry worse
[+3.913ms; +4.339ms] or [+53.994%; +59.868%]
11.374ms 7.248ms
scenario:petclinic:iast:Telemetry worse
[+2.719ms; +8.156ms] or [+38.133%; +114.378%]
12.568ms 7.130ms
scenario:petclinic:profiling:Telemetry worse
[+3.781ms; +4.239ms] or [+50.954%; +57.121%]
11.432ms 7.421ms
scenario:petclinic:tracing:Remote Config worse
[+19.797µs; +73.233µs] or [+2.983%; +11.035%]
710.142µs 663.627µs
scenario:petclinic:tracing:Telemetry worse
[+4.105ms; +4.516ms] or [+57.346%; +63.082%]
11.470ms 7.159ms
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.23.0-SNAPSHOT~70d9234851, baseline=1.24.0-SNAPSHOT~3c1890cf33

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.033 s) : 0, 1032690
Total [baseline] (9.348 s) : 0, 9347854
Agent [candidate] (1.046 s) : 0, 1046431
Total [candidate] (9.338 s) : 0, 9337735
section appsec
Agent [baseline] (1.129 s) : 0, 1128761
Total [baseline] (9.384 s) : 0, 9383707
Agent [candidate] (1.119 s) : 0, 1119486
Total [candidate] (9.43 s) : 0, 9429543
section iast
Agent [baseline] (1.148 s) : 0, 1147917
Total [baseline] (9.55 s) : 0, 9550137
Agent [candidate] (1.154 s) : 0, 1153916
Total [candidate] (9.573 s) : 0, 9573027
section profiling
Agent [baseline] (1.218 s) : 0, 1217737
Total [baseline] (9.505 s) : 0, 9504862
Agent [candidate] (1.229 s) : 0, 1229214
Total [candidate] (9.604 s) : 0, 9603708
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.033 s -
Agent appsec 1.129 s 96.072 ms (9.3%)
Agent iast 1.148 s 115.227 ms (11.2%)
Agent profiling 1.218 s 185.047 ms (17.9%)
Total tracing 9.348 s -
Total appsec 9.384 s 35.852 ms (0.4%)
Total iast 9.55 s 202.283 ms (2.2%)
Total profiling 9.505 s 157.008 ms (1.7%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.046 s -
Agent appsec 1.119 s 73.054 ms (7.0%)
Agent iast 1.154 s 107.485 ms (10.3%)
Agent profiling 1.229 s 182.782 ms (17.5%)
Total tracing 9.338 s -
Total appsec 9.43 s 91.808 ms (1.0%)
Total iast 9.573 s 235.292 ms (2.5%)
Total profiling 9.604 s 265.972 ms (2.8%)
gantt
    title petclinic - break down per module: candidate=1.23.0-SNAPSHOT~70d9234851, baseline=1.24.0-SNAPSHOT~3c1890cf33

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (646.961 ms) : 0, 646961
BytebuddyAgent [candidate] (652.411 ms) : 0, 652411
GlobalTracer [baseline] (294.533 ms) : 0, 294533
GlobalTracer [candidate] (297.704 ms) : 0, 297704
AppSec [baseline] (48.863 ms) : 0, 48863
AppSec [candidate] (49.383 ms) : 0, 49383
Remote Config [baseline] (663.627 µs) : 0, 664
Remote Config [candidate] (710.142 µs) : 0, 710
Telemetry [baseline] (7.159 ms) : 0, 7159
Telemetry [candidate] (11.47 ms) : 0, 11470
section appsec
BytebuddyAgent [baseline] (651.335 ms) : 0, 651335
BytebuddyAgent [candidate] (645.653 ms) : 0, 645653
GlobalTracer [baseline] (296.61 ms) : 0, 296610
GlobalTracer [candidate] (294.042 ms) : 0, 294042
AppSec [baseline] (138.787 ms) : 0, 138787
AppSec [candidate] (137.932 ms) : 0, 137932
Remote Config [baseline] (653.382 µs) : 0, 653
Remote Config [candidate] (644.057 µs) : 0, 644
Telemetry [baseline] (6.789 ms) : 0, 6789
Telemetry [candidate] (6.872 ms) : 0, 6872
section iast
BytebuddyAgent [baseline] (767.862 ms) : 0, 767862
BytebuddyAgent [candidate] (766.764 ms) : 0, 766764
GlobalTracer [baseline] (275.022 ms) : 0, 275022
GlobalTracer [candidate] (274.951 ms) : 0, 274951
AppSec [baseline] (46.17 ms) : 0, 46170
AppSec [candidate] (46.993 ms) : 0, 46993
Remote Config [baseline] (558.69 µs) : 0, 559
Remote Config [candidate] (577.958 µs) : 0, 578
Telemetry [baseline] (7.13 ms) : 0, 7130
Telemetry [candidate] (12.568 ms) : 0, 12568
IAST [baseline] (16.673 ms) : 0, 16673
IAST [candidate] (17.604 ms) : 0, 17604
section profiling
ProfilingAgent [baseline] (87.996 ms) : 0, 87996
ProfilingAgent [candidate] (88.65 ms) : 0, 88650
BytebuddyAgent [baseline] (658.414 ms) : 0, 658414
BytebuddyAgent [candidate] (663.498 ms) : 0, 663498
GlobalTracer [baseline] (359.931 ms) : 0, 359931
GlobalTracer [candidate] (360.87 ms) : 0, 360870
AppSec [baseline] (49.009 ms) : 0, 49009
AppSec [candidate] (49.415 ms) : 0, 49415
Remote Config [baseline] (641.618 µs) : 0, 642
Remote Config [candidate] (653.776 µs) : 0, 654
Telemetry [baseline] (7.421 ms) : 0, 7421
Telemetry [candidate] (11.432 ms) : 0, 11432
Profiling [baseline] (88.019 ms) : 0, 88019
Profiling [candidate] (88.675 ms) : 0, 88675
Loading
Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.23.0-SNAPSHOT~70d9234851, baseline=1.24.0-SNAPSHOT~3c1890cf33

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.034 s) : 0, 1033533
Total [baseline] (8.786 s) : 0, 8785781
Agent [candidate] (1.045 s) : 0, 1044674
Total [candidate] (8.832 s) : 0, 8832290
section iast
Agent [baseline] (1.157 s) : 0, 1157000
Total [baseline] (9.321 s) : 0, 9321064
Agent [candidate] (1.154 s) : 0, 1153909
Total [candidate] (9.343 s) : 0, 9342696
section iast_TELEMETRY_OFF
Agent [baseline] (1.142 s) : 0, 1141704
Total [baseline] (9.289 s) : 0, 9289239
Agent [candidate] (1.144 s) : 0, 1143781
Total [candidate] (9.313 s) : 0, 9312801
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.034 s -
Agent iast 1.157 s 123.467 ms (11.9%)
Agent iast_TELEMETRY_OFF 1.142 s 108.171 ms (10.5%)
Total tracing 8.786 s -
Total iast 9.321 s 535.283 ms (6.1%)
Total iast_TELEMETRY_OFF 9.289 s 503.458 ms (5.7%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.045 s -
Agent iast 1.154 s 109.235 ms (10.5%)
Agent iast_TELEMETRY_OFF 1.144 s 99.107 ms (9.5%)
Total tracing 8.832 s -
Total iast 9.343 s 510.406 ms (5.8%)
Total iast_TELEMETRY_OFF 9.313 s 480.511 ms (5.4%)
gantt
    title insecure-bank - break down per module: candidate=1.23.0-SNAPSHOT~70d9234851, baseline=1.24.0-SNAPSHOT~3c1890cf33

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (646.922 ms) : 0, 646922
BytebuddyAgent [candidate] (651.483 ms) : 0, 651483
GlobalTracer [baseline] (295.145 ms) : 0, 295145
GlobalTracer [candidate] (297.138 ms) : 0, 297138
AppSec [baseline] (49.181 ms) : 0, 49181
AppSec [candidate] (49.194 ms) : 0, 49194
Remote Config [baseline] (669.145 µs) : 0, 669
Remote Config [candidate] (707.445 µs) : 0, 707
Telemetry [baseline] (7.248 ms) : 0, 7248
Telemetry [candidate] (11.374 ms) : 0, 11374
section iast
BytebuddyAgent [baseline] (773.312 ms) : 0, 773312
BytebuddyAgent [candidate] (766.706 ms) : 0, 766706
GlobalTracer [baseline] (276.987 ms) : 0, 276987
GlobalTracer [candidate] (274.347 ms) : 0, 274347
AppSec [baseline] (47.033 ms) : 0, 47033
AppSec [candidate] (46.988 ms) : 0, 46988
IAST [baseline] (15.753 ms) : 0, 15753
IAST [candidate] (16.113 ms) : 0, 16113
Remote Config [baseline] (575.947 µs) : 0, 576
Remote Config [candidate] (564.659 µs) : 0, 565
Telemetry [baseline] (8.613 ms) : 0, 8613
Telemetry [candidate] (14.701 ms) : 0, 14701
section iast_TELEMETRY_OFF
BytebuddyAgent [baseline] (760.148 ms) : 0, 760148
BytebuddyAgent [candidate] (758.274 ms) : 0, 758274
GlobalTracer [baseline] (274.919 ms) : 0, 274919
GlobalTracer [candidate] (274.296 ms) : 0, 274296
AppSec [baseline] (46.372 ms) : 0, 46372
AppSec [candidate] (46.391 ms) : 0, 46391
IAST [baseline] (16.744 ms) : 0, 16744
IAST [candidate] (19.528 ms) : 0, 19528
Remote Config [baseline] (582.563 µs) : 0, 583
Remote Config [candidate] (566.155 µs) : 0, 566
Telemetry [baseline] (8.46 ms) : 0, 8460
Telemetry [candidate] (10.364 ms) : 0, 10364
Loading

Load

Parameters

Baseline Candidate
commit 1.24.0-SNAPSHOT~3c1890cf33 1.23.0-SNAPSHOT~70d9234851
config baseline candidate
end_time 2023-11-15T13:24:14 2023-11-15T13:40:46
start_time 2023-11-15T13:24:01 2023-11-15T13:40:33
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 22 cases.

Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.23.0-SNAPSHOT~70d9234851, baseline=1.24.0-SNAPSHOT~3c1890cf33
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.341 ms) : 1321, 1360
.   : milestone, 1341,
appsec (1.736 ms) : 1711, 1760
.   : milestone, 1736,
iast (1.506 ms) : 1482, 1531
.   : milestone, 1506,
profiling (1.496 ms) : 1471, 1522
.   : milestone, 1496,
tracing (1.459 ms) : 1434, 1484
.   : milestone, 1459,
section candidate
no_agent (1.338 ms) : 1319, 1357
.   : milestone, 1338,
appsec (1.728 ms) : 1704, 1753
.   : milestone, 1728,
iast (1.501 ms) : 1477, 1525
.   : milestone, 1501,
profiling (1.469 ms) : 1444, 1494
.   : milestone, 1469,
tracing (1.491 ms) : 1467, 1515
.   : milestone, 1491,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.341 ms [1.321 ms, 1.36 ms] -
appsec 1.736 ms [1.711 ms, 1.76 ms] 394.996 µs (29.5%)
iast 1.506 ms [1.482 ms, 1.531 ms] 165.772 µs (12.4%)
profiling 1.496 ms [1.471 ms, 1.522 ms] 155.774 µs (11.6%)
tracing 1.459 ms [1.434 ms, 1.484 ms] 118.845 µs (8.9%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.338 ms [1.319 ms, 1.357 ms] -
appsec 1.728 ms [1.704 ms, 1.753 ms] 390.134 µs (29.2%)
iast 1.501 ms [1.477 ms, 1.525 ms] 162.877 µs (12.2%)
profiling 1.469 ms [1.444 ms, 1.494 ms] 130.361 µs (9.7%)
tracing 1.491 ms [1.467 ms, 1.515 ms] 152.631 µs (11.4%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.23.0-SNAPSHOT~70d9234851, baseline=1.24.0-SNAPSHOT~3c1890cf33
    dateFormat X
    axisFormat %s
section baseline
no_agent (360.977 µs) : 341, 381
.   : milestone, 361,
iast (470.374 µs) : 449, 491
.   : milestone, 470,
iast_FULL (537.021 µs) : 516, 558
.   : milestone, 537,
iast_INACTIVE (445.75 µs) : 425, 466
.   : milestone, 446,
iast_TELEMETRY_OFF (470.533 µs) : 449, 492
.   : milestone, 471,
tracing (441.75 µs) : 421, 463
.   : milestone, 442,
section candidate
no_agent (363.396 µs) : 344, 383
.   : milestone, 363,
iast (468.831 µs) : 448, 490
.   : milestone, 469,
iast_FULL (524.532 µs) : 504, 545
.   : milestone, 525,
iast_INACTIVE (444.984 µs) : 423, 467
.   : milestone, 445,
iast_TELEMETRY_OFF (459.707 µs) : 439, 480
.   : milestone, 460,
tracing (439.396 µs) : 418, 460
.   : milestone, 439,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 360.977 µs [340.622 µs, 381.332 µs] -
iast 470.374 µs [449.309 µs, 491.439 µs] 109.397 µs (30.3%)
iast_FULL 537.021 µs [516.334 µs, 557.708 µs] 176.044 µs (48.8%)
iast_INACTIVE 445.75 µs [425.071 µs, 466.429 µs] 84.773 µs (23.5%)
iast_TELEMETRY_OFF 470.533 µs [449.077 µs, 491.989 µs] 109.556 µs (30.3%)
tracing 441.75 µs [420.974 µs, 462.525 µs] 80.772 µs (22.4%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 363.396 µs [343.649 µs, 383.142 µs] -
iast 468.831 µs [447.773 µs, 489.888 µs] 105.435 µs (29.0%)
iast_FULL 524.532 µs [504.034 µs, 545.03 µs] 161.136 µs (44.3%)
iast_INACTIVE 444.984 µs [423.259 µs, 466.709 µs] 81.588 µs (22.5%)
iast_TELEMETRY_OFF 459.707 µs [439.07 µs, 480.343 µs] 96.311 µs (26.5%)
tracing 439.396 µs [418.369 µs, 460.424 µs] 76.001 µs (20.9%)

Copy link
Contributor

@bric3 bric3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same issue with IntelliJ classloader, which strips the protection domain. So when two agents are loaded at the same time (e.g the Kotlin Coroutine debugger agent) dd-trace-java bails out.

Coincidentally I was trying to fix AgentBootstrap for that reason, using a property as a fall back instead of short circuiting when there are multiple agents. However the proposed approach seems to work better without "user" intervention.

For reference I have reproduced the failure with IJ there bric3/dd-trace-java-with-ij-classloading-and-another-agent, and trying your change avoid the issue.


I believe the feature is looking good, but I only comment as I'm not a member of this project.

@nikita-tkachenko-datadog
Copy link
Contributor Author

nikita-tkachenko-datadog commented Nov 14, 2023

I have the same issue with IntelliJ classloader, which strips the protection domain. So when two agents are loaded at the same time (e.g the Kotlin Coroutine debugger agent) dd-trace-java bails out.

Yes, this is exactly the issue that I am fixing here :) I tested this with the DD Intellij Plugin repo, and the fix works without the need for additional changes in the project

@bric3
Copy link
Contributor

bric3 commented Nov 14, 2023

Would you be interested in me adding a smoke test for this ?

@nikita-tkachenko-datadog
Copy link
Contributor Author

Would you be interested in me adding a smoke test for this ?

Of course, that'd be great!

Copy link
Contributor

@mcculls mcculls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - same comments as @bric3, but no blockers

@nikita-tkachenko-datadog nikita-tkachenko-datadog merged commit e4541b1 into master Nov 15, 2023
60 of 69 checks passed
@nikita-tkachenko-datadog nikita-tkachenko-datadog deleted the nikita-tkachenko/agent-bootstrap-class-resource-lookup branch November 15, 2023 13:49
@github-actions github-actions bot added this to the 1.24.0 milestone Nov 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp: core Tracer core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants