-
Notifications
You must be signed in to change notification settings - Fork 319
Fix field-injection of ForkJoinTask on Java 25 #10084
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… with timeout during premain - otherwise on Java 25 it will load ForkJoinPool which in turn loads ForkJoinTask, which then means we lose the chance to field-inject context into ForkJoinTask instances
cebf1c7 to
a73fd0b
Compare
BenchmarksStartupParameters
See matching parameters
SummaryFound 18 performance improvements and 6 performance regressions! Performance is the same for 27 metrics, 14 unstable metrics.
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.57.0-SNAPSHOT~c4beb14f04, baseline=1.57.0-SNAPSHOT~b690a79bd7
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.054 s) : 0, 1054247
Total [baseline] (10.942 s) : 0, 10942466
Agent [candidate] (1.027 s) : 0, 1026978
Total [candidate] (10.926 s) : 0, 10925591
section appsec
Agent [baseline] (1.23 s) : 0, 1230391
Total [baseline] (10.943 s) : 0, 10942681
Agent [candidate] (1.202 s) : 0, 1202372
Total [candidate] (10.917 s) : 0, 10917066
section iast
Agent [baseline] (1.204 s) : 0, 1204422
Total [baseline] (11.249 s) : 0, 11249194
Agent [candidate] (1.167 s) : 0, 1166935
Total [candidate] (11.169 s) : 0, 11169205
section profiling
Agent [baseline] (1.207 s) : 0, 1207196
Total [baseline] (10.961 s) : 0, 10960590
Agent [candidate] (1.166 s) : 0, 1165910
Total [candidate] (10.903 s) : 0, 10903124
gantt
title petclinic - break down per module: candidate=1.57.0-SNAPSHOT~c4beb14f04, baseline=1.57.0-SNAPSHOT~b690a79bd7
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.483 ms) : 0, 1483
crashtracking [candidate] (1.206 ms) : 0, 1206
BytebuddyAgent [baseline] (692.785 ms) : 0, 692785
BytebuddyAgent [candidate] (647.181 ms) : 0, 647181
GlobalTracer [baseline] (264.875 ms) : 0, 264875
GlobalTracer [candidate] (282.618 ms) : 0, 282618
AppSec [baseline] (31.998 ms) : 0, 31998
AppSec [candidate] (32.108 ms) : 0, 32108
Debugger [baseline] (6.351 ms) : 0, 6351
Debugger [candidate] (6.314 ms) : 0, 6314
Remote Config [baseline] (675.255 µs) : 0, 675
Remote Config [candidate] (668.192 µs) : 0, 668
Telemetry [baseline] (16.591 ms) : 0, 16591
Telemetry [candidate] (9.235 ms) : 0, 9235
Flare Poller [baseline] (4.351 ms) : 0, 4351
Flare Poller [candidate] (11.959 ms) : 0, 11959
section appsec
crashtracking [baseline] (1.473 ms) : 0, 1473
crashtracking [candidate] (1.197 ms) : 0, 1197
BytebuddyAgent [baseline] (732.579 ms) : 0, 732579
BytebuddyAgent [candidate] (686.28 ms) : 0, 686280
GlobalTracer [baseline] (241.165 ms) : 0, 241165
GlobalTracer [candidate] (259.428 ms) : 0, 259428
IAST [baseline] (24.633 ms) : 0, 24633
IAST [candidate] (24.67 ms) : 0, 24670
AppSec [baseline] (174.703 ms) : 0, 174703
AppSec [candidate] (174.011 ms) : 0, 174011
Debugger [baseline] (6.348 ms) : 0, 6348
Debugger [candidate] (6.516 ms) : 0, 6516
Remote Config [baseline] (710.773 µs) : 0, 711
Remote Config [candidate] (727.251 µs) : 0, 727
Telemetry [baseline] (9.725 ms) : 0, 9725
Telemetry [candidate] (9.958 ms) : 0, 9958
Flare Poller [baseline] (3.962 ms) : 0, 3962
Flare Poller [candidate] (4.208 ms) : 0, 4208
section iast
crashtracking [baseline] (1.489 ms) : 0, 1489
crashtracking [candidate] (1.195 ms) : 0, 1195
BytebuddyAgent [baseline] (840.036 ms) : 0, 840036
BytebuddyAgent [candidate] (793.305 ms) : 0, 793305
GlobalTracer [baseline] (239.061 ms) : 0, 239061
GlobalTracer [candidate] (255.385 ms) : 0, 255385
IAST [baseline] (30.478 ms) : 0, 30478
IAST [candidate] (26.702 ms) : 0, 26702
AppSec [baseline] (31.705 ms) : 0, 31705
AppSec [candidate] (35.373 ms) : 0, 35373
Debugger [baseline] (6.109 ms) : 0, 6109
Debugger [candidate] (6.067 ms) : 0, 6067
Remote Config [baseline] (616.451 µs) : 0, 616
Remote Config [candidate] (589.192 µs) : 0, 589
Telemetry [baseline] (8.643 ms) : 0, 8643
Telemetry [candidate] (8.672 ms) : 0, 8672
Flare Poller [baseline] (10.869 ms) : 0, 10869
Flare Poller [candidate] (4.19 ms) : 0, 4190
section profiling
crashtracking [baseline] (1.442 ms) : 0, 1442
crashtracking [candidate] (1.193 ms) : 0, 1193
BytebuddyAgent [baseline] (740.015 ms) : 0, 740015
BytebuddyAgent [candidate] (701.578 ms) : 0, 701578
GlobalTracer [baseline] (223.541 ms) : 0, 223541
GlobalTracer [candidate] (221.275 ms) : 0, 221275
AppSec [baseline] (32.488 ms) : 0, 32488
AppSec [candidate] (31.976 ms) : 0, 31976
Debugger [baseline] (7.468 ms) : 0, 7468
Debugger [candidate] (6.674 ms) : 0, 6674
Remote Config [baseline] (1.45 ms) : 0, 1450
Remote Config [candidate] (679.301 µs) : 0, 679
Telemetry [baseline] (15.222 ms) : 0, 15222
Telemetry [candidate] (9.836 ms) : 0, 9836
Flare Poller [baseline] (4.118 ms) : 0, 4118
Flare Poller [candidate] (10.328 ms) : 0, 10328
ProfilingAgent [baseline] (111.514 ms) : 0, 111514
ProfilingAgent [candidate] (112.549 ms) : 0, 112549
Profiling [baseline] (112.16 ms) : 0, 112160
Profiling [candidate] (113.212 ms) : 0, 113212
Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.57.0-SNAPSHOT~c4beb14f04, baseline=1.57.0-SNAPSHOT~b690a79bd7
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.05 s) : 0, 1049973
Total [baseline] (8.738 s) : 0, 8738025
Agent [candidate] (1.025 s) : 0, 1024791
Total [candidate] (8.714 s) : 0, 8713806
section iast
Agent [baseline] (1.195 s) : 0, 1195101
Total [baseline] (9.351 s) : 0, 9351011
Agent [candidate] (1.169 s) : 0, 1168635
Total [candidate] (9.355 s) : 0, 9354793
gantt
title insecure-bank - break down per module: candidate=1.57.0-SNAPSHOT~c4beb14f04, baseline=1.57.0-SNAPSHOT~b690a79bd7
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.474 ms) : 0, 1474
crashtracking [candidate] (1.187 ms) : 0, 1187
BytebuddyAgent [baseline] (689.897 ms) : 0, 689897
BytebuddyAgent [candidate] (647.051 ms) : 0, 647051
GlobalTracer [baseline] (263.675 ms) : 0, 263675
GlobalTracer [candidate] (282.586 ms) : 0, 282586
AppSec [baseline] (31.891 ms) : 0, 31891
AppSec [candidate] (32.355 ms) : 0, 32355
Debugger [baseline] (6.317 ms) : 0, 6317
Debugger [candidate] (6.376 ms) : 0, 6376
Remote Config [baseline] (658.288 µs) : 0, 658
Remote Config [candidate] (666.195 µs) : 0, 666
Telemetry [baseline] (15.119 ms) : 0, 15119
Telemetry [candidate] (9.287 ms) : 0, 9287
Flare Poller [baseline] (5.839 ms) : 0, 5839
Flare Poller [candidate] (9.637 ms) : 0, 9637
section iast
crashtracking [baseline] (1.489 ms) : 0, 1489
crashtracking [candidate] (1.186 ms) : 0, 1186
BytebuddyAgent [baseline] (833.338 ms) : 0, 833338
BytebuddyAgent [candidate] (790.269 ms) : 0, 790269
GlobalTracer [baseline] (237.63 ms) : 0, 237630
GlobalTracer [candidate] (258.868 ms) : 0, 258868
IAST [baseline] (34.463 ms) : 0, 34463
IAST [candidate] (28.031 ms) : 0, 28031
AppSec [baseline] (27.026 ms) : 0, 27026
AppSec [candidate] (33.271 ms) : 0, 33271
Debugger [baseline] (6.036 ms) : 0, 6036
Debugger [candidate] (7.953 ms) : 0, 7953
Remote Config [baseline] (628.731 µs) : 0, 629
Remote Config [candidate] (626.766 µs) : 0, 627
Telemetry [baseline] (8.489 ms) : 0, 8489
Telemetry [candidate] (8.902 ms) : 0, 8902
Flare Poller [baseline] (10.817 ms) : 0, 10817
Flare Poller [candidate] (4.127 ms) : 0, 4127
LoadParameters
See matching parameters
SummaryFound 2 performance improvements and 2 performance regressions! Performance is the same for 16 metrics, 16 unstable metrics.
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.57.0-SNAPSHOT~c4beb14f04, baseline=1.57.0-SNAPSHOT~b690a79bd7
dateFormat X
axisFormat %s
section baseline
no_agent (1.195 ms) : 1184, 1207
. : milestone, 1195,
iast (3.286 ms) : 3244, 3328
. : milestone, 3286,
iast_FULL (5.742 ms) : 5685, 5799
. : milestone, 5742,
iast_GLOBAL (3.681 ms) : 3627, 3736
. : milestone, 3681,
profiling (2.015 ms) : 1996, 2034
. : milestone, 2015,
tracing (1.783 ms) : 1769, 1798
. : milestone, 1783,
section candidate
no_agent (1.231 ms) : 1219, 1243
. : milestone, 1231,
iast (3.236 ms) : 3193, 3279
. : milestone, 3236,
iast_FULL (5.618 ms) : 5563, 5674
. : milestone, 5618,
iast_GLOBAL (3.738 ms) : 3683, 3793
. : milestone, 3738,
profiling (2.011 ms) : 1992, 2030
. : milestone, 2011,
tracing (1.845 ms) : 1828, 1861
. : milestone, 1845,
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.57.0-SNAPSHOT~c4beb14f04, baseline=1.57.0-SNAPSHOT~b690a79bd7
dateFormat X
axisFormat %s
section baseline
no_agent (18.315 ms) : 18127, 18504
. : milestone, 18315,
appsec (18.518 ms) : 18331, 18705
. : milestone, 18518,
code_origins (17.795 ms) : 17617, 17973
. : milestone, 17795,
iast (17.861 ms) : 17685, 18038
. : milestone, 17861,
profiling (19.399 ms) : 19200, 19599
. : milestone, 19399,
tracing (17.547 ms) : 17372, 17723
. : milestone, 17547,
section candidate
no_agent (18.933 ms) : 18742, 19125
. : milestone, 18933,
appsec (18.684 ms) : 18492, 18875
. : milestone, 18684,
code_origins (17.699 ms) : 17525, 17872
. : milestone, 17699,
iast (17.679 ms) : 17508, 17851
. : milestone, 17679,
profiling (18.52 ms) : 18332, 18709
. : milestone, 18520,
tracing (18.686 ms) : 18500, 18871
. : milestone, 18686,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics. Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.57.0-SNAPSHOT~c4beb14f04, baseline=1.57.0-SNAPSHOT~b690a79bd7
dateFormat X
axisFormat %s
section baseline
no_agent (1.477 ms) : 1465, 1489
. : milestone, 1477,
appsec (3.69 ms) : 3473, 3907
. : milestone, 3690,
iast (2.211 ms) : 2146, 2276
. : milestone, 2211,
iast_GLOBAL (2.258 ms) : 2193, 2324
. : milestone, 2258,
profiling (2.055 ms) : 2002, 2107
. : milestone, 2055,
tracing (2.047 ms) : 1996, 2099
. : milestone, 2047,
section candidate
no_agent (1.473 ms) : 1462, 1485
. : milestone, 1473,
appsec (3.694 ms) : 3477, 3910
. : milestone, 3694,
iast (2.211 ms) : 2146, 2276
. : milestone, 2211,
iast_GLOBAL (2.256 ms) : 2190, 2321
. : milestone, 2256,
profiling (2.09 ms) : 2035, 2145
. : milestone, 2090,
tracing (2.042 ms) : 1991, 2093
. : milestone, 2042,
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.57.0-SNAPSHOT~c4beb14f04, baseline=1.57.0-SNAPSHOT~b690a79bd7
dateFormat X
axisFormat %s
section baseline
no_agent (15.608 s) : 15608000, 15608000
. : milestone, 15608000,
appsec (14.927 s) : 14927000, 14927000
. : milestone, 14927000,
iast (18.197 s) : 18197000, 18197000
. : milestone, 18197000,
iast_GLOBAL (18.206 s) : 18206000, 18206000
. : milestone, 18206000,
profiling (14.582 s) : 14582000, 14582000
. : milestone, 14582000,
tracing (14.658 s) : 14658000, 14658000
. : milestone, 14658000,
section candidate
no_agent (15.618 s) : 15618000, 15618000
. : milestone, 15618000,
appsec (14.504 s) : 14504000, 14504000
. : milestone, 14504000,
iast (17.974 s) : 17974000, 17974000
. : milestone, 17974000,
iast_GLOBAL (17.861 s) : 17861000, 17861000
. : milestone, 17861000,
profiling (14.668 s) : 14668000, 14668000
. : milestone, 14668000,
tracing (14.534 s) : 14534000, 14534000
. : milestone, 14534000,
|
amarziali
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix and for adding the test case. I’m wondering whether, in the long run, we should introduce a very lightweight scheduler - something that won’t impact load but would still provide an opportunity to perform field injection when classes are redefined.
That was the idea behind When I joined the tracer used a So we replaced it with a simple mechanism built on top of a It's only recently that Oracle made the internals behind Any solution that uses While we could potentially develop our own So this is a pragmatic short-term fix which soon won't be necessary. |
|
Thanks for all the context and details explanation 🙏 |
What Does This Do
Turning off crash-tracking in
FieldInjectionSmokeTestwas hiding a bug on Java 25:https://gitlab.ddbuild.io/datadog/apm-reliability/dd-trace-java/builds/1270792914
By default crash-tracking executes a task on the
dd-task-schedulerthread before any instrumentations are installed. WhenAgentTaskScheduleris called for the first time without an initial delay we prepare the work queue (an instance ofDelayQueue), execute the task, and then block onworkQueue.take().On Java 25 calling
DelayQueue.take()when theDelayQueueis empty leads to anawaitcall without a timeout:https://github.com/openjdk/jdk/blob/jdk-25%2B0/src/java.base/share/classes/java/util/concurrent/DelayQueue.java#L243
this in turn leads to the
ForkJoinPoolclass being initialized:https://github.com/openjdk/jdk/blob/jdk-25%2B0/src/java.base/share/classes/java/util/concurrent/locks/AbstractQueuedSynchronizer.java#L1751
which in turn leads to the
ForkJoinTaskclass being loaded:https://github.com/openjdk/jdk/blob/jdk-25%2B0/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java#L3977
and because this happens before any instrumentations are installed we lose the chance to field-inject
ForkJoinTask. This results in us using the global weak map to track async context across instances ofForkJoinTaskwhich is not memory efficient.The simplest solution is to add a no-op, one-shot task far enough in the future when preparing the work queue. The
take()call will now callawaitNanos(delay)which doesn't result in either of theForkJoinPoolorForkJoinTaskclasses being initialized.Adding this no-op, one-shot task doesn't affect startup performance.
Contributor Checklist
type:and (comp:orinst:) labels in addition to any useful labelsclose,fixor any linking keywords when referencing an issue.Use
solvesinstead, and assign the PR milestone to the issueJira ticket: [PROJ-IDENT]