Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement extra_services for remote config #6130

Merged
merged 36 commits into from Feb 16, 2024

Conversation

jandro996
Copy link
Member

@jandro996 jandro996 commented Nov 2, 2023

What Does This Do

Add extraService field to RemoteConfigRequest with all the different service names in the spans

New ExtraServiceProvider to manage extra services that satisfies the next conditions:

  • Limited to 64 elements
  • New service name can be added if its different from the global one, is not null and is not in the current list
  • When comparing service names, it should be don in a case-insensitive manner
  • if the list is empty it should return null

It has been decided to attempt adding a new extra service when the span will be finished. This approach is used by other languages like dotNet. Any suggestions from @DataDog/apm-java ?

Motivation

Knowing all service names used by a tracer instances help the UI give better feedback to the user.
The agent version v7.46.0 contains an new field "extra_services" in the remote config client, that allows clients to list any additional services that are used within tracer spans.

Additional Notes

Jira ticket: APPSEC-9907
.NET PR: DataDog/dd-trace-dotnet#4419

@jandro996 jandro996 added the comp: asm iast Application Security Management (IAST) label Nov 2, 2023
@jandro996 jandro996 force-pushed the alejandro.gonzalez/java_remote_config_extra_services branch from 3d60edb to 5948be8 Compare November 2, 2023 10:54
@pr-commenter
Copy link

pr-commenter bot commented Nov 2, 2023

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master alejandro.gonzalez/java_remote_config_extra_services
git_commit_date 1708008561 1708010070
git_commit_sha 142c8ca 8b357ab
release_version 1.31.0-SNAPSHOT~142c8cac77 1.31.0-SNAPSHOT~8b357abce7
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1708012928 1708012928
ci_job_id 435144991 435144991
ci_pipeline_id 28359306 28359306
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
module Agent Agent
parent None None
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 44 metrics, 10 unstable metrics.

Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.31.0-SNAPSHOT~8b357abce7, baseline=1.31.0-SNAPSHOT~142c8cac77

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.059 s) : 0, 1059423
Total [baseline] (9.367 s) : 0, 9367301
Agent [candidate] (1.065 s) : 0, 1065307
Total [candidate] (9.345 s) : 0, 9345382
section appsec
Agent [baseline] (1.168 s) : 0, 1168037
Total [baseline] (9.495 s) : 0, 9495162
Agent [candidate] (1.159 s) : 0, 1158716
Total [candidate] (9.451 s) : 0, 9450511
section iast
Agent [baseline] (1.194 s) : 0, 1194162
Total [baseline] (9.768 s) : 0, 9767988
Agent [candidate] (1.188 s) : 0, 1187828
Total [candidate] (9.722 s) : 0, 9721950
section profiling
Agent [baseline] (1.288 s) : 0, 1287750
Total [baseline] (9.611 s) : 0, 9610678
Agent [candidate] (1.28 s) : 0, 1280433
Total [candidate] (9.614 s) : 0, 9613623
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.059 s -
Agent appsec 1.168 s 108.615 ms (10.3%)
Agent iast 1.194 s 134.739 ms (12.7%)
Agent profiling 1.288 s 228.327 ms (21.6%)
Total tracing 9.367 s -
Total appsec 9.495 s 127.861 ms (1.4%)
Total iast 9.768 s 400.687 ms (4.3%)
Total profiling 9.611 s 243.378 ms (2.6%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.065 s -
Agent appsec 1.159 s 93.409 ms (8.8%)
Agent iast 1.188 s 122.521 ms (11.5%)
Agent profiling 1.28 s 215.126 ms (20.2%)
Total tracing 9.345 s -
Total appsec 9.451 s 105.129 ms (1.1%)
Total iast 9.722 s 376.568 ms (4.0%)
Total profiling 9.614 s 268.241 ms (2.9%)
gantt
    title petclinic - break down per module: candidate=1.31.0-SNAPSHOT~8b357abce7, baseline=1.31.0-SNAPSHOT~142c8cac77

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (666.934 ms) : 0, 666934
BytebuddyAgent [candidate] (670.695 ms) : 0, 670695
GlobalTracer [baseline] (298.087 ms) : 0, 298087
GlobalTracer [candidate] (300.384 ms) : 0, 300384
AppSec [baseline] (51.918 ms) : 0, 51918
AppSec [candidate] (51.613 ms) : 0, 51613
Remote Config [baseline] (704.905 µs) : 0, 705
Remote Config [candidate] (709.597 µs) : 0, 710
Telemetry [baseline] (7.608 ms) : 0, 7608
Telemetry [candidate] (7.636 ms) : 0, 7636
section appsec
BytebuddyAgent [baseline] (673.171 ms) : 0, 673171
BytebuddyAgent [candidate] (667.76 ms) : 0, 667760
GlobalTracer [baseline] (301.039 ms) : 0, 301039
GlobalTracer [candidate] (298.542 ms) : 0, 298542
AppSec [baseline] (151.649 ms) : 0, 151649
AppSec [candidate] (150.746 ms) : 0, 150746
Remote Config [baseline] (643.434 µs) : 0, 643
Remote Config [candidate] (634.561 µs) : 0, 635
Telemetry [baseline] (6.901 ms) : 0, 6901
Telemetry [candidate] (6.817 ms) : 0, 6817
section iast
BytebuddyAgent [baseline] (785.538 ms) : 0, 785538
BytebuddyAgent [candidate] (780.61 ms) : 0, 780610
GlobalTracer [baseline] (292.124 ms) : 0, 292124
GlobalTracer [candidate] (290.453 ms) : 0, 290453
AppSec [baseline] (53.084 ms) : 0, 53084
AppSec [candidate] (51.932 ms) : 0, 51932
Remote Config [baseline] (672.422 µs) : 0, 672
Remote Config [candidate] (706.352 µs) : 0, 706
Telemetry [baseline] (6.646 ms) : 0, 6646
Telemetry [candidate] (6.602 ms) : 0, 6602
IAST [baseline] (21.442 ms) : 0, 21442
IAST [candidate] (23.176 ms) : 0, 23176
section profiling
ProfilingAgent [baseline] (112.417 ms) : 0, 112417
ProfilingAgent [candidate] (111.998 ms) : 0, 111998
BytebuddyAgent [baseline] (670.222 ms) : 0, 670222
BytebuddyAgent [candidate] (667.153 ms) : 0, 667153
GlobalTracer [baseline] (385.299 ms) : 0, 385299
GlobalTracer [candidate] (383.029 ms) : 0, 383029
AppSec [baseline] (52.508 ms) : 0, 52508
AppSec [candidate] (51.981 ms) : 0, 51981
Remote Config [baseline] (673.388 µs) : 0, 673
Remote Config [candidate] (662.257 µs) : 0, 662
Telemetry [baseline] (11.638 ms) : 0, 11638
Telemetry [candidate] (10.862 ms) : 0, 10862
Profiling [baseline] (112.444 ms) : 0, 112444
Profiling [candidate] (112.022 ms) : 0, 112022

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
end_time 2024-02-15T15:38:13 2024-02-15T15:57:07
git_branch master alejandro.gonzalez/java_remote_config_extra_services
git_commit_date 1708008561 1708010070
git_commit_sha 142c8ca 8b357ab
release_version 1.31.0-SNAPSHOT~142c8cac77 1.31.0-SNAPSHOT~8b357abce7
start_time 2024-02-15T15:38:00 2024-02-15T15:56:54
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1708012928 1708012928
ci_job_id 435144991 435144991
ci_pipeline_id 28359306 28359306
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
variant iast iast

Summary

Found 1 performance improvements and 0 performance regressions! Performance is the same for 9 metrics, 16 unstable metrics.

scenario Δ mean http_req_duration Δ mean throughput candidate mean http_req_duration candidate mean throughput baseline mean http_req_duration baseline mean throughput
scenario:load:petclinic:profiling better
[-92.870µs; -35.903µs] or [-5.854%; -2.263%]
unstable
[-587.082op/s; +587.082op/s] or [-19.814%; +19.814%]
1.522ms 2962.963op/s 1.587ms 2962.963op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.31.0-SNAPSHOT~8b357abce7, baseline=1.31.0-SNAPSHOT~142c8cac77
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.34 ms) : 1321, 1359
.   : milestone, 1340,
appsec (1.768 ms) : 1743, 1794
.   : milestone, 1768,
iast (1.525 ms) : 1501, 1549
.   : milestone, 1525,
profiling (1.587 ms) : 1560, 1613
.   : milestone, 1587,
tracing (1.496 ms) : 1471, 1522
.   : milestone, 1496,
section candidate
no_agent (1.357 ms) : 1338, 1376
.   : milestone, 1357,
appsec (1.766 ms) : 1740, 1791
.   : milestone, 1766,
iast (1.509 ms) : 1484, 1533
.   : milestone, 1509,
profiling (1.522 ms) : 1496, 1549
.   : milestone, 1522,
tracing (1.499 ms) : 1474, 1523
.   : milestone, 1499,
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.34 ms [1.321 ms, 1.359 ms] -
appsec 1.768 ms [1.743 ms, 1.794 ms] 428.235 µs (32.0%)
iast 1.525 ms [1.501 ms, 1.549 ms] 185.133 µs (13.8%)
profiling 1.587 ms [1.56 ms, 1.613 ms] 246.3 µs (18.4%)
tracing 1.496 ms [1.471 ms, 1.522 ms] 156.223 µs (11.7%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.357 ms [1.338 ms, 1.376 ms] -
appsec 1.766 ms [1.74 ms, 1.791 ms] 408.442 µs (30.1%)
iast 1.509 ms [1.484 ms, 1.533 ms] 151.735 µs (11.2%)
profiling 1.522 ms [1.496 ms, 1.549 ms] 165.072 µs (12.2%)
tracing 1.499 ms [1.474 ms, 1.523 ms] 141.622 µs (10.4%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.31.0-SNAPSHOT~8b357abce7, baseline=1.31.0-SNAPSHOT~142c8cac77
    dateFormat X
    axisFormat %s
section baseline
no_agent (374.392 µs) : 352, 396
.   : milestone, 374,
iast (460.369 µs) : 440, 481
.   : milestone, 460,
iast_FULL (531.682 µs) : 511, 552
.   : milestone, 532,
iast_GLOBAL (484.559 µs) : 463, 506
.   : milestone, 485,
iast_HARDCODED_SECRET_DISABLED (472.252 µs) : 451, 493
.   : milestone, 472,
iast_INACTIVE (436.434 µs) : 416, 457
.   : milestone, 436,
iast_TELEMETRY_OFF (468.934 µs) : 448, 490
.   : milestone, 469,
tracing (433.637 µs) : 413, 455
.   : milestone, 434,
section candidate
no_agent (370.33 µs) : 348, 393
.   : milestone, 370,
iast (471.344 µs) : 450, 493
.   : milestone, 471,
iast_FULL (526.951 µs) : 506, 548
.   : milestone, 527,
iast_GLOBAL (497.862 µs) : 477, 519
.   : milestone, 498,
iast_HARDCODED_SECRET_DISABLED (471.663 µs) : 451, 492
.   : milestone, 472,
iast_INACTIVE (448.813 µs) : 427, 470
.   : milestone, 449,
iast_TELEMETRY_OFF (461.71 µs) : 441, 483
.   : milestone, 462,
tracing (439.076 µs) : 418, 460
.   : milestone, 439,
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 374.392 µs [352.352 µs, 396.433 µs] -
iast 460.369 µs [440.178 µs, 480.56 µs] 85.976 µs (23.0%)
iast_FULL 531.682 µs [511.163 µs, 552.201 µs] 157.29 µs (42.0%)
iast_GLOBAL 484.559 µs [463.071 µs, 506.047 µs] 110.167 µs (29.4%)
iast_HARDCODED_SECRET_DISABLED 472.252 µs [451.375 µs, 493.129 µs] 97.859 µs (26.1%)
iast_INACTIVE 436.434 µs [415.874 µs, 456.995 µs] 62.042 µs (16.6%)
iast_TELEMETRY_OFF 468.934 µs [447.806 µs, 490.062 µs] 94.542 µs (25.3%)
tracing 433.637 µs [412.752 µs, 454.523 µs] 59.245 µs (15.8%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 370.33 µs [347.767 µs, 392.893 µs] -
iast 471.344 µs [450.0 µs, 492.689 µs] 101.014 µs (27.3%)
iast_FULL 526.951 µs [506.401 µs, 547.502 µs] 156.621 µs (42.3%)
iast_GLOBAL 497.862 µs [476.65 µs, 519.074 µs] 127.532 µs (34.4%)
iast_HARDCODED_SECRET_DISABLED 471.663 µs [451.111 µs, 492.216 µs] 101.333 µs (27.4%)
iast_INACTIVE 448.813 µs [427.296 µs, 470.33 µs] 78.483 µs (21.2%)
iast_TELEMETRY_OFF 461.71 µs [440.824 µs, 482.596 µs] 91.379 µs (24.7%)
tracing 439.076 µs [418.234 µs, 459.918 µs] 68.746 µs (18.6%)

@jandro996 jandro996 force-pushed the alejandro.gonzalez/java_remote_config_extra_services branch from 631d8e6 to e36bd71 Compare November 3, 2023 12:33
@jandro996 jandro996 marked this pull request as ready for review November 6, 2023 06:46
@jandro996 jandro996 requested a review from a team as a code owner November 6, 2023 06:46
@jandro996 jandro996 force-pushed the alejandro.gonzalez/java_remote_config_extra_services branch from 967f107 to b9096d8 Compare November 10, 2023 10:56
@jandro996 jandro996 requested a review from a team as a code owner November 10, 2023 10:56
@jandro996 jandro996 requested review from jpbempel and removed request for a team November 10, 2023 10:56
@jandro996 jandro996 force-pushed the alejandro.gonzalez/java_remote_config_extra_services branch from b9096d8 to 0912fbf Compare November 10, 2023 11:17
@jandro996 jandro996 requested a review from a team as a code owner November 10, 2023 11:17
@jandro996 jandro996 force-pushed the alejandro.gonzalez/java_remote_config_extra_services branch from 0912fbf to 400c162 Compare November 10, 2023 11:50
@jandro996 jandro996 force-pushed the alejandro.gonzalez/java_remote_config_extra_services branch from 732bc97 to ee64e26 Compare November 11, 2023 07:19
Copy link
Contributor

@mcculls mcculls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this happens on every single call to finish a span this has the potential to noticeably affect performance. Have we done any performance measurements with real-world applications to gauge the impact?

Rather than compare the span's service name against the local service name on finish (which is an extremely hot code path) is there a way we could collect this information as the naming schema(s) are accessed? There are potentially ways to do that so we only add the extra service name to the bucket once, the first time the naming schema is accessed.

@jandro996 jandro996 force-pushed the alejandro.gonzalez/java_remote_config_extra_services branch from 695eebc to 0ce4241 Compare November 15, 2023 10:40
@smola smola added comp: remote config Configuration at Runtime and removed comp: asm iast Application Security Management (IAST) labels Nov 15, 2023
Copy link
Collaborator

@amarziali amarziali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's OK to me. You also should need to add the same on the TagInterceptor when the service name is inferred from the servlet context (the case of multi-services for webapps) or for the service name tag (will address service name set by manual instrumentation)

@@ -1773,6 +1776,8 @@ PROFILING_DATADOG_PROFILER_ENABLED, isDatadogProfilerSafeInCurrentEnvironment())
remoteConfigTargetsKey =
configProvider.getString(REMOTE_CONFIG_TARGETS_KEY, DEFAULT_REMOTE_CONFIG_TARGETS_KEY);

remoteConfigMaxExtraServices = configProvider.getInteger(REMOTE_CONFIG_MAX_EXTRA_SERVICES, 64);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to extract a constant with the default value.

if (extraServices.size() >= MAX_EXTRA_SERVICE) {
if (!limitReachedLogged) {
log.debug(
"extra service limit({}) reached: service {} can't be added",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add the SEND_TELEMETRY marker, so we get notified if this situation is common.

@@ -0,0 +1,61 @@
package datadog.trace.util;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not technically a utility class, perhaps datadog.trace.api.telemetry would be a better package since it already contains similar collectors? Or datadog.trace.api.remoteconfig?

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class ExtraServicesProvider {
Copy link
Contributor

@mcculls mcculls Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest renaming this to ServiceCollector or ExtraServiceCollector because that's what it does - it is used to populate the extra services field in RC, but it could also be used elsewhere. (Generally I prefer naming classes based on what they do.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion!

// singleton
}

public void maybeAddExtraService(final String serviceName) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shorten to addService or recordService ?

Copy link
Contributor

@mcculls mcculls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor suggestions around naming

@jandro996 jandro996 merged commit 65d7de1 into master Feb 16, 2024
79 checks passed
@jandro996 jandro996 deleted the alejandro.gonzalez/java_remote_config_extra_services branch February 16, 2024 07:17
@github-actions github-actions bot added this to the 1.31.0 milestone Feb 16, 2024
jandro996 added a commit that referenced this pull request Feb 16, 2024
What Does This Do
Add extraService field to RemoteConfigRequest with all the different service names in the spans

New ExtraServiceProvider to manage extra services that satisfies the next conditions:

Limited to 64 elements
New service name can be added if its different from the global one, is not null and is not in the current list
When comparing service names, it should be don in a case-insensitive manner
if the list is empty it should return null
It has been decided to attempt adding a new extra service when naming-schema is  invoked to get a service name

Motivation
Knowing all service names used by a tracer instances help the UI give better feedback to the user.
The agent version v7.46.0 contains an new field "extra_services" in the remote config client, that allows clients to list any additional services that are used within tracer spans.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp: remote config Configuration at Runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants