Skip to content

[datadog] Add datadog.discovery.serviceMap.enabled#2641

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 17 commits into
mainfrom
bar.fins/discovery-service-map
May 28, 2026
Merged

[datadog] Add datadog.discovery.serviceMap.enabled#2641
gh-worker-dd-mergequeue-cf854d[bot] merged 17 commits into
mainfrom
bar.fins/discovery-service-map

Conversation

@BarFinsdd
Copy link
Copy Markdown
Contributor

@BarFinsdd BarFinsdd commented May 10, 2026

Wires up the free Discovery Service Map mode, system-probe boots a restricted USM monitor that produces HTTP/HTTPS topology only — no paid USM RED metrics, no billing impact. Designed for non-APM customers to see a service-to-service dependency map without enabling paid USM.

  • New value: datadog.discovery.serviceMap.enabled (default false)
  • Renders discovery.service_map.enabled in system-probe.yaml
  • Treats serviceMap.enabled as a system-probe-feature trigger so the daemon is deployed when service map is on standalone
  • should-render-discovery-config returns true when service map is on, so the discovery block is emitted even if discovery.enabled itself is left to its default

Coexistence with paid USM is handled agent-side: when both datadog.serviceMonitoring.enabled and datadog.discovery.serviceMap.enabled are true, the agent silently disables discovery (USM wins on billing).

What this PR does / why we need it:

Which issue this PR fixes

(optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged)

  • fixes #

Special notes for your reviewer:

Checklist

[Place an '[x]' (no spaces) in all applicable fields. Please remove unrelated fields.]

  • All commits are signed and show as "Verified" on GitHub (see: signing commits)
  • Chart Version semver bump label has been added (use <chartName>/minor-version, <chartName>/patch-version, or <chartName>/no-version-bump)
  • For datadog or datadog-operator chart or value changes, update the test baselines (run: make update-test-baselines)
  • For datadog chart changes, received ✅ from a member of your team

GitHub CI takes care of the below, but are still required:

  • Documentation has been updated with helm-docs (run: .github/helm-docs.sh)
  • CHANGELOG.md has been updated
  • Variables are documented in the README.md

Wires up the free Discovery Service Map mode introduced in agent
7.78. When enabled, system-probe boots a restricted USM monitor
that produces HTTP/HTTPS topology only — no paid USM RED metrics,
no billing impact. Designed for non-APM customers to see a
service-to-service dependency map without enabling paid USM.

- New value: datadog.discovery.serviceMap.enabled (default false)
- Renders discovery.service_map.enabled in system-probe.yaml
- Treats serviceMap.enabled as a system-probe-feature trigger so
  the daemon is deployed when service map is on standalone
- should-render-discovery-config returns true when service map is
  on, so the discovery block is emitted even if discovery.enabled
  itself is left to its default

Coexistence with paid USM is handled agent-side: when both
datadog.serviceMonitoring.enabled and datadog.discovery.serviceMap.enabled
are true, the agent silently disables discovery (USM wins on billing).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@BarFinsdd BarFinsdd self-assigned this May 10, 2026
@BarFinsdd BarFinsdd requested review from a team as code owners May 10, 2026 14:53
@BarFinsdd BarFinsdd requested review from gpalmz and removed request for a team May 10, 2026 14:53
@github-actions github-actions Bot added the chart/datadog This issue or pull request is related to the datadog chart label May 10, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8d7f7ec685

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

*/}}
{{- define "system-probe-feature" -}}
{{- if or .Values.datadog.securityAgent.runtime.enabled .Values.datadog.networkMonitoring.enabled .Values.datadog.systemProbe.enableTCPQueueLength .Values.datadog.systemProbe.enableOOMKill .Values.datadog.serviceMonitoring.enabled .Values.datadog.traceroute.enabled (eq (include "resolved-discovery-enabled" .) "true") (and .Values.datadog.gpuMonitoring.enabled .Values.datadog.gpuMonitoring.privilegedMode) .Values.datadog.dynamicInstrumentationGo.enabled (and .Values.datadog.securityAgent.compliance.enabled .Values.datadog.securityAgent.compliance.runInSystemProbe) (eq (include "should-enable-sbom-enrichment-usage" .) "true") -}}
{{- if or .Values.datadog.securityAgent.runtime.enabled .Values.datadog.networkMonitoring.enabled .Values.datadog.systemProbe.enableTCPQueueLength .Values.datadog.systemProbe.enableOOMKill .Values.datadog.serviceMonitoring.enabled .Values.datadog.traceroute.enabled (eq (include "resolved-discovery-enabled" .) "true") .Values.datadog.discovery.serviceMap.enabled (and .Values.datadog.gpuMonitoring.enabled .Values.datadog.gpuMonitoring.privilegedMode) .Values.datadog.dynamicInstrumentationGo.enabled (and .Values.datadog.securityAgent.compliance.enabled .Values.datadog.securityAgent.compliance.runInSystemProbe) (eq (include "should-enable-sbom-enrichment-usage" .) "true") -}}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Gate service map on legacy Autopilot

When datadog.discovery.serviceMap.enabled=true is set on legacy GKE Autopilot (providers.gke.autopilot=true with WorkloadAllowlist unavailable/HELM_FORCE_RENDER=false), this new term makes should-enable-system-probe render the system-probe container even though that Autopilot mode only permits the core agent. The existing discovery trigger goes through resolved-discovery-enabled, which suppresses system-probe for that path; service map needs the same Autopilot guard or installs on those clusters will be rejected.

Useful? React with 👍 / 👎.

Comment thread charts/datadog/values.yaml Outdated
Comment on lines +1146 to +1148
# datadog.discovery.serviceMap.enabled -- (bool) Enable the free Discovery Service Map mode. When true, system-probe boots a restricted USM monitor (HTTP/HTTPS topology only, not billed) so non-APM customers can see a service-to-service dependency map. Mutually exclusive with paid USM (`datadog.serviceMonitoring.enabled`); when both are set, paid USM wins and discovery is silently disabled. Linux only. Requires Agent >= 7.78.0.
serviceMap:
enabled: false
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add chart version and changelog for this value

This adds a new Datadog chart value and changes rendered templates, but the commit leaves charts/datadog/Chart.yaml at 3.208.2 and does not update charts/datadog/CHANGELOG.md. The datadog chart review guide requires a chart version bump and changelog entry for template/value behavior changes, otherwise the packaged chart release will not advertise the new behavior to users upgrading the chart.

Useful? React with 👍 / 👎.

BarFinsdd and others added 2 commits May 10, 2026 17:58
Minor bump for the new datadog.discovery.serviceMap.enabled value
added in 8d7f7ec.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolve chart-version-validation CI error: the badge at the top of
charts/datadog/README.md must match the version in Chart.yaml.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@BarFinsdd BarFinsdd marked this pull request as draft May 13, 2026 11:23
BarFinsdd and others added 3 commits May 13, 2026 17:47
Shorten CHANGELOG, README, and values.yaml descriptions for
datadog.discovery.serviceMap.enabled to match the terse phrasing
established by PR #1645.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The system-probe configmap now emits discovery.service_map.enabled,
so every baseline that renders system-probe needs the new line.
Regenerated via `make update-test-baselines-datadog-agent`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s wiring

Ship datadog.discovery.serviceMap.enabled as nil and only render the
service_map sub-block when explicitly set, matching the chart's
discovery.enabled convention so an explicit false is preserved rather
than silently elided.

Wire serviceMap into should-render-discovery-config and
system-probe-feature so setting it alone deploys system-probe and
renders the discovery block. Do not route through
resolved-discovery-enabled: the agent processes service_map.enabled
independently of discovery.enabled, so the chart should not flip
discovery.enabled on the user's behalf.

Extend the GKE Autopilot WorkloadAllowlist fail message in NOTES.txt
to list datadog.discovery.serviceMap.enabled, and shorten the parameter
description ("paid USM" -> "restricted USM").

Includes two merges from main during development; net diff against
origin/main is the serviceMap feature only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@BarFinsdd BarFinsdd force-pushed the bar.fins/discovery-service-map branch from 2c8d7bf to c72fc84 Compare May 18, 2026 14:46
…rvice-map

# Conflicts:
#	charts/datadog/CHANGELOG.md
#	charts/datadog/Chart.yaml
#	charts/datadog/README.md
@BarFinsdd BarFinsdd marked this pull request as ready for review May 18, 2026 14:59
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ddee031bd0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +87 to +88
service_map:
enabled: {{ $.Values.datadog.discovery.serviceMap.enabled }}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Ensure service map doesn't render under disabled discovery

When datadog.discovery.serviceMap.enabled=true is combined with a configuration where discovery resolves to false, such as datadog.discovery.enabled=false or an Agent pinned below 7.78, this renders service_map.enabled: true inside a discovery block whose parent enabled field remains false. The chart also starts system-probe for the new value, so the install appears to accept Service Map while the system-probe config disables the discovery subsystem that would consume it; gate the value on supported/resolved discovery or have it explicitly drive discovery.enabled in the supported case.

Useful? React with 👍 / 👎.

Comment thread charts/datadog/README.md Outdated
| datadog.disablePasswdMount | bool | `false` | Set this to true to disable mounting /etc/passwd in all containers |
| datadog.discovery.enabled | bool | `nil` | Enable Service Discovery. If omitted, the chart auto-enables it when the effective node Agent version resolved by the chart is >= 7.78.0, except on GKE Autopilot clusters where system-probe is not supported. If that resolution still yields a non-semver-ish tag, discovery treats it as latest. Explicit true/false always takes precedence. On supported Agent versions, the chart also enables `discovery.use_system_probe_lite` so discovery-only deployments can exec into `system-probe-lite`. |
| datadog.discovery.networkStats.enabled | bool | `true` | Enable Service Discovery Network Stats |
| datadog.discovery.serviceMap.enabled | bool | `nil` | Enable Discovery Service Map (HTTP/HTTPS topology only; restricted USM) |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to write HTTP/HTTPs, in the future we might add more protocols

…rvice-map

# Conflicts:
#	charts/datadog/CHANGELOG.md
#	charts/datadog/Chart.yaml
#	charts/datadog/README.md
…escription

The chart-side description does not need to enumerate the agent-side
protocol scope of Discovery Service Map. Keep only the "restricted USM"
qualifier, which captures the licensing constraint that affects whether
operators can turn it on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@datadog-prod-us1-5

This comment has been minimized.

Comment thread charts/datadog/values.yaml Outdated

# datadog.discovery.serviceMap.enabled -- (bool) Enable Discovery Service Map (restricted USM)
serviceMap:
enabled: # false
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this empty and not false?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followed the same pattern as datadog.discovery.enabled right above it

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there is a good intentional reason, I don't see why a bool variable should be anything but false or true.

@BarFinsdd BarFinsdd requested a review from brycekahle May 20, 2026 08:57
@gpalmz gpalmz requested review from tedkahwaji and removed request for gpalmz May 20, 2026 15:11
@BarFinsdd BarFinsdd force-pushed the bar.fins/discovery-service-map branch from 044427e to 703c412 Compare May 24, 2026 10:54
Make the documented chart default explicit instead of nil so the rendered
system-probe configmap shows the value without relying on the nil-gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
BarFinsdd and others added 2 commits May 27, 2026 09:42
Adds the now-rendered `service_map: enabled: false` block to the
system-probe configmap snapshot in 42 baseline manifests. Generated via
`make update-test-baselines-datadog-agent`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rvice-map

# Conflicts:
#	charts/datadog/CHANGELOG.md
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

⚠️ GKE Autopilot / GDC Baseline Manifests Changed

This PR modifies GKE Autopilot or GDC baseline manifest snapshots. Before merging, confirm:

  • GKE Autopilot/GKE GDC baseline manifest diffs have been reviewed and confirmed to be supported in GKE Autopilot and the latest Datadog WorkloadAllowlist.

If changes introduce constraints not yet covered by the Datadog WorkloadAllowlist CR, gate them with {{- if not (or .Values.providers.gke.autopilot .Values.providers.gke.gdc) }} until the WorkloadAllowlist is updated.
See gke-constraints-review-guide.md for the full constraint reference.

BarFinsdd and others added 3 commits May 27, 2026 10:07
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
use_system_probe_lite: {{ include "discovery-use-system-probe-lite" . }}
network_stats:
enabled: {{ $.Values.datadog.discovery.networkStats.enabled }}
{{- if not (eq $.Values.datadog.discovery.serviceMap.enabled nil) }}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be removed now?

BarFinsdd and others added 2 commits May 28, 2026 10:29
…rvice-map

# Conflicts:
#	charts/datadog/CHANGELOG.md
#	charts/datadog/Chart.yaml
#	charts/datadog/README.md
Bump to 3.215.0 (minor on top of main's 3.214.1) for the
datadog.discovery.serviceMap.enabled feature. Remove the always-true
nil-guard around service_map in system-probe-configmap.yaml since
serviceMap.enabled has a concrete default in values.yaml.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@BarFinsdd BarFinsdd force-pushed the bar.fins/discovery-service-map branch from 3ed5b12 to 6ef1577 Compare May 28, 2026 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chart/datadog This issue or pull request is related to the datadog chart mergequeue-status: done

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants