Skip to content

feat(panoramic): add native macOS runtime for integration tests#1735

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 57 commits into
mainfrom
thieman/macos-integration-tests-baremetal
Jun 1, 2026
Merged

feat(panoramic): add native macOS runtime for integration tests#1735
gh-worker-dd-mergequeue-cf854d[bot] merged 57 commits into
mainfrom
thieman/macos-integration-tests-baremetal

Conversation

@thieman
Copy link
Copy Markdown
Contributor

@thieman thieman commented May 26, 2026

Human Summary

Adds support for running integration tests (not correctness tests) under Mac native. All but 3 of the existing integration tests are updated to support Mac. In CI, this runs under Tart VMs for arm64 and on native runners for amd64. To support this, we have to do a bunch of gymnastics so that our Agent under test doesn't conflict with the system Agent on the native runners.

Runtime is selected via a new --runtime flag to panoramic which defaults to Mac native tests when running on a Mac.

Summary

Adds the ability to run ADP integration tests as native macOS processes (no Docker, no virtualization), and wires the suite into CI on the existing macos:sonoma-arm64 / macos:sonoma-amd64 runners.

24 of 27 integration tests pass on native_macos in ~3 minutes. Both standalone (just-ADP) and converged (Agent + ADP) tests are supported. Both CI jobs are passing.

This is the bare-metal subset of #1721; a Tart-based local-dev wrapper will land separately in a follow-up.

Why

ADP has no test coverage on macOS today. The 27-test integration suite all runs inside a Linux Docker container. Adding native macOS support is the first concrete step toward macOS as a supported platform (tracked in #1718): it builds ADP for macOS, exercises real macOS startup behavior, and gives us a place to land further coverage as the platform gaps from #1718 are closed.

Architecture

┌─────────────────────────────────────────────────────────────┐
│ Layer 2: macOS environment (CI bare-metal runner)           │
│   `make test-integration-macos-ci` ← CI entry point         │
└─────────────────┬───────────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────────┐
│ Layer 1: new Rust code (panoramic + airlock)                │
│   airlock::native::NativeProcess                            │
│   panoramic::native_runner::NativeIntegrationRunner         │
│   runtimes: [docker, native_macos] in test configs          │
│   requires_core_agent: true  for converged tests            │
└─────────────────────────────────────────────────────────────┘

The native runner is a parallel implementation of the Docker IntegrationRunner. The genuinely-duplicated assertion-loop logic is factored into a shared run_assertion_steps helper; the spawn / log capture / cleanup architectures stay separate because they're legitimately different (bollard vs tokio::process).

The Makefile intentionally does not expose a "run on your own Mac" convenience target. A Datadog Agent running on a dev host will collide with the test ADP on port 8125 (DSD UDP); the right answer for local-dev isolation is the Tart wrapper in the follow-up PR. Until then, CI is the only supported path.

Key changes

Rust framework

  • airlock::native::NativeProcess (new, ~240 lines) — tokio::process wrapper with kill-on-drop, always places spawned processes in their own process group so cleanup can signal the entire group (parent plus any forked helpers like trace-agent / process-agent). Background watcher observes the child's exit, fires a CancellationToken, and records the exit code in a shared OnceLock cell.
  • panoramic::native_runner::NativeIntegrationRunner (new, ~365 lines) — parallel to the Docker integration runner. Creates a per-test temp config directory, optionally spawns the Datadog Core Agent, waits for the IPC handshake to be ready, then spawns ADP. Both processes feed into the same LogBuffer so existing log-based assertions work unchanged.
  • run_assertion_steps (new helper in assertions/mod.rs) — extracted from the Docker IntegrationRunner's implementation (the more complete one: fail-fast on first failure, step-index debug logging, consistent error reporting). Both runners delegate to it.
  • AssertionContext gains an is_native flag and an Option<ExitCodeCell> so:
    • file_contains reads files via tokio::fs on native (vs. docker exec cat on Docker)
    • adp_exits_with (new assertion) abstracts the runtime difference for "did ADP exit with code N":
      • On docker: greps the captured log buffer for agent-data-plane exited with code N (the s6 supervisor's exit message)
      • On native_macos: reads the exit code from the per-process OnceLock
  • log_contains / log_not_contains fixed to do a final post-exit buffer read instead of bailing immediately on container exit. Required for tests where ADP exits within seconds (e.g., adp-disabled-exit).

Config schema

  • New runtimes: Vec<String> field on IntegrationConfig. Defaults to ["docker"]. Configs that declare multiple runtimes expand into one Test instance per runtime at discovery time, named {base_name}/{runtime}.
  • New requires_core_agent: bool field. When true on the native_macos runtime, the native runner spawns the Core Agent alongside ADP. Informational on docker (s6 always runs both).
  • New panoramic run --runtime <name> filter so make targets can scope to one runtime cleanly.

Provisioning + CI

  • make provision-macos-test-env — idempotently installs the Datadog Agent at /opt/datadog-agent (downloads the official DMG, runs installer -pkg, tolerates the postinstall non-zero exit since the binaries land before postinstall runs) and bootstraps the IPC cert + auth_token by running the Agent briefly. Pinned to Agent 7.78.0 via MACOS_TEST_AGENT_VERSION for reproducibility.
  • make test-integration-macos-ci — CI entry point: builds binaries, runs provision-macos-test-env, runs the native suite.
  • make build-adp-nativecargo build --release --bin agent-data-plane with the same APP_* env vars that build-adp-base uses, so the binary logs as | DATAPLANE | instead of | UNKNOWN |.
  • .gitlab/e2e.yml — new test-integration-macos-arm64 and test-integration-macos-amd64 jobs (e2e stage) extending the existing bare-metal macOS runner tags. before_script does a defensive sudo pkill to clean up any orphan Agent processes from prior shared-runner jobs.

Tests enabled

Category Tests Status
Standalone (just ADP) 14 of 17 All passing
Converged (Agent + ADP) 10 of 10 All passing
Container-supervisor behavior 0 of 1 Deferred
Bind-host (PANORAMIC_DYNAMIC shell hooks) 0 of 2 Deferred

3 tests intentionally not enabled (filtered out via the runtimes field):

  • adp-rar-disabled: assertion is "container stable for 10s," which really tests "s6 keeps restarting ADP." No equivalent supervisor on native. Re-enable when the native runner grows restart-on-failure semantics or the test is rewritten to assert on retry behavior directly.
  • dogstatsd-bind-host, dogstatsd-bind-custom-hostname: use PANORAMIC_DYNAMIC_* env shell hooks that run hostname -i and echo ... >> /etc/hosts — Linux-container-isms not portable to a macOS host (hostname -i isn't supported on macOS; writing /etc/hosts needs root and has real host side effects).

Test plan

Verified locally on Apple Silicon (macOS 26.4.1), inside an ephemeral Tart macOS Sequoia VM with a freshly provisioned Datadog Agent (this matches the shape of a clean CI runner):

$ make test-integration-macos-ci
...
PASSED: 24 passed, 0 failed, 24 total (192.50s)

CI passes the same tests on both macos:sonoma-arm64 and macos:sonoma-amd64.

Regression check (Linux Docker path):

  • panoramic list -d test/integration/cases shows the original tests plus the 24 new */native_macos variants.
  • The docker variant of every previously-enabled test still discovers and dispatches to the existing IntegrationRunner.
  • cargo test -p airlock -p panoramic passes.
  • make check-clippy, make check-fmt, make check-docs clean (pre-commit enforced on every commit).

Notable findings while building this

  1. ADP's bootstrap requires a datadog.yaml to exist (defaulting to /opt/datadog-agent/etc/datadog.yaml on macOS). The native runner creates a per-test empty config so it works on hosts with or without an Agent install.
  2. The Agent's authoritative config (via the config stream) overrides ADP's env vars because ConfigurationLoader follows "later sources win." If ADP set DD_AUTH_TOKEN_FILE_PATH=<per-test> but the Agent didn't, the Agent would advertise the platform default and ADP would honor the stream value for post-bootstrap IPC clients, failing TLS with UnknownIssuer. Fix: set DD_AUTH_TOKEN_FILE_PATH on the Agent too.
  3. Core Agent forks trace-agent and process-agent that orphan onto launchd when the parent is SIGKILL'd. Native runner spawns the Agent with process_group(0) and signals the whole group on cleanup.
  4. The DMG installer's postinstall hook fails with exit code 112, but the binaries land in /opt/datadog-agent before the postinstall runs. The provision target tolerates this non-zero exit.
  5. Bootstrap-via-sudo leaves files owned by root mode 600. The provision target chowns to the current user only if the files aren't already readable, so sudo isn't re-prompted on already-set-up systems.

Related: #1718 (macOS platform gap tracking), #1721 (companion PR with optional Tart VM wrapper for local-dev isolation).

@dd-octo-sts dd-octo-sts Bot added area/docs Reference documentation. area/test All things testing: unit/integration, correctness, SMP regression, etc. labels May 26, 2026
@datadog-prod-us1-3

This comment has been minimized.

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 26, 2026

Binary Size Analysis (Agent Data Plane)

Baseline: 4e5a7e2 · Comparison: 09379a5 · diff
Analysis Configuration: stripped binaries · Pass/Fail Threshold: +5%
Sizes: 37.93 MiB (baseline) vs 37.93 MiB (comparison)
Size Change: +0 B (+0.00%)

✅ Binary size difference within threshold

Changes by Module
Module File Size Symbols
anon.85240eacea40817b540ad191ce7e90d0.1.llvm.3508058534102656255 +129 B 1
anon.85240eacea40817b540ad191ce7e90d0.1.llvm.7567146742271023864 -129 B 1
anon.85240eacea40817b540ad191ce7e90d0.4.llvm.3508058534102656255 +114 B 1
anon.85240eacea40817b540ad191ce7e90d0.4.llvm.7567146742271023864 -114 B 1
anon.85240eacea40817b540ad191ce7e90d0.3.llvm.3508058534102656255 +108 B 1
anon.85240eacea40817b540ad191ce7e90d0.3.llvm.7567146742271023864 -108 B 1
anon.85240eacea40817b540ad191ce7e90d0.0.llvm.3508058534102656255 +96 B 1
anon.85240eacea40817b540ad191ce7e90d0.0.llvm.7567146742271023864 -96 B 1
anon.85240eacea40817b540ad191ce7e90d0.2.llvm.3508058534102656255 +94 B 1
anon.85240eacea40817b540ad191ce7e90d0.2.llvm.7567146742271023864 -94 B 1
Detailed Symbol Changes
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  [NEW]    +129  [NEW]     +40    anon.85240eacea40817b540ad191ce7e90d0.1.llvm.3508058534102656255
  [NEW]    +114  [NEW]     +25    anon.85240eacea40817b540ad191ce7e90d0.4.llvm.3508058534102656255
  [NEW]    +108  [NEW]     +19    anon.85240eacea40817b540ad191ce7e90d0.3.llvm.3508058534102656255
  [NEW]     +96  [NEW]      +7    anon.85240eacea40817b540ad191ce7e90d0.0.llvm.3508058534102656255
  [NEW]     +94  [NEW]      +5    anon.85240eacea40817b540ad191ce7e90d0.2.llvm.3508058534102656255
  [DEL]     -94  [DEL]      -5    anon.85240eacea40817b540ad191ce7e90d0.2.llvm.7567146742271023864
  [DEL]     -96  [DEL]      -7    anon.85240eacea40817b540ad191ce7e90d0.0.llvm.7567146742271023864
  [DEL]    -108  [DEL]     -19    anon.85240eacea40817b540ad191ce7e90d0.3.llvm.7567146742271023864
  [DEL]    -114  [DEL]     -25    anon.85240eacea40817b540ad191ce7e90d0.4.llvm.7567146742271023864
  [DEL]    -129  [DEL]     -40    anon.85240eacea40817b540ad191ce7e90d0.1.llvm.7567146742271023864
  [ = ]       0  [ = ]       0    TOTAL

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 26, 2026

Regression Detector (Agent Data Plane)

Run ID: 7a3ccbff-4898-4b38-a9fe-23122920be0e
Baseline: 4e5a7e24 · Comparison: 09379a59 · diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment (35)

Experiments configured erratic: true are tagged (ignored) and skipped when determining which experiments regressed or improved. Experiments which are detected as erratic at runtime are tagged (erratic) to flag that the run's sample dispersion was high, but their regression / improvement signal still counts.

experiment goal Δ mean % links
otlp_ingest_traces_ottl_filtering_5mb_cpu (erratic) cpu ⚪ +2.49 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_cpu (erratic) cpu ⚪ +1.83 metrics profiles logs
otlp_ingest_logs_5mb_cpu (ignored) cpu ⚪ +1.17 metrics profiles logs
dsd_uds_100mb_3k_contexts_cpu (erratic) cpu ⚪ +1.10 metrics profiles logs
quality_gates_rss_dsd_medium memory ⚪ +0.36 metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory ⚪ +0.14 metrics profiles logs
dsd_uds_512kb_3k_contexts_memory memory ⚪ +0.11 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_throughput throughput ⚪ -0.08 metrics profiles logs
otlp_ingest_traces_5mb_throughput throughput ⚪ -0.04 metrics profiles logs
otlp_ingest_traces_5mb_memory memory ⚪ +0.03 metrics profiles logs
otlp_ingest_metrics_5mb_throughput throughput ⚪ -0.03 metrics profiles logs
dsd_uds_512kb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_100mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
otlp_ingest_logs_5mb_throughput (ignored) throughput ⚪ +0.00 metrics profiles logs
dsd_uds_1mb_3k_contexts_throughput throughput ⚪ +0.00 metrics profiles logs
dsd_uds_10mb_3k_contexts_throughput throughput ⚪ +0.01 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_memory memory ⚪ -0.03 metrics profiles logs
dsd_uds_100mb_3k_contexts_memory memory ⚪ -0.04 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_throughput throughput ⚪ +0.04 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_memory memory ⚪ -0.09 metrics profiles logs
dsd_uds_500mb_3k_contexts_memory memory ⚪ -0.10 metrics profiles logs
dsd_uds_10mb_3k_contexts_memory memory ⚪ -0.25 metrics profiles logs
otlp_ingest_traces_5mb_cpu (erratic) cpu ⚪ -0.35 metrics profiles logs
quality_gates_rss_dsd_low memory ⚪ -0.38 metrics profiles logs
quality_gates_rss_idle memory ⚪ -0.39 metrics profiles logs
dsd_uds_500mb_3k_contexts_cpu (erratic) cpu ⚪ -0.40 metrics profiles logs
dsd_uds_1mb_3k_contexts_memory memory ⚪ -0.41 metrics profiles logs
quality_gates_rss_dsd_heavy memory ⚪ -0.58 metrics profiles logs
dsd_uds_500mb_3k_contexts_throughput throughput ⚪ +0.92 metrics profiles logs
otlp_ingest_metrics_5mb_cpu (erratic) cpu ⚪ -1.17 metrics profiles logs
otlp_ingest_logs_5mb_memory (ignored) memory ⚪ -1.55 metrics profiles logs
dsd_uds_10mb_3k_contexts_cpu (erratic) cpu ⚪ -3.43 metrics profiles logs
dsd_uds_512kb_3k_contexts_cpu (erratic) cpu ⚪ -3.49 metrics profiles logs
otlp_ingest_metrics_5mb_memory memory ⚪ -4.22 metrics profiles logs
dsd_uds_1mb_3k_contexts_cpu (erratic) cpu 🟢 -7.69 metrics profiles logs
Bounds Checks: ✅ Passed (5)
experiment check replicates observed links
quality_gates_rss_dsd_heavy memory_usage 10/10 ✅ 125 MiB ≤ 140 MiB metrics profiles logs
quality_gates_rss_dsd_low memory_usage 10/10 ✅ 39.7 MiB ≤ 50 MiB metrics profiles logs
quality_gates_rss_dsd_medium memory_usage 10/10 ✅ 60.9 MiB ≤ 75 MiB metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory_usage 10/10 ✅ 182 MiB ≤ 200 MiB metrics profiles logs
quality_gates_rss_idle memory_usage 10/10 ✅ 26.6 MiB ≤ 40 MiB metrics profiles logs
Explanation

A change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression (is_regression: true). Improvements use the matching criteria for the improving direction. Experiments configured erratic: true (tagged (ignored)) are skipped outright; experiments detected as erratic at runtime (tagged (erratic)) still count, since that flag describes sample dispersion rather than directional certainty. The Δ mean % cell is colored accordingly: 🟢 = improvement, 🔴 = regression, ⚪ = neutral. Reduction in CPU or memory is an improvement; reduction in ingress throughput is a regression.

@dd-octo-sts dd-octo-sts Bot added the area/ci CI/CD, automated testing, etc. label May 26, 2026
thieman added a commit that referenced this pull request May 26, 2026
Replaces 'e.g.' with 'for example' in doc comments (Google.Latin),
swaps an em dash for a semicolon in a runtime-field docstring
(Google.EmDash), and adds 'launchd' to the technical vocabulary.

Also drops docs/superpowers/plans/2026-05-21-macos-native-integration-tests.md;
the plan was a one-time implementation artifact and the PR description
on #1735 supersedes it.
@dd-octo-sts dd-octo-sts Bot removed the area/docs Reference documentation. label May 26, 2026
thieman added a commit that referenced this pull request May 27, 2026
Replaces 'e.g.' with 'for example' in doc comments (Google.Latin),
swaps an em dash for a semicolon in a runtime-field docstring
(Google.EmDash), and adds 'launchd' to the technical vocabulary.

Also drops docs/superpowers/plans/2026-05-21-macos-native-integration-tests.md;
the plan was a one-time implementation artifact and the PR description
on #1735 supersedes it.
@thieman thieman force-pushed the thieman/macos-integration-tests-baremetal branch 4 times, most recently from fcd4492 to bcdff89 Compare May 27, 2026 20:02
@thieman thieman marked this pull request as ready for review May 27, 2026 20:41
@thieman thieman requested a review from a team as a code owner May 27, 2026 20:41
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6ac79736da

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +150 to +151
let mut agent_env = self.test_case.container.env.clone();
agent_env.insert("DD_AUTH_TOKEN_FILE_PATH".to_string(), auth_token_path.clone());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Disable Core Agent DogStatsD for native converged tests

When a mac test has requires_core_agent: true and sets DD_DATA_PLANE_DOGSTATSD_ENABLED=true, this passes the same environment to the Core Agent that ADP receives, but the Docker path explicitly writes DD_USE_DOGSTATSD=0 in docker/cont-init.d/99-agent-data-plane.sh so the Agent does not bind DogStatsD while ADP owns it. Native converged tests such as adp-config-stream now let the Core Agent start first with DogStatsD enabled, so it can claim port 8125 before ADP, causing ADP startup failures or port assertions to observe the wrong process.

Useful? React with 👍 / 👎.

Comment on lines +98 to +99
let read_result = if ctx.is_host_process {
read_file_local(&self.path).await
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Isolate host file assertions from stale files

For host-process tests this reads the absolute path directly from the shared macOS filesystem, so existing files from earlier runs can satisfy file_contains before the current ADP/Core Agent writes anything. The newly opted-in logging tests assert paths like /tmp/adp-custom.log, /tmp/coreagent-only.log, and /var/log/datadog/agent-data-plane.log; unlike the Docker container filesystem, those paths persist across tests and runner jobs, which can hide logging regressions unless the runner removes or scopes these files per test before polling.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair thing to raise... I see we're sandboxing our logs and what not in their own directory, but are we cleaning that up before/after test runs?

(Maybe I just haven't gotten far enough through the PR yet.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[GPT-5] Addressed in afa587d. The Unix runner now wraps the per-test panoramic-unix-* state directory in an RAII guard and removes only the framework-owned directory it creates. I intentionally did not clean paths from test YAML (for example explicit log-file assertion paths), since those are test-owned rather than runner-owned.

@atanzu
Copy link
Copy Markdown
Contributor

atanzu commented May 28, 2026

I have a concern about native runners clashing with the local setup. The thing with Docker is, it allowes us to create an isolated environment which 1/ Doesn't have conflicts with the host setup and 2/ Doesn't allow the code under test to do weird things.

For the latter, did you consider things like Launchd limits or Sandbox/Seatbelt?

For the former, technically we have chroot in MacOS (with certain limitations, but still).

thieman added 13 commits May 28, 2026 11:06
Introduces NativeProcess and NativeProcessConfig in airlock, mirroring the
relevant surface of the Docker Driver but spawning a local binary instead of a
container. Captured stdout/stderr lines flow through a small LogSink trait so
consumers can bridge to their own log buffer types without coupling airlock to
panoramic-specific code.

This is the foundation for running ADP integration tests natively on macOS,
where ADP runs as a real macOS process rather than inside a Linux container.
Adds a 'runtimes' field to IntegrationConfig (default ['docker']). At test
discovery time, configs that declare multiple runtimes expand into one Test
instance per runtime, each named '{base_name}/{runtime}'. The Test trait impl
for IntegrationConfig dispatches to either the existing Docker IntegrationRunner
or the new NativeIntegrationRunner based on the resolved runtime.

NativeIntegrationRunner spawns the ADP binary directly via tokio::process,
piping stdout/stderr into the same LogBuffer the Docker path uses so existing
assertions work unchanged. The binary path is discovered via the
ADP_BINARY_PATH env var, defaulting to target/release/agent-data-plane.

The Docker code path is untouched; declaring 'runtimes: [docker]' (or omitting
the field) produces the existing behavior.
basic-startup is a single-process standalone ADP test (no Core Agent
required), making it the simplest integration test to validate the new
native_macos runtime end-to-end. The same config now runs in both Docker
(on Linux) and as a native process (on macOS).
Adds two new Makefile targets:

- build-adp-native: builds agent-data-plane in release mode for the current
  host. On macOS this produces a native macOS binary.
- test-integration-macos: builds panoramic and ADP, then runs panoramic
  filtered to a single native_macos integration test (defaulting to
  basic-startup/native_macos, overridable via CASE=...).

This works on any macOS host with cargo + rustc, no Docker required. Verified
locally on Apple Silicon with basic-startup/native_macos passing in ~11s.
ADP's bootstrap loader requires the configuration file to exist, defaulting
to /opt/datadog-agent/etc/datadog.yaml on macOS. On a clean macOS host
without an installed Datadog Agent (which a CI runner would be), ADP fails
immediately with 'No such file or directory'.

Fix by having NativeIntegrationRunner create a per-test temp directory with
an empty datadog.yaml and passing it to ADP via -c. Tests communicate config
through env vars, so the file itself is intentionally empty.

Also splits the Makefile target so 'test-integration-macos-run' can be
invoked against pre-built binaries (useful for CI build-once-run-many) while
'test-integration-macos' remains the build+run convenience wrapper for local
use. Quotes $(CURDIR) so the run target survives paths with spaces.
Adds a new --runtime <name> option to 'panoramic run' that restricts
the test run to tests whose Test::runtime() matches the given value.
Composes with the existing -t name filter (AND semantics): a test
must match BOTH filters to be selected when both are set.

Updates the Makefile's test-integration-macos-run target to default
to '--runtime native_macos -p 1' (run all native_macos tests serially)
instead of hardcoding basic-startup. Tests can still be selected
individually with CASE=<name>/native_macos.
…nable 13 more standalone integration tests on native_macos

Adds 'runtimes: [docker, native_macos]' to 13 of the standalone integration
tests. Combined with basic-startup (enabled earlier on this branch), that
brings the native_macos coverage to 14 of 17 standalone tests.

Also populates the AssertionContext's port_mappings in the native runner
using the test config's 'exposed_ports' as identity mappings (host port ==
'container' port, since native has no port remapping). The Docker path uses
this map to translate container ports to Docker-allocated host ports; on
native we just need every probed port to appear in the map for the existing
port_listening assertion to work unchanged.

Tests enabled in this commit:
- adp-memory-mode-disabled
- adp-memory-mode-permissive-exceeds-limit
- adp-memory-mode-permissive-within-limit
- adp-memory-mode-strict-within-limit
- adp-no-pipelines-exit
- dogstatsd-autoscale-udp
- dogstatsd-default-bind
- dogstatsd-enabled
- dogstatsd-non-local-overrides-bind-host
- otlp-traces-enabled
- privileged-api-endpoints
- telemetry-endpoint
- unprivileged-api-endpoints

Verified all 14 pass via 'make test-integration-macos' (134s total, serial).

Tests intentionally NOT enabled (each needs adaptation):
- adp-memory-mode-strict-exceeds-limit (asserts on s6 supervisor log)
- dogstatsd-bind-host (uses 'hostname -i' which works differently on macOS)
- dogstatsd-bind-custom-hostname (writes to /etc/hosts via PANORAMIC_DYNAMIC)
Adds the ability for native_macos integration tests to spawn the Datadog
Core Agent alongside ADP, sharing a per-test config directory so they
authenticate over IPC the same way they would in production.

Test configs opt in by setting 'requires_core_agent: true'. When set, the
native runner:

  1. Resolves the Core Agent binary (CORE_AGENT_BINARY_PATH env var,
     defaulting to /opt/datadog-agent/bin/agent/agent).
  2. Spawns the Agent against the per-test config dir, in a new process
     group so its trace-agent and process-agent child processes can be
     reaped together on cleanup.
  3. Waits up to 60s for the Agent to write 'auth_token' and
     'ipc_cert.pem' into the config dir.
  4. Spawns ADP with DD_AUTH_TOKEN_FILE_PATH pointing at the per-test
     auth_token, so ADP's IPC client uses the same per-test credentials
     and ADP's API server uses the matching cert.
  5. On cleanup, SIGTERM then SIGKILL the entire Agent process group
     (parent + trace-agent + process-agent) to prevent orphans holding
     ports between tests.

The 'requires_core_agent' field is informational on the existing docker
runtime, which always runs both processes via s6.
Enables the first three converged tests in the native_macos runtime:

  - adp-rar-registration: ADP successfully registers with the Agent's
    Remote Agent Registry.
  - adp-rar-disabled: ADP handles registration failure gracefully when
    the Agent has the RAR disabled.
  - adp-config-check-warn: ADP warns (but does not exit) on
    medium-severity unsupported config keys.

These all rely on the converged-spawn support added in the previous
commit. Verified inside an ephemeral Tart macOS VM with a freshly
installed Datadog Agent 7.78.0.

Remaining converged tests are intentionally NOT enabled yet:

  - adp-cmd-port: needs investigation; requires a specific cmd_port that
    isn't being honored end-to-end on macOS yet.
  - adp-config-check-exit: asserts on the s6 supervisor's exit log,
    which has no native equivalent.
  - adp-config-stream: ADP waits indefinitely for config when the test
    uses the new config stream endpoint; needs investigation.
  - adp-logging-*: assert on Linux log paths (/var/log/datadog/...);
    macOS uses /opt/datadog-agent/logs/... so the assertions need
    platform-specific paths.
When ADP's bootstrap config flow uses the new config stream endpoint
(DD_DATA_PLANE_USE_NEW_CONFIG_STREAM_ENDPOINT=true), the Agent's
authoritative configuration is layered on top of ADP's env vars and
takes precedence (per ConfigurationLoader's ordering: later sources
win). Previously the native runner only set DD_AUTH_TOKEN_FILE_PATH on
ADP, but the Agent's config stream still advertised the platform default
(/opt/datadog-agent/etc/auth_token). ADP would honor the stream value
for its post-bootstrap IPC clients, load the wrong cert, and fail TLS
validation with 'invalid peer certificate: UnknownIssuer' — even though
the cert and key on disk in the per-test state directory were correct.

Fix: pass DD_AUTH_TOKEN_FILE_PATH to the Agent too so its config stream
advertises the per-test path that both processes are actually using.

Also enables adp-cmd-port and adp-config-stream on native_macos, which
hit exactly this failure (they pass with the fix, observed in a Tart VM
with a freshly provisioned Datadog Agent 7.78.0).
…P_* metadata to ADP native build

Two related fixes needed for the adp-logging-* integration tests on the
native_macos runtime:

1. The file_contains assertion previously always shelled out to
   'docker exec cat <path>' to read files from the container. On native
   there is no container; files referenced by the test live on the host
   filesystem directly. Adds an is_native flag to AssertionContext (set
   by NativeIntegrationRunner) and branches the assertion to read via
   tokio::fs::read_to_string when native.

2. The Makefile's build-adp-native target was running 'cargo build'
   without the APP_FULL_NAME / APP_SHORT_NAME / APP_IDENTIFIER /
   APP_GIT_HASH / APP_VERSION / APP_BUILD_DATE env vars that the
   saluki-metadata build script reads. Without them, ADP logs as
   '| UNKNOWN |' instead of '| DATAPLANE |', and tests that look for
   the 'DATAPLANE' marker fail. Aligns with build-adp-base.
Builds on the converged-spawn support and the DD_AUTH_TOKEN_FILE_PATH
alignment + file_contains native code path landed in previous commits.

Newly enabled:
  - adp-cmd-port: ADP connects to the Agent on a custom cmd_port (7777)
  - adp-config-stream: ADP reaches a healthy topology via the config
    stream from the Agent
  - adp-logging-default-path: ADP writes its log to the platform-default
    path under converged operation
  - adp-logging-ignores-core-agent-log-file: ADP does not honor the
    Core Agent's DD_LOG_FILE setting and keeps using its own log path
  - adp-logging-respects-data-plane-log-file: ADP honors
    DD_DATA_PLANE_LOG_FILE when set explicitly

Total native_macos integration coverage is now 22 of 27 integration
tests (17 standalone + 5 converged). Verified end-to-end inside an
ephemeral Tart macOS VM with a freshly provisioned Datadog Agent 7.78.0
in 3m9s wall clock.

Remaining 5 tests intentionally not enabled:
  - adp-disabled-exit, adp-config-check-exit: assert on the s6
    supervisor log line in the converged Docker image; no s6 on native,
    so the assertion has no equivalent without a runner change.
  - adp-memory-mode-strict-exceeds-limit: same s6 supervisor log
    assertion.
  - dogstatsd-bind-host, dogstatsd-bind-custom-hostname: use
    PANORAMIC_DYNAMIC env shell hooks that run 'hostname -i' and
    'echo ... >> /etc/hosts' \u2014 valid in a Linux container, not portable
    to a macOS host.
Comment thread .gitlab-ci.yml
# bare-metal runner capacity); non-interruptible jobs continue to completion as before.
# https://docs.gitlab.com/ci/yaml/#workflowauto_cancelon_new_commit
auto_cancel:
on_new_commit: interruptible
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made Mac jobs specifically interruptible and added this definition to the workflow so they'll get canceled when new commits get pushed. Our Mac capacity is constrained, we only have 3 concurrent AMD64 runners across the project.

/// to the flat key `data_plane_api_listen_address` and are silently ignored; we use double
/// underscores at every dot boundary for the deep ADP / OTLP keys below. The top-level Agent
/// env vars (`DD_CMD_PORT` etc.) are explicitly queried by the Agent so they don't need it.
pub fn port_isolation_env() -> HashMap<String, String> {
Copy link
Copy Markdown
Contributor Author

@thieman thieman May 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This thing provides a shared set of env vars which is intended to provide non-default values for anything like port assignments or filepaths which can conflict with another Agent running on the machine. This is necessary since the amd64 runners are non-virtualized and have their own Agents running on them. For consistency these variables are applied to all test runs, native or otherwise.

Obviously this is super annoying and jank, so I'm open to better ideas here.

@thieman
Copy link
Copy Markdown
Contributor Author

thieman commented May 28, 2026

@atanzu this is currently doing the following:

  1. arm64 jobs run on a shared runner pool where each job now goes into a Tart VM that is provisioned and torn down with each job. Should be no risk of contaminating the runner here.

  2. amd64 jobs still run directly on native Mac runners. These are "dedicated" runners only used by Saluki, so if we do happen to screw them up somehow the blast radius is limited to our project.

Additionally, we are going out of our way to install the version of the Agent under test into a non-standard install directory and override any port assignments or filepaths under test so that we do not interfere with an Agent already running on the machine. This is applicable to the amd64 runners since they are running their own Agents for internal observability. That's probably not perfect (I'd imagine we may be polluting some logs?) but it's thorough enough that the tests can pass. I think this is probably the best we can do here without any virtualization support. Something like chroot might help on some axes (filepaths?) but not others (ports?). This was already a bear to get working so I'm inclined to leave it as-is since the blast radius is already limited.

thieman added 6 commits May 28, 2026 16:23
The Datadog Agent at startup searches for a 'datadog' config file in
[<install>/etc /opt/datadog-agent/etc] and aborts ('Config File Not
Found') if neither has one. On the previous bare-metal CI runners the
fallback worked because /opt/datadog-agent was pre-installed with a
real datadog.yaml. On the new arm64 Tart runners (fresh VM, no system
install), neither path is populated and the bootstrap step never
even reaches IPC cert generation.

The pkg payload we extract only includes datadog.yaml.example, not
datadog.yaml proper. Drop a zero-byte datadog.yaml into the sandbox
etc/ before launching the bootstrap Agent. Tests communicate config
via env vars; the file's content doesn't matter, only its existence.
The 'simplification' in fe5e9a4 made push_line spawn a tokio task
per log line to do the actual buffer append. On the slower arm64 Tart
VM those spawned tasks queue up faster than the runtime drains them;
by the time the assertion polls or write_logs reads the buffer, only
the first couple of lines have actually landed. amd64 (faster CPU /
more cores) drains enough tasks for most assertions to still pass,
but arm64 captures literally 2 lines per failing test \u2014 the rest are
still stuck in the spawn queue.

Switch LogBuffer's lock from tokio::sync::RwLock to std::sync::RwLock
so push_line can take the lock and append synchronously from the
pump's sync trait method. No spawn-per-line, no fire-and-forget, no
ordering race. All log_buffer.read()/.write() call sites updated to
sync.

None of the existing critical sections held the lock across an .await,
so the swap is mechanical. The one snag was adp_exits.rs's read guard
being held in scope past the next sleep().await; tightened that with
a block so the guard drops before the await.
When a macOS integration test fails outside panoramic's per-test log
capture (bootstrap-Agent issues, stranded processes from this run,
weird mount or filesystem state), nothing useful lands in the
artifact today \u2014 panoramic only knows about its per-test stdout/stderr,
and /tmp/saluki-agent-bootstrap.log lives outside the artifact path.

Add an after_script that runs whether the script passed or failed
(GitLab CI semantics) and writes a small bundle of host state into
integration-logs/host-diag/:

  sw_vers, uname, mount, df, ps, the bootstrap log, sandbox etc/ listing

The existing artifacts.paths declares integration-logs/ with
when: always, so anything dropped under that dir is uploaded
automatically \u2014 no artifacts-config change needed.

Cheap to collect, can be the difference between 'I have no idea why
this failed in CI' and 'oh, that process is still running'.

Comment refresh: the helper formerly known as test_port_isolation_env
moved to test_env::port_isolation_env earlier in this branch; the
pkill block still references its old name.
Standalone macOS integration tests run ADP without a Core Agent, so no
Agent process writes the IPC certificate ADP needs for its privileged API.
ADP fell back to the platform default path under /opt/datadog-agent and
waited up to 20 seconds for that file, causing 15-second log assertions to
time out after only the startup/standalone-mode lines.

Seed a per-test auth token and self-signed IPC certificate for standalone
Unix-runner tests, and force ADP to use those paths. Converged tests keep
using the Core Agent-generated credentials in the same per-test state dir.

Verified with:
- cargo test -p panoramic unix_runner::tests::standalone_adp_env_points_ipc_credentials_at_test_state_dir
- cargo check -p panoramic
- PANORAMIC_LOG_DIR=/tmp/panoramic-standalone-fix-all make test-integration-macos-run CASE=adp-memory-mode-disabled,adp-memory-mode-permissive-exceeds-limit,adp-memory-mode-permissive-within-limit,adp-memory-mode-strict-within-limit,dogstatsd-autoscale-udp,dogstatsd-default-bind,dogstatsd-non-local-overrides-bind-host,otlp-traces-enabled,privileged-api-endpoints,telemetry-endpoint,unprivileged-api-endpoints
The docker logging tests assert the Linux container default path,
/var/log/datadog/agent-data-plane.log. Host-process mac tests should not
write to /var or /opt on the runner, and the default mac path is not the
container path anyway.

When a mac host-process test does not explicitly configure
DD_DATA_PLANE_LOG_FILE, set it to the per-test state directory and rewrite
assertions that target the platform/container default ADP log file to the
same path. Tests with an explicit ADP log-file override keep using their
configured path.

Verified with:
- cargo test -p panoramic unix_runner::tests
- cargo check -p panoramic
- PANORAMIC_LOG_DIR=/tmp/panoramic-converged-logfix2 make test-integration-macos-run CASE=adp-logging-default-path,adp-logging-ignores-core-agent-log-file
The Linux default-path logging tests assert container paths such as
/var/log/datadog/agent-data-plane.log. Those paths are correct for the
docker runtime, but they are not appropriate for macOS host-process tests.

Make the existing Linux-path cases docker-only, add an explicit macOS case
for the behavior we still need to cover there, and remove the Unix-runner
path/env rewriting that made the effective test behavior invisible from the
YAML.

The macOS case uses explicit Core Agent and ADP log-file paths under /tmp,
so it still proves ADP ignores the Core Agent's log_file setting without
requiring writes to /var or /opt.

Verified with:
- cargo test -p panoramic unix_runner::tests
- cargo check -p panoramic
- PANORAMIC_LOG_DIR=/tmp/panoramic-mac-logging-split make test-integration-macos-run CASE=adp-logging-mac-ignores-core-agent-log-file,adp-logging-respects-data-plane-log-file
Comment on lines +98 to +99
let read_result = if ctx.is_host_process {
read_file_local(&self.path).await
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair thing to raise... I see we're sandboxing our logs and what not in their own directory, but are we cleaning that up before/after test runs?

(Maybe I just haven't gotten far enough through the PR yet.)

Comment thread bin/correctness/panoramic/src/config.rs Outdated
/// On the `docker` runtime this field is informational; the converged image always runs
/// both processes via s6.
#[serde(default)]
pub requires_core_agent: bool,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say we should always run the Core Agent. It's that sort of thing that gives me some extra confidence that our tests are reflecting real-world usage.

Comment thread Makefile Outdated
Comment on lines +593 to +601
# Version of the Datadog Agent installed by `provision-macos-test-env`. Pinned for
# reproducibility; bump when the integration tests need newer Agent behavior.
MACOS_TEST_AGENT_VERSION ?= 7.78.0
MACOS_TEST_AGENT_DMG_DIR ?= /tmp/saluki-dda-dmg-cache
MACOS_TEST_AGENT_DMG_URL ?= https://s3.amazonaws.com/dd-agent/datadog-agent-$(MACOS_TEST_AGENT_VERSION)-1.$(shell uname -m).dmg
# Sandbox directory the Agent is installed into. Deliberately *not* /opt/datadog-agent: keeping
# our install isolated from any pre-existing system install (which a CI runner or developer host
# may already have at a different, conflicting version) avoids surprises in both directions.
MACOS_TEST_AGENT_INSTALL_DIR ?= /tmp/saluki-dda/datadog-agent
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should live at the top with other variables so they're discoverable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[GPT-5] Addressed in afa587d. Moved the MACOS_TEST_AGENT_* settings up with the other top-level Makefile variables so they are easier to find.

thieman added 3 commits May 29, 2026 14:12
Move the macOS integration-test Agent settings up with the other Makefile
variables so they are discoverable.

Also wrap each Unix-runner per-test state directory in an RAII guard so the
runner removes only the directories it creates itself. This deliberately does
not clean paths that appear in test YAML, such as file_contains log paths;
those are test-owned paths, not framework-owned scratch state.

Verified with:
- cargo test -p panoramic unix_runner::tests
- cargo check -p panoramic
- make build-panoramic
- PANORAMIC_LOG_DIR=/tmp/panoramic-state-cleanup-check2 make test-integration-macos-run CASE=adp-memory-mode-disabled
The docker integration fixture always runs the Core Agent and ADP together
via s6. Match that shape in the mac host-process runner: start the Core
Agent for every test, then let the test config decide whether ADP behaves
as standalone or uses the Agent/config stream.

Mirror the docker cont-init collision avoidance for listeners ADP owns:
disable Core Agent DogStatsD when ADP DogStatsD is enabled, and redirect
Core Agent OTLP receivers when ADP OTLP is enabled.

With Core Agent always present, the requires_core_agent YAML field is no
longer meaningful, so remove it and the standalone self-signed IPC cert
fallback.

Verified with:
- cargo test -p panoramic unix_runner::tests
- cargo check -p panoramic
- make build-panoramic
- PANORAMIC_LOG_DIR=/tmp/panoramic-always-core-smoke2 make test-integration-macos-run CASE=basic-startup,adp-memory-mode-disabled,dogstatsd-default-bind,otlp-traces-enabled,adp-logging-mac-ignores-core-agent-log-file
After making the mac runner always start the Core Agent, standalone ADP
cases could race the Core Agent for the shifted DogStatsD UDP port. The
Docker fixture avoids this when ADP owns DogStatsD; mirror that behavior in
the mac runner even when tests rely on ADP's default DogStatsD setting and
do not explicitly set DD_DATA_PLANE_DOGSTATSD_ENABLED=true.

Disable Core Agent DogStatsD whenever ADP is enabled unless the test
explicitly disables ADP DogStatsD.

Verified with:
- cargo test -p panoramic unix_runner::tests
- cargo check -p panoramic
- PANORAMIC_LOG_DIR=/tmp/panoramic-dsd-own-fix make test-integration-macos-run CASE=adp-memory-mode-disabled,adp-memory-mode-permissive-exceeds-limit,adp-memory-mode-permissive-within-limit,adp-memory-mode-strict-within-limit,basic-startup,otlp-traces-enabled
thieman added 2 commits June 1, 2026 10:28
The new adp-config-check-no-warn case from main still used the old
container.env shape, so the mac runner did not pass its ADP settings to the
spawned processes. ADP received the dynamic config stream, saw itself as
disabled, and exited.

Move the env block to the top-level integration config shape and mark the
case as portable across docker and mac, matching adp-config-check-warn.
Because this case deliberately disables DogStatsD to prove DogStatsD-affined
config warnings are suppressed, enable OTLP so ADP still has an active data
pipeline and can remain stable for the assertion window.

Verified with:
- PANORAMIC_LOG_DIR=/tmp/panoramic-adp-check-no-warn-3 make test-integration-macos-run CASE=adp-config-check-no-warn
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit b98f78a into main Jun 1, 2026
81 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot deleted the thieman/macos-integration-tests-baremetal branch June 1, 2026 15:27
dd-octo-sts Bot pushed a commit that referenced this pull request Jun 1, 2026
## Human Summary

Adds support for running integration tests (*not* correctness tests) under Mac native. All but 3 of the existing integration tests are updated to support Mac. In CI, this runs under Tart VMs for arm64 and on native runners for amd64. To support this, we have to do a bunch of gymnastics so that our Agent under test doesn't conflict with the system Agent on the native runners.

Runtime is selected via a new `--runtime` flag to panoramic which defaults to Mac native tests when running on a Mac.

## Summary

Adds the ability to run ADP integration tests as **native macOS processes** (no Docker, no virtualization), and wires the suite into CI on the existing `macos:sonoma-arm64` / `macos:sonoma-amd64` runners.

**24 of 27 integration tests pass on `native_macos`** in ~3 minutes. Both standalone (just-ADP) and converged (Agent + ADP) tests are supported. Both CI jobs are passing.

This is the bare-metal subset of #1721; a Tart-based local-dev wrapper will land separately in a follow-up.

## Why

ADP has no test coverage on macOS today. The 27-test integration suite all runs inside a Linux Docker container. Adding native macOS support is the first concrete step toward macOS as a supported platform (tracked in #1718): it builds ADP for macOS, exercises real macOS startup behavior, and gives us a place to land further coverage as the platform gaps from #1718 are closed.

## Architecture

```
┌─────────────────────────────────────────────────────────────┐
│ Layer 2: macOS environment (CI bare-metal runner)           │
│   `make test-integration-macos-ci` ← CI entry point         │
└─────────────────┬───────────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────────┐
│ Layer 1: new Rust code (panoramic + airlock)                │
│   airlock::native::NativeProcess                            │
│   panoramic::native_runner::NativeIntegrationRunner         │
│   runtimes: [docker, native_macos] in test configs          │
│   requires_core_agent: true  for converged tests            │
└─────────────────────────────────────────────────────────────┘
```

The native runner is a parallel implementation of the Docker `IntegrationRunner`. The genuinely-duplicated assertion-loop logic is factored into a shared `run_assertion_steps` helper; the spawn / log capture / cleanup architectures stay separate because they're legitimately different (bollard vs `tokio::process`).

The Makefile intentionally does **not** expose a "run on your own Mac" convenience target. A Datadog Agent running on a dev host will collide with the test ADP on port 8125 (DSD UDP); the right answer for local-dev isolation is the Tart wrapper in the follow-up PR. Until then, CI is the only supported path.

## Key changes

### Rust framework

- **`airlock::native::NativeProcess`** (new, ~240 lines) — `tokio::process` wrapper with kill-on-drop, always places spawned processes in their own process group so cleanup can signal the entire group (parent plus any forked helpers like `trace-agent` / `process-agent`). Background watcher observes the child's exit, fires a `CancellationToken`, and records the exit code in a shared `OnceLock` cell.
- **`panoramic::native_runner::NativeIntegrationRunner`** (new, ~365 lines) — parallel to the Docker integration runner. Creates a per-test temp config directory, optionally spawns the Datadog Core Agent, waits for the IPC handshake to be ready, then spawns ADP. Both processes feed into the same `LogBuffer` so existing log-based assertions work unchanged.
- **`run_assertion_steps`** (new helper in `assertions/mod.rs`) — extracted from the Docker `IntegrationRunner`'s implementation (the more complete one: fail-fast on first failure, step-index debug logging, consistent error reporting). Both runners delegate to it.
- **`AssertionContext`** gains an `is_native` flag and an `Option<ExitCodeCell>` so:
  - `file_contains` reads files via `tokio::fs` on native (vs. `docker exec cat` on Docker)
  - `adp_exits_with` (new assertion) abstracts the runtime difference for "did ADP exit with code N":
    - On `docker`: greps the captured log buffer for `agent-data-plane exited with code N` (the s6 supervisor's exit message)
    - On `native_macos`: reads the exit code from the per-process `OnceLock`
- **`log_contains` / `log_not_contains` fixed** to do a final post-exit buffer read instead of bailing immediately on container exit. Required for tests where ADP exits within seconds (e.g., `adp-disabled-exit`).

### Config schema

- New `runtimes: Vec<String>` field on `IntegrationConfig`. Defaults to `["docker"]`. Configs that declare multiple runtimes expand into one `Test` instance per runtime at discovery time, named `{base_name}/{runtime}`.
- New `requires_core_agent: bool` field. When true on the `native_macos` runtime, the native runner spawns the Core Agent alongside ADP. Informational on `docker` (s6 always runs both).
- New `panoramic run --runtime <name>` filter so `make` targets can scope to one runtime cleanly.

### Provisioning + CI

- **`make provision-macos-test-env`** — idempotently installs the Datadog Agent at `/opt/datadog-agent` (downloads the official DMG, runs `installer -pkg`, tolerates the postinstall non-zero exit since the binaries land before postinstall runs) and bootstraps the IPC cert + auth_token by running the Agent briefly. Pinned to Agent `7.78.0` via `MACOS_TEST_AGENT_VERSION` for reproducibility.
- **`make test-integration-macos-ci`** — CI entry point: builds binaries, runs `provision-macos-test-env`, runs the native suite.
- **`make build-adp-native`** — `cargo build --release --bin agent-data-plane` with the same `APP_*` env vars that `build-adp-base` uses, so the binary logs as `| DATAPLANE |` instead of `| UNKNOWN |`.
- **`.gitlab/e2e.yml`** — new `test-integration-macos-arm64` and `test-integration-macos-amd64` jobs (e2e stage) extending the existing bare-metal macOS runner tags. `before_script` does a defensive `sudo pkill` to clean up any orphan Agent processes from prior shared-runner jobs.

### Tests enabled

| Category | Tests | Status |
|---|---|---|
| Standalone (just ADP) | 14 of 17 | All passing |
| Converged (Agent + ADP) | 10 of 10 | All passing |
| Container-supervisor behavior | 0 of 1 | Deferred |
| Bind-host (PANORAMIC_DYNAMIC shell hooks) | 0 of 2 | Deferred |

**3 tests intentionally not enabled** (filtered out via the `runtimes` field):

- `adp-rar-disabled`: assertion is "container stable for 10s," which really tests "s6 keeps restarting ADP." No equivalent supervisor on native. Re-enable when the native runner grows restart-on-failure semantics or the test is rewritten to assert on retry behavior directly.
- `dogstatsd-bind-host`, `dogstatsd-bind-custom-hostname`: use `PANORAMIC_DYNAMIC_*` env shell hooks that run `hostname -i` and `echo ... >> /etc/hosts` — Linux-container-isms not portable to a macOS host (`hostname -i` isn't supported on macOS; writing `/etc/hosts` needs root and has real host side effects).

## Test plan

Verified locally on Apple Silicon (macOS 26.4.1), inside an ephemeral Tart macOS Sequoia VM with a freshly provisioned Datadog Agent (this matches the shape of a clean CI runner):

```
$ make test-integration-macos-ci
...
PASSED: 24 passed, 0 failed, 24 total (192.50s)
```

CI passes the same tests on both `macos:sonoma-arm64` and `macos:sonoma-amd64`.

**Regression check (Linux Docker path):**

- `panoramic list -d test/integration/cases` shows the original tests plus the 24 new `*/native_macos` variants.
- The `docker` variant of every previously-enabled test still discovers and dispatches to the existing `IntegrationRunner`.
- `cargo test -p airlock -p panoramic` passes.
- `make check-clippy`, `make check-fmt`, `make check-docs` clean (pre-commit enforced on every commit).

## Notable findings while building this

1. **ADP's bootstrap requires a `datadog.yaml`** to exist (defaulting to `/opt/datadog-agent/etc/datadog.yaml` on macOS). The native runner creates a per-test empty config so it works on hosts with or without an Agent install.
2. **The Agent's authoritative config (via the config stream) overrides ADP's env vars** because `ConfigurationLoader` follows "later sources win." If ADP set `DD_AUTH_TOKEN_FILE_PATH=<per-test>` but the Agent didn't, the Agent would advertise the platform default and ADP would honor the stream value for post-bootstrap IPC clients, failing TLS with `UnknownIssuer`. Fix: set `DD_AUTH_TOKEN_FILE_PATH` on the Agent too.
3. **Core Agent forks `trace-agent` and `process-agent`** that orphan onto launchd when the parent is SIGKILL'd. Native runner spawns the Agent with `process_group(0)` and signals the whole group on cleanup.
4. **The DMG installer's postinstall hook fails** with exit code 112, but the binaries land in `/opt/datadog-agent` before the postinstall runs. The provision target tolerates this non-zero exit.
5. **Bootstrap-via-sudo leaves files owned by root** mode 600. The provision target chowns to the current user only if the files aren't already readable, so sudo isn't re-prompted on already-set-up systems.

Related: #1718 (macOS platform gap tracking), #1721 (companion PR with optional Tart VM wrapper for local-dev isolation).

Co-authored-by: travis.thieman <travis.thieman@datadoghq.com> b98f78a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/ci CI/CD, automated testing, etc. area/test All things testing: unit/integration, correctness, SMP regression, etc. mergequeue-status: done

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants