Skip to content

docs(common): improve project orientation in AGENTS.md#1630

Merged
tobz merged 1 commit into
mainfrom
matt.briggs/agents.md
May 12, 2026
Merged

docs(common): improve project orientation in AGENTS.md#1630
tobz merged 1 commit into
mainfrom
matt.briggs/agents.md

Conversation

@webern
Copy link
Copy Markdown
Contributor

@webern webern commented May 12, 2026

Summary

  • Deworded some things to make the guiding principles crisper and moved them to the top
  • Added a tiny index to orient the agent before it starts spelunking
  • I was consistently having to tell Claude that standalone mode isn't something we need to overly focus on, so I de-emphasized in AGENTS.md

Change Type

  • Non-functional (chore, refactoring, docs)

How did you test this PR?

An Analysis Question

Before:

❯ Which tests are dealing with checks?

⏺ Let me read AGENTS.md first, then search for tests related to "checks".

  Read 1 file (ctrl+o to expand)

⏺ Now let me search for tests dealing with "checks" — this likely means Datadog Agent checks.

⏺ Explore(Find tests related to "checks" in the saluki repo)
  ⎿  Done (24 tool uses · 83.1k tokens · 1m 49s)
  (ctrl+o to expand)

⏺ Here's what I found:

  Unit tests:

  - lib/saluki-components/src/sources/checks_ipc/mod.rs (lines 326-725) - 23 tests covering proto-to-event conversion for the checks IPC source: metrics, logs,
  events, service checks, hostname propagation, field mapping, and error/skip cases.
  - lib/saluki-io/src/deser/codec/dogstatsd/service_check.rs (lines 129-319) - 8 tests covering DogStatsD service check packet parsing: basic checks, timestamps,
  tags, messages, and origin detection.
  - lib/saluki-components/src/encoders/datadog/service_checks/mod.rs (lines 221-237) - 1 config smoke test for the DatadogServiceChecksConfiguration.

  Correctness test suite:

  - test/correctness/dsd-service-checks/ - End-to-end correctness tests comparing Datadog Agent vs. Saluki behavior for DogStatsD service checks (docker-based,
  analysis_mode: service_checks).

  The checks IPC source tests are the most substantial, covering conversion from the protobuf wire format into Saluki's internal types.

✻ Worked for 2m 9s

After:

⏺ Here's the breakdown of tests dealing with "checks" across the three test suites:

  Correctness tests (test/correctness/):
  - dsd-service-checks — tests DogStatsD service check handling

  Integration tests (test/integration/cases/):
  - adp-config-check-exit — ADP exits on a bad config check
  - adp-config-check-warn — ADP warns on a config check issue

  SMP regression tests (test/smp/regression/):
  - adp-checks-agent/cases/quality_gates_rss_basic — memory baseline with a running Python check (checks.d/simple.py + conf.d/simple.yaml)
  - adp-checks-agent/cases/quality_gates_rss_idle — memory baseline with checks-agent idle (no active check)

  So there are two distinct meanings of "checks" in use:
  1. DogStatsD service checks — the _sc|...|... metric type, tested in dsd-service-checks
  2. Agent checks (Python-based checks run by the checks agent) — tested in the adp-checks-agent SMP suite, which measures ADP memory behavior when running alongside
  the checks agent

  Which of these are you interested in?

✻ Crunched for 27s

Standalone Mode

Before:

❯ Explain to me the purpose of standalone mode and how important it is to consider when implementing features

⏺ Let me read AGENTS.md first, then research standalone mode in the codebase.

  Read 1 file (ctrl+o to expand)

⏺ Now let me search for standalone mode across the codebase.

  Searched for 10 patterns, read 3 files (ctrl+o to expand)

⏺ Now I have a clear picture. Here's the explanation:

  ---
  Standalone mode in ADP

  What it is: A runtime mode (data_plane.standalone_mode=true) where ADP runs without a co-located Datadog Core Agent. It defaults to false — the normal production
  deployment always has a Core Agent present.

  What it cuts off: When standalone mode is active, three Agent-dependent subsystems are replaced with no-ops at startup (env/mod.rs:50-83):

  1. Host provider - replaced by a FixedHostProvider that reads the hostname from config rather than querying the Agent. You must set hostname explicitly.
  2. Workload provider - disabled entirely (returns None). This means origin detection and entity tag enrichment are unavailable - no container tags, no pod tags,
  nothing.
  3. Autodiscovery provider - disabled entirely. No dynamic service discovery.

  Additionally:
  - Remote agent registration is skipped (run.rs:91) - ADP does not connect to the Core Agent's gRPC control plane and receives no config stream updates from it.
  - Host tags enrichment is skipped in the metrics pipeline (run.rs:480-483) - the host_tags transform is only added when not in standalone mode.
  - The dp_config.enabled() guard is bypassed (run.rs:149) - standalone mode allows ADP to run even if data_plane.enabled=false, which is used in benchmarking/test
  scenarios.

  Why it exists: Testing and benchmarking without needing a real Agent running alongside ADP.

  ---
  How important is it when implementing features?

  Very important for any feature that touches:

  - Origin enrichment / entity tags - If your feature attaches container/pod/entity metadata to metrics, it must handle the case where workload_provider is None.
  Don't assume a workload provider exists.
  - Host information - In standalone mode the hostname comes from a static config value, not a live query. Features that need the hostname will still get one, but it
  will be fixed/static.
  - The metrics enrichment pipeline - The host_tags transform is only present in non-standalone mode. If you add to the enrichment chain, ask whether it should be
  gated the same way.
  - Control plane / Agent communication - Any feature that calls back to the Core Agent over gRPC will not have that connection in standalone mode.

  Lower concern for features that are purely data-path (parsing, aggregation, forwarding of metric values themselves) - standalone mode doesn't affect those.

  The practical rule: if your feature depends on env_provider.workload() or env_provider.autodiscovery(), it must handle None gracefully, because standalone mode
  returns None for both.

✻ Cogitated for 59s

※ recap: We were discussing standalone mode in ADP — how it disables origin detection, workload enrichment, and Agent communication, and what to watch for when
  implementing features. No code was changed; no next action is queued. (disable recaps in /config)

After:

❯ Explain to me the purpose of standalone mode and how important it is to consider when implementing features

⏺ Let me read AGENTS.md and search for standalone mode documentation.

  Read 1 file (ctrl+o to expand)

⏺ The answer is right in AGENTS.md:

  From AGENTS.md:36-37:

  ▎ agent-data-plane standalone mode is a vestige of earlier development and testing cycles. It is not for production use and supporting it need not be a blocker
  ▎ during feature development.

  Standalone mode is low priority. It was an earlier dev/testing convenience, not a production path. When implementing features, you don't need to ensure they work in
   standalone mode - if supporting it would block or complicate a feature, skip it. Focus on the production configuration.

References

N/A

@webern webern requested a review from a team as a code owner May 12, 2026 14:11
@webern webern changed the title docs(common): better orient ai agents in AGENTS.md docs(common): improve project orientation in AGENTS.md May 12, 2026
@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 12, 2026

Binary Size Analysis (Agent Data Plane)

Target: 4aeb9c7 (baseline) vs 8105a68 (comparison) diff
Analysis Type: Stripped binaries (debug symbols excluded)
Baseline Size: 37.61 MiB
Comparison Size: 37.61 MiB
Size Change: +0 B (+0.00%)
Pass/Fail Threshold: +5%
Result: PASSED ✅

Changes by Module

Module File Size Symbols
anon.4f8fd67d74ae1f1600187cfeb0121be9.1.llvm.16577489434316153245 +130 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.1.llvm.3228274853015480322 -129 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.4.llvm.16577489434316153245 +115 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.4.llvm.3228274853015480322 -114 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.3.llvm.16577489434316153245 +109 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.3.llvm.3228274853015480322 -108 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.0.llvm.16577489434316153245 +97 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.0.llvm.3228274853015480322 -96 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.2.llvm.16577489434316153245 +95 B 1
anon.4f8fd67d74ae1f1600187cfeb0121be9.2.llvm.3228274853015480322 -94 B 1
[Unmapped] -5 B 1

Detailed Symbol Changes

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  [NEW]    +130  [NEW]     +40    anon.4f8fd67d74ae1f1600187cfeb0121be9.1.llvm.16577489434316153245
  [NEW]    +115  [NEW]     +25    anon.4f8fd67d74ae1f1600187cfeb0121be9.4.llvm.16577489434316153245
  [NEW]    +109  [NEW]     +19    anon.4f8fd67d74ae1f1600187cfeb0121be9.3.llvm.16577489434316153245
  [NEW]     +97  [NEW]      +7    anon.4f8fd67d74ae1f1600187cfeb0121be9.0.llvm.16577489434316153245
  [NEW]     +95  [NEW]      +5    anon.4f8fd67d74ae1f1600187cfeb0121be9.2.llvm.16577489434316153245
  -0.1%      -5  [ = ]       0    [Unmapped]
  [DEL]     -94  [DEL]      -5    anon.4f8fd67d74ae1f1600187cfeb0121be9.2.llvm.3228274853015480322
  [DEL]     -96  [DEL]      -7    anon.4f8fd67d74ae1f1600187cfeb0121be9.0.llvm.3228274853015480322
  [DEL]    -108  [DEL]     -19    anon.4f8fd67d74ae1f1600187cfeb0121be9.3.llvm.3228274853015480322
  [DEL]    -114  [DEL]     -25    anon.4f8fd67d74ae1f1600187cfeb0121be9.4.llvm.3228274853015480322
  [DEL]    -129  [DEL]     -40    anon.4f8fd67d74ae1f1600187cfeb0121be9.1.llvm.3228274853015480322
  [ = ]       0  [ = ]       0    TOTAL

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 12, 2026

Regression Detector (Agent Data Plane)

Regression Detector Results

Run ID: 08cf25fd-be50-427d-a344-4c9b85c1076b

Baseline: 4aeb9c7
Comparison: 8105a68
Diff

Optimization Goals: ✅ No significant changes detected

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf experiment goal Δ mean % Δ mean % CI trials links
otlp_ingest_logs_5mb_cpu % cpu utilization +1.06 [-3.65, +5.77] 1 (metrics) (profiles) (logs)
otlp_ingest_logs_5mb_throughput ingress throughput +0.02 [-0.10, +0.14] 1 (metrics) (profiles) (logs)
otlp_ingest_logs_5mb_memory memory utilization -0.36 [-0.74, +0.02] 1 (metrics) (profiles) (logs)

Fine details of change detection per experiment

perf experiment goal Δ mean % Δ mean % CI trials links
otlp_ingest_metrics_5mb_memory memory utilization +2.07 [+1.87, +2.27] 1 (metrics) (profiles) (logs)
dsd_uds_500mb_3k_contexts_throughput ingress throughput +1.68 [+1.54, +1.81] 1 (metrics) (profiles) (logs)
otlp_ingest_traces_ottl_filtering_5mb_cpu % cpu utilization +1.42 [-0.83, +3.67] 1 (metrics) (profiles) (logs)
dsd_uds_100mb_3k_contexts_cpu % cpu utilization +1.35 [-4.36, +7.06] 1 (metrics) (profiles) (logs)
otlp_ingest_logs_5mb_cpu % cpu utilization +1.06 [-3.65, +5.77] 1 (metrics) (profiles) (logs)
dsd_uds_10mb_3k_contexts_cpu % cpu utilization +0.52 [-30.58, +31.61] 1 (metrics) (profiles) (logs)
otlp_ingest_traces_ottl_transform_5mb_memory memory utilization +0.24 [+0.07, +0.40] 1 (metrics) (profiles) (logs)
quality_gates_rss_dsd_medium memory utilization +0.23 [+0.06, +0.40] 1 (metrics) (profiles) (logs)
otlp_ingest_traces_5mb_memory memory utilization +0.16 [-0.01, +0.32] 1 (metrics) (profiles) (logs)
quality_gates_rss_idle memory utilization +0.12 [+0.08, +0.16] 1 (metrics) (profiles) (logs)
dsd_uds_512kb_3k_contexts_memory memory utilization +0.12 [-0.03, +0.27] 1 (metrics) (profiles) (logs)
dsd_uds_1mb_3k_contexts_memory memory utilization +0.10 [-0.04, +0.24] 1 (metrics) (profiles) (logs)
otlp_ingest_traces_5mb_throughput ingress throughput +0.09 [+0.01, +0.17] 1 (metrics) (profiles) (logs)
dsd_uds_10mb_3k_contexts_memory memory utilization +0.05 [-0.10, +0.20] 1 (metrics) (profiles) (logs)
otlp_ingest_traces_ottl_filtering_5mb_memory memory utilization +0.04 [-0.21, +0.29] 1 (metrics) (profiles) (logs)
quality_gates_rss_dsd_low memory utilization +0.03 [-0.13, +0.19] 1 (metrics) (profiles) (logs)
otlp_ingest_logs_5mb_throughput ingress throughput +0.02 [-0.10, +0.14] 1 (metrics) (profiles) (logs)
dsd_uds_1mb_3k_contexts_throughput ingress throughput +0.00 [-0.06, +0.06] 1 (metrics) (profiles) (logs)
otlp_ingest_metrics_5mb_throughput ingress throughput -0.00 [-0.18, +0.17] 1 (metrics) (profiles) (logs)
dsd_uds_10mb_3k_contexts_throughput ingress throughput -0.00 [-0.20, +0.19] 1 (metrics) (profiles) (logs)
dsd_uds_512kb_3k_contexts_throughput ingress throughput -0.01 [-0.06, +0.04] 1 (metrics) (profiles) (logs)
dsd_uds_100mb_3k_contexts_throughput ingress throughput -0.02 [-0.05, +0.01] 1 (metrics) (profiles) (logs)
otlp_ingest_traces_ottl_filtering_5mb_throughput ingress throughput -0.03 [-0.10, +0.04] 1 (metrics) (profiles) (logs)
dsd_uds_500mb_3k_contexts_memory memory utilization -0.05 [-0.20, +0.09] 1 (metrics) (profiles) (logs)
quality_gates_rss_dsd_heavy memory utilization -0.19 [-0.31, -0.07] 1 (metrics) (profiles) (logs)
quality_gates_rss_dsd_ultraheavy memory utilization -0.25 [-0.38, -0.11] 1 (metrics) (profiles) (logs)
otlp_ingest_traces_ottl_transform_5mb_throughput ingress throughput -0.32 [-0.40, -0.24] 1 (metrics) (profiles) (logs)
otlp_ingest_logs_5mb_memory memory utilization -0.36 [-0.74, +0.02] 1 (metrics) (profiles) (logs)
dsd_uds_100mb_3k_contexts_memory memory utilization -0.40 [-0.55, -0.25] 1 (metrics) (profiles) (logs)
dsd_uds_500mb_3k_contexts_cpu % cpu utilization -0.43 [-1.77, +0.91] 1 (metrics) (profiles) (logs)
otlp_ingest_metrics_5mb_cpu % cpu utilization -0.65 [-6.42, +5.12] 1 (metrics) (profiles) (logs)
otlp_ingest_traces_5mb_cpu % cpu utilization -1.35 [-3.42, +0.71] 1 (metrics) (profiles) (logs)
otlp_ingest_traces_ottl_transform_5mb_cpu % cpu utilization -1.74 [-3.60, +0.12] 1 (metrics) (profiles) (logs)
dsd_uds_512kb_3k_contexts_cpu % cpu utilization -6.35 [-63.36, +50.67] 1 (metrics) (profiles) (logs)
dsd_uds_1mb_3k_contexts_cpu % cpu utilization -10.07 [-61.22, +41.08] 1 (metrics) (profiles) (logs)

Bounds Checks: ✅ Passed

perf experiment bounds_check_name replicates_passed observed_value links
quality_gates_rss_dsd_heavy memory_usage 10/10 122.76MiB ≤ 140MiB (metrics) (profiles) (logs)
quality_gates_rss_dsd_low memory_usage 10/10 40.18MiB ≤ 50MiB (metrics) (profiles) (logs)
quality_gates_rss_dsd_medium memory_usage 10/10 60.15MiB ≤ 75MiB (metrics) (profiles) (logs)
quality_gates_rss_dsd_ultraheavy memory_usage 10/10 177.44MiB ≤ 200MiB (metrics) (profiles) (logs)
quality_gates_rss_idle memory_usage 10/10 27.16MiB ≤ 40MiB (metrics) (profiles) (logs)

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

  • ✅ = significantly better comparison variant performance
  • ❌ = significantly worse comparison variant performance
  • ➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

  1. Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.

  2. Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.

  3. Its configuration does not mark it "erratic".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants