feat: add experiment information as OTLP telemetry attribute by Copilot · Pull Request #29593 · github/gh-aw

Copilot · 2026-05-01T18:41:22Z

Summary

Adds experiment assignment information as attributes in the OTLP OpenTelemetry telemetry spans, enabling A/B experiment variant correlation in observability backends.

Changes

`actions/setup/js/pick_experiment.cjs`

After selecting variants, writes current-run assignments to /tmp/gh-aw/experiments/assignments.json alongside the existing state file (which tracks counts)
Write is guarded: only occurs when at least one experiment is successfully assigned (skips empty-spec and all-invalid-variant cases)

`actions/setup/js/send_otlp_span.cjs`

Adds EXPERIMENT_ASSIGNMENTS_PATH constant (/tmp/gh-aw/experiments/assignments.json)
Adds readExperimentAssignments() helper that reads the assignments file (returns null if absent/invalid — non-fatal)
Adds buildExperimentAttributes() helper that converts assignments to OTLP span attributes:
- One gh-aw.experiment.<name> attribute per experiment (e.g. gh-aw.experiment.caveman=yes)
- A combined gh-aw.experiments JSON string attribute (only when at least one valid variant exists)
Both sendJobSetupSpan and sendJobConclusionSpan now include experiment attributes
All three helpers exported from the module

Tests

pick_experiment.test.cjs: 3 new tests covering assignments file creation, successive-run overwrite, and no-write cases
send_otlp_span.test.cjs: 10 new tests covering readExperimentAssignments, buildExperimentAttributes, and experiment attribute inclusion in both setup and conclusion spans

Attribute design

Attribute	Value	Example
`gh-aw.experiment.<name>`	Selected variant string	`gh-aw.experiment.caveman=yes`
`gh-aw.experiments`	JSON of all assignments	`{"caveman":"yes","style":"detailed"}`

The per-experiment attributes allow filtering on individual variants in dashboards. The aggregated JSON attribute enables substring search in backends that don't support per-attribute filtering.

- pick_experiment.cjs now writes assignments.json alongside state.json so downstream spans can read the current variant without recomputing - send_otlp_span.cjs gains readExperimentAssignments() and buildExperimentAttributes() helpers - sendJobSetupSpan and sendJobConclusionSpan both include gh-aw.experiment.<name> (one per experiment) and gh-aw.experiments (JSON of all assignments) span attributes - Tests added for pick_experiment.cjs (assignments file writing) and send_otlp_span.cjs (experiment attributes in setup and conclusion spans) Agent-Logs-Url: https://github.com/github/gh-aw/sessions/f0ab1e69-b2cd-46eb-977e-8c582c5c65a1 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

…gnments.json write - buildExperimentAttributes only appends gh-aw.experiments when at least one valid variant attribute was produced - pick_experiment.cjs only writes assignments.json when at least one experiment was successfully assigned - Added test for all-empty-variant-string case in buildExperimentAttributes - Added test for <2-variant experiment skipping in pick_experiment main Agent-Logs-Url: https://github.com/github/gh-aw/sessions/f0ab1e69-b2cd-46eb-977e-8c582c5c65a1 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot

Pull request overview

Adds A/B experiment assignment metadata to OTLP spans so observability backends can correlate traces with experiment variants.

Changes:

Persist current-run experiment assignments to an assignments.json file in pick_experiment.cjs.
Read assignments in send_otlp_span.cjs and emit them as OTLP span attributes (gh-aw.experiment.<name> and gh-aw.experiments).
Add/extend test coverage for both the assignments file behavior and OTLP attribute emission.

Show a summary per file

File	Description
actions/setup/js/pick_experiment.cjs	Writes current-run experiment assignments to `assignments.json` when at least one assignment exists.
actions/setup/js/pick_experiment.test.cjs	Adds tests validating `assignments.json` creation/overwrite and no-write scenarios.
actions/setup/js/send_otlp_span.cjs	Reads experiment assignments and injects them into setup/conclusion OTLP spans; exports new helpers/constants.
actions/setup/js/send_otlp_span.test.cjs	Adds tests for reading/building experiment attributes and ensuring spans include/omit them appropriately.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 4/4 changed files
Comments generated: 2

+/**
+ * Path to the experiment assignments file written by pick_experiment.cjs.
+ * Contains a JSON object mapping experiment name → selected variant for the
+ * current workflow run.  Example: `{"caveman":"yes","style":"detailed"}`.
+ * @type {string}
+ */
+const EXPERIMENT_ASSIGNMENTS_PATH = "/tmp/gh-aw/experiments/assignments.json";


+function buildExperimentAttributes(assignments) {
+  if (!assignments || typeof assignments !== "object") return [];
+  const names = Object.keys(assignments).sort();
+  if (names.length === 0) return [];
+  const attrs = [];
+  for (const name of names) {
+    const variant = assignments[name];
+    if (typeof variant === "string" && variant) {
+      attrs.push(buildAttr(`gh-aw.experiment.${name}`, variant));
+    }
+  }
+  if (attrs.length > 0) {
+    attrs.push(buildAttr("gh-aw.experiments", JSON.stringify(assignments)));
+  }
+  return attrs;


github-actions · 2026-05-01T18:56:08Z

🧪 Test Quality Sentinel Report

Test Quality Score: 83/100

✅ Excellent

Metric	Value
New/modified tests analyzed	17
✅ Design tests (behavioral contracts)	17 (100%)
⚠️ Implementation tests (low value)	0 (0%)
Tests with error/edge cases	13 (76%)
Duplicate test clusters	0
Test inflation detected	⚠️ Yes — `pick_experiment.test.cjs` (58 test lines / 9 production lines = 6.4:1)
🚨 Coding-guideline violations	None

Test Classification Details

View all 17 test classifications

Test	File	Classification	Issues Detected
`writes assignments.json alongside state.json after picking variants`	`pick_experiment.test.cjs`	✅ Design	Verifies file existence and content on disk
`overwrites assignments.json on successive runs reflecting the current variant`	`pick_experiment.test.cjs`	✅ Design	Tests state transition across two invocations
`does not write assignments.json when spec is empty`	`pick_experiment.test.cjs`	✅ Design	Edge case: empty spec
`does not write assignments.json when all experiments have fewer than 2 variants`	`pick_experiment.test.cjs`	✅ Design	Edge case: boundary condition on variant count
`includes gh-aw.experiment.<name> and gh-aw.experiments attributes when assignments file exists` (setup)	`send_otlp_span.test.cjs`	✅ Design	Verifies all three OTLP attribute values in the emitted span
`omits experiment attributes when assignments file is absent` (setup)	`send_otlp_span.test.cjs`	✅ Design	Edge case: absent file → no attributes
`returns null when the assignments file does not exist`	`send_otlp_span.test.cjs`	✅ Design	Edge case: ENOENT → null return
`returns null when the assignments file contains invalid JSON`	`send_otlp_span.test.cjs`	✅ Design	Edge case: corrupt file → null
`returns null when the assignments file contains a non-object value`	`send_otlp_span.test.cjs`	✅ Design	Edge case: wrong type → null
`returns the parsed assignments object when the file is valid`	`send_otlp_span.test.cjs`	✅ Design	Happy-path return value
`returns an empty array for null input`	`send_otlp_span.test.cjs`	✅ Design	Edge case: null input
`returns an empty array for an empty assignments object`	`send_otlp_span.test.cjs`	✅ Design	Edge case: empty object
`builds one attribute per experiment plus the aggregated gh-aw.experiments attribute`	`send_otlp_span.test.cjs`	✅ Design	Verifies attribute structure and aggregated JSON
`skips assignments with non-string or empty-string variants and still adds gh-aw.experiments for valid ones`	`send_otlp_span.test.cjs`	✅ Design	Edge case: mixed valid/invalid variants
`returns empty array and omits gh-aw.experiments when all variants are empty strings`	`send_otlp_span.test.cjs`	✅ Design	Edge case: all variants invalid
`includes gh-aw.experiment.<name> and gh-aw.experiments attributes in conclusion span`	`send_otlp_span.test.cjs`	✅ Design	Verifies attributes in the conclusion span
`omits experiment attributes in conclusion span when assignments file is absent`	`send_otlp_span.test.cjs`	✅ Design	Edge case: absent file → no attributes in conclusion span

Score Breakdown

Component	Score
Behavioral coverage (17/17 design tests)	40/40
Error/edge case coverage (13/17 tests = 76%)	23/30
Low duplication (0 clusters)	20/20
Proportional growth (inflation penalty: pick_experiment 6.4:1 > 2:1)	0/10
Total	83/100

The only deduction is the test inflation flag on pick_experiment.test.cjs (58 lines added vs. 9 production lines). In practice, this is benign — the 4 new test scenarios are substantive and cover meaningful behavioral contracts (file-system side-effects, state transitions, empty/boundary inputs). The inflation ratio is mechanically high because the new production code is concise.

Language Support

Tests analyzed:

🟨 JavaScript (*.test.cjs): 17 tests (vitest)

No Go test files were added or modified in this PR.

Notes on Mocking Strategy

The tests make heavy use of vi.spyOn(fs, "readFileSync") and vi.stubGlobal("fetch", ...). Both are external I/O (filesystem and network), which is acceptable under the project's guidelines. No internal business-logic functions are mocked. All assertions target observable outputs (OTLP span attribute values, file contents, return values of helper functions).

Verdict

✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). All 17 tests verify behavioral contracts — including a strong set of error and edge-case scenarios.

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

Assert on observable outputs, return values, or state changes
Cover error paths and boundary conditions
Would catch a behavioral regression if deleted
Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

Assert on internal function calls (mocking internals)
Only test the happy path with typical inputs
Break during legitimate refactoring even when behavior is correct
Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §25228044513

🧪 Test quality analysis by Test Quality Sentinel · ● 515.3K · ◷

github-actions

✅ Test Quality Sentinel: 83/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 17 new vitest tests verify behavioral contracts with strong edge-case coverage.

pelikhan · 2026-05-01T18:56:20Z

@copilot review all comments, fix https://github.com/github/gh-aw/actions/runs/25228047833/job/73976430444?pr=29593

…tered experiments JSON - readExperimentAssignments() now derives the file path from GH_AW_EXPERIMENT_STATE_DIR env var (same as pick_experiment.cjs), falling back to EXPERIMENT_ASSIGNMENTS_PATH when not set - buildExperimentAttributes() builds gh-aw.experiments JSON from the filtered valid assignments only (excludes empty/non-string variants), making it consistent with the per-experiment attributes - Updated/added tests for both fixes Agent-Logs-Url: https://github.com/github/gh-aw/sessions/5ccb46be-572d-4706-9cd6-c6f4ee50a256 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-05-01T19:03:17Z

@copilot review all comments, fix https://github.com/github/gh-aw/actions/runs/25228047833/job/73976430444?pr=29593

Fixed both review comments in ad409cb:

EXPERIMENT_ASSIGNMENTS_PATH / configurable dir: readExperimentAssignments() now checks GH_AW_EXPERIMENT_STATE_DIR and reads from <dir>/assignments.json when set, falling back to the hardcoded default only when the env var is absent — keeping it in sync with pick_experiment.cjs.
gh-aw.experiments consistency: buildExperimentAttributes() now builds the aggregated JSON from the same filtered set used for per-experiment attributes (only valid, non-empty string variants), so both are always consistent.

The CI failure in create_issue.test.cjs is a pre-existing flaky test unrelated to these changes.

Copilot AI and others added 2 commits May 1, 2026 18:38

Copilot AI assigned Copilot and pelikhan May 1, 2026

Copilot created this pull request from a session on behalf of pelikhan May 1, 2026 18:41 View session

Copilot finished work on behalf of pelikhan May 1, 2026 18:42

Copilot AI requested a review from pelikhan May 1, 2026 18:42

pelikhan marked this pull request as ready for review May 1, 2026 18:50

Copilot AI review requested due to automatic review settings May 1, 2026 18:50

Merge branch 'main' into copilot/add-experiment-information-attribute

4d0cc20

Copilot started reviewing on behalf of pelikhan May 1, 2026 18:50 View session

Copilot AI reviewed May 1, 2026

View reviewed changes

github-actions Bot mentioned this pull request May 1, 2026

[aw] No-Op Runs #29134

Open

github-actions Bot approved these changes May 1, 2026

View reviewed changes

Copilot started work on behalf of pelikhan May 1, 2026 18:56 View session

Copilot finished work on behalf of pelikhan May 1, 2026 19:04

Merge branch 'main' into copilot/add-experiment-information-attribute

4a10f76

pelikhan merged commit aca5a8c into main May 1, 2026
6 of 7 checks passed

pelikhan deleted the copilot/add-experiment-information-attribute branch May 1, 2026 19:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add experiment information as OTLP telemetry attribute#29593

feat: add experiment information as OTLP telemetry attribute#29593
pelikhan merged 5 commits intomainfrom
copilot/add-experiment-information-attribute

Copilot AI commented May 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot commented May 1, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

pelikhan commented May 1, 2026

Uh oh!

Copilot AI commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented May 1, 2026

Summary

Changes

actions/setup/js/pick_experiment.cjs

actions/setup/js/send_otlp_span.cjs

Tests

Attribute design

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

github-actions Bot commented May 1, 2026

🧪 Test Quality Sentinel Report

Test Quality Score: 83/100

Test Classification Details

Score Breakdown

Language Support

Notes on Mocking Strategy

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

pelikhan commented May 1, 2026

Uh oh!

Copilot AI commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

`actions/setup/js/pick_experiment.cjs`

`actions/setup/js/send_otlp_span.cjs`