Add A/B experiment wiring for smoke-pi sub-agent decomposition by Copilot · Pull Request #34027 · github/gh-aw

Copilot · 2026-05-22T14:25:11Z

smoke-pi currently runs all five Pi smoke checks in a single agent turn. This change introduces an A/B experiment on sub_agent_decomposition so the workflow can compare the current single-agent path against a parallel sub-agent implementation using the built-in experiments runtime.

Experiment metadata
- Adds experiments.sub_agent_decomposition to /.github/workflows/smoke-pi.md
- Configures:
  - variants: single_agent, parallel_sub_agents
  - primary metric: effective_token_count
  - secondary and guardrail metrics
  - sample size, weights, start date, analysis type, and tags
- Leaves the issue field as a placeholder comment until a real tracking issue number is available
Prompt branching
- Replaces the single static test-requirements block with variant-specific Handlebars branches
- parallel_sub_agents instructs the workflow to launch five background task agents, one per smoke check, then aggregate results via read_agent
- else preserves the existing sequential single-turn behavior as the baseline
Compiled workflow regeneration
- Recompiles smoke-pi so /.github/workflows/smoke-pi.lock.yml includes:
  - experiment spec serialization
  - runtime variant selection wiring
  - prompt interpolation inputs for sub_agent_decomposition

Example of the new prompt split:

{{#if experiments.sub_agent_decomposition == "parallel_sub_agents"}}
Launch five parallel `task` agents using mode: "background" to execute each smoke test independently.
{{else}}
Execute the following tests sequentially in a single turn:
{{/if}}

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot

Pull request overview

Introduces an A/B experiment (sub_agent_decomposition) for the smoke-pi workflow to compare the existing single-agent smoke run against a parallel background sub-agent approach, with the compiled lock workflow regenerated to include experiments runtime wiring.

Changes:

Added experiments.sub_agent_decomposition metadata to smoke-pi workflow frontmatter (variants, metrics, weights, analysis).
Branched the “Test Requirements” prompt to run either sequential tests (baseline) or five parallel background task agents with aggregation via read_agent.
Regenerated smoke-pi.lock.yml to serialize the experiment spec, select variants at runtime, persist experiment state, and pass variant inputs into prompt interpolation.

Show a summary per file

File	Description
.github/workflows/smoke-pi.md	Adds experiment spec and Handlebars branching for single-agent vs parallel sub-agent smoke execution.
.github/workflows/smoke-pi.lock.yml	Recompiled workflow to include experiment selection, state restore/push, artifact wiring, and prompt env var injection.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 2/2 changed files
Comments generated: 3

+{{#if experiments.sub_agent_decomposition == "parallel_sub_agents"}}
+Launch five parallel `task` agents using mode: "background" to execute each smoke test independently. Use the `task` agent type with `description` field for each:
+
+1. **GitHub MCP Test Agent**: Fetch 2 merged PR titles from ${{ github.repository }}


+
+1. **GitHub MCP Test Agent**: Fetch 2 merged PR titles from ${{ github.repository }}
+2. **Web Fetch Test Agent**: Fetch https://github.com and verify "GitHub" in response using web-fetch MCP
+3. **File I/O Test Agent**: Create `/tmp/gh-aw/agent/smoke-test-pi-${{ github.run_id }}.txt` with timestamp


+Launch five parallel `task` agents using mode: "background" to execute each smoke test independently. Use the `task` agent type with `description` field for each:
+
+1. **GitHub MCP Test Agent**: Fetch 2 merged PR titles from ${{ github.repository }}
+2. **Web Fetch Test Agent**: Fetch https://github.com and verify "GitHub" in response using web-fetch MCP
+3. **File I/O Test Agent**: Create `/tmp/gh-aw/agent/smoke-test-pi-${{ github.run_id }}.txt` with timestamp
+4. **Bash Test Agent**: Verify file creation with `cat` command
+5. **Build Test Agent**: Run `GOCACHE=/tmp/go-cache GOMODCACHE=/tmp/go-mod make build`
+
+Wait for all five agents to complete (you'll receive notifications). Read each agent's result using `read_agent`. Aggregate the results into a unified report with ✅/❌ status for each test.
+


Initial plan

5f71965

Copilot AI assigned Copilot and pelikhan May 22, 2026

Copilot started work on behalf of pelikhan May 22, 2026 14:31 View session

Copilot AI linked an issue May 22, 2026 that may be closed by this pull request

[ab-advisor] Experiment campaign for smoke-pi: A/B test sub_agent_strategy #34021

Closed

9 tasks

Add smoke-pi sub-agent experiment

b3ce8c6

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add A/B test for sub_agent_strategy in smoke-pi campaign~~ Add A/B experiment wiring for smoke-pi sub-agent decomposition May 22, 2026

Copilot AI requested a review from pelikhan May 22, 2026 14:44

Copilot finished work on behalf of pelikhan May 22, 2026 14:45

pelikhan marked this pull request as ready for review May 22, 2026 14:46

Copilot AI review requested due to automatic review settings May 22, 2026 14:46

pelikhan merged commit 25aa24e into main May 22, 2026

pelikhan deleted the copilot/ab-advisor-experiment-campaign-smoke-pi branch May 22, 2026 14:46

Copilot started reviewing on behalf of pelikhan May 22, 2026 14:47 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add A/B experiment wiring for smoke-pi sub-agent decomposition#34027

Add A/B experiment wiring for smoke-pi sub-agent decomposition#34027
pelikhan merged 2 commits into
mainfrom
copilot/ab-advisor-experiment-campaign-smoke-pi

Copilot AI commented May 22, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented May 22, 2026 •

edited

Loading