[ab-advisor] Add sub_agent_strategy A/B experiment to smoke-temporary-id workflow by Copilot · Pull Request #34020 · github/gh-aw

Copilot · 2026-05-22T14:03:36Z

Implements the sub_agent_strategy experiment campaign on smoke-temporary-id, testing whether decomposing issue creation into parallel sub-agents reduces token consumption vs. the current single-agent approach.

Frontmatter

Added rich experiments.sub_agent_strategy block:

Variants: single_agent (control) / sub_agents (treatment), 50/50 weight
Primary metric: effective_token_count; secondary: run_duration_seconds, issue_creation_success_rate
Guardrails: all_issues_created ==3, temporary_id_resolution_rate >=0.95
min_samples: 20, analysis_type: t_test, start_date: 2026-05-23

Workflow body

Wrapped prompt in two {{#if}} branches:

{{#if experiments.sub_agent_strategy == 'single_agent'}}
## Single-Agent Mode
Create all issues in this context.
...3 create_issue JSON blocks...
{{/if}}

{{#if experiments.sub_agent_strategy == 'sub_agents'}}
## Sub-Agent Mode
Launch 3 background `task` agents (one per issue) in parallel, wait for completion...
{{/if}}

## Final Step: Add Summary Comment
...shared add_comment block for both variants...

The add_comment step is intentionally outside both conditional blocks so it runs regardless of variant.

Schema adaptations

Dropped issue: "#aw_campaign" — schema requires an integer issue number
Dropped direction: from guardrail metric entries — not a recognized field (name + threshold only)
Used single-quoted Handlebars expressions (== 'value') per compiler requirement

Lock file regenerated via gh aw compile smoke-temporary-id.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

…kflow - Add experiments.sub_agent_strategy block to frontmatter (variants: single_agent, sub_agents) - Wrap workflow body with {{#if}} conditional blocks for each variant - single_agent: original single-context approach (3 create_issue calls in one context) - sub_agents: coordinator launches 3 background task agents (one per issue) in parallel - Final add_comment step is shared outside both variant blocks - Regenerate smoke-temporary-id.lock.yml via gh aw compile Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot

Pull request overview

Adds an A/B experiment (experiments.sub_agent_strategy) to the smoke-temporary-id workflow to compare the existing single-agent issue creation flow against a parallelized “sub-agent” strategy, and regenerates the compiled lock artifacts accordingly.

Changes:

Introduces rich experiments.sub_agent_strategy frontmatter (variants, metrics, guardrails, weights, analysis metadata).
Splits the workflow prompt into two experiment-gated branches: single-agent vs. background sub-agent orchestration.
Regenerates workflow lockfiles and updates the actions lock to include docker/metadata-action@v6.

Show a summary per file

File	Description
`.github/workflows/smoke-temporary-id.md`	Adds `sub_agent_strategy` experiment metadata and conditional prompt branches for single-agent vs sub-agent execution.
`.github/workflows/smoke-temporary-id.lock.yml`	Recompiled lockfile reflecting experiment plumbing and updated pinned runtime/action/container details.
`.github/workflows/release.lock.yml`	Updates pinned `docker/metadata-action` reference to the `v6` entry.
`.github/aw/actions-lock.json`	Adds a lock entry for `docker/metadata-action@v6` (SHA pinned).

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (3)

.github/workflows/smoke-temporary-id.lock.yml:520

The workflow installs AWF v0.25.49, which is a downgrade relative to other lockfiles in this repository that install v0.25.51. If the intent is only to add the experiment, consider recompiling with the same gh-aw/AWF versions used elsewhere to avoid behavioral drift across workflows.

      - name: Install GitHub Copilot CLI
        run: bash "${RUNNER_TEMP}/gh-aw/actions/install_copilot_cli.sh" 1.0.48
        env:
          GH_HOST: github.com
      - name: Install AWF binary
        run: bash "${RUNNER_TEMP}/gh-aw/actions/install_awf_binary.sh" v0.25.49
      - name: Determine automatic lockdown mode for GitHub MCP Server

.github/workflows/smoke-temporary-id.lock.yml:548

The image is pre-pulled using a digest-pinned reference (gh-aw-mcpg:v0.3.9@sha256:…), but later docker run uses only the tag (gh-aw-mcpg:v0.3.9). Depending on how download_docker_images.sh tags images locally, this can cause an extra pull or run a different digest than the one pinned here; align the docker run reference with the digest-pinned value (or stop using the digest in the pre-pull) to keep execution reproducible.

        run: bash "${RUNNER_TEMP}/gh-aw/actions/restore_inline_sub_agents.sh"
      - name: Download container images
        run: bash "${RUNNER_TEMP}/gh-aw/actions/download_docker_images.sh" ghcr.io/github/gh-aw-firewall/agent:0.25.49 ghcr.io/github/gh-aw-firewall/api-proxy:0.25.49 ghcr.io/github/gh-aw-firewall/squid:0.25.49 ghcr.io/github/gh-aw-mcpg:v0.3.9@sha256:64828b42a4482f58fab16509d7f8f495a6d97c972a98a68aff20543531ac0388 ghcr.io/github/github-mcp-server:v1.0.4 node:lts-alpine@sha256:d1b3b4da11eefd5941e7f0b9cf17783fc99d9c6fc34884a665f40a06dbdfc94f
      - name: Generate Safe Outputs Config

.github/workflows/smoke-temporary-id.lock.yml:804

MCP_GATEWAY_DOCKER_COMMAND runs ghcr.io/github/gh-aw-mcpg:v0.3.9 by tag, but earlier the workflow pre-downloads ghcr.io/github/gh-aw-mcpg:v0.3.9@sha256:…. If the digest-pinned pull doesn’t create/refresh the :v0.3.9 tag locally, docker run may pull a different image than intended. Consider using the same digest-pinned reference in docker run (or pull by tag consistently) so the executed image is deterministic.

          DOCKER_SOCK_GID=$(stat -c '%g' "$DOCKER_SOCK_PATH" 2>/dev/null || echo '0')
          export MCP_GATEWAY_DOCKER_COMMAND='docker run -i --rm --network host --add-host host.docker.internal:127.0.0.1 --user '"${MCP_GATEWAY_UID}"':'"${MCP_GATEWAY_GID}"' --group-add '"${DOCKER_SOCK_GID}"' -v '"${DOCKER_SOCK_PATH}"':/var/run/docker.sock -e MCP_GATEWAY_PORT -e MCP_GATEWAY_DOMAIN -e MCP_GATEWAY_API_KEY -e MCP_GATEWAY_PAYLOAD_DIR -e MCP_GATEWAY_PAYLOAD_SIZE_THRESHOLD -e DOCKER_HOST=unix:///var/run/docker.sock -e DEBUG -e MCP_GATEWAY_LOG_DIR -e GH_AW_MCP_LOG_DIR -e GH_AW_SAFE_OUTPUTS -e GH_AW_SAFE_OUTPUTS_CONFIG_PATH -e GH_AW_SAFE_OUTPUTS_TOOLS_PATH -e GH_AW_ASSETS_BRANCH -e GH_AW_ASSETS_MAX_SIZE_KB -e GH_AW_ASSETS_ALLOWED_EXTS -e DEFAULT_BRANCH -e GITHUB_MCP_SERVER_TOKEN -e GITHUB_MCP_GUARD_MIN_INTEGRITY -e GITHUB_MCP_GUARD_REPOS -e GITHUB_REPOSITORY -e GITHUB_SERVER_URL -e GITHUB_SHA -e GITHUB_WORKSPACE -e GITHUB_TOKEN -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e GITHUB_JOB -e GITHUB_ACTION -e GITHUB_EVENT_NAME -e GITHUB_EVENT_PATH -e GITHUB_ACTOR -e GITHUB_ACTOR_ID -e GITHUB_TRIGGERING_ACTOR -e GITHUB_WORKFLOW -e GITHUB_WORKFLOW_REF -e GITHUB_WORKFLOW_SHA -e GITHUB_REF -e GITHUB_REF_NAME -e GITHUB_REF_TYPE -e GITHUB_HEAD_REF -e GITHUB_BASE_REF -e GH_AW_SAFE_OUTPUTS_PORT -e GH_AW_SAFE_OUTPUTS_API_KEY -e GITHUB_AW_OTEL_TRACE_ID -e GITHUB_AW_OTEL_PARENT_SPAN_ID -e OTEL_EXPORTER_OTLP_HEADERS -v /tmp/gh-aw/mcp-payloads:/tmp/gh-aw/mcp-payloads:rw -v /opt:/opt:ro -v /tmp:/tmp:rw -v '"${GITHUB_WORKSPACE}"':'"${GITHUB_WORKSPACE}"':rw ghcr.io/github/gh-aw-mcpg:v0.3.9'

Files reviewed: 4/4 changed files
Comments generated: 2

  "temporary_id": "aw_test03",
  "parent": "aw_test01",
  "title": "Sub-Issue 2: Test Different ID Length",
  "body": "This is sub-issue 2 with an 8-character temporary ID.\n\nParent: #aw_test01\nRelated: #aw_test02\n\nTesting that longer temporary IDs (8 chars) work correctly."
 }


 # Container images used:
-#   - ghcr.io/github/gh-aw-firewall/agent:0.25.51
-#   - ghcr.io/github/gh-aw-firewall/api-proxy:0.25.51
-#   - ghcr.io/github/gh-aw-firewall/squid:0.25.51
-#   - ghcr.io/github/gh-aw-mcpg:v0.3.17
+#   - ghcr.io/github/gh-aw-firewall/agent:0.25.49
+#   - ghcr.io/github/gh-aw-firewall/api-proxy:0.25.49
+#   - ghcr.io/github/gh-aw-firewall/squid:0.25.49
+#   - ghcr.io/github/gh-aw-mcpg:v0.3.9@sha256:64828b42a4482f58fab16509d7f8f495a6d97c972a98a68aff20543531ac0388


Initial plan

631f6e6

Copilot AI assigned Copilot and pelikhan May 22, 2026

Copilot started work on behalf of pelikhan May 22, 2026 14:19 View session

Copilot AI linked an issue May 22, 2026 that may be closed by this pull request

[ab-advisor] Experiment campaign for smoke-temporary-id: A/B test sub_agent_strategy #34010

Closed

Copilot AI and others added 2 commits May 22, 2026 14:25

chore: initial plan for smoke-temporary-id A/B experiment

a238dc8

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add A/B test for sub_agent_strategy in smoke-temporary-id~~ [ab-advisor] Add sub_agent_strategy A/B experiment to smoke-temporary-id workflow May 22, 2026

Copilot finished work on behalf of pelikhan May 22, 2026 14:30

Copilot AI requested a review from pelikhan May 22, 2026 14:30

pelikhan marked this pull request as ready for review May 22, 2026 14:31

Copilot AI review requested due to automatic review settings May 22, 2026 14:31

pelikhan merged commit 737991d into main May 22, 2026

pelikhan deleted the copilot/ab-advisor-experiment-campaign-smoke-temporary-id branch May 22, 2026 14:31

Copilot started reviewing on behalf of pelikhan May 22, 2026 14:32 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ab-advisor] Add sub_agent_strategy A/B experiment to smoke-temporary-id workflow#34020

[ab-advisor] Add sub_agent_strategy A/B experiment to smoke-temporary-id workflow#34020
pelikhan merged 3 commits into
mainfrom
copilot/ab-advisor-experiment-campaign-smoke-temporary-id

Copilot AI commented May 22, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Frontmatter

Workflow body

Schema adaptations

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented May 22, 2026 •

edited

Loading