feat: add storage option to experiments (cache | repo, default repo)#29996
feat: add storage option to experiments (cache | repo, default repo)#29996
Conversation
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/69d3d119-87f1-4006-a972-5f756dc06abd Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
@copilot use helper to create signed commits |
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/90d876e4-b93a-455a-b801-0cf0a62d7a57 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Done in 06575aa. Added a dedicated Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
There was a problem hiding this comment.
Pull request overview
Adds an experiments.storage frontmatter option to persist A/B experiment state either via GitHub Actions cache (legacy) or a durable git branch in the repo (default).
Changes:
- Extend workflow compilation to parse
experiments.storageand generate either cache-based or repo-branch-based experiment persistence steps. - Add repo-backed persistence implementation: load
state.jsonfrom anexperiments/{sanitizedWorkflowID}branch and push updated state via a newpush_experiments_statejob. - Update schema + docs and add JS/Go tests for the new parsing/step-generation behavior.
Show a summary per file
| File | Description |
|---|---|
| pkg/workflow/compiler_types.go | Adds ExperimentsStorage to compiled workflow data. |
| pkg/workflow/compiler_orchestrator_workflow.go | Extracts experiments.storage from frontmatter into WorkflowData. |
| pkg/workflow/compiler_jobs.go | Adds wrapper to create push_experiments_state and wires it into conclusion dependencies. |
| pkg/workflow/compiler_experiments_test.go | Adds tests for repo vs cache step generation, branch naming, and reserved storage key behavior. |
| pkg/workflow/compiler_experiments.go | Implements storage selection, repo-branch naming, repo restore steps, artifact upload refactor, and push job builder. |
| pkg/parser/schemas/main_workflow_schema.json | Adds experiments.storage schema (enum/default) and reserves the storage key. |
| docs/src/content/docs/guides/experiments.md | Documents `storage: repo |
| actions/setup/js/load_experiment_state_from_repo.cjs | New helper to load state.json from the experiments branch via GitHub API. |
| actions/setup/js/load_experiment_state_from_repo.test.cjs | Unit tests for repo load helper (success/404/error paths). |
| actions/setup/js/push_experiment_state.cjs | New helper to commit/push experiment state to the experiments branch with signed-commit path + retries. |
| actions/setup/js/push_experiment_state.test.cjs | Unit tests for push helper env validation / empty state dir behavior. |
| .github/workflows/smoke-copilot.lock.yml | Regenerates locked workflow to use repo-based experiment restore + adds push_experiments_state job. |
| .github/workflows/daily-community-attribution.lock.yml | Regenerates locked workflow similarly (repo restore + push job). |
| .github/workflows/daily-astrostylelite-markdown-spellcheck.lock.yml | Regenerates locked workflow similarly (repo restore + push job). |
| .github/aw/experiments.md | Updates internal docs to describe storage configuration and push job. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comments suppressed due to low confidence (1)
pkg/parser/schemas/main_workflow_schema.json:2903
properties.storage.descriptionreferences a branch namedexperiments/{workflowID}, but the implementation uses a sanitized workflow ID for the branch suffix (lowercase, hyphens removed). Please align this description with the actual branch naming scheme so users don’t look for a non-existent branch.
"default": "repo",
"description": "Storage backend for experiment state. 'repo' (default) persists state to a git branch named 'experiments/{workflowID}' for durability across cache evictions. 'cache' uses GitHub Actions cache (legacy behaviour). Repo storage is recommended because experiment data is valuable and more durable than cache."
}
- Files reviewed: 15/15 changed files
- Comments generated: 6
| try { | ||
| const { stdout: lsOut } = await exec.getExecOutput("git", ["ls-remote", "origin", `refs/heads/${branchName}`], { cwd: workspaceDir }); | ||
| const remoteHead = lsOut.trim().split(/\s+/)[0] || ""; | ||
| if (remoteHead && remoteHead !== currentBaseRef) { | ||
| currentBaseRef = remoteHead; | ||
| core.info(`Refreshed baseRef for retry: ${currentBaseRef}`); | ||
| } | ||
| } catch { | ||
| // ls-remote failed; keep existing baseRef |
|
|
||
| | Value | Behaviour | | ||
| |---|---| | ||
| | `repo` (**default**) | Commits state to a git branch named `experiments/{workflowID}`. Durable — survives cache evictions. Requires `contents: write` permission (added automatically by the compiler). | |
| | `repo` (**default**) | Commits `state.json` to a git branch named `experiments/{workflowID}` after each run. State survives cache evictions. | Recommended for all experiments — experiment data is valuable. | | ||
| | `cache` | Uses GitHub Actions cache (legacy behaviour). State may be evicted after 7 days of inactivity. | Only when `contents: write` cannot be granted to the workflow. | | ||
|
|
||
| **Key differences:** | ||
|
|
||
| - **`repo` storage** adds a `push_experiments_state` job that runs after the activation job. This job commits the updated state to a branch like `experiments/myworkflow` using `contents: write` permission. The state is durable and survives long periods without workflow runs. |
| }, | ||
| "experiments": { | ||
| "description": "A/B testing experiments. Each key is an experiment name; the value is either an array of two or more variant strings (bare-array form) or an object with a 'variants' field plus optional metadata fields (description, metric, weight, issue, start_date, end_date, hypothesis, secondary_metrics, guardrail_metrics, min_samples, owner). At runtime the activation job picks a variant using actions/cache to maintain consistent assignment across runs. Use ${{ experiments.<name> }} in the workflow prompt to reference the selected variant. When multiple experiments are declared, assignments are statistically balanced using a least-used counter that round-robins across variants (or weighted when 'weight' is provided); ties are broken randomly so no variant is systematically favoured on the first run.", | ||
| "description": "A/B testing experiments. Each key is an experiment name; the value is either an array of two or more variant strings (bare-array form) or an object with a 'variants' field plus optional metadata fields (description, metric, weight, issue, start_date, end_date, hypothesis, secondary_metrics, guardrail_metrics, min_samples, owner). The reserved 'storage' key controls how experiment state is persisted: 'repo' (default) commits state to a git branch named 'experiments/{workflowID}' for durability; 'cache' uses GitHub Actions cache. At runtime the activation job picks a variant and persists the updated counters. Use ${{ experiments.<name> }} in the workflow prompt to reference the selected variant. When multiple experiments are declared, assignments are statistically balanced using a least-used counter that round-robins across variants (or weighted when 'weight' is provided); ties are broken randomly so no variant is systematically favoured on the first run.", |
| // Build push_experiments_state job when experiment storage is "repo" | ||
| pushExperimentsJobName, err := c.buildPushExperimentsStateJobWrapper(data) | ||
| if err != nil { | ||
| return err | ||
| } |
| const [owner, repo] = (process.env.GITHUB_REPOSITORY || "/").split("/"); | ||
| if (!owner || !repo) { | ||
| core.warning("GITHUB_REPOSITORY is not set – starting with empty experiment state"); | ||
| fs.mkdirSync(stateDir, { recursive: true }); | ||
| return; | ||
| } | ||
|
|
||
| const octokit = getOctokit(process.env.GITHUB_TOKEN || ""); | ||
| const stateFileName = path.basename(stateFile); | ||
|
|
🧪 Test Quality Sentinel ReportTest Quality Score: 96/100✅ Excellent test quality
Test Classification DetailsView all 13 test classifications
Flagged Tests — Requires ReviewNo tests require review. All tests are well-structured behavioral contracts. Language SupportTests analyzed:
Verdict
📖 Understanding Test ClassificationsDesign Tests (High Value) verify what the system does:
Implementation Tests (Low Value) verify how the system does it:
Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators. References:
|
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Commit pushed:
|
🏗️ Design Decision Gate — ADR RequiredThis PR makes significant changes to core business logic (398 new lines in business logic directories, threshold: 100) but does not have a linked Architecture Decision Record (ADR). AI has analyzed the PR diff and generated a draft ADR to help you get started: 📄 Draft ADR: What to do next
Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision. Why ADRs Matter
ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you. 📋 Michael Nygard ADR Format ReferenceAn ADR must contain these four sections to be considered complete:
All ADRs are stored in
References: §25288924793
|
|
@copilot review all comments |
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/9214d13e-6db8-4f64-910f-2e92f57f9650 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/9214d13e-6db8-4f64-910f-2e92f57f9650 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Addressed all review comments in 5a07c86 and 460e3ac:
Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Summary
Adds a
storageoption to theexperiments:frontmatter section that controls how A/B experiment state is persisted across workflow runs.Why: Experiment data is valuable — accumulated run counts inform A/B decisions that take weeks to accumulate. GitHub Actions cache can be evicted after 7 days of inactivity (e.g. over a holiday), resetting all counts.
repostorage commits state to a git branch, which is permanent.Usage
repo(default)state.jsontoexperiments/{workflowID}git branch after each run. Durable. Adds apush_experiments_statejob (requirescontents: write).cacheChanges
Go
WorkflowData.ExperimentsStoragefieldextractExperimentsStorageFromFrontmatter()— readsstoragekey from experiments mapexperimentsBranchName()— returnsexperiments/{sanitizedID}branch namegenerateExperimentSteps()now dispatches togenerateExperimentCacheSteps()orgenerateExperimentRepoSteps()buildPushExperimentsStateJob()— new job that downloads experiment artifact and commits state to git branch viapush_experiment_state.cjsbuildMemoryManagementJobs()andupdateConclusionJobDependencies()to include push_experiments_stateJavaScript (new files)
actions/setup/js/load_experiment_state_from_repo.cjs— fetchesstate.jsonfrom the experiments git branch via GitHub API; falls back to empty state on first run (404)actions/setup/js/load_experiment_state_from_repo.test.cjs— 5 unit tests covering success, 404, and error pathsactions/setup/js/push_experiment_state.cjs— dedicated helper that pushesstate.jsonandassignments.jsonto the experiments git branch usingpushSignedCommits(GitHub GraphQLcreateCommitOnBranchmutation for verified commits, plaingit pushfallback); includes retry logic with exponential backoffactions/setup/js/push_experiment_state.test.cjs— 3 unit tests covering missing env vars and empty state dirSchema
storageproperty to experiments JSON schema (enum:["cache", "repo"], default:"repo")storagekey is excluded fromextractExperimentConfigsFromFrontmatter(reserved keyword)Documentation
.github/aw/experiments.mdwith Storage Configuration sectiondocs/src/content/docs/guides/experiments.mdwith storage details