Skip to content

[plan] Create daily-experiment-report workflow for statistical analysis #29613

@github-actions

Description

@github-actions

Objective

Implement a daily-experiment-report agentic workflow (markdown + compiled lock file) that aggregates experiment state artifacts across recent runs, computes per-variant statistics, and posts an ASCII comparison table as a discussion comment.

Context

Issue #29604 (Area 2 — Reporting & Dashboards). Without automated reporting, experiment results are invisible unless an engineer manually downloads and parses artifacts.

Approach

1. Create the workflow markdown

Create .github/aw/daily-experiment-report.md with:

---
name: daily-experiment-report
on:
  schedule:
    - cron: "0 8 * * *"   # 08:00 UTC daily
  workflow_dispatch:
engine: copilot
tools:
  github:
    toolsets: [default]
---

2. Prompt the agent to

  1. For each workflow in the repository that declares experiments: in its frontmatter, list the last 30 runs.
  2. Download the experiment artifact (state.json) from each run.
  3. Parse { experiment, variant, run_id, token_count, duration_ms, conclusion } from each artifact.
  4. Compute per-variant statistics: mean, variance, 95% CI, sample size, success rate.
  5. Detect significance using a Welch t-test (continuous) or two-proportion z-test (binary).
  6. Render an ASCII table per experiment (see format in issue [ab-advisor] Improve experiment infrastructure: schema, reporting & audit #29604).
  7. Post the table as a comment on the tracking issue (if issue: is set) or as a workflow step summary.
  8. Recommend "promote", "extend", or "abandon" based on p-value and effect size.

3. Compile and validate

Run ./gh-aw compile .github/aw/daily-experiment-report.md and commit the generated .lock.yml.

Files to Create/Modify

  • .github/aw/daily-experiment-report.md — new workflow
  • .github/aw/daily-experiment-report.lock.yml — compiled workflow (generated)

Acceptance Criteria

Generated by Plan Command for issue #29604 · ● 253.3K ·

  • expires on May 3, 2026, 8:12 PM UTC

Metadata

Metadata

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions