[plan] Create daily-experiment-report workflow for statistical analysis

## Objective

Implement a `daily-experiment-report` agentic workflow (markdown + compiled lock file) that aggregates experiment state artifacts across recent runs, computes per-variant statistics, and posts an ASCII comparison table as a discussion comment.

## Context

Issue #29604 (Area 2 — Reporting & Dashboards). Without automated reporting, experiment results are invisible unless an engineer manually downloads and parses artifacts.

## Approach

### 1. Create the workflow markdown

Create `.github/aw/daily-experiment-report.md` with:

```yaml
---
name: daily-experiment-report
on:
  schedule:
    - cron: "0 8 * * *"   # 08:00 UTC daily
  workflow_dispatch:
engine: copilot
tools:
  github:
    toolsets: [default]
---
```

### 2. Prompt the agent to

1. For each workflow in the repository that declares `experiments:` in its frontmatter, list the last 30 runs.
2. Download the `experiment` artifact (`state.json`) from each run.
3. Parse `{ experiment, variant, run_id, token_count, duration_ms, conclusion }` from each artifact.
4. Compute per-variant statistics: mean, variance, 95% CI, sample size, success rate.
5. Detect significance using a Welch t-test (continuous) or two-proportion z-test (binary).
6. Render an ASCII table per experiment (see format in issue #29604).
7. Post the table as a comment on the tracking issue (if `issue:` is set) or as a workflow step summary.
8. Recommend "promote", "extend", or "abandon" based on p-value and effect size.

### 3. Compile and validate

Run `./gh-aw compile .github/aw/daily-experiment-report.md` and commit the generated `.lock.yml`.

## Files to Create/Modify

- `.github/aw/daily-experiment-report.md` — new workflow
- `.github/aw/daily-experiment-report.lock.yml` — compiled workflow (generated)

## Acceptance Criteria

- [ ] Workflow compiles without errors (`./gh-aw compile`)
- [ ] Prompt clearly instructs the agent to aggregate artifacts and compute statistics
- [ ] ASCII table format matches the example in issue #29604
- [ ] Significance detection threshold (p < 0.05) is documented in the prompt
- [ ] Recommendation logic (promote / extend / abandon) is included in the prompt
- [ ] `make recompile` passes with no diff on the new file
Related to #29604




> Generated by [Plan Command](https://github.com/github/gh-aw/actions/runs/25230975372/agentic_workflow) for issue #29604 · ● 253.3K · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Fplan%22&type=issues)
> - [x] expires  on May 3, 2026, 8:12 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[plan] Create daily-experiment-report workflow for statistical analysis #29613

Objective

Context

Approach

1. Create the workflow markdown

2. Prompt the agent to

3. Compile and validate

Files to Create/Modify

Acceptance Criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[plan] Create daily-experiment-report workflow for statistical analysis #29613

Description

Objective

Context

Approach

1. Create the workflow markdown

2. Prompt the agent to

3. Compile and validate

Files to Create/Modify

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions