Skip to content

feat(dashboard): add Experiments tab backed by gh aw experiments list#42213

Merged
pelikhan merged 2 commits into
mainfrom
copilot/add-experiments-tab
Jun 29, 2026
Merged

feat(dashboard): add Experiments tab backed by gh aw experiments list#42213
pelikhan merged 2 commits into
mainfrom
copilot/add-experiments-tab

Conversation

Copilot AI commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Adds an Experiments tab to the agentic-workflows-dashboard canvas, exposing A/B experiment branch data from gh aw experiments list --json.

Extension server (extension.mjs)

  • getExperiments() — runs gh aw experiments list --json, 60s cached
  • /api/experiments HTTP route
  • listExperiments canvas action (paged, same contract as listDefinitions/listRuns)
  • System message updated to document new CLI command and action

Web UI (web/app.js, web/index.html)

  • "Experiments" tab with counter, between "Run details" and "Commands"
  • fetchExperiments() + loadExperimentPage() with loading/error state
  • formatDate() helper (guards Invalid Date)
  • Tab section renders: workflow ID, branch, variant count, total runs, last run — with pagination

TypeScript source (src/models.ts, src/app.ts)

  • ExperimentInfo interface matching Go CLI JSON schema (workflow_id, branch, experiments, total_runs, last_run)
  • Extended DashboardTabId and counter union types; mock data builder for local dev

Copilot AI and others added 2 commits June 29, 2026 10:17
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title feat: add Experiments tab to agentic-workflows-dashboard feat(dashboard): add Experiments tab backed by gh aw experiments list Jun 29, 2026
Copilot AI requested a review from pelikhan June 29, 2026 10:19
@pelikhan pelikhan marked this pull request as ready for review June 29, 2026 10:20
Copilot AI review requested due to automatic review settings June 29, 2026 10:20
@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

PR Code Quality Reviewer completed the code quality review.

@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Test Quality Sentinel completed test quality analysis.

No test files were added or modified in this PR. All 6 changed files are dashboard source files (.mjs, package-lock.json, .ts, .ts, .js, .html). Test Quality Sentinel skipped.

@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #42213 does not have the 'implementation' label and has 0 new lines of code in business logic directories (threshold: 100).

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an Experiments tab to the Agentic Workflows Dashboard canvas extension, surfacing experiment-branch summaries from gh aw experiments list --json through a new server route and UI pagination.

Changes:

  • Added an /api/experiments endpoint in the extension server and a listExperiments canvas action (paged).
  • Added an Experiments tab in the web UI, including fetch/pagination wiring and a date formatter helper.
  • Extended the TypeScript dashboard model/mock app with an ExperimentInfo type and mock experiment data.
Show a summary per file
File Description
.github/extensions/agentic-workflows-dashboard/web/index.html Adds the Experiments tab UI and renders experiment rows + pagination controls.
.github/extensions/agentic-workflows-dashboard/web/app.js Adds experiments fetch/load state, paging, and a date formatting helper used by the new tab.
.github/extensions/agentic-workflows-dashboard/src/models.ts Introduces the ExperimentInfo TypeScript interface for experiment list items.
.github/extensions/agentic-workflows-dashboard/src/app.ts Extends the TS mock dashboard state/tabs and adds a mock experiment data builder.
.github/extensions/agentic-workflows-dashboard/package-lock.json Updates the lockfile (notably removing @github/copilot-sdk entries).
.github/extensions/agentic-workflows-dashboard/extension.mjs Adds cached getExperiments(), /api/experiments route, and listExperiments action; updates system message docs.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files not reviewed (1)
  • .github/extensions/agentic-workflows-dashboard/package-lock.json: Generated file
  • Files reviewed: 5/6 changed files
  • Comments generated: 2
  • Review effort level: Low

Comment on lines +45 to +49
function formatDate(iso) {
if (!iso) return "—";
const d = new Date(iso);
return isNaN(d.getTime()) ? "—" : d.toLocaleString();
}
Comment on lines +179 to +180
<div class="text-small color-fg-muted">Variants</div>
<div class="text-bold" x-text="exp.experiments"></div>
@github-actions github-actions Bot mentioned this pull request Jun 29, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧵 Reviewed using Impeccable skills by Impeccable Skills Reviewer · 82.6 AIC · ⌖ 6.98 AIC · ⊞ 4.9K

if (hit) return hit;
const raw = await runGhAw(["experiments", "list", "--json"]);
const data = JSON.parse(raw);
const experiments = Array.isArray(data) ? data : [];

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When gh aw experiments list --json exits cleanly (exit code 0) but returns a non-array payload — e.g. an error object like {"error":"unknown subcommand"} — the Array.isArray(data) ? data : [] branch silently caches [] for 60 s and the UI shows "No experiments found" instead of surfacing the actual error.

Contrast with getDefinitions() (line 70), which caches the raw parsed value, and getRuns() (line 91), which does logsData.runs ?? [] (narrower key access).

Consider throwing when the shape is unexpected so the server's top-level catch (line 181) returns a 500 that fetchExperiments() in web/app.js will display in errorExperiments:

if (!Array.isArray(data)) {
  throw new Error(data?.error ?? `Unexpected response: ${JSON.stringify(data).slice(0, 120)}`);
}

@copilot please address this.

{ id: "definitions", label: "Workflows", counter: "definitions" },
{ id: "runs", label: "Runs", counter: "runs" },
{ id: "details", label: "Run details" },
{ id: "experiments", label: "Experiments", counter: "experiments" },

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tab ordering nit: placing "Experiments" between "Run details" (a context panel for a selected run) and "Commands" breaks the data-view grouping. A more natural order would put the two top-level data views next to each other:

Workflows | Runs | Experiments | Run details | Commands

Non-blocking — happy to defer or keep the current order if it was intentional.

@github-actions

Copy link
Copy Markdown
Contributor

Review Summary

Applied /tdd, /grill-with-docs, and /improve-codebase-architecture to PR #42213. Two blocking issues found; four non-blocking suggestions.

📌 Inline comments at a glance
# File Skill Issue
1 extension.mjs:79 /tdd JSON.parse(raw) in getExperiments() has no try/catch — CLI errors crash the route
2 web/app.js:115 /tdd resp.json() called before resp.ok — non-JSON errors produce misleading messages
3 src/models.ts:41 /grill-with-docs experiments: number is tautological; rename to variants to match the UI label
4 src/models.ts:38 /grill-with-docs snake_case fields in TypeScript should be documented or mapped to camelCase
5 web/app.js:55 /tdd formatDate() has two branches but no unit tests — Vitest is already set up
6 web/index.html:176 /improve-codebase-architecture Redundant ternary — formatDate() already handles falsy

The overall architecture is clean and the feature follows established patterns well. Issues 1 and 2 are the priority before merge.

@copilot please address the review comments above.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 93.3 AIC · ⌖ 8.37 AIC · ⊞ 6.6K ·
Comment /matt to run again

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /tdd, /grill-with-docs, and /improve-codebase-architecture — requesting changes on two correctness issues before merge.

📋 Key Themes & Highlights

Blocking Issues

  • Unguarded JSON.parse in getExperiments() — a CLI error or unavailable subcommand will crash the HTTP route rather than returning a graceful empty state.
  • resp.json() before resp.ok in fetchExperiments() — non-JSON error responses produce a confusing JSON parse error instead of the real HTTP error.

Non-blocking Issues

  • ExperimentInfo.experiments: number — tautological name; the UI already calls it "Variants", so the model should too.
  • snake_case fields in a TypeScript interface are unusual in this codebase and should be either documented or mapped to camelCase at the API boundary.
  • formatDate() is a pure multi-branch utility with no test coverage; Vitest is already configured.
  • Redundant ternary in the template — formatDate already handles falsy inputs.

Positive Highlights

  • ✅ Clean end-to-end symmetry: getExperiments()/api/experimentsfetchExperiments() → Alpine state — follows the exact same shape as definitions and runs.
  • ✅ 60-second cache reuse is correct — no new cache key needed per-page.
  • ✅ Loading / error / empty states are all handled in the UI, with a helpful "Run gh aw experiments list" empty-state hint.
  • ✅ Removing @github/copilot-sdk from production dependencies is a meaningful cleanup.
  • formatDate's Invalid Date guard is exactly the right defensive approach.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 93.3 AIC · ⌖ 8.37 AIC · ⊞ 6.6K
Comment /matt to run again

const hit = getCached("experiments");
if (hit) return hit;
const raw = await runGhAw(["experiments", "list", "--json"]);
const data = JSON.parse(raw);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] JSON.parse(raw) has no error handling — if gh aw experiments list --json is unavailable or returns non-JSON output, this throws uncaught and crashes the /api/experiments route.

💡 Suggested fix
async function getExperiments() {
  const hit = getCached("experiments");
  if (hit) return hit;
  try {
    const raw = await runGhAw(["experiments", "list", "--json"]);
    const data = JSON.parse(raw);
    const experiments = Array.isArray(data) ? data : [];
    setCached("experiments", experiments);
    return experiments;
  } catch (err) {
    console.error("getExperiments failed:", err.message);
    return [];
  }
}

The same gap exists in the existing getDefinitions()/getRuns(), but those are pre-existing. A try/catch here (returning [] on failure) keeps the route alive and lets the UI show its error state.

@copilot please address this.

this.errorExperiments = "";
try {
const resp = await fetch("/api/experiments");
const data = await resp.json();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] resp.json() is called before resp.ok is checked — if the server returns a non-JSON error response (e.g. a plain-text 500), the call throws a JSON parse error and the actual HTTP error is swallowed.

💡 Suggested pattern
async fetchExperiments() {
    this.loadingExperiments = true;
    this.errorExperiments = "";
    try {
        const resp = await fetch("/api/experiments");
        if (!resp.ok) {
            const text = await resp.text();
            throw new Error(text || `HTTP ${resp.status}`);
        }
        const data = await resp.json();
        this.experiments = Array.isArray(data) ? data : [];
        this.loadExperimentPage(1);
    } catch (e) {
        this.errorExperiments = `Failed to load experiments: ${e.message}`;
    } finally {
        this.loadingExperiments = false;
    }
},

fetchDefinitions and fetchRuns have the same ordering issue; fixing it here improves the pattern for future consistency.

@copilot please address this.

export interface ExperimentInfo {
workflow_id: string;
branch: string;
experiments: number;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/grill-with-docs] experiments: number is tautologically named — the field is on ExperimentInfo and the word experiments doesn't communicate what it counts. The HTML renders it under a "Variants" heading, suggesting the domain term is already established.

💡 Suggested rename
export interface ExperimentInfo {
  workflow_id: string;
  branch: string;
  variants: number;      // was: experiments (count of A/B variants)
  total_runs: number;
  last_run: string;
}

Update the matching field in web/app.js/web/index.html (exp.experiments → exp.variants) and the buildExperiments mock in src/app.ts. If the Go CLI emits experiments, map it at the API boundary rather than carrying the ambiguous name into the TypeScript model.

@copilot please address this.

hasPreviousPage: boolean;
}

export interface ExperimentInfo {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/grill-with-docs] All fields in ExperimentInfo use snake_case (workflow_id, total_runs, last_run) while TypeScript convention and the existing codebase prefer camelCase. This appears to be a direct mapping from the Go CLI JSON output, but there is no comment explaining the deliberate choice.

💡 Two options

Option A — add a doc comment (minimal change):

/** Direct JSON mapping from `gh aw experiments list --json` — snake_case intentional. */
export interface ExperimentInfo {
  workflow_id: string;
  ...
}

Option B — camelCase in the model, map at the API boundary in extension.mjs:

export interface ExperimentInfo {
  workflowId: string;
  branch: string;
  variants: number;
  totalRuns: number;
  lastRun: string;
}

Option A is lower risk for this PR; Option B aligns with the rest of the codebase conventions.

@copilot please address this.

tabs: dashboardTabs,
activeTab: "definitions",
pageSize: 20,
definitions: [],

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] formatDate() has two distinct branches (falsy guard, Invalid Date guard) but no unit tests. It's a pure function that's easy to test, and incorrect date handling would silently show stale or confusing data to users.

💡 Suggested tests (using existing Vitest setup)
// formatDate.test.js (or add to an existing test file)
import { describe, it, expect } from 'vitest';
import { formatDate } from './app.js'; // or extract and export

describe('formatDate', () => {
  it('returns em-dash for null', () => expect(formatDate(null)).toBe('—'));
  it('returns em-dash for undefined', () => expect(formatDate(undefined)).toBe('—'));
  it('returns em-dash for empty string', () => expect(formatDate('')).toBe('—'));
  it('returns em-dash for invalid date string', () => expect(formatDate('not-a-date')).toBe('—'));
  it('returns a non-empty string for valid ISO date', () => {
    expect(formatDate('2024-01-01T00:00:00Z')).not.toBe('—');
  });
});

@copilot please address this.

<div>
<div class="text-bold" x-text="exp.workflow_id"></div>
<div class="color-fg-muted text-small mt-1">Branch: <code x-text="exp.branch"></code></div>
<div class="color-fg-muted text-small">Last run: <span x-text="exp.last_run ? formatDate(exp.last_run) : '—'"></span></div>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/improve-codebase-architecture] The ternary exp.last_run ? formatDate(exp.last_run) : '—' is redundant — formatDate() already returns '—' for falsy inputs. Keeping this in the template obscures the contract of formatDate for future readers.

Simplify to:

<span x-text="formatDate(exp.last_run)"></span>

@copilot please address this.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blocking issues found.

The HTTP route handler wraps all async calls in a top-level try/catch that sends {error: ...} JSON at 500, so CLI failures (missing subcommand, non-JSON output) are gracefully surfaced to the client. The cache has a 60 s TTL and a /api/refresh path. Client-side error handling in fetchExperiments() is consistent with fetchDefinitions()/fetchRuns().

Minor nit (non-blocking): The template guard exp.last_run ? formatDate(exp.last_run) : '—' in index.html is redundant — formatDate already returns '—' for falsy input.

🔎 Code quality review by PR Code Quality Reviewer · 99.2 AIC · ⌖ 7.03 AIC · ⊞ 5.2K
Comment /review to run again

@pelikhan pelikhan merged commit 00150df into main Jun 29, 2026
60 checks passed
@pelikhan pelikhan deleted the copilot/add-experiments-tab branch June 29, 2026 10:38
@github-actions

Copy link
Copy Markdown
Contributor

🎉 This pull request is included in a new release.

Release: v0.82.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants