Goal
Ingest GitHub / GitLab PR webhooks + CI run results into the local store so we can link "session X → commit Y → PR Z → CI ran W → 2 reverts later." Foundation for outcome attribution v2 (Spec 22) and the comparative benchmark (Spec 26).
Why now
Yield tab today is git-correlation by cwd — coarse. To answer "did this session ship code that actually held up?", we need PR + CI + downstream-touch data. That data is only outside the local store right now.
Schema
v017 — two additive tables:
CREATE TABLE pr_outcomes (
id INTEGER PRIMARY KEY,
provider TEXT NOT NULL, -- 'github' | 'gitlab'
repo_slug TEXT NOT NULL, -- 'owner/repo'
pr_number INTEGER NOT NULL,
title TEXT,
state TEXT NOT NULL, -- 'open' | 'merged' | 'closed'
merged_at TEXT,
reverted_at TEXT,
author TEXT,
raw_json TEXT NOT NULL,
UNIQUE (provider, repo_slug, pr_number)
);
CREATE TABLE ci_runs (
id INTEGER PRIMARY KEY,
provider TEXT NOT NULL, -- 'github-actions' | 'gitlab-ci' | 'circleci' | ...
repo_slug TEXT NOT NULL,
run_id TEXT NOT NULL, -- provider-side id
commit_sha TEXT NOT NULL,
status TEXT NOT NULL, -- 'success' | 'failure' | 'cancelled' | 'in_progress'
workflow_name TEXT,
started_ts TEXT,
completed_ts TEXT,
raw_json TEXT NOT NULL,
UNIQUE (provider, run_id)
);
CREATE INDEX idx_pr_outcomes_repo ON pr_outcomes(repo_slug, state);
CREATE INDEX idx_ci_runs_commit ON ci_runs(commit_sha);
raw_json keeps the full webhook payload for future analysis.
User-visible surface
- CLI:
stackunderflow ingest github --repo owner/repo --token $GH_TOKEN [--since 30d] — backfill via REST API.
- CLI:
stackunderflow ingest webhook serve --port 8096 — opt-in webhook receiver. Validates HMAC signature.
- API:
POST /api/webhooks/github, POST /api/webhooks/gitlab, POST /api/webhooks/ci — receive + validate + insert.
- Meta-agent tool:
get_pr_outcomes(repo, state?, since?) and get_ci_runs(commit_sha?, status?).
- UI: extend Yield tab to show PR + CI columns alongside commit data.
Implementation plan
- v017 migration.
- New service
stackunderflow/services/github_ingest.py (REST backfill).
- New
stackunderflow/routes/webhooks.py (signature validation + insert).
- CLI commands.
- Meta-agent tool entries.
- Yield-tab UI extension (optional in v1 — JSON API alone unblocks Spec 22).
Tests
- Signature validation for GitHub (HMAC-SHA256) and GitLab (token compare).
- REST-backfill with mocked responses (pagination, rate-limit retry).
- Schema migration idempotency.
- Yield-tab integration test (if UI shipped).
Hard parts
- Webhook signature validation is security-critical. Use
hmac.compare_digest. Reject on missing / mismatched signatures with 403.
- Token storage: don't store the GH token in the database. Read from env (
STACKUNDERFLOW_GITHUB_TOKEN) or settings file (encrypted-at-rest — defer encryption to Spec 28).
- "linking session to commit" — that's Spec 22's job. This spec just ingests the data; the link is downstream.
Out of scope
- Bitbucket / Codeberg / self-hosted Gitea (defer — same pattern, just adapter work).
- Encrypted token storage (Spec 28).
- Auto-linking sessions to PRs (Spec 22).
Dependencies
- Spec 22 (outcome attribution) consumes this.
Estimated effort
Size L — single agent, ~2 hr.
Hard rules
- DO NOT touch versions / CHANGELOG headings.
- Pre-assigned schema slot: v017.
- Branch:
feat/pr-ci-webhook-ingest off main.
Goal
Ingest GitHub / GitLab PR webhooks + CI run results into the local store so we can link "session X → commit Y → PR Z → CI ran W → 2 reverts later." Foundation for outcome attribution v2 (Spec 22) and the comparative benchmark (Spec 26).
Why now
Yield tab today is git-correlation by cwd — coarse. To answer "did this session ship code that actually held up?", we need PR + CI + downstream-touch data. That data is only outside the local store right now.
Schema
v017 — two additive tables:
raw_jsonkeeps the full webhook payload for future analysis.User-visible surface
stackunderflow ingest github --repo owner/repo --token $GH_TOKEN [--since 30d]— backfill via REST API.stackunderflow ingest webhook serve --port 8096— opt-in webhook receiver. Validates HMAC signature.POST /api/webhooks/github,POST /api/webhooks/gitlab,POST /api/webhooks/ci— receive + validate + insert.get_pr_outcomes(repo, state?, since?)andget_ci_runs(commit_sha?, status?).Implementation plan
stackunderflow/services/github_ingest.py(REST backfill).stackunderflow/routes/webhooks.py(signature validation + insert).Tests
Hard parts
hmac.compare_digest. Reject on missing / mismatched signatures with 403.STACKUNDERFLOW_GITHUB_TOKEN) or settings file (encrypted-at-rest — defer encryption to Spec 28).Out of scope
Dependencies
Estimated effort
Size L — single agent, ~2 hr.
Hard rules
feat/pr-ci-webhook-ingestoff main.