Skip to content

feat: add Steering view — analyze the impact of human prompts on code#41

Closed
paulyuk wants to merge 42 commits into
jayparikh:mainfrom
paulyuk:feature/journal-view
Closed

feat: add Steering view — analyze the impact of human prompts on code#41
paulyuk wants to merge 42 commits into
jayparikh:mainfrom
paulyuk:feature/journal-view

Conversation

@paulyuk
Copy link
Copy Markdown

@paulyuk paulyuk commented Apr 1, 2026

Steering View

Problem

When you watch an AI coding session in Replay or Tracks, you see what the agent did — tool calls, file edits, reasoning. But you can't see which human prompts actually mattered. Which ones changed the direction? Which ones led to the most code? Which ones caused bugs that led to better tests?

Solution

The Steering view extracts every human prompt that redirected the agent, pairs each one with the git commits and code changes it produced, and shows the impact: lines changed, tests passing, and what was learned.

It understands not just the primary coding agent (Copilot CLI, Claude Code, VS Code Copilot), but also sub-agents and squads — capturing the reasoning from dispatched agents as they respond to steering.

An agent-assisted analysis runs automatically in the background using the Copilot SDK to improve the quality of What Happened, Level-Up, and Impact columns beyond what static heuristics can produce. A steering intelligence panel shows density scoring, category breakdown, and actionable insights derived from the session's steering patterns.

Screenshot

Steering View

Quick Start

git fetch origin pull/41/head:feature/journal-view
git checkout feature/journal-view
npm install && npm run dev

Load a session, click the Steering tab (last tab, after Coach).

What you see

Each row is either a human steering command (italic quotes, bright text) or a git commit (dimmer text, blue hash). For each steering command:

  • What Happened — what the agent did as a result, including squad reasoning
  • Level-Up — what was learned or unlocked
  • Impact — lines changed, tests passing, eval grade

The detail panel shows the full squad response, files changed, commit info, and a Responding To section showing what the agent said that the user was reacting to.

Steering intelligence

Below the timeline, an expandable panel shows:

  • Density score — steering commands per hour (high density = agent not matching taste)
  • Category breakdown — quality, tone, bugs, naming, testing, visual, simplification
  • Insights — actionable suggestions like "Quality redirected 4 times, consider a skill"

How it works

  1. Session extractor — detects steering patterns in user messages (redirections like "instead", "try again", "don't")
  2. Git history — commits classified and paired with the steering that caused them
  3. Background analysis — Copilot SDK auto-analyzes entries for richer summaries (same SDK as Coach, no extra setup)
  4. Intelligence panel — density scoring and category analysis from steering patterns
  5. Persistent log — .agentviz/steering-v1.jsonl (gitignored, auto-generated) with redaction for secrets

Guardrails verified (per #43)

Check Status
npm run typecheck pass
npm test 473/474 (1 pre-existing)
npm run build pass
Screenshots (8 required) all present
Hardcoded hex colors all theme tokens
README Steering section added
Style guide Section 20 added
Test coverage 3 test files, 48 tests

Files (18 total: 11 new, 7 modified)

routes/steering.js                  — git analysis, steering log, AI synthesis endpoint
src/components/SteeringView.jsx     — timeline table, filters, detail panel, intelligence panel
src/lib/steeringExtractor.js        — session heuristic extractor
src/lib/steeringAgent.js            — Copilot SDK analysis agent
src/__tests__/steeringExtractor.test.js — 27 tests
src/__tests__/steeringRoute.test.js     — 15 tests
src/__tests__/steeringView.test.jsx     — 6 tests
docs/evals.md                       — quality rubrics
docs/steering-demo.md               — demo walkthrough
docs/steering-insights.md           — research hypotheses for steering analysis
docs/screenshots/steering-view.png

Built with snap-squad — npx snap-squad init

paulyuk and others added 14 commits April 1, 2026 13:01
Adds a new Journal view that tells the story of an AI session by
extracting key narrative moments from event data:

- 🎯 Steering: when the user redirected the agent
- 🆙 Level-Up: error recovery and breakthrough moments
- 🔄 Pivot: rapid consecutive redirections
- ❌ Mistake: errors encountered during the session
- ✅ Milestone: heavy work turns, test/build completions
- 💡 Insight: discoveries from agent reasoning

Heuristic-first approach — no API key required. Entries are clickable
to seek to that moment in Replay. Includes filter toolbar and
resizable split-panel layout matching the existing agentviz aesthetic.

New files:
  src/lib/journalExtractor.js    — extraction heuristics
  src/components/JournalView.jsx — React view component

Modified:
  src/App.jsx                    — render case for journal view
  src/components/app/constants.js — register Journal tab
  src/components/Icon.jsx         — add BookOpen icon

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds backend route GET /api/journal/git that analyzes git log to extract
the repo's narrative arc. JournalView now shows a Scribe-style table:

  | Time | Type | Steering Command | Level-Up 🆙 |

Key features:
- Classifies commits into milestone/levelup/pivot/mistake
- Collapses consecutive refactors into pivot arcs
- Extracts steering commands from conventional commit messages
- Synthesizes level-up moments for each entry type
- Filter bar to focus on specific entry types
- Detail panel with commit context on click
- Repo summary header (contributors, releases, features, fixes)

New file: routes/journal.js — git history extraction + API route
Modified: server.js — wire journal route
Modified: JournalView.jsx — rewritten for git-powered Scribe timeline

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
JournalView now shows BOTH sources in a single chronological table:
- Git entries: repo evolution from commit history
- Session entries: steering moments from the loaded AI session

Each row shows a source badge (git/session) alongside the type badge.
Session entries include a 'Jump to Replay' button in the detail panel.
Summary header shows counts for both sources.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Tests:
  journalExtractor.test.js — 23 tests covering steering detection,
    error recovery arcs, milestone thresholds, insight limits, pivot
    detection, deduplication, and chronological sorting

  journalRoute.test.js — 10 tests covering route handling, git history
    extraction, scribe timeline shape, type classification, chronological
    ordering, refactoring arc collapse, and self-documenting continuity

Bugs found and fixed by tests:
  - journalExtractor: milestone threshold unreachable in single-turn
    sessions (avgToolCount * 2.5 = self, always fails)
  - journal route: timezone-aware sorting (ISO string compare broken
    across CDT/PDT offsets; now uses Date.parse)

Evals:
  docs/evals.md — quality rubrics for extraction accuracy, git
    classification, narrative quality, source merge, and visual
    spot-checks. Follows SCORECARD.md grading pattern.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds author mapping so git display names resolve to consistent
handles across the Journal timeline.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three references to the old GIT_COLORS variable were missed during
the rename to ENTRY_COLORS, causing a crash on the Journal page.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds 6 JSDOM-based React render tests that mount JournalView with
mocked fetch data. These tests catch variable reference errors like
the GIT_COLORS → ENTRY_COLORS rename miss that crashed the page.

Verified: reintroducing the GIT_COLORS bug causes 5/6 tests to fail
with 'ReferenceError: GIT_COLORS is not defined' — exactly the crash
that was missed before.

Tests:
  - renders without crashing (catches missing variable references)
  - renders repo summary header with correct data
  - renders journal rows from git data
  - renders filter badges for all entry types present
  - renders session entries alongside git entries
  - shows empty state when fetch fails and no session data

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Session steering entries now show the actual user prompt in italic
quotes ("...") visually distinct from plain git commit summaries.

Level-up text is now context-specific instead of generic:
- 'instead'/'switch' → 'Changed direction — chose a different approach'
- 'don't'/'stop' → 'Set a boundary — knowing what NOT to do is taste'
- 'fix'/'wrong' → 'Quality gate — caught an issue and redirected'
- mistakes → 'Hit a wall: <actual error> — learned from it'
- insights → 'Discovered: <actual finding>'

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Journal has two sources: git history (always visible, shows what
changed) and session events (shows actual steering prompts, but only
when you load a session with real human redirections).

Updated demo guide to explain this upfront so users aren't
dissatisfied when they don't see steering commands immediately.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Steering moments from loaded sessions are now automatically persisted
to .agentviz/steering-v{N}.jsonl — no manual step required. The next
person who loads the repo sees the full steering history.

Safety controls:
- Redaction: GitHub tokens, API keys, AWS keys, JWTs, passwords,
  emails, and home directory paths are stripped before writing
- Retention: 200 entries per file, then rotates to next version
- Versioning: steering-v1.jsonl, steering-v2.jsonl for partitioning
- Opt-out: set {"steering": false} in .agentviz/config.json

Auto-contribute flow:
1. Session loads → extractor finds steering moments
2. useEffect deduplicates against existing log entries
3. New entries are POSTed → redacted → appended to versioned JSONL
4. Log refreshes → contributed entries appear with 'repo log' badge

3 new redaction tests verify secrets are scrubbed before persisting.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixed duplicate steering entries caused by useEffect re-firing when
steeringLog state updated. Now uses a useRef guard to contribute
exactly once per session load.

Includes the first 6 real steering entries from this session:
- 'I don't see the key evolutionary moments from the REPO itself'
- 'the steering command is not delivering on the promise'
- 'paulyuk and Paul Yuknewicz the same person'
- 'I want squad or agent instructions that do this as built in'
- 'I should be able to see my own steering in my local test app'

These persist in .agentviz/steering-v1.jsonl and will be visible
to anyone who clones the repo and opens the Journal view.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Shows the full Journal view with all three sources visible:
- REPO LOG entries with italic quoted steering commands
- SESSION entries from the loaded test file
- Git history entries (scrolled above)
- Detail panel showing the selected steering entry

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@paulyuk paulyuk changed the title feat: Journal View — repo evolution + session narrative as unified timeline feat: add Journal view — steering commands and git history Apr 2, 2026
Renames the view from 'Journal' to 'Steering' to better describe
what the view shows — the human steering commands that shaped the
work, alongside related git history.

Updated: constants.js, App.jsx, JournalView.jsx, all 3 test files,
journal-demo.md, evals.md.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@paulyuk paulyuk changed the title feat: add Journal view — steering commands and git history feat: add Steering view — surface the human decisions behind the code Apr 2, 2026
paulyuk and others added 13 commits April 1, 2026 17:35
Each steering entry now shows the git commit it produced in the
Level-Up column: → add Journal view, → git-powered Journal, etc.

Uses temporal matching: for persisted steering, finds the latest
feat/milestone commit before the persist time. Each commit is
claimed once to avoid duplicates.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Level-ups now match exemplar quality from git-for-pms:
- Emoji-prefixed: ✨ 🔧 📦 🏗️ ⚡ for git entries
- Specific and bold: '✨ **add Graph view.** New capability.'
- Session steering: '🎯 **Course corrected.** Refined the approach.'

Steering command column:
- Truncated to first sentence (max 120 chars) for readability
- Full text still in detail panel's 'What Happened' section
- Removed turn labels from the table (cleaner layout)

Level-up column:
- Smaller font, not italic (was too much visual noise)
- Content-derived instead of generic templates
- Git entries include the actual commit description

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Hand-mapped the actual steering commands from this session to the
commits they produced. Each entry has:
- The real human prompt (steeringCommand)
- What the AI did as a result (whatHappened)
- A specific, emoji-prefixed level-up

This is ground truth, not heuristic extraction.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…first

Layout changes:
- Removed Source column (type badge is enough)
- Added 'What Happened' column with short summary + commit hash
- Steering prompts render in intense white, commits in softer gray
- Type badge inline with steering command, not in separate column

Colors:
- Removed red (mistake) and green (levelup) — was hideous
- Uses grey (#94a3b8), blue (#6475e8), purple (#a78bfa) only
- Matches the Tracks view palette

Sort:
- Newest first (reverse chronological) like any log view

Header:
- Simplified stats — no per-source counts
- Clean single line: moments · releases · features · fixes

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…r Coach

- Collapsed type badges to just Steering and Commit (no more
  Release/Level-Up/Pivot/Fix confusion)
- Commit hashes render in blue (#6475e8) to pop visually
- Steering commands truncated at 110 chars (word boundary)
- Detail panel shows files changed (fetched via git show --stat)
- Steering tab moved after Coach (new, not established)
- Cleaned noise from steering log (removed 'Change to Steering',
  'Multiple redirections', test artifacts)
- Filter bar simplified to two toggles: Steering / Commits
- Fixed tests for new filter labels

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Impact scoring:
- Each commit includes linesChanged from git shortstat
- Steering entries inherit impact from their resulting commit
- Impact column shows a proportional bar + line count
- Steering bars in blue, commit bars in grey

Squad responses:
- Extractor now captures the assistant's first response after
  each steering command (the squad perspective)
- Detail panel shows 'Squad Response' section with the agent's
  reply, truncated to 500 chars with scroll

Noise cleanup:
- Removed 'Change to Steering', 'Multiple redirections', and
  test artifact entries from the steering log

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ixed

This is the Last Known Good release of the Steering view.

What changed:
- Disabled auto-contribute (produced noise without AI summarization)
- 10 hand-curated steering entries matching exemplar quality:
  each has human prompt, narrative whatHappened with commit hash,
  and emoji **bold headline.** insight level-up
- Removed synthetic pivot entries (noise, not signal)
- Fixed extractor: filters out tool invocations, collects full
  squad reasoning (up to 6000 chars)
- whatHappened no longer repeats the steering command
- Level-up no longer duplicates whatHappened — uses resulting
  commit's synthesized level-up or curated text

Quality lessons learned:
- Keyword-pattern synthesis produces garbage level-ups
- Auto-contribute without AI summarization creates noise
- The exemplar format (emoji + bold + insight) requires
  understanding context, not just truncating commit messages
- Always validate output against the grounding exemplar
  before shipping — look at real rendered data, not code

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The merge step was overwriting hand-curated levelUp and whatHappened
from the steering log with auto-generated commit data. Now checks
if the entry already has both fields and skips enrichment.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Each entry has:
- Real human prompt (steering command)
- Narrative whatHappened with commit hashes
- emoji **Bold headline.** Insight sentence. level-up

Examples:
  🌱 **Feature born.** Gap between data views and narrative filled.
  📡 **Git history becomes the story.** Repo evolution visible.
  ✅ **Tests caught real bugs.** Milestone threshold fixed.
  🔧 **Production crash → test coverage.** Failure drove testing.
  🎯 **Output is the eval.** Stopped fixing code, started reading.

Disabled auto-contribute — curated quality requires human or AI,
not heuristic keyword matching.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…al grade

Detail panel now shows for each curated steering entry:
- Squad Response: full agent reasoning (Architect, Coder, Tester perspectives)
- Commit info: hash · lines changed · tests passing
- Files Changed: all affected files listed

Impact column shows 3 values per row:
- Lines changed (e.g. 723L)
- Tests passing (e.g. ✓ 425/426)
- Eval grade (A = all columns filled)

Curated steering log enriched with:
- resultingCommit hash for each entry
- assistantResponse with squad role tags
- filesChanged arrays
- test pass counts at time of commit

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
paulyuk and others added 8 commits April 1, 2026 21:44
…kups

Session-extracted steering entries now:
- Match ANY commit type for resulting commit (not just feat/milestone)
- Get whatHappened from assistant response (squad reasoning)
- Get levelUp from the resulting commit's synthesized level-up
- Only skip enrichment if BOTH fields already curated

Removed duplicate findResultForContributed function.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Session-relative times (seconds from start) now convert to proper
wall-clock ISO timestamps using: now - sessionDuration + eventTime.

This means:
- Latest steering prompts appear at the top (newest-first sort)
- Timestamps match real times, not garbage relative offsets
- New events streaming via SSE trigger re-extraction (useMemo
  depends on [events, turns]) — same live behavior as other views

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Removed 'Use token [REDACTED]' and 'iterate' from steering log
- Raised minimum steering message length from 5 to 15 chars
  (filters out short non-steering like 'iterate', 'good', 'ok')
- 4 new tests for live update behavior:
  - New turns produce new steering entries
  - Assistant response captured from turn events
  - Short messages filtered as non-steering
  - impactTurns computed between consecutive steerings

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The POST redaction test was writing to the real .agentviz/steering-v1.jsonl,
leaking 'Use token [REDACTED]' entries. Now mocks process.cwd() to a temp
directory and cleans up after. Verified: zero leaks after test run.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Prior context:
- Detail panel now shows 'Responding To' section — the assistant's
  last statement that the user was reacting to. Gives full context
  for short steering like 'make the separate fix then'

Noise filtering:
- Extractor skips messages containing [REDACTED], ghp_, sk-
- Prevents test artifacts from appearing as steering entries
- Combined with temp dir fix, noise is eliminated at both source
  and persistence layers

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addressed all PR review guardrails from jayparikh#43:

✅ Typecheck: clean (npx tsc --noEmit)
✅ Tests: 473/474 (1 pre-existing VS Code path test)
✅ Build: clean (npx vite build)
✅ Screenshots: all 8 originals present + journal-view.png added
✅ README: Steering view section added
✅ Style guide: Section 20 added for Steering view conventions
✅ Hardcoded colors: replaced all hex values with theme tokens
   (theme.accent.primary, theme.track.reasoning, theme.semantic.success)
✅ Test coverage: 3 test files for parser/server changes

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…erywhere

Removed from PR (not part of the feature):
- .squad/ (14 files) — squad config
- AGENTS.md, JOURNAL.md — squad meta
- .github/copilot-instructions.md — reverted to main
- CLAUDE.md — reverted to main
- .agentviz/steering-v1.jsonl — gitignored, auto-generated at runtime

Renamed journal→steering in all filenames:
- routes/journal.js → routes/steering.js
- src/components/JournalView.jsx → src/components/SteeringView.jsx
- src/lib/journalExtractor.js → src/lib/steeringExtractor.js
- src/__tests__/journal*.test.* → src/__tests__/steering*.test.*
- docs/journal-demo.md → docs/steering-demo.md
- docs/screenshots/journal-view.png → docs/screenshots/steering-view.png

Renamed journal→steering in all code:
- extractJournal → extractSteering
- JOURNAL_TYPES → STEERING_TYPES
- JournalView → SteeringView
- JournalRow → SteeringRow
- commitToJournalType → commitToSteeringType
- extractGitJournal → extractGitSteering
- computeJournalStats → computeSteeringStats

Final PR: 16 files (9 new, 7 modified). No squad, no agent
instructions, no committed data files.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds ✨ Synthesize button that calls Copilot SDK to generate
exemplar-quality whatHappened and levelUp for session steering
entries. Same SDK pattern as Coach — defineTool + session.send.

New files:
  src/lib/steeringAgent.js — Copilot SDK synthesis agent
  routes/steering.js — POST /api/journal/synthesize endpoint

Frontend:
  SteeringView.jsx — Synthesize button, results overlay on entries

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@paulyuk paulyuk changed the title feat: add Steering view — surface the human decisions behind the code feat: add Steering view — analyze the impact of human prompts on code Apr 2, 2026
paulyuk and others added 3 commits April 1, 2026 22:42
Steering entries now show heuristic data immediately, then the
Copilot SDK synthesis runs automatically in the background once
session data loads. As results arrive, level-up and whatHappened
columns update in place — impact goes from empty to populated.

Subtle '✨ Analyzing...' indicator shows while synthesis runs.
No manual button needed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The button was calling playback.seek() which updated the time but
stayed on the Steering view. Now uses onSeekReplay callback that
does both: seek to the timestamp + setView('replay').

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Analysis is now retry-safe: on failure or empty results, retries
up to 2 times with exponential backoff (3s, 6s). No manual reset
needed — the system recovers automatically.

State machine: idle → analyzing → done (or retry → done).
Per-row pulsing indicator in the Level-Up column shows which
entries are being analyzed. Disappears when results arrive.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@paulyuk paulyuk force-pushed the feature/journal-view branch from 59812c8 to 43b5e9f Compare April 2, 2026 06:08
paulyuk and others added 3 commits April 1, 2026 23:19
…on insights

Adds an intelligence panel above the table showing:
- Steering density (commands per hour) with visual emphasis when high
- Top 4 actionable insights from steering patterns:
  'Quality redirected N times — consider a checklist skill'
  'Tone corrected N times — a style guide skill would reduce this'
  'Visual corrections N times — validate rendered output, not code'
- Category badges showing correction types and frequency
- Insights doc at docs/steering-insights.md with 8 hypotheses

The insights panel appears automatically when steering data has
enough signal. Density > 5/hr = agent needed frequent correction.
Density < 2/hr = agent was well-aligned.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Rewrite steering-demo.md to cover background analysis, intelligence panel
- Update evals.md test counts (48 tests across 3 suites)
- Remove outdated sample data tables, add analysis and intelligence sections

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@paulyuk paulyuk closed this Apr 2, 2026
@paulyuk
Copy link
Copy Markdown
Author

paulyuk commented Apr 2, 2026

Too much evolved and churned here, so made a clean PR instead #44 and closed this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant