feat: add Steering view — analyze the impact of human prompts on code#41
Closed
paulyuk wants to merge 42 commits into
Closed
feat: add Steering view — analyze the impact of human prompts on code#41paulyuk wants to merge 42 commits into
paulyuk wants to merge 42 commits into
Conversation
Adds a new Journal view that tells the story of an AI session by extracting key narrative moments from event data: - 🎯 Steering: when the user redirected the agent - 🆙 Level-Up: error recovery and breakthrough moments - 🔄 Pivot: rapid consecutive redirections - ❌ Mistake: errors encountered during the session - ✅ Milestone: heavy work turns, test/build completions - 💡 Insight: discoveries from agent reasoning Heuristic-first approach — no API key required. Entries are clickable to seek to that moment in Replay. Includes filter toolbar and resizable split-panel layout matching the existing agentviz aesthetic. New files: src/lib/journalExtractor.js — extraction heuristics src/components/JournalView.jsx — React view component Modified: src/App.jsx — render case for journal view src/components/app/constants.js — register Journal tab src/components/Icon.jsx — add BookOpen icon Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds backend route GET /api/journal/git that analyzes git log to extract the repo's narrative arc. JournalView now shows a Scribe-style table: | Time | Type | Steering Command | Level-Up 🆙 | Key features: - Classifies commits into milestone/levelup/pivot/mistake - Collapses consecutive refactors into pivot arcs - Extracts steering commands from conventional commit messages - Synthesizes level-up moments for each entry type - Filter bar to focus on specific entry types - Detail panel with commit context on click - Repo summary header (contributors, releases, features, fixes) New file: routes/journal.js — git history extraction + API route Modified: server.js — wire journal route Modified: JournalView.jsx — rewritten for git-powered Scribe timeline Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
JournalView now shows BOTH sources in a single chronological table: - Git entries: repo evolution from commit history - Session entries: steering moments from the loaded AI session Each row shows a source badge (git/session) alongside the type badge. Session entries include a 'Jump to Replay' button in the detail panel. Summary header shows counts for both sources. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Tests:
journalExtractor.test.js — 23 tests covering steering detection,
error recovery arcs, milestone thresholds, insight limits, pivot
detection, deduplication, and chronological sorting
journalRoute.test.js — 10 tests covering route handling, git history
extraction, scribe timeline shape, type classification, chronological
ordering, refactoring arc collapse, and self-documenting continuity
Bugs found and fixed by tests:
- journalExtractor: milestone threshold unreachable in single-turn
sessions (avgToolCount * 2.5 = self, always fails)
- journal route: timezone-aware sorting (ISO string compare broken
across CDT/PDT offsets; now uses Date.parse)
Evals:
docs/evals.md — quality rubrics for extraction accuracy, git
classification, narrative quality, source merge, and visual
spot-checks. Follows SCORECARD.md grading pattern.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds author mapping so git display names resolve to consistent handles across the Journal timeline. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three references to the old GIT_COLORS variable were missed during the rename to ENTRY_COLORS, causing a crash on the Journal page. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds 6 JSDOM-based React render tests that mount JournalView with mocked fetch data. These tests catch variable reference errors like the GIT_COLORS → ENTRY_COLORS rename miss that crashed the page. Verified: reintroducing the GIT_COLORS bug causes 5/6 tests to fail with 'ReferenceError: GIT_COLORS is not defined' — exactly the crash that was missed before. Tests: - renders without crashing (catches missing variable references) - renders repo summary header with correct data - renders journal rows from git data - renders filter badges for all entry types present - renders session entries alongside git entries - shows empty state when fetch fails and no session data Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Session steering entries now show the actual user prompt in italic
quotes ("...") visually distinct from plain git commit summaries.
Level-up text is now context-specific instead of generic:
- 'instead'/'switch' → 'Changed direction — chose a different approach'
- 'don't'/'stop' → 'Set a boundary — knowing what NOT to do is taste'
- 'fix'/'wrong' → 'Quality gate — caught an issue and redirected'
- mistakes → 'Hit a wall: <actual error> — learned from it'
- insights → 'Discovered: <actual finding>'
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Journal has two sources: git history (always visible, shows what changed) and session events (shows actual steering prompts, but only when you load a session with real human redirections). Updated demo guide to explain this upfront so users aren't dissatisfied when they don't see steering commands immediately. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Steering moments from loaded sessions are now automatically persisted
to .agentviz/steering-v{N}.jsonl — no manual step required. The next
person who loads the repo sees the full steering history.
Safety controls:
- Redaction: GitHub tokens, API keys, AWS keys, JWTs, passwords,
emails, and home directory paths are stripped before writing
- Retention: 200 entries per file, then rotates to next version
- Versioning: steering-v1.jsonl, steering-v2.jsonl for partitioning
- Opt-out: set {"steering": false} in .agentviz/config.json
Auto-contribute flow:
1. Session loads → extractor finds steering moments
2. useEffect deduplicates against existing log entries
3. New entries are POSTed → redacted → appended to versioned JSONL
4. Log refreshes → contributed entries appear with 'repo log' badge
3 new redaction tests verify secrets are scrubbed before persisting.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixed duplicate steering entries caused by useEffect re-firing when steeringLog state updated. Now uses a useRef guard to contribute exactly once per session load. Includes the first 6 real steering entries from this session: - 'I don't see the key evolutionary moments from the REPO itself' - 'the steering command is not delivering on the promise' - 'paulyuk and Paul Yuknewicz the same person' - 'I want squad or agent instructions that do this as built in' - 'I should be able to see my own steering in my local test app' These persist in .agentviz/steering-v1.jsonl and will be visible to anyone who clones the repo and opens the Journal view. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Shows the full Journal view with all three sources visible: - REPO LOG entries with italic quoted steering commands - SESSION entries from the loaded test file - Git history entries (scrolled above) - Detail panel showing the selected steering entry Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Renames the view from 'Journal' to 'Steering' to better describe what the view shows — the human steering commands that shaped the work, alongside related git history. Updated: constants.js, App.jsx, JournalView.jsx, all 3 test files, journal-demo.md, evals.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Each steering entry now shows the git commit it produced in the Level-Up column: → add Journal view, → git-powered Journal, etc. Uses temporal matching: for persisted steering, finds the latest feat/milestone commit before the persist time. Each commit is claimed once to avoid duplicates. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Level-ups now match exemplar quality from git-for-pms: - Emoji-prefixed: ✨ 🔧 📦 🏗️ ⚡ for git entries - Specific and bold: '✨ **add Graph view.** New capability.' - Session steering: '🎯 **Course corrected.** Refined the approach.' Steering command column: - Truncated to first sentence (max 120 chars) for readability - Full text still in detail panel's 'What Happened' section - Removed turn labels from the table (cleaner layout) Level-up column: - Smaller font, not italic (was too much visual noise) - Content-derived instead of generic templates - Git entries include the actual commit description Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Hand-mapped the actual steering commands from this session to the commits they produced. Each entry has: - The real human prompt (steeringCommand) - What the AI did as a result (whatHappened) - A specific, emoji-prefixed level-up This is ground truth, not heuristic extraction. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…first Layout changes: - Removed Source column (type badge is enough) - Added 'What Happened' column with short summary + commit hash - Steering prompts render in intense white, commits in softer gray - Type badge inline with steering command, not in separate column Colors: - Removed red (mistake) and green (levelup) — was hideous - Uses grey (#94a3b8), blue (#6475e8), purple (#a78bfa) only - Matches the Tracks view palette Sort: - Newest first (reverse chronological) like any log view Header: - Simplified stats — no per-source counts - Clean single line: moments · releases · features · fixes Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…r Coach - Collapsed type badges to just Steering and Commit (no more Release/Level-Up/Pivot/Fix confusion) - Commit hashes render in blue (#6475e8) to pop visually - Steering commands truncated at 110 chars (word boundary) - Detail panel shows files changed (fetched via git show --stat) - Steering tab moved after Coach (new, not established) - Cleaned noise from steering log (removed 'Change to Steering', 'Multiple redirections', test artifacts) - Filter bar simplified to two toggles: Steering / Commits - Fixed tests for new filter labels Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Impact scoring: - Each commit includes linesChanged from git shortstat - Steering entries inherit impact from their resulting commit - Impact column shows a proportional bar + line count - Steering bars in blue, commit bars in grey Squad responses: - Extractor now captures the assistant's first response after each steering command (the squad perspective) - Detail panel shows 'Squad Response' section with the agent's reply, truncated to 500 chars with scroll Noise cleanup: - Removed 'Change to Steering', 'Multiple redirections', and test artifact entries from the steering log Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ixed This is the Last Known Good release of the Steering view. What changed: - Disabled auto-contribute (produced noise without AI summarization) - 10 hand-curated steering entries matching exemplar quality: each has human prompt, narrative whatHappened with commit hash, and emoji **bold headline.** insight level-up - Removed synthetic pivot entries (noise, not signal) - Fixed extractor: filters out tool invocations, collects full squad reasoning (up to 6000 chars) - whatHappened no longer repeats the steering command - Level-up no longer duplicates whatHappened — uses resulting commit's synthesized level-up or curated text Quality lessons learned: - Keyword-pattern synthesis produces garbage level-ups - Auto-contribute without AI summarization creates noise - The exemplar format (emoji + bold + insight) requires understanding context, not just truncating commit messages - Always validate output against the grounding exemplar before shipping — look at real rendered data, not code Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The merge step was overwriting hand-curated levelUp and whatHappened from the steering log with auto-generated commit data. Now checks if the entry already has both fields and skips enrichment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Each entry has: - Real human prompt (steering command) - Narrative whatHappened with commit hashes - emoji **Bold headline.** Insight sentence. level-up Examples: 🌱 **Feature born.** Gap between data views and narrative filled. 📡 **Git history becomes the story.** Repo evolution visible. ✅ **Tests caught real bugs.** Milestone threshold fixed. 🔧 **Production crash → test coverage.** Failure drove testing. 🎯 **Output is the eval.** Stopped fixing code, started reading. Disabled auto-contribute — curated quality requires human or AI, not heuristic keyword matching. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…al grade Detail panel now shows for each curated steering entry: - Squad Response: full agent reasoning (Architect, Coder, Tester perspectives) - Commit info: hash · lines changed · tests passing - Files Changed: all affected files listed Impact column shows 3 values per row: - Lines changed (e.g. 723L) - Tests passing (e.g. ✓ 425/426) - Eval grade (A = all columns filled) Curated steering log enriched with: - resultingCommit hash for each entry - assistantResponse with squad role tags - filesChanged arrays - test pass counts at time of commit Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…kups Session-extracted steering entries now: - Match ANY commit type for resulting commit (not just feat/milestone) - Get whatHappened from assistant response (squad reasoning) - Get levelUp from the resulting commit's synthesized level-up - Only skip enrichment if BOTH fields already curated Removed duplicate findResultForContributed function. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Session-relative times (seconds from start) now convert to proper wall-clock ISO timestamps using: now - sessionDuration + eventTime. This means: - Latest steering prompts appear at the top (newest-first sort) - Timestamps match real times, not garbage relative offsets - New events streaming via SSE trigger re-extraction (useMemo depends on [events, turns]) — same live behavior as other views Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Removed 'Use token [REDACTED]' and 'iterate' from steering log - Raised minimum steering message length from 5 to 15 chars (filters out short non-steering like 'iterate', 'good', 'ok') - 4 new tests for live update behavior: - New turns produce new steering entries - Assistant response captured from turn events - Short messages filtered as non-steering - impactTurns computed between consecutive steerings Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The POST redaction test was writing to the real .agentviz/steering-v1.jsonl, leaking 'Use token [REDACTED]' entries. Now mocks process.cwd() to a temp directory and cleans up after. Verified: zero leaks after test run. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Prior context: - Detail panel now shows 'Responding To' section — the assistant's last statement that the user was reacting to. Gives full context for short steering like 'make the separate fix then' Noise filtering: - Extractor skips messages containing [REDACTED], ghp_, sk- - Prevents test artifacts from appearing as steering entries - Combined with temp dir fix, noise is eliminated at both source and persistence layers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addressed all PR review guardrails from jayparikh#43: ✅ Typecheck: clean (npx tsc --noEmit) ✅ Tests: 473/474 (1 pre-existing VS Code path test) ✅ Build: clean (npx vite build) ✅ Screenshots: all 8 originals present + journal-view.png added ✅ README: Steering view section added ✅ Style guide: Section 20 added for Steering view conventions ✅ Hardcoded colors: replaced all hex values with theme tokens (theme.accent.primary, theme.track.reasoning, theme.semantic.success) ✅ Test coverage: 3 test files for parser/server changes Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…erywhere Removed from PR (not part of the feature): - .squad/ (14 files) — squad config - AGENTS.md, JOURNAL.md — squad meta - .github/copilot-instructions.md — reverted to main - CLAUDE.md — reverted to main - .agentviz/steering-v1.jsonl — gitignored, auto-generated at runtime Renamed journal→steering in all filenames: - routes/journal.js → routes/steering.js - src/components/JournalView.jsx → src/components/SteeringView.jsx - src/lib/journalExtractor.js → src/lib/steeringExtractor.js - src/__tests__/journal*.test.* → src/__tests__/steering*.test.* - docs/journal-demo.md → docs/steering-demo.md - docs/screenshots/journal-view.png → docs/screenshots/steering-view.png Renamed journal→steering in all code: - extractJournal → extractSteering - JOURNAL_TYPES → STEERING_TYPES - JournalView → SteeringView - JournalRow → SteeringRow - commitToJournalType → commitToSteeringType - extractGitJournal → extractGitSteering - computeJournalStats → computeSteeringStats Final PR: 16 files (9 new, 7 modified). No squad, no agent instructions, no committed data files. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds ✨ Synthesize button that calls Copilot SDK to generate exemplar-quality whatHappened and levelUp for session steering entries. Same SDK pattern as Coach — defineTool + session.send. New files: src/lib/steeringAgent.js — Copilot SDK synthesis agent routes/steering.js — POST /api/journal/synthesize endpoint Frontend: SteeringView.jsx — Synthesize button, results overlay on entries Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Steering entries now show heuristic data immediately, then the Copilot SDK synthesis runs automatically in the background once session data loads. As results arrive, level-up and whatHappened columns update in place — impact goes from empty to populated. Subtle '✨ Analyzing...' indicator shows while synthesis runs. No manual button needed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The button was calling playback.seek() which updated the time but
stayed on the Steering view. Now uses onSeekReplay callback that
does both: seek to the timestamp + setView('replay').
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Analysis is now retry-safe: on failure or empty results, retries up to 2 times with exponential backoff (3s, 6s). No manual reset needed — the system recovers automatically. State machine: idle → analyzing → done (or retry → done). Per-row pulsing indicator in the Level-Up column shows which entries are being analyzed. Disappears when results arrive. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
59812c8 to
43b5e9f
Compare
…on insights Adds an intelligence panel above the table showing: - Steering density (commands per hour) with visual emphasis when high - Top 4 actionable insights from steering patterns: 'Quality redirected N times — consider a checklist skill' 'Tone corrected N times — a style guide skill would reduce this' 'Visual corrections N times — validate rendered output, not code' - Category badges showing correction types and frequency - Insights doc at docs/steering-insights.md with 8 hypotheses The insights panel appears automatically when steering data has enough signal. Density > 5/hr = agent needed frequent correction. Density < 2/hr = agent was well-aligned. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Rewrite steering-demo.md to cover background analysis, intelligence panel - Update evals.md test counts (48 tests across 3 suites) - Remove outdated sample data tables, add analysis and intelligence sections Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Author
|
Too much evolved and churned here, so made a clean PR instead #44 and closed this. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Steering View
Problem
When you watch an AI coding session in Replay or Tracks, you see what the agent did — tool calls, file edits, reasoning. But you can't see which human prompts actually mattered. Which ones changed the direction? Which ones led to the most code? Which ones caused bugs that led to better tests?
Solution
The Steering view extracts every human prompt that redirected the agent, pairs each one with the git commits and code changes it produced, and shows the impact: lines changed, tests passing, and what was learned.
It understands not just the primary coding agent (Copilot CLI, Claude Code, VS Code Copilot), but also sub-agents and squads — capturing the reasoning from dispatched agents as they respond to steering.
An agent-assisted analysis runs automatically in the background using the Copilot SDK to improve the quality of What Happened, Level-Up, and Impact columns beyond what static heuristics can produce. A steering intelligence panel shows density scoring, category breakdown, and actionable insights derived from the session's steering patterns.
Screenshot
Quick Start
git fetch origin pull/41/head:feature/journal-view git checkout feature/journal-view npm install && npm run devLoad a session, click the Steering tab (last tab, after Coach).
What you see
Each row is either a human steering command (italic quotes, bright text) or a git commit (dimmer text, blue hash). For each steering command:
The detail panel shows the full squad response, files changed, commit info, and a Responding To section showing what the agent said that the user was reacting to.
Steering intelligence
Below the timeline, an expandable panel shows:
How it works
Guardrails verified (per #43)
Files (18 total: 11 new, 7 modified)
Built with snap-squad — npx snap-squad init