feat: generic wizard + /config hub, TUI overhaul, structural refactor, e2e uplift + docs sync#66
Conversation
…nal review, title)
render/wizard.ts was coupled to setup. Generalize it (its pure model + hardened
alt-screen/raw-mode driver stay):
- new `text` step kind: default/placeholder, secret masking, validate() that
blocks confirm; char/erase in the pure reducer; caret + inline error render
- parameterized `title` (was hardcoded "tsforge setup")
- optional review screen (`review:false` applies on the last step's confirm)
- results now include `text` (+ `textValue` helper); overview shows text answers
- driver takes an options object {title, review, extra, out}; `b`/`q` are
back/cancel except on a text field (where they're literal input)
…ons object
Both callers pass {title, extra} instead of positional args — setup keeps its
'tsforge setup' header; scaffold now gets a correct 'tsforge scaffold' header
(it previously inherited the hardcoded setup title). Behavior-preserving for setup.
b/q and printable chars now decode as text input ({char}); the driver maps b/q to
back/cancel only on non-text steps. Backspace decodes as erase.
… gate
Spawns the wizard in a real pty, picks a single-select, erases the default and
types into a text field, confirms; asserts frames + final {single, text}. Wired
into e2e:pty so it runs on every validate/CI.
There was a problem hiding this comment.
Code Review
This pull request generalizes the wizard primitive to support a new text step kind, parameterized titles, and an optional review screen, refactoring /setup and /scaffold to use it. It also adds a real-PTY end-to-end test harness. The review feedback highlights two key issues in the interactive driver: first, pressing the Spacebar on a text step is incorrectly handled as a toggle action instead of typing a space; second, the text step hints misleadingly suggest using b and q for navigation and cancellation, which are instead captured as literal text inputs.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
… review)
Gemini review:
- HIGH: space on a text step decoded to "toggle" (a no-op) — you couldn't type a
space. On a text step, EVERY printable ASCII char (0x20–0x7e) is now literal
input, incl. space/b/q. Bounded at 0x7e so DEL (backspace) still decodes as erase.
- MEDIUM: text-step hints falsely showed "b back / q cancel" (those are typed).
Hints now read "type to edit · ← back · enter continue · esc cancel"; ← is a real
back key for text steps (left-arrow is otherwise unused while editing).
Guards: reducer test (space types into a field), actionFor DEL boundary, and the
real-pty e2e now types a value WITH a space ("x y") to lock the regression.
…c wizard First consumer of the generic wizard: an in-harness settings menu so users never hand-edit ~/.tsforge/models.json. - cli/config-menu.ts: settings surface built from wizard steps — buildConfigMenu (switch/add), buildModelPickStep, buildAddModelSteps (name/baseUrl/model/apiKey; apiKey masked + optional; required-field validation), draftToEntry + addModel (pure). Persists via saveModelsConfig/setActiveModel; hot-swaps the provider via an injected reconfigure. Mode / feature-toggle groups slot in later. - cli.ts: /config → runConfigCommand, extracted to handleConfig for the complexity cap; suspends the REPL editor's stdin around the wizard via a repl-scoped editorControl (mirrors resizeEditor); applies the result live. - commands.ts: /config in the registry. Tests: config-menu pure builders/validators/addModel; a real-pty e2e (scripts/e2e-config-pty.py, in the gate) drives the add-model flow end to end and asserts models.json persisted + active + provider hot-swapped.
|
Added
Testing: pure builders/validators/
|
…ncel Rebuild /config as a single owned-stdin menu loop (no nested overlays), which fixes the reported bugs and makes it a real settings hub. Bug fixed (reported): a REPL-launched wizard called stdin.pause() on exit — because the editor owns stdin via a `data` listener (no keypress listeners), the wizard wrongly thought it owned raw mode. pause() emptied the event loop and QUIT tsforge on cancel/back/apply. Fix: runWizard gains `manageInput` (default true); REPL callers (/config, and /setup from the editor) pass false so they never seize/pause stdin. Also removes the 'b'-leaks-into-input class (no nesting + clear the editor buffer on resume). /config is now a hub (cli/config-menu.ts) — one keypress session, grouped settings, each with a one-line description + live value, applied immediately: - Model: switch active (cycles models.json) + add a model (inline text fields, masked optional apiKey, validation) -> saveModelsConfig + live reconfigure - Behavior: mode (plan/normal), gate command, editable scope (session) - Tools: web / TDD / script toggles (env, live for subsequent turns) Pure helpers unit-tested; the interactive loop covered by a REAL-REPL pty e2e (open /config via the palette; cancel-doesn't-quit, live mode toggle, add-model persist). Removed the obsolete standalone harness. Docs updated. validate green (1858 pass; all three pty suites pass).
…ive) - The auto-detected gate command is a huge multi-line tsc+eslint+test string; it rendered verbatim and blew the whole /config menu layout out. Menu rows now clamp each value to one line (whitespace-collapsed, ellipsis) via oneLine(). - Web tools now default ON in the interactive REPL (an assistant that can't look things up is silly). Only a default — an explicit TSFORGE_WEB (incl. "0") wins, and one-shot/headless/eval never run repl() so they stay offline+deterministic. Test: oneLine unit test (truncate + collapse newlines); validate green (1859 pass, all pty suites).
/config is now the single home for what a human actually configures, each setting with its own visible one-line description (the config screen IS the docs). Removed nonsensical toggles — nobody disables code navigation or git context — those stay env-only for eval/CI. - config-menu: per-setting descriptions rendered under every row; add Update check toggle; Web tools default-on (interactive) surfaced live. - Delete experimental ENV flags + all consumers/tests: TSFORGE_LEGACY_FEEDBACK, TSFORGE_NO_ASTGREP, TSFORGE_FORCE_TOOLS, TSFORGE_SIMPLICITY, TSFORGE_CONTRACT. - Graduate to always-on (flag deleted): hashline, TTSR, LSP write feedback. - Remove the now-dead yield_status machinery (only the deleted forced-tools experiment advertised it): tool spec, dispatch stub, policy class, session resolveYieldCalls. - Eval sweep dims trimmed to live flags (git/script/web). - Docs: flags.mdx points feature toggles at /config; purge deleted flags across uplift/eval/greenfield/lsp/quality/web docs. Verified: bun run validate green (typecheck+lint+format, 1842 tests, 3 pty suites) + real iTerm2 GUI drive of /config.
Fixes the double-typed text when entering values in /config (e.g. Add a model). The palette launches /config via a fire-and-forget runLine then resume()s the editor in its `finally`, re-activating it UNDERNEATH the overlay so every key was echoed into the editor's pinned input row on top of the overlay's own render. Add an `inert` input-gate to the editor that resume() does NOT clear; the /config overlay sets it, so the stray resume can't re-activate the editor. Regression tests in the real-PTY e2e: typed text renders once, and the editor works again after /config closes. Also trims /config to only genuine human choices: - Remove Script tool + Update check toggles (eval/kill-switch knobs, not settings). - Update check now ALWAYS runs (interactive, non-CI; respects NO_UPDATE_NOTIFIER); TSFORGE_NO_UPDATE_CHECK deleted. TSFORGE_NO_SCRIPT kept as an env kill-switch. - e2e scripts switched from TSFORGE_NO_UPDATE_CHECK to NO_UPDATE_NOTIFIER offline. - Docs updated: /config = model, mode, gate, scope, web, TDD; eval/CI-only knobs (NO_LSP_TOOLS/NO_GIT_TOOL/NO_SCRIPT) documented separately. Verified: bun run validate green (typecheck+lint+format, tests, 3 pty suites) + isolated pty repro (marker renders 1x, was 2x).
Audited all 43 doc pages against the current source. Fixes: - plan-mode / interactive: `--plan` accurately described (forces plan for an interactive session, overriding a repo's autonomous policy.mode; ignored by one-shot/headless) — was overstated/ambiguous. - model-agent: add the `script` tool (programmatic tool calling) to the tool table. - spec-runner / commands: eval sweep examples used the removed `ttsr,hashline` dimensions → live dims (`git`/`script`). - validation: web-build turn cap 180 → 400 (loop.constants.ts webMaxTurns). - rule-packs: `generic-ts` is an always-on pack (core TS safety), not a "detection label only" — moved into the always-on table. - flags: document TSFORGE_BOOT_URL/TIMEOUT defaults (http://localhost:3000/, 15000ms). - roadmap: "shipped through 0.18" → 0.27; Road-to-1.0 sweep example uses live dims. 30+ pages verified clean. Docs build green (46 pages).
Design for making tsforge's capabilities discoverable in-session: /help becomes an actionable capability browser over a self-describing registry; scaffold (boringstack/astro/vite) + recipes brought into the REPL; an anti-drift test that fails if a command or model tool ships without a discovery home.
…e); drop duplicate loop
…menus Replace /config's alt-screen menu with a compact inline dropdown above the input row (matching the @file picker pattern). The new `inline-menu.ts` module provides a reusable FLAT menu driver + formatter: - formatMenuRows(rows, cursor, columns, color) returns a complete overlay block: windowed ≤8 rows with scroll indicators, divider, selected row's description, and footer hint. No alt-screen, no raw-mode toggle. - runInlineMenu(rows, deps) owns keypress and navigates ↑/↓, Enter to select, Esc to close. Resolves to row index or null. - Config-menu migrated to use inline-menu via IConfigMenuView callbacks (render/close), injected by cli.ts handleConfig. Edit sub-view uses the same overlay pattern inline. - All behavior preserved: toggles stay open + re-render (cursor keeps row), text fields inline with validation, editor suspend/resume + inert gate (no double-typed text), model persistence to models.json. Tests: formatMenuRows windowing test added, config-menu 13 pass, e2e 15/15.
Two coordinated changes to the capability browser: 1. Remove model-tools + passive machinery: - Delete toolCapabilities() and TOOL_METADATA entirely - Remove "passive" from CapabilityKind (now "command" | "wizard") - Remove detail field from ICapability - buildCapabilities returns only command + wizard rows (no tools) 2. Migrate /help to inline-menu (same as /config): - Replace owned-menu driver with inline-menu + formatMenuRows - capabilityRows now returns IMenuRowData (label, hint, describe) - Remove showDetail from ICapabilityMenuDeps - handleHelp follows handleConfig pattern: suspend→runCapabilityMenu→resume - Uses statusBar.setOverlay/clearOverlay for rendering 3. Tests updated: - Delete "every model tool has a discovery home" anti-drift test - Keep "every slash command has a discovery home" - capability-menu tests use formatMenuRows instead of owned-menu Note: owned-menu.ts remains (still used by repl-recipe.ts). All tests pass; e2e config script: 15/15 PASS.
…, add title The inline menus (/help, /config, recipes) had three rendering bugs: - STACKING: the overlay could exceed the terminal height, so the status bar's relative-redraw couldn't climb to the (scrolled-off) region top to clear it and each redraw left a copy. Now the visible-row count is bounded to the terminal height, and EVERY overlay line is clipped to the column width (an unclipped describe line wrapped, desyncing the row bookkeeping and compounding it). - STYLING: every row was painted bold (then, worse, all-blue). Now only the SELECTED row is brand+bold; all other rows are plain default text (legible). - LAYOUT: added a bold title at the top; the selected row's description stays at the bottom. Verified in a REAL 14-row terminal (scripts/e2e-help-menu-pty.py, wired into e2e:pty): no stacking, exactly one styled row, title on top. /config e2e 15/15.
…h on cancel The / command palette was the last menu on the alternate screen. It now renders as the same compact inline overlay as /help and the @ picker, reusing formatMenuRows: command names as rows, the selected command's summary at the bottom, and the live query as the overlay title (/co). No alt-screen. Also fixes the reported bug where cancelling the palette (Esc or backspace-past- empty) left the trigger '/' stuck in the editor — the cancel path now clears it. clampIndex moved to inline-menu (the menu core) with a re-export from command-menu so existing importers are untouched (avoids an import cycle). Verified in a real terminal: inline (no ESC[?1049h), filters, Esc closes cleanly, no stray '/', editor live after (7/7). config e2e 15/15, help e2e 6/6, unit green. e2e palette-open markers updated for the inline title.
The manageInput:false path (REPL-launched wizard) was untested — the exact invariant that keeps a wizard from pausing stdin and quitting the process. Extract the inline decision into a pure exported wizardOwnsRawMode() and unit test it: manageInput:false / non-TTY / pre-existing keypress listeners / no setRawMode all yield false; only a standalone TTY owns raw mode. Regression: wizard.test.ts (wizardOwnsRawMode ownership rule).
The greenfield section still described the removed contract feature (contracts/<id>.md basename guard, TSFORGE_CONTRACT, greenfield-contract.test), and the tools section listed the renamed yield_status tool. Bring the manifest back in line with the code it contracts.
Startup redesign: - Replace the anvil emblem with a large ANSI-Shadow TSFORGE wordmark painted with a per-column cyan→indigo→violet gradient (new truecolor() helper). - Clear the screen + scrollback before the banner so it never renders on top of leftover shell output (env dumps, prior command noise). - Drop the cryptic cwd/scope/gate/session block (those live in /config); show a single compact hint bar + styled no-config / plan-mode nudges instead. Input prompt: - The › prompt now persists while typing: it's painted as a hanging gutter in front of the editor block (was only on the pre-typing placeholder row). The editor reserves PROMPT_COLS so wrapping matches the visible width and no row exceeds the terminal. Tests: banner gradient + wordmark; status-bar prompt-in-editor-mode; editor-e2e and render-e2e cursor/wrap math updated for the 2-col gutter.
Redesign the conversation transcript (the user asked for bubbles, not blue text + a left bar): - USER messages render as a full rounded bubble (╭─ you ─╮ / │ … │ / ╰──╯), sized to content and capped at the terminal width (word-wrapped). - AGENT messages render as a left-accent card: a rounded ╭ <model> cap, every body line prefixed with a │ rail, a ╰ cap when the turn ends. Streaming- friendly (any width; code blocks/tables render cleanly inside the rail). Gap fix: the live stream previously stacked a label newline + the stream separator + the model's own leading blanks, leaving a big empty block before each answer. railAgentChunk now swallows leading blank lines until real content, so the answer starts right under the cap. Shared helpers (userBubble, agentCardTop/Bottom, agentBar, agentCardBody, wrapToWidth) power both the settled/replay path (renderMessage) and the live streaming path (cli.ts). Regression: tests/message-render.test.ts.
… rail Two bugs in the bubble/prompt rendering: - Plan-mode flow wrote its hints via process.stdout.write, bypassing the pinned StatusBar region — corrupting the input row and stranding a › in scrollback. Route all four plan-flow writes through echo() (→ statusBar.writeStream). - Streamed agent text wrapped at the terminal edge, so continuation rows escaped the card's │ rail. railAgentChunk now soft-wraps at the card's inner width (columns − rail), ANSI-escape-aware, so text can never spill past the rail.
The soft-wrap used a naive 1-col-per-char count and filled the last terminal column, so wide chars (emoji/CJK) and auto-margin terminals still wrapped the row themselves — dropping the │ rail on the continuation. Wrap now: - counts each char by displayWidth (emoji/CJK = 2 cols), and - leaves the last column empty (columns − rail − 1) so the terminal never wraps. Also guards a missing/zero stdout.columns with an 80-col fallback.
The rail-wrap logic was an untestable inline closure in cli.ts. Extract it to render/agent-rail.ts as makeAgentRail(rail, innerWidth) — a stateful streaming wrapper (state persists across token chunks) that prefixes every visual line with the │ rail, swallows the leading gap, keeps the rail on interior blanks, and soft-wraps at the card's inner width (display-width-accurate; ANSI escapes pass through free). Content budget now leaves rail + 2 spare columns so no terminal wraps a row and drops the rail. Regression: tests/agent-rail.test.ts — rail on every wrapped row + width bound at 80/92/120 cols incl. emoji/CJK, gap-swallow, interior-blank rail, split-chunk.
…breaks) drive()'s finally now calls closeAgentTurn(), so the ╰ bottom cap is written the moment streaming ends. Post-turn hints (plan-mode notice, PLAN review, folding changes) then land BELOW the sealed card instead of inside it — which had left the hint un-railed between the last body line and the cap, visually breaking the │ rail. Idempotent with the existing close in runLine's finally.
The post-turn plan hint was plain full-width text that read like a debug line. Replace it with a compact styled chip matching the startup plan line: brand ◆ plan (or ◆ plan ready), dim helper text, green approve. Two variants driven by whether the agent has proposed a plan yet.
…m 5) Add src/lib/trace.ts — trace(scope, err) gated by TSFORGE_TRACE/TSFORGE_DEBUG (file path or stderr; no-op when unset). Wire the 10 silent degrade catches in turn.ts (6), session.ts, run.ts, detect-gate.ts (2) to trace the swallowed error instead of vanishing — keeps the degrade behaviour, adds observability. Prod stays silent; TSFORGE_TRACE=1 surfaces what quietly failed (e.g. buildTsService). Regression: tests/trace-util.test.ts.
…tem 1) Pure move, no logic change — behaviour-preserving (all existing gate/loop/session tests pass unchanged). detect-gate.ts split into focused src/gate/ modules: types.ts IGate, IFileLintProblem, FileLinter tool-paths.ts resolveToolBin + the bundled BIN/CONFIG/CHECK paths tsconfig.ts strict tsconfig overlays + ensureWebGateTsconfig + tscPart shell.ts shSingleQuote + packEnvPrefix test-discovery.ts discoverTestCommand + webTestProbe linter.ts makeFileLinter + formatFile + prettierWriteCommand core-gate.ts buildGate + buildCoreFix web-gate.ts buildWebGate/TypeGate/TscCheck/Fix + WEB_FRAMEWORKS/WEB_PACKS index.ts public barrel Web scaffolding (scaffoldWeb/installWebDeps/webGuidance/BUILD_PREAMBLE) → src/scaffold/web-scaffold.ts. Path constants moved a directory deeper, so import.meta.dir joins gained a level (verified STRICT_CONFIG/STRICT_WEB_CONFIG/BROWSER_CHECK resolve to the package root). Updated all 18 import sites (4 src, 3 scripts, 11 tests).
The web gate was one opaque `build && tsc && lint && stubs && format && tests && render` chain — a failure buried WHICH stage broke in a wall of mixed output. Add scripts/staged-gate.ts: a bundled runner (mirrors browser-check/stub-check) that takes a base64 JSON stage list, runs each stage sequentially via the shared runShellCommand, prints a `━━ <label> ━━` banner + streams output live, and on the first failure prints `✗ <label> FAILED (exit N)` and stops with that exit code. buildWebGate now emits `bun staged-gate.ts <payload>` with the SAME commands in the SAME order (type-aware lint is its own stage) — identical stop-on-first-failure semantics, legible per-stage feedback. base64 keeps the quoted/&&/env-prefixed stage commands intact through the shell with zero escaping; onChunk forwards both stdout and stderr so the gate parser still sees every error. Regression: tests/staged-gate.test.ts (banners, stop-on-fail, exit-code preserved, stderr forwarded, malformed payload → exit 2). Web-gate tests decode the payload. Verified end-to-end: a real web gate on a depless dir prints the vite-build banner then ✗ vite build FAILED.
The web profile enforced I-prefixed interfaces (IButtonProps) — non-standard vs React/shadcn/TanStack, so the model fought its training data every scaffold and burned turns 'correcting' idiomatic names. Web interfaces now need only be PascalCase: bare 'ButtonProps' AND 'IButtonProps' both pass. Core/library code is unchanged — it still requires the I-prefix. - namingRule (eslint-conventions.ts): web surface emits bare PascalCase (no prefix, no Register filter needed — bare already permits 'Register'). Covers both the gate and the write-time linter. - strict.web.eslint.config.mjs: static fallback + header comment updated (resolves the review's contradiction — the comment claimed 'no I-prefix' while the rule enforced it; now both say bare). - BUILD_PREAMBLE + web-templates guidance: instruct/illustrate bare names. Regression: eslint-conventions.test.ts (web bare, core still I-prefix) + gate-conventions.test.ts (real eslint: web accepts bare 'interface User', core rejects the same file).
…iew item 4) The ~220-line settleGate mixed auto-fix, gate execution, meta-rules, three convergence guards, and feedback injection in one body. Extract each seam: autoFixStep(ctx) → string[] (janitor fixers + what they changed) runGateStep(ctx, turn) (validate + live-stream flush) runMetaRulesStep(ctx) → violations (best-effort, change-scoped) checkStuck(ctx, state, errs, turn) (the 3 guards; shared stuckResult shape) injectFeedback(...) (red-gate feedback + auto-fix notice) settleGate is now a thin orchestrator; signature + IRunResult|null contract are unchanged, so both drivers (run.ts / session.ts) are untouched. Regression: tests/settle-steps.test.ts — checkStuck (converging run never trips, persistent single error stops, unchanged whole set stops) + autoFixStep (no-op ⇒ [], a real task.fix rewrite ⇒ reported). Guard internals stay covered by same-persist-guard.test.ts; existing loop/session suites pass unchanged.
ILoopCtx had grown into an 18-field grab-bag, and toolContextFor spread eight
fields one-by-one (...(x === undefined ? {} : {x})). Reshape:
flat core task, cwd, tsService, report, messages
ctx.tool signal, setupWeb, readOnly, policyMode, policyRules,
interactive, mcpRegistry, touched (ILoopCtxTool)
ctx.gate parse, lintFile, stackProfile, ruleOverrides, onGateChunk
(ILoopCtxGate)
ctx.tool groups exactly the optional fields IToolContext threads through, so
toolContextFor is now { …identity, ...ctx.tool } — one spread, touched still
shared by reference. Sub-objects are always-present and mutable (the Session
flips policyMode/readOnly/signal/setupWeb/lintFile mid-run). Both construction
sites (session.ts, run.ts + policyCtxFields) nest the fields; write-guard and
all loop accessors updated.
Existing loop/session/tool-accounting suites pass unchanged — behaviour-preserving.
…tor) Neither flag was recognized: an unknown --flag fell through as a POSITIONAL, so `tsforge --version` booted a session whose task was the literal string "--version" — while install.sh's post-install message advertises `tsforge --help show flags`. Add --version/-V (prints `tsforge <version>` via the existing currentVersion()) and --help/-h (a new pure cliUsage() kept next to the flag tables in cli/args.ts), dispatched first in main(). Regression: cli.test.ts — both flags parse as flags (never a task); usage text covers the advertised surface. Verified live: `tsforge --version` → tsforge 0.27.1.
Extract the duplicated read_until/stub-server/spawn/reap/alive blocks from the four PTY e2e scripts into scripts/lib/ptyharness.py (~180 LOC of divergent copies -> one module), and replace blind time.sleep() settles with buffer-aware drain()/wait_for() so alive-checks fail fast and no redraw bytes are lost. Scripts keep their scenarios verbatim; assertions unchanged. Flake gate: bun run e2e:pty green 3 consecutive runs; full validate green (1890 pass).
… wrap) New scripts/e2e-editor-pty.py (on ptyharness): the four editor surfaces that previously had only in-process tests now run against the REAL binary in a real pseudo-terminal — typing+backspace+Alt-Enter multiline into a submitted bubble, bracketed paste with embedded CRs landing as ONE message (no per-line submits), @ file-picker open→filter→select→submit, and 200-char wrap with exactly one status bar in the final frame. Wired into e2e:pty (now 5 scripts). 20/20 checks, green 3 consecutive runs; full validate green.
…n scenarios New scripts/lib/itermharness.py (osascript plumbing, model-aware BAR regex, window() context manager) shared by the three iTerm2 scripts: - every window is now closed via try/finally — no stranded GUI windows on a failing scenario - model under test overridable via TSFORGE_E2E_MODEL (BAR regex derives from it) - two NEW real-terminal scenarios in e2e-iterm-tui.py: bracketed paste with an embedded CR (must not submit), and @ picker filter+select landing the path in the input row Verified live: 25/25 TUI checks + 6/6 plan-mode lifecycle against the real model in real iTerm2; PTY suite unaffected (all green).
- staged-gate.test.ts resolved the runner via process.cwd(), so the suite failed when bun test ran from packages/core instead of the repo root — resolve relative to the test file instead (latent bug, found by running the suite from a different cwd) - settle-steps' task.fix test sleeps 1s to move mtime; give it an explicit 30s timeout so a loaded machine can't flake it past bun's 5s default
Pure move + re-import — same symbols, no logic changes:
- cli/repl.ts the interactive REPL (repl, initReplSession, approvals)
- cli/model-setup.ts provider config/factory, /model, context-window probe
- cli/logging.ts spinner, terminal Reporter, --log ledger, log paths
(module state now behind setInteractiveStream())
- cli/banner.ts welcome banner, startup hint, plan chip, resume replay
- cli/gate-setup.ts gate resolution (resumed > --accept > --web > auto);
BROWSER_CHECK now reused from gate/tool-paths
- cli/repl-commands.ts /sessions /map /review /trace + metrics line
- cli/web-setup.ts web scaffold + deps install progress
cli.ts keeps main(), runOnce(), and the one-shot modes (559 lines, was
2938). External import paths preserved via re-exports (providerConfig,
isApproval/isPlanApproval, spinner). Parity test now reads cli/repl.ts.
Full validate green (1890 pass + 5 PTY suites); live REPL boot smoke-tested
on the real binary.
- loop/staged-build.ts: the design→implement phase orchestration behind a narrow IStagedBuildHost seam (gate/tool swap, one send, one gate probe, one raw completion); Session methods become thin delegates. Now unit- tested against a fake host: gate+tools save/restore ordering, the interrupted short-circuit, the green-skip of phase 2, plan-note injection, and the from-disk type-contract re-injection. - loop/model-call.ts: the two pure per-call decisions in askModel — selectThinking (forced > repairing > per-send > config precedence) and offeredToolsFor (plan mode's read-only tool filter + MCP ride-along) — extracted and pinned by direct unit tests (the read-only guarantee had none). - Fixed en route: the staged-build design/implement prompts still told the model to I-prefix web interfaces, contradicting the scaffold guidance (web is bare PascalCase since a1bf032) — prompts now agree. session.ts 2118 → ~1930 lines. Full validate green (1921 tests + 5 PTY suites).
editor/completion.ts owns the anchor/query tracking, dropdown navigation, selection clamping, and accept-replaces-query-keeps-@ behaviour behind createCompletion(); the controller feeds it key names and re-queries it after edits. IEditorCompletionSource moved with it (re-exported from the controller for existing importers). Now directly unit-tested WITHOUT stdin (tests/editor-completion.test.ts, 9 tests over a real EditorBuffer): whitespace closes the mention, cursor- before-anchor closes, clamped selection, empty-list accept is a no-op, and the picked path lands as '@<path> '. The controller keeps the stdin loop, key dispatch (already a data-driven table), history, and lifecycle. Note: the planned key-dispatch rework was based on a stale review claim of a 387-line builder — the table is ~80 lines and already declarative, so it stays. Full validate green (1930 tests + 5 PTY suites incl. the real-PTY @-picker interaction).
…le-ops) Wire the env-gated trace() into the silent catches the review flagged: cli/model-setup (hostOf, detectContextWindow, warnDefaultModelOnRemote), cli/logging (newestLogFile), runNotify, file-ops currentFileView, and the editor's fire-and-forget palette/picker opens. Degrade behaviour unchanged — TSFORGE_TRACE now shows WHAT quietly failed. Regression test: an unreachable model endpoint leaves a scoped [cli.detectContextWindow] line while still returning undefined. Also fixed en route: agent.constants.ts still described the script tool as 'Opt-in (TSFORGE_SCRIPT)' — it has been ON by default with TSFORGE_NO_SCRIPT as the kill-switch since 0.23.0. Dropped from the plan (claims disproven while verifying): the hashline parse swallow doesn't exist, and the snapshotMtimes/changedSince 'duplicate' lives only in turn.ts (already traced) — no lib/file-diff.ts needed. Full validate green (1931 tests + 5 PTY suites).
- reference/flags.mdx: document the missing env vars, each verified against its read site — TSFORGE_TRACE/TSFORGE_DEBUG/TSFORGE_EDITOR_DEBUG (new 'Debug & tracing' section), TSFORGE_A11Y + TSFORGE_SCREENSHOTS (gate oracles), TSFORGE_WEB + TSFORGE_TDD env equivalents, TSFORGE_SCRIPT_MAX_CALLS/TIMEOUT_MS, TSFORGE_STATUS (--notify), TSFORGE_BASIC_INPUT, TSFORGE_PROPTEST_TIMEOUT_MS; clarify NO_UPDATE_NOTIFIER is the cross-tool standard. (TSFORGE_RPC_* stays undocumented: internal to the script tool's RPC bridge.) - cli/interactive.mdx: add the missing /trace and /setup rows; note /quit. - loop/gate-floor.mdx: new 'Staged gate progress & failures' section quoting the real runner strings (━━ banners, ✗ <label> FAILED (exit N), stop-on-first-failure). - scaffold/web.mdx: web interfaces are bare PascalCase now, not I-prefixed. - astro.config.mjs: input-editor.mdx was orphaned — added under Reference. Docs build green (46 pages).
agentCardBody joined rail-prefixed lines with no width awareness, so a long replayed assistant line spilled past the rail on resume (the live stream was already wrap-safe via makeAgentRail). It now feeds the settled text through the SAME ANSI-aware, display-width wrapper the live path uses; renderMessage threads the real terminal columns through. Tests: long replay wraps with every body row railed and within width; CJK counts as 2 columns. Full validate green (1933 tests + 5 PTY suites).
This PR grew well past its original scope (generic wizard +
/config) — it now carries four related workstreams, each validated green (typecheck + lint + ~1,930 unit tests + 5 real-PTY e2e suites) before landing:1. Generic wizard +
/configsettings hub (original scope)/setupand the new/configsettings hub: every setting shows a one-line description + live value — model (switch/add), mode, gate, editable scope, web tools, TDD./helpcapability browser; palette-run commands no longer ghost their names into the input.2. TUI overhaul
makeAgentRail); persistent›input gutter; styled plan-mode chips.3. Structural refactor (external review, all 6 items)
detect-gate.ts(1,049 lines) →src/gate/*modules; staged web gate with per-stage━━ label ━━banners and stop-on-first-failure.I-prefix);settleGatedecomposed into tested steps;ILoopCtxgrouped intoctx.tool/ctx.gate; env-gatedtrace()on silent degrade paths.--version/--helpused to boot a session with the flag as the task — now print-and-exit.4. Cleanup wave: e2e uplift + god-file splits + docs sync
scripts/lib/ptyharness.py,itermharness.py): dedupes ~180 lines across 7 scripts, buffer-aware polling instead of blind sleeps, guaranteed iTerm2 window teardown,TSFORGE_E2E_MODELoverride.e2e-editor-pty.py(typing/backspace/Alt-Enter, bracketed paste must-not-submit,@picker filter→select→submit, long-line wrap) + two new live-iTerm2 scenarios (paste,@picker interaction). iTerm2 suite verified live: 25/25 + 6/6.cli.ts2,938→562 across 7cli/*modules;Sessionshedsloop/staged-build.ts(host-interface seam, unit-tested) +loop/model-call.ts(plan mode's read-only tool filter now has direct tests); editor@-completion state machine extracted + tested without stdin./trace+/setupin the command table, staged-gate section, orphaned page re-linked; Astro build green (46 pages).Testing: every task committed only after full
bun run validate(includes the 5 PTY suites); the editor/REPL refactors additionally verified in real iTerm2 against a live model. Test count 1,890 → 1,931.