Conversation
Build the perception->extraction->validation->sandbox compute->verification chat experience the spec calls for. Each unit of work renders as a typed card so users see observable work, not hidden reasoning. Backend (convex/domains/financialOperator/): - types.ts: 9 step kinds (run_brief, tool_call, extraction, validation, calculation, evidence, artifact, approval_request, result) x 7 statuses. - sandbox.ts: deterministic JS compute (ETR, after-tax cost of debt, leverage, variance, compliance). Throws on NaN/divide-by-zero. - validators.ts: schema/unit/range/confidence checks. HONEST_SCORES counts what was actually checked. - extractors.ts + attFixture.ts: pinned AT&T 10-K fixture; real-PDF-shape interface for swap-in. - runOps.ts: createRun, appendStep, updateStepStatus, getRun, listSteps. BOUND at 200 steps/run. - orchestrator.ts: runAttCostOfDebtDemo + recordApprovalDecision actions. Schema (convex/schema.ts): - New financialOperatorRuns + financialOperatorSteps tables (additive). - Fixed pre-existing data drift in productEventWorkspaces by adding activeEventSessionId as optional. Frontend (src/features/financialOperator/): - 9 typed card components + StepShell common chrome + StepStatusBadge. - StepCard switch dispatcher. - FinancialOperatorTimeline live-streaming parent (Convex useQuery). - FinancialOperatorDemo standalone view at /finance-demo. Routing: - viewRegistry.ts: added financial-operator view at /finance-demo with aliases /financial-operator and /finops. - QuickCommandChips: added optional `navigate` field on chips for workspace handoff; "AT&T cost of debt" chip routes to /finance-demo. Tests (19/19 pass): - sandbox.scenario.test.ts: 13 tests covering happy path, 1000-replay determinism, NaN/divide-by-zero/out-of-range sad paths, compliance gate, signed variance formatting. - validators.scenario.test.ts: 6 tests covering missing required, wrong unit, out-of-range, low confidence, scale to 100 fields. Verification: - npx tsc --noEmit: clean - npx vitest run: 19/19 pass - npx vite build: clean - Live browser test: 10 cards stream end-to-end, approve flow produces Result card with ETR=16.86%, after-tax cost of debt=4.51% Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eploy hardening Closes the 4 follow-ups from PR #204: 1. Vercel deploy hook race fix (60s wait + Tier-A verify poll) 2. Edge cache stickiness (no-cache headers on HTML, immutable on assets) 3. Inline chat experience (FinancialOperatorOverlay, no FastAgentPanel surgery) 4. Real PDF reader (Claude PDF input + structured extraction) 5. Examples B/C/D (CRM cleanup, covenant compliance, variance analysis) ## Examples B/C/D — full operator-console workflows - Example B (financial_data_cleanup): inspect → profile spreadsheet → extract entities → dedup → enrich → validate CRM schema → export CSV. Sandbox compute: dedup ratio (387 -> 312, 19.4%). - Example C (covenant_compliance): locate covenant → extract terms + inputs → validate → sandbox leverage + compliance gate → memo. Sandbox: computeLeverageRatio + checkCompliance (3.55x vs 4.25x cap, compliant). - Example D (variance_analysis): inspect → align CoA → per-line variance in sandbox → driver search → CFO memo. Sandbox: computeVariance for 6 P&L lines, signed-percent formatting. All three reuse the same backbone: runOps + sandbox + validators + typed step kinds. Each emits 8-10 cards, picker on /finance-demo lets the user choose which workflow to run. ## Real PDF reader (production path) `runRealCostOfDebtFromPdf` action: - Takes a `_storage` PDF id (any uploader can produce one) - Sends PDF directly to Claude as a document input (no separate parse step) - Constrains output to a strict JSON schema with sourceRef + confidence per field; instructs Claude to return null + add to unresolvedFields rather than fabricate - Validates extraction with the same `validateExtraction()` used by the fixture path; computes ETR + after-tax cost of debt deterministically - Bounded reads (MAX_PDF_BYTES = 20MB), HONEST_STATUS error path that surfaces parse failures verbatim, approval gate when required fields unresolved. ## Inline chat experience (FinancialOperatorOverlay) Surface-agnostic global drawer. Listens for `?finRun=<runId>` URL param, mounts `FinancialOperatorTimeline` as a right-side drawer alongside any chat surface. Collapsible to a corner pill. Mounted in App.tsx so it works on /, /?surface=ask, /?surface=workspace, etc. Why a global overlay vs editing FastAgentPanel directly: - FastAgentPanel.tsx is 3700+ lines; surgical message-bubble edits have high blast radius - URL-param-driven means any caller (chip, button, MCP tool) can activate the overlay via `setActiveFinancialRun()` without knowing the chat panel internals - /finance-demo "View in chat" button deep-links to `/?surface=ask&finRun=<id>` — overlay mounts beside the chat ## Deploy hardening vercel-deploy-hook-backup.yml: - 60s wait before firing the deploy hook on push events. Closes the race that bit PR #204: the GitHub→Vercel git mirror takes a few seconds to catch up after a merge, and deploy hooks pass no commit SHA, so immediate-fire deploys can clone the previous HEAD. - Tier-A verification poll: after the hook fires, watch the live URL for up to 7 minutes for the bundle hash to rotate. Non-blocking warning if it doesn't (deploy still in progress, or edge cache stuck). vercel.json headers: - /assets/* → `public, max-age=31536000, immutable` (content-hashed, safe for permanent edge cache) - /(everything else) → `no-cache, no-store, must-revalidate` plus CDN-Cache-Control / Vercel-CDN-Cache-Control no-store. Prevents the stale-HTML landmine that took 15 minutes to clear post-deploy on PR #204. The bundle hashes inside index.html change every deploy, so stale HTML points at JS files the new deploy may have evicted. ## Design alignment doc New `docs/architecture/FINANCIAL_OPERATOR_DESIGN_ALIGNMENT.md` walks through how the cards build on existing UI kit per surface (web, mobile, workspace, CLI/MCP). Same step-kind enum, same status enum, same sandbox guarantee everywhere. Workspace + CLI/MCP exposure described as concrete next-PR plans. ## Verification - npx convex dev --once --typecheck=enable: clean (3.17m typecheck) - npx tsc --noEmit: 0 errors - npx vitest run convex/domains/financialOperator/__tests__/: 19/19 pass - npx vite build: clean (42.66s, 211 entries precached) - Live browser: 4 demo workflows trigger, each renders 8-10 typed cards; "View in chat" deep-links to /?surface=ask with overlay mounted (8 cards in the drawer next to the chat surface). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…apabilityBadge
Two corrections to the prior PR:
## 1. Design-kit alignment (we build on top of the kit, not next to it)
Replaced ad-hoc styling with the kit's canonical utilities (per
docs/architecture/FINANCIAL_OPERATOR_DESIGN_ALIGNMENT.md and the
NodeBench AI Design System reference):
- StepShell: now uses `.nb-panel` (12px radius + hairline border +
panel bg, kit canonical) instead of a hand-rolled `.nb-card`-styled
box. Left accent stripe via `::before` keeps cards distinguishable
without inventing new chrome.
- Type: every kicker is `type-label !tracking-[0.18em]` (kit's
canonical 11px uppercase 0.18em). Titles are `type-card-title`.
Body is `text-[13px] leading-[1.5]`. Mono numerics use `font-mono`.
- Color: every raw `#d97757` literal across 7 card files swapped to
`var(--accent-primary)` (Tailwind arbitrary-value with CSS var).
Status badges now use `.badge-success/-warn/-fail/-accent` tone
families with kit-canonical semantic colors (--success, --warning,
--destructive) — same tones the kit's component-badges.html ships.
- Demo view: page header uses `type-page-title` + `type-label` +
`type-body`. Workflow tiles use `.nb-panel` chrome with the kit's
44px icon container (10px radius, terracotta-12% bg, terracotta
fg, 20px Lucide stroke icon — exactly the kit's component-panel.html
pattern).
- Overlay: drawer chrome uses `--bg-primary`, `--border-color`, and
`--shadow-xl` instead of inline `#151413` / `border-edge`. Header
icon buttons are 16px Lucide (kit pill-icon size), rounded-full to
match the kit's icon-button conventions.
No new design tokens were introduced. Every utility class on these
surfaces already existed in src/index.css before this work shipped.
## 2. ModelCapabilityBadge — surfacing what the active model can do
Pattern lifted from open-source projects that route through unified
LLM providers (OpenRouter, pi-ai, LibreChat, OpenWebUI):
- OpenRouter exposes `architecture.input_modalities` /
`output_modalities` per model
- LibreChat shows per-model capability chips next to the picker
- pi-ai's `getModel().inputModalities` is the same shape
NodeBench surfaces them as a compact icon-only row:
- 8 modalities: text, image, pdf, audio, video, web_search, code_exec, tools
- Each is a 24px round Lucide-icon pill (14px stroke icon — kit's
pill icon size)
- Supported: terracotta accent (border + bg + fg via accent CSS vars)
- Unsupported: 50% opacity + line-through (visible but visually
receded — agent users see what's missing without it competing)
- Native title tooltip + role=listitem aria-label per icon
Hand-curated capability registry (`MODEL_CAPABILITIES`) covers the
models NodeBench routes today: Claude Opus/Sonnet/Haiku, GPT-5/4.1/4o,
o1/o3, Gemini 3 Pro/Flash + 2.5 Flash, Grok 4, Kimi k2.6, DeepSeek
v3.5, GLM 4.6V. Unknown models fall back to text-only with a
`(unverified)` tag — HONEST_SCORES, never claim capabilities the model
can't deliver. Long-term path: a Convex action that hits OpenRouter's
/v1/models and caches the modality matrix daily.
Surfaced in two places this PR:
- /finance-demo header (active orchestrator model)
- FinancialOperatorOverlay header (visible alongside chat surface)
Future PRs can drop it next to FastAgentPanel's model selector and any
other model-aware surface — it's a self-contained component with one
prop (`model: string`).
## Verification
- npx tsc --noEmit: 0 errors
- npx vite build: clean (7.71s)
- Live browser: 4 demo workflows still render 8-10 typed cards each;
StepShell now uses .nb-panel + type-label + type-card-title;
ModelCapabilityBadge shows 4 supported (text/image/pdf/tools) + 4
unsupported (audio/video/web_search/code) for claude-opus-4-7 with
per-icon tooltips and aria-labels
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…surface
User feedback: "it should actually be built into the existing chat page
or chat agent sidebar, add all the new components to the chat, wired
live and used under a toggle called workspace mode."
## What changed
A new `WorkspaceModeToggle` floats on the chat surface (top-right on
desktop, bottom-right above the mobile bottom nav). Clicking it sets
`?ws=1` in the URL; clicking again clears it.
When `?ws=1` is active, the new `WorkspaceModePane`:
- Mounts inside the chat content area (fixed, z-55, padded around
the bottom nav + agent panel so the chat composer below stays live)
- Renders the 4-workflow picker (AT&T 10-K · CRM cleanup · Covenant
compliance · Variance analysis) when no run is active
- Streams the FinancialOperatorTimeline live when a run is active
- Surfaces ModelCapabilityBadge in its header so the user sees what
the active model can/can't do
- Defers to the existing right-side drawer (`FinancialOperatorOverlay`)
when ws=0 — both modes coexist for users who want a side dock
URL state drives everything (`?ws=1`, `?finRun=<id>`) so deep links work
and the chat composer below stays interactive.
## Why not edit FastAgentPanel.tsx
FastAgentPanel.tsx is 3700+ lines. The toggle + pane sit on top of it
via fixed positioning; no surgery on its render tree. Surface coupling
is via URL params only — the same pattern any future caller (chip,
button, MCP tool) can use to drive workspace mode.
## Visibility rule
Toggle hidden on:
- /finance-demo (the page IS workspace already)
- /cli, /pricing, /changelog, /legal, /about, /api-docs (info pages)
- /share/*, /report/*, /embed/* (public/embedded views)
Toggle shown on the root chat surface and ?surface=ask|home variants.
## Verification
- npx tsc --noEmit: 0 errors
- npx vite build: clean (210 PWA entries)
- Live browser:
- Toggle visible on /?surface=home with aria-label "Enter workspace mode"
- Click → URL gets ?ws=1, pane mounts (role=region "Workspace mode")
- Pane shows 4 demo tiles + model capability badge + Exit button
- Click "Covenant compliance" → 9 typed cards stream inline
(Plan → Tool → Extraction×2 → Validation → Calculation → Evidence
→ Artifact → Result) with the run id in the URL
- "Back to picker" returns to the 4-tile state
- "Close" / "Exit workspace" returns to plain chat
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…et-ed0e9b # Conflicts: # convex/_generated/api.d.ts # convex/domains/financialOperator/index.ts # src/features/financialOperator/components/ApprovalCard.tsx # src/features/financialOperator/components/ArtifactCard.tsx # src/features/financialOperator/components/CalculationCard.tsx # src/features/financialOperator/components/EvidenceCard.tsx # src/features/financialOperator/components/ResultCard.tsx # src/features/financialOperator/components/StepShell.tsx # src/features/financialOperator/components/StepStatusBadge.tsx # src/features/financialOperator/components/ToolCallCard.tsx # src/features/financialOperator/index.ts # src/features/financialOperator/views/FinancialOperatorDemo.tsx
✅ Dogfood Visual QA Gate: PASSED
ArtifactsDownload the Generated by Dogfood QA Gate |
… through User QA caught a broken UI: workspace mode rendered with the home surface visible behind it (greeting, sidebar, watchlist, search input all stacking with the operator-console pane). ## Root cause Tailwind's `/95` opacity modifier does NOT work on CSS-var arbitrary values without the `color:` prefix. The class `bg-[var(--bg-app)]/95` resolved to `rgba(0,0,0,0)` — fully transparent. A second issue compounded it: `--bg-app` is in the kit reference (colors_and_type.css) but is NOT defined in the live repo's src/index.css. The repo has `--bg-primary` / `--bg-secondary` only. So even the unmodified `var(--bg-app)` would have resolved to nothing. ## Fix - Use `--bg-primary` (defined: #FFFFFF light, dark variant in dark mode) as the pane base color, set via inline `style` to bypass any Tailwind quirks with CSS-var opacity arbitrary values. - Bump pane to `z-[80]` (above modals at z-50, toasts at z-60). The toggle bumped to `z-[85]` so users can dismiss mid-run without hunting inside the pane. - Add `isolate` for a clean stacking context — prevents any future z-leak from the home surface beneath. - Inline-comment the var-opacity gotcha so the next developer doesn't re-introduce it. ## Verification (per dogfood_verification.md) - npx tsc --noEmit: 0 errors - Live browser screenshot: clean opaque pane, header readable, 4 demo tiles in 2x2 grid, model capability badge with 4 supported + 4 unsupported icons, no home-surface bleed-through - Run flow: clicked AT&T 10-K → 9 typed cards stream inline (Plan → Tool×2 → Extraction → Validation → Calculation → Evidence → Artifact → Result), all using .nb-panel chrome with proper status badges and source-cited fields Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… scroll / composer)
User QA caught: workspace mode was overlaying the chat surface instead
of building on top of the existing chat layout. The kit's canonical
chat shell (ui_kits/nodebench-web/ChatThread.jsx) is:
header (sticky top, entity icon + title + meta + actions)
↓
scrollable thread (turns / operator console cards)
↓
composer (pinned bottom: pins · field · model + caps · suggested chips)
The model selector + capability indicators belong IN the composer (per
the design board reference + the kit's Composer.jsx), not floating in
the header.
## What changed
WorkspaceModePane now renders as a 3-row CSS grid mirroring the kit:
- Row 1 (header): kit's .nb-chat-header pattern — entity icon
(terracotta squircle with sparkle), kicker, title, meta in mono
font, Picker + Close actions
- Row 2 (scroll): demo picker (no run) OR FinancialOperatorTimeline
(active run) inside a max-w-3xl container
- Row 3 (composer): new WorkspaceComposer component
WorkspaceComposer follows the kit's composer shape exactly:
- Pin row: "EVENT Ship Demo Day ×" + "+ Add context" (matches design
board reference)
- Field row: paperclip + link + mic icons (15px stroke) | textarea
"Ask, capture, paste, upload, or record…" | terracotta send button
- Below field: MODEL claude-opus-4-7 + 8 capability icons (text /
image / pdf / audio / video / web_search / code_exec / tools with
supported vs muted variants and per-icon tooltips) | Memory-first ·
0 paid calls in mono
- Suggested chips: Run AT&T 10-K demo · Run CRM cleanup · Run
covenant compliance · Run variance analysis
The composer is interactive: typing a prompt that matches a known
workflow regex starts that demo (e.g. "AT&T 10-K cost of debt" →
runAttCostOfDebtDemo). Send falls back to dispatching a custom
`nb:workspace:compose` event for any other panel listening (so future
FastAgentPanel integration can hook in without surgery).
## Verification
- npx tsc --noEmit: 0 errors
- npx vite build: clean (210 PWA entries)
- Live browser screenshot (kit-aligned at mobile width):
- Empty state: header with WORKSPACE MODE / Pick a workflow / "4
canonical workflows · math sandboxed · approval-gated" meta + Close
button; scrollable middle with 4 demo tiles in 2x2; composer pinned
bottom with all canonical pieces (pins, attach, textarea, send,
model badge with capabilities, Memory-first hint, 4 suggested chips)
- Active run: header switches to "Live operator-console run", scroll
area renders the typed-card timeline (RUN header → Plan → 2x Tool →
Extraction → Validation → Calculation → Evidence → Artifact),
composer stays pinned and never overlaps content; capability badge
sits inside the composer where the kit puts it
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the 4 follow-ups from PR #204 plus 2 corrections from review:
no-cacheand assets tomax-age=31536000, immutableFinancialOperatorOverlayglobal drawer (URL-param-driven, no FastAgentPanel surgery)runRealCostOfDebtFromPdf).nb-panel/.type-card-title/.type-labelwithvar(--accent-primary); no new design tokens introducedVerification (per .claude/rules/live_dom_verification.md)
npx tsc --noEmit: 0 errorsnpx vite build: clean (7.71s last run)npx vitest run convex/domains/financialOperator/__tests__/: 19/19npx convex dev --once --typecheck=enable: cleanvercel-deploy-hook-backup.ymlTest plan
npx tsc --noEmitcleannpx vitest run convex/domains/financialOperator/__tests__/cleannpx vite buildclean/finance-demoafter deploy for kit-aligned cards + model badgeWhat changed at a glance
New convex domain:
convex/domains/financialOperator/orchestratorExamples.ts— runCrmCleanupDemo / runCovenantComplianceDemo / runVarianceAnalysisDemorealExtractors.ts— runRealCostOfDebtFromPdf (Claude PDF input + structured output)fixtures/{crm,covenant,variance}Fixture.ts— pinned demo dataNew frontend feature:
src/features/financialOperator/components/ModelCapabilityBadge.tsx— 8-modality capability grid + curated registrycomponents/FinancialOperatorOverlay.tsx— global URL-param-driven drawerviews/FinancialOperatorDemo.tsx— 4-workflow picker (kit-aligned).nb-panel,.type-card-title,.type-label,var(--accent-primary)Infra:
.github/workflows/vercel-deploy-hook-backup.yml— 60s wait + Tier-A pollvercel.json— HTML no-cache + assets immutable cache-controlDocs:
docs/architecture/FINANCIAL_OPERATOR_DESIGN_ALIGNMENT.md— surface-by-surface kit alignment for web/mobile/workspace/CLI🤖 Generated with Claude Code