feat(web): operations-first control plane UI revamp with live updates and node logs by santoshkumarradha · Pull Request #330 · Agent-Field/agentfield

santoshkumarradha · 2026-04-04T18:01:25Z

Summary

Refreshes the embedded control plane web UI into an operations-first control surface and now also folds in the full live-update stack plus agent node process logs.

Final integrated branch tip: 5236e69

Included scope

This PR now combines three previously separate lines of work into one final integration PR:

Operations-first control plane UI revamp

shell and navigation refresh
dashboard, runs, agents, reasoners, settings, access, and provenance UX cleanup
tighter table/badge/status patterns

Unified live updates and adaptive polling

shared SSESyncProvider at the app shell
execution, node, and reasoner event-driven query invalidation
adaptive fallback polling when the relevant SSE stream is unavailable
health strip and page-level live status aligned with the actual backing stream

Agent node process logs and execution observability

control-plane proxy for GET /api/ui/v1/nodes/:id/logs
settings for node log proxy tail and timeout limits
NodeProcessLogsPanel on Agents and Node Detail pages
structured execution logs on the execution detail page
advanced raw node log debugging integrated into the execution experience
NDJSON process log capture in Go, Python, and TypeScript SDKs
execution-context stamping and structured execution log ingestion across SDK/runtime/control-plane layers
docs and functional coverage for the observability flow in the shared tests/functional harness

Product direction

Operations before analytics: health, queues, runs, and recovery actions come first.
Layered depth: shell and health strip stay persistent, while runs, DAG, and step detail handle diagnosis.
Recovery is normal: retry, cancel, cleanup, and next-step affordances stay close to degraded states.
Control-plane mental model: nodes and executions are the moving parts; workflows are the jobs.

Review and integration notes

Earlier stacked PRs for live updates and node logs were folded into this branch and superseded.
The SSE provider-based design won during final integration; older page-local invalidation wiring was removed.
The node logs proxy follows the same control-plane-to-agent trust model already used on execute paths.
Draft PR docs: execution observability RFC #342 was merged into this branch as the execution-observability implementation line and should be treated as already integrated here.

Issue linkage

Fixes #324.

Validation status

GitHub CI is the source of truth for this final integrated branch and is running on PR #330 after commit 5236e69.

Local verification on the integrated branch included:

npm exec -- tsc --noEmit
go test ./internal/handlers/... ./internal/storage/... ./internal/server/... -count=1
shared docker functional run for tests/test_go_sdk_cli.py -k observability

Previous review fixes still included

audit verification hardening for HTTP verification path
DID did:web encoded-port handling fix
SSE/CORS hardening already present on this branch
TypeScript multimodal helper fixes and tests already present on this branch

…tputs

Replace all @phosphor-icons/react imports with lucide-react equivalents. Rewrote icon-bridge.tsx to re-export Lucide icons under the same names used throughout the codebase, so no consumer files needed changing. Updated icon.tsx to use Lucide directly. Removed weight= props from badge.tsx, segmented-status-filter.tsx, and ReasonerCard.tsx since Lucide does not support the Phosphor weight API.

…tures being redesigned - Delete src/components/mcp/ (MCPServerList, MCPServerCard, MCPHealthIndicator, MCPServerControls, MCPToolExplorer, MCPToolTester) - Delete src/components/authorization/ (AccessRulesTab, AgentTagsTab, ApproveWithContextDialog, PolicyContextPanel, PolicyFormDialog, RevokeDialog) - Delete src/components/packages/ (AgentPackageCard, AgentPackageList) - Delete src/components/did/ (DIDIdentityCard, DIDDisplay, DIDStatusBadge, DIDInfoModal, DIDIndicator) - Delete src/components/vc/ (VCVerificationCard, WorkflowVCChain, SimpleVCTag, SimpleWorkflowVC, VCDetailsModal, VCStatusIndicator, VerifiableCredentialBadge) - Delete MCP hooks: useMCPHealth, useMCPMetrics, useMCPServers, useMCPTools - Delete pages: AuthorizationPage, CredentialsPage, DIDExplorerPage, PackagesPage, WorkflowDeckGLTestPage - Remove MCP Servers, Tools, Performance tabs from NodeDetailPage - Remove Identity & Trust and Authorization sections from navigation config - Remove deprecated routes from App.tsx router - Fix broken imports in WorkflowDetailPage, ReasonerDetailPage - Trim src/mcp/index.ts barrel to API services + types only (no component re-exports) API services (vcApi, mcpApi), types, and non-MCP hooks are preserved. TypeScript check passes with zero errors after cleanup.

… foundation system - Rewrote src/index.css with clean standard shadcn/ui theme (HSL tokens for light/dark mode) - Deleted src/styles/foundation.css (custom token system) - Rewrote tailwind.config.js to minimal shadcn-standard config (removed custom spacing, fontSize, lineHeight, transitionDuration overrides) - Replaced ~130 component files: bg-bg-*, text-text-*, border-border-*, text-nav-*, bg-nav-*, text-heading-*, text-body*, text-caption, text-label, text-display, interactive-hover, card-elevated, focus-ring, glass, gradient-* with standard shadcn equivalents - Migrated status sub-tokens (status-success-bg, status-success-light, status-success-border etc.) to opacity modifiers on base status tokens - Updated lib/theme.ts STATUS_TONES to use standard token classes - Fixed workflow-table.css status dot and node status colors to use hsl(var(--status-*)) - Zero TypeScript errors after migration

…search

…ariants - Delete 4 JSON viewer duplicates (JsonViewer, EnhancedJsonViewer x2, AdvancedJsonViewer); all callers already use UnifiedJsonViewer - Delete 3 execution header duplicates (ExecutionHero, ExecutionHeader, EnhancedExecutionHeader); update RedesignedExecutionDetailPage to use CompactExecutionHeader - Delete 3 status indicator duplicates (ui/StatusIndicator, ui/status-indicator, reasoners/StatusIndicator); consolidate legacy StatusIndicator into UnifiedStatusIndicator module and create ReasonerStatusDot for reasoner-specific dot display - Delete RedesignedInputDataPanel and RedesignedOutputDataPanel standalone files; InputDataPanel/OutputDataPanel already export backward-compat aliases - Delete legacy Navigation/Sidebar, NavigationItem, NavigationSection (unused; SidebarNew is active) - Delete enterprise-card.tsx (no callers; card.tsx already exports cardVariants) - Delete animated-tabs.tsx; add AnimatedTabs* re-exports to tabs.tsx and update 5 callers

…dark mode default - Rewrote navigation config with 5 items: Dashboard, Runs, Agents, Playground, Settings - Built AppSidebar using shadcn Sidebar with icon-rail collapsed by default (collapsible="icon") - Built HealthStrip sticky bar showing LLM, Agent fleet, and Queue status placeholders - Built AppLayout using SidebarProvider/SidebarInset/Outlet pattern with breadcrumb header - Updated App.tsx to use AppLayout as layout route wrapper, removing old SidebarNew/TopNavigation - Added placeholder routes for /runs, /playground and their detail pages - Set defaultTheme="dark" for dark-first UI - All existing pages (Dashboard, Executions, Workflows, Nodes, Reasoners) preserved under new layout

…ents, health - Install @tanstack/react-query v5 - Create src/lib/query-client.ts with 30s stale time, 5min GC, retry=1 - Wrap App with QueryClientProvider - Add src/hooks/queries/ with useRuns, useRunDAG, useStepDetail, useAgents, useLLMHealth, useQueueStatus, useCancelExecution, usePauseExecution, useResumeExecution - Barrel export via src/hooks/queries/index.ts - Hooks delegate to existing service functions (workflowsApi, executionsApi, api) - Polling: agents 10s, system health 5s, active run DAGs 3s

…lows Add RunsPage component at /runs with: - Filter bar: time range, status, and debounced search - Table with columns: Run ID, Root Reasoner, Steps, Status, Duration, Started - Checkbox row selection with bulk action bar (Compare / Cancel Running) - Paginated data via useRuns hook with Load more support - Status badge using existing badge variants (destructive/default/secondary) - Duration formatting (Xs, Xm Ys, —) - Row click navigates to /runs/:runId Wire RunsPage into App.tsx replacing the placeholder at /runs.

…iew results Adds a new /playground and /playground/:reasonerId route with: - Reasoner selector grouped by agent node - Split-pane JSON input textarea and result display - Execute button with loading state (Loader2 spinner) - View as Execution link on successful run - Recent Runs table (last 5) with Load Input shortcut - Route-sync: selecting a reasoner updates the URL path

…t runs Replaces /dashboard with NewDashboardPage — a focused, operations-first view that answers "Is anything broken? What's happening now?" rather than displaying metrics charts. The legacy enhanced dashboard is preserved at /dashboard/legacy. Key sections: - Issues banner (conditional): surfaces unhealthy LLM endpoints and queue-saturated agents via useLLMHealth / useQueueStatus polling - Recent Runs table: last 10 runs with reasoner, step count, status badge, duration, and relative start time; click navigates to detail - System Overview: 4 stat cards (Total Runs Today, Success Rate, Agents Online, Avg Run Time) backed by dashboardService + TanStack Query with auto-refresh

…ner list Replaces the /agents placeholder with a fully functional page showing each registered agent node as a collapsible Card. Each card displays status badge with live dot, last heartbeat, reasoner/skill count, health score, and an inline reasoner list fetched lazily from GET /nodes/:id/details. Supports Restart and Config actions. Auto- refreshes every 10 s via useAgents polling.

…tity, About Adds NewSettingsPage with four tabs: - General: placeholder for future system config - Observability: full webhook config (migrated from ObservabilityWebhookSettingsPage) with live forwarder status and DLQ management - Identity: DID system status, server DID display, export credentials - About: version, server URL, storage mode Updates App.tsx to route /settings to NewSettingsPage and redirect /settings/observability-webhook to /settings.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds RunDetailPage (/runs/:runId) as the primary execution inspection screen, replacing the placeholder. Features a split-panel layout with a proportional-bar execution trace tree on the left and collapsible Input/Output/Notes step detail on the right. Single-step runs skip the trace and show step detail directly. Includes smart polling for active runs and a Trace/Graph toggle (graph view placeholder). New files: - src/pages/RunDetailPage.tsx — main page, wires useRunDAG + state - src/components/RunTrace.tsx — recursive trace tree with duration bars - src/components/StepDetail.tsx — step I/O panel with collapsible sections Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Replace 4 stat cards with a horizontal stats strip above the table - Fix duration formatter to handle hours and days (e.g. "31d 6h", "5h 23m") - Compact table rows: TableHead h-8 px-3 text-[11px], TableCell px-3 py-1.5 - Table text reduced to text-xs for all data columns - Remove double padding — page container is now plain flex col gap-4 - Remove Separator between CardHeader and table - Tighten CardHeader to py-3 px-4 with text-sm font-medium title - Limit recent runs to 15 (up from 10) - Fix "View All" link to navigate to /runs instead of /workflows - Remove unused StatCard component and Clock/XCircle imports

…clickable headers - Cards start collapsed by default (was open): prevents 300+ item flood with 15 agents × 20+ reasoners - Entire card header row is the expand/collapse trigger (was isolated chevron button on far right) - Reasoner rows reduced to py-1 ~24px (was ~40px with tree characters) - Removed tree characters (├──), replaced with clean font-mono list - Play button always visible (was hidden on hover) with icon + label - Truncate reasoner list at 5, "Show N more" link to expand - Removed Config button and Restart text label — icon-only restart button - Removed redundant "15 TOTAL" badge from page header - Replaced space-y-* with flex flex-col gap-2 for card list - Removed Card/CardHeader/CardContent/Collapsible/Separator — plain divs for density

- TableHead height reduced from h-10 to h-8, padding px-4 → px-3, text-[11px] - TableCell padding reduced from p-4 to px-3 py-1.5 across all row cells - Table base text changed from text-sm to text-xs for dense data display - Run ID and Started cells use text-[11px], Reasoner cell uses text-xs font-medium - Steps and Duration cells use tabular-nums for numeric alignment - formatDuration now handles ms, seconds, minutes, hours, and days correctly - space-y-4 → space-y-3 and mb-4 → mb-3 for tighter page layout

…imestamps - Rewrite AgentsPage from bordered cards to a borderless divide-y list inside a single Card - Fix formatRelativeTime to guard against bogus/epoch timestamps (was showing '739709d ago') - Expanded reasoner rows now render inline (bg-muted/30, pl-8, text-[11px]) instead of in a nested Card - Remove page <h1> heading from AgentsPage — breadcrumb in AppLayout already identifies the page - Add delayDuration={300} to HealthStrip TooltipProvider so tooltips don't appear immediately - navigation.ts already correct (5 items, correct icons) — no change needed - Dashboard already reads runsQuery.data?.workflows and navigates to /runs — no change needed

…ips, theme toggle - Use useSidebar() state to conditionally render logo text vs icon-only in collapsed mode, eliminating text overflow/clipping behind the icon rail - Add SidebarRail for drag-to-resize handle on desktop - Add SidebarSeparator between header and nav content for visual separation - Implement ModeToggle in SidebarFooter (sun/moon theme toggle, centered when collapsed) - Replace bg-primary/text-primary-foreground with bg-sidebar-primary/text-sidebar-primary-foreground in logo icon container to use correct semantic sidebar tokens - Use text-sidebar-foreground and text-sidebar-foreground/60 for logo text - Add tooltip="AgentField" to logo SidebarMenuButton so collapsed state shows tooltip on hover - Header bar: use border-sidebar-border and bg-sidebar/30 backdrop-blur instead of border-border

…result linking - Add cURL dropdown with sync and async variants; clipboard copy with "Copied!" feedback - Add collapsible schema section showing input_schema and output_schema when a reasoner is selected - Show status badge and duration in Result card header after execution - Replace "View as Execution" with "View Run →" linking to /runs/:runId - Add "Replay" button to re-run with same input

…observability check - AppLayout: change SidebarProvider defaultOpen from false to true so sidebar shows labels on first load (users can collapse via Cmd+B) - Settings/General: replace empty placeholder with useful content — API endpoint display with copy button and quick-start env var snippet - Settings/Identity: fix Server DID display — was incorrectly showing res.message (a status string) as the DID; now fetches the actual DID from /api/v1/did/agentfield-server and displays it with a copy button; shows "DID system not configured" when unavailable (local mode) - Settings: default tab remains "general" which is now useful content - Settings/Observability: tab already has full webhook config, status, DLQ management — no changes needed

…uttons - Dashboard: routes already correct (/runs/:runId and /runs) - Playground: View Run link already uses /runs/:runId - HealthStrip: connected to real data (useLLMHealth, useQueueStatus, useAgents) - RunsPage: added agent filter Select, functional Compare Selected and Cancel Running buttons - RunDetailPage: removed broken Trace/Graph toggle (Tabs/ViewMode were declared but unused), added Cancel Run button (useCancelExecution) for running runs and Replay button for failed/timeout runs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…opy, VC export - StepDetail: replace plain <pre> blocks with JsonHighlight component (regex-based coloring for keys, strings, numbers, booleans, null) - StepDetail: add copy-action row (Copy cURL, Copy Input, Copy Output) with transient check-icon feedback after clipboard write - RunDetailPage: add Export VC button in header that opens the /api/v1/did/workflow/:id/vc-chain endpoint in a new tab - RunTrace: extend formatDuration to handle hours (Xh Ym) and days (Xd Yh)

…order, status dots - Add click-to-sort on Status, Steps, Duration, and Started headers with asc/desc arrow indicators; sort state flows through useRuns to the API - Reorder columns: Status | Reasoner | Agent | Steps | Duration | Started | Run ID (status first for scannability, run ID de-emphasised at the far right) - Add Agent column showing agent_id / agent_name per row - Replace Badge with a compact StatusDot (coloured dot + short label) for denser status display in table rows - Update search placeholder to "Search runs, reasoners, agents…" to reflect multi-field search capability - Import cn from @/lib/utils for conditional class merging

Wire up the existing WorkflowDAGViewer component into the Run Detail page as a proper Graph tab alongside the Trace view. Multi-step runs show a Trace/Graph toggle in the header; single-step runs skip the toggle entirely and show step detail directly. Clicking a node in the graph panel selects the step and populates the right-hand detail panel.

…mpty state - Add copy button next to each Run ID (copies full ID to clipboard) - Combine Agent + Reasoner columns into a single "Target" column showing agent.reasoner in monospace (agent part muted, reasoner part primary) - Remove separate Agent column; new order: Status | Target | Steps | Duration | Started | Run ID - Add HoverCard on reasoner cell that lazily fetches and displays root execution input/output preview (only when root_execution_id is present) - Replace plain "No runs found" cell with a centered empty state using Play icon and context-aware helper text - TypeScript: 0 errors

…n, active sidebar - RunDetailPage: flex column layout with h-[calc(100vh-8rem)] so trace/step panels fill the viewport instead of using fixed 500px heights - Reorganized header: status badge and DID badge inline with title, subtitle shows workflow name + step count + duration - Added Replay button (navigates to playground with agent/reasoner target) - Added Copy ID button for quick clipboard access to the run ID - Replaced single Export VC button with an Export dropdown containing "Export VC Chain" and "Export Audit Log" (downloads JSON) - AppSidebar: active nav item now renders a left-edge accent bar (before:w-0.5 bg-sidebar-primary) for clear visual distinction in both light and dark mode, supplementing the existing bg-sidebar-accent fill

…roup separators - Add sequential step numbers (1-based) on every trace row for disambiguation - Show relative start times per step ("+0:00", "+1:23") anchored to run start - Color-code duration bars: green=succeeded, red=failed, amber=timeout, blue/pulse=running - Replace large status icons with compact inline status dots (size-1.5) - Add group count badge (×N) on first node of consecutive same-reasoner runs - Add subtle border separator when reasoner_id changes between siblings - Reduce row height to py-1 (28px) for better visual density - Pass runStartedAt prop from RunDetailPage down to RunTrace

Adds a CommandPalette component using shadcn Command + Dialog, registered globally via AppLayout. Cmd+K / Ctrl+K toggles the palette; items navigate to Dashboard, Runs, Agents, Playground, Settings, and filtered run views. A ⌘K hint badge is shown in the header bar on medium+ screens.

santoshkumarradha · 2026-04-05T08:33:51Z

@AbirAbbas ready to go after your review.

CC: @Sridhar-Vetrivel / @SivasankaranPSIOG

…roduct-research

- Use constant-time comparison (crypto/subtle) for bearer token validation in Go SDK process logs endpoint - Add MaxBytesReader (10 MiB) to execution logs ingestion handler to prevent memory exhaustion from oversized payloads - Remove accidentally committed .cursor/ IDE state files - Add .cursor/ to .gitignore Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

santoshkumarradha and others added 30 commits April 1, 2026 19:38

feat(sdk/ts): add multimodal helpers for image, audio, file inputs/ou…

004ba87

…tputs

Merge branch 'worktree-agent-aeca2023' into feat/ui-revamp-product-re…

4f7de4d

…search

feat(ui): add URL redirects from old routes to new pages

70f4e7b

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

This was referenced Apr 5, 2026

Hard to troubleshoot, agent stays running for hour #316

Open

test: comprehensive test coverage + behavioral invariant tests #337

Closed

docs: add execution observability RFC

e57c306

santoshkumarradha mentioned this pull request Apr 5, 2026

docs: execution observability RFC #342

Merged

santoshkumarradha added 23 commits April 6, 2026 00:25

agit sync

31420b0

agit sync

e0950b9

agit sync

91b2d2b

agit sync

99885ff

agit sync

f087db6

agit sync

e18414e

agit sync

2f375ab

feat: add execution logging transport

c93d529

feat(py): add structured execution logging

b1741ed

agit sync

db0b50c

feat(web): unify execution observability panel

40039e6

Remove local plandb database from repo

32638df

fix demo execution observability flow

9f46d1d

refactor run detail into execution and logs tabs

6bde0ab

refine execution logs density

f772213

agit sync

e7ee815

polish execution logs panel header

e309984

refine raw node log console

7cf61df

polish process log filter toolbar

a390a5d

standardize observability spacing primitives

4ad363b

agit sync

2fa67be

test execution observability in functional harness

1c2f91e

Merge branch 'feat/execution-observability-rfc' into feat/ui-revamp-p…

5236e69

…roduct-research

santoshkumarradha mentioned this pull request Apr 5, 2026

UI: Live execution log viewer (SSE) #324

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(web): operations-first control plane UI revamp with live updates and node logs#330

feat(web): operations-first control plane UI revamp with live updates and node logs#330
santoshkumarradha wants to merge 107 commits intomainfrom
feat/ui-revamp-product-research

santoshkumarradha commented Apr 4, 2026 •

edited

Loading

Uh oh!

santoshkumarradha commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

santoshkumarradha commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Included scope

Product direction

Review and integration notes

Issue linkage

Validation status

Previous review fixes still included

Uh oh!

santoshkumarradha commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

santoshkumarradha commented Apr 4, 2026 •

edited

Loading