feat: service detail overview tab with AI brief + MCP-powered panels#45
Merged
Conversation
added 23 commits
March 26, 2026 11:49
…Schema Creates src/types/service-brief.ts with all 10 types required by the Service Detail Overview tab: BriefDependencyNode/Edge, Deployment, MergeRequest, ConfigChange, ContainerStatus, K8sEvent, SectionStatus, AISummary, and ServiceBrief. Adds gitlabProject: z.string().optional() to ServiceSchema so config.yaml can wire a service to a GitLab project.
- Make SectionStatus.fetchedAt optional (no timestamp for unconfigured/error states) - Extract ChangesSection and InfrastructureSection as named exported interfaces - Extract DependencyGraphSource as a named exported type - Widen Deployment.pipelineStatus to string with doc comment listing known values - Fix BriefDependencyEdge doc comment (was "DependencyNode concept")
…d AI summary Implements buildServiceBrief() — the core backend aggregator for the Service Detail Overview tab. Queries changes (GitLab MCP), infrastructure (K8s MCP), and dependencies in parallel, then generates an AI summary correlating recent changes with current health. Key features: - Per-section timeouts (3s) via Promise.race - In-memory cache with per-section TTLs and stale-while-revalidate - In-flight dedup (concurrent callers share the same Promise) - Graceful degradation (each section independent) - Extracts inferDependencyGraph for reuse from routes.ts Includes 8 unit tests covering happy path, partial failures, timeouts, cache hits, stale-while-revalidate, in-flight dedup, and LLM failure.
…el fetch, error vs unconfigured, fake timers - withTimeout() now creates an AbortController per section and aborts it on timeout - doBuildServiceBrief() uses unconditional Promise.allSettled([fetchChanges, fetchInfra, fetchDeps]); each fetcher owns its own cache check internally (stale-while-revalidate stays intact) - fetchChanges/fetchInfra let getToolsByRole errors propagate (→ "error" status); empty tools still returns null (→ "unconfigured" status) - Test 3 uses vi.useFakeTimers() + vi.advanceTimersByTimeAsync() instead of real 5s wait - All 8 tests pass in <10ms
…eout, stale summary refresh - Fix misleading "unconditional" comment to describe actual conditional fan-out behavior - Simplify withTimeout to return Promise<T> directly (remove unused AbortController signal) - Extract inferDependencyGraph to shared src/server/dependency-graph.ts (DRY) - Add background refresh path for stale AI summaries (fire-and-forget like data sections) - Add TODO comments for gitlabProject wiring and hardcoded namespace - Remove unused makeFailingTool helper and unnecessary type casts in tests - Add stale status assertion to stale-while-revalidate test
Delegates to buildServiceBrief() for aggregated service overview data including changes, infrastructure, dependencies, and AI summary.
Full-width card with teal left-border accent, shimmer loading, per-state rendering (ok/stale/error/unconfigured/null), evidence ref badges, relative freshness timestamp with warning color on stale, and AI-generated label. Exports ServiceBriefSkeleton for in-flight placeholder use.
Merges deployments, merge requests, and config changes into a single time-sorted timeline (capped at 10 items) with per-type color dots, metadata lines, freshness indicator, and all five UI states (loading, data, empty, error, unconfigured).
Shows workload type + replica status, per-container CPU/memory utilization bars with green/warning/destructive thresholds, restart counts, and Warning K8s events. Handles loading, error, unconfigured, and empty states with freshness indicator.
…le to ServiceDependencyGraph - New optional props: healthMap (per-service health status) and dependencySource - Health dot (6px circle) rendered left of node label when healthMap is provided, using success/warning/destructive/muted-foreground CSS vars - "Estimated topology" disclaimer shown below graph when dependencySource is "inferred" or omitted (10px JetBrains Mono, muted-foreground/50) - Collapsible "Show as table" section below graph with role=table/row/cell markup, keyboard-navigable rows, and health dots in Status column - onNodeClick fixed to always pass plain service name string (via _labelText) regardless of whether dot label JSX is used
…pendency sections Create ServiceOverview.tsx that fetches /api/services/:name/brief and renders all four section components (ServiceBrief, RecentChanges, InfrastructureStatus, ServiceDependencyGraph) with skeleton loaders. Wire Overview as the new default tab in ServiceDetail.tsx via lazy loading, keeping existing Metrics/History/Dependencies tabs unchanged.
The default tab changed from Metrics to Overview as part of the Service Brief feature. Update the test assertion to match.
- ServiceDependencyGraph: remove dead labelElement HTML-string and nodeById map - ServiceBrief: remove unreachable summary===undefined check; loading is determined by the parent's sectionStatus, not by checking for undefined - RecentChanges + InfrastructureStatus: extract duplicated STALE_THRESHOLD_MS and freshness formatting into src/web/lib/freshness.ts - ServiceDependencyGraph: add initialData/initialHealthMap props so ServiceOverview can pass pre-fetched brief data instead of triggering a redundant /api/dependencies fetch
…zation, error state - Fix withTimeout timer leak: clear setTimeout handle via .finally() when the underlying promise wins the race - Add NAME_PATTERN validation on :name route params to reject cache-key injection (colon separator) and restrict to K8s-valid names (max 253 chars) - Add sanitizeForPrompt() to strip control characters and truncate user-derived fields before embedding in LLM prompts - Track fetch error state in ServiceOverview to show error UI instead of infinite loading skeletons when /brief fails entirely - Wrap initialDepData in useMemo to prevent unnecessary re-renders of ServiceDependencyGraph
Add coerce helpers (coerceContainer, coerceEvent, coerceDeployment, coerceMergeRequest) that produce typed structs with sensible defaults from unknown MCP tool output, preventing NaN/undefined from reaching callers. Add MAX_CACHE_ENTRIES=200 constant and LRU-style eviction (oldest insertion order) in setCache() to bound memory growth.
The AI Brief section showed "Configure an LLM provider" because the LLM model wasn't passed through registerRoutes to buildServiceBrief.
When K8s MCP returns a service-not-found response with all zeros (replicas: 0/0, containers: [], events: []), don't feed that to the LLM. The AI was reporting "0/0 replicas" for services that actually had pods running (visible via Prometheus metrics).
Bump console header buttons (Clear, New chat) from caption-tier (10px) to label-tier (11px) with proper height (h-7) to reduce visual gap with service detail action buttons. Bump RCA card body text from 10-11px to 11-12px for better readability and DESIGN.md alignment.
- Match console buttons (Clear, New chat) to service detail secondary button style: h-9, px-4, 12px font, rounded-lg, border-border/50 - Brighten RCA card text in console (foreground opacity 60-75% → 85-90%) - Change Investigate button from pill (rounded-full) to rounded-lg
Sync initialData when it arrives after mount — the parent fetches the brief async, so initialData starts undefined then becomes defined. Also cap fitView maxZoom at 0.8 to prevent single-node graphs from appearing oversized.
Update sidebar tooltip, service detail breadcrumb, and test.
WZ
added a commit
that referenced
this pull request
Apr 2, 2026
…45) * feat(types): add ServiceBrief interfaces and gitlabProject to ServiceSchema Creates src/types/service-brief.ts with all 10 types required by the Service Detail Overview tab: BriefDependencyNode/Edge, Deployment, MergeRequest, ConfigChange, ContainerStatus, K8sEvent, SectionStatus, AISummary, and ServiceBrief. Adds gitlabProject: z.string().optional() to ServiceSchema so config.yaml can wire a service to a GitLab project. * refactor(types): clean up service-brief.ts per code review - Make SectionStatus.fetchedAt optional (no timestamp for unconfigured/error states) - Extract ChangesSection and InfrastructureSection as named exported interfaces - Extract DependencyGraphSource as a named exported type - Widen Deployment.pipelineStatus to string with doc comment listing known values - Fix BriefDependencyEdge doc comment (was "DependencyNode concept") * feat(server): add service-brief aggregator with caching, timeouts, and AI summary Implements buildServiceBrief() — the core backend aggregator for the Service Detail Overview tab. Queries changes (GitLab MCP), infrastructure (K8s MCP), and dependencies in parallel, then generates an AI summary correlating recent changes with current health. Key features: - Per-section timeouts (3s) via Promise.race - In-memory cache with per-section TTLs and stale-while-revalidate - In-flight dedup (concurrent callers share the same Promise) - Graceful degradation (each section independent) - Extracts inferDependencyGraph for reuse from routes.ts Includes 8 unit tests covering happy path, partial failures, timeouts, cache hits, stale-while-revalidate, in-flight dedup, and LLM failure. * fix: spec compliance — AbortController timeouts, unconditional parallel fetch, error vs unconfigured, fake timers - withTimeout() now creates an AbortController per section and aborts it on timeout - doBuildServiceBrief() uses unconditional Promise.allSettled([fetchChanges, fetchInfra, fetchDeps]); each fetcher owns its own cache check internally (stale-while-revalidate stays intact) - fetchChanges/fetchInfra let getToolsByRole errors propagate (→ "error" status); empty tools still returns null (→ "unconfigured" status) - Test 3 uses vi.useFakeTimers() + vi.advanceTimersByTimeAsync() instead of real 5s wait - All 8 tests pass in <10ms * fix: code review cleanup — extract dependency graph, simplify withTimeout, stale summary refresh - Fix misleading "unconditional" comment to describe actual conditional fan-out behavior - Simplify withTimeout to return Promise<T> directly (remove unused AbortController signal) - Extract inferDependencyGraph to shared src/server/dependency-graph.ts (DRY) - Add background refresh path for stale AI summaries (fire-and-forget like data sections) - Add TODO comments for gitlabProject wiring and hardcoded namespace - Remove unused makeFailingTool helper and unnecessary type casts in tests - Add stale status assertion to stale-while-revalidate test * feat: add GET /api/services/:name/brief route handler Delegates to buildServiceBrief() for aggregated service overview data including changes, infrastructure, dependencies, and AI summary. * feat: add ServiceBrief.tsx AI summary card for service overview tab Full-width card with teal left-border accent, shimmer loading, per-state rendering (ok/stale/error/unconfigured/null), evidence ref badges, relative freshness timestamp with warning color on stale, and AI-generated label. Exports ServiceBriefSkeleton for in-flight placeholder use. * feat: add RecentChanges.tsx timeline component for service overview Merges deployments, merge requests, and config changes into a single time-sorted timeline (capped at 10 items) with per-type color dots, metadata lines, freshness indicator, and all five UI states (loading, data, empty, error, unconfigured). * feat: add InfrastructureStatus.tsx K8s resource cards Shows workload type + replica status, per-container CPU/memory utilization bars with green/warning/destructive thresholds, restart counts, and Warning K8s events. Handles loading, error, unconfigured, and empty states with freshness indicator. * feat: add health badges, estimated topology label, and accessible table to ServiceDependencyGraph - New optional props: healthMap (per-service health status) and dependencySource - Health dot (6px circle) rendered left of node label when healthMap is provided, using success/warning/destructive/muted-foreground CSS vars - "Estimated topology" disclaimer shown below graph when dependencySource is "inferred" or omitted (10px JetBrains Mono, muted-foreground/50) - Collapsible "Show as table" section below graph with role=table/row/cell markup, keyboard-navigable rows, and health dots in Status column - onNodeClick fixed to always pass plain service name string (via _labelText) regardless of whether dot label JSX is used * feat: add ServiceOverview tab composing brief, changes, infra, and dependency sections Create ServiceOverview.tsx that fetches /api/services/:name/brief and renders all four section components (ServiceBrief, RecentChanges, InfrastructureStatus, ServiceDependencyGraph) with skeleton loaders. Wire Overview as the new default tab in ServiceDetail.tsx via lazy loading, keeping existing Metrics/History/Dependencies tabs unchanged. * test: update ServiceDetail test for Overview as default tab The default tab changed from Metrics to Overview as part of the Service Brief feature. Update the test assertion to match. * fix: pre-landing review fixes - ServiceDependencyGraph: remove dead labelElement HTML-string and nodeById map - ServiceBrief: remove unreachable summary===undefined check; loading is determined by the parent's sectionStatus, not by checking for undefined - RecentChanges + InfrastructureStatus: extract duplicated STALE_THRESHOLD_MS and freshness formatting into src/web/lib/freshness.ts - ServiceDependencyGraph: add initialData/initialHealthMap props so ServiceOverview can pass pre-fetched brief data instead of triggering a redundant /api/dependencies fetch * fix: adversarial review — timer leak, input validation, prompt sanitization, error state - Fix withTimeout timer leak: clear setTimeout handle via .finally() when the underlying promise wins the race - Add NAME_PATTERN validation on :name route params to reject cache-key injection (colon separator) and restrict to K8s-valid names (max 253 chars) - Add sanitizeForPrompt() to strip control characters and truncate user-derived fields before embedding in LLM prompts - Track fetch error state in ServiceOverview to show error UI instead of infinite loading skeletons when /brief fails entirely - Wrap initialDepData in useMemo to prevent unnecessary re-renders of ServiceDependencyGraph * fix: MCP output shape validation and cache size bound Add coerce helpers (coerceContainer, coerceEvent, coerceDeployment, coerceMergeRequest) that produce typed structs with sensible defaults from unknown MCP tool output, preventing NaN/undefined from reaching callers. Add MAX_CACHE_ENTRIES=200 constant and LRU-style eviction (oldest insertion order) in setCache() to bound memory growth. * fix: wire LLM model into service brief endpoint The AI Brief section showed "Configure an LLM provider" because the LLM model wasn't passed through registerRoutes to buildServiceBrief. * fix: instruct LLM to output plain text in AI brief, no markdown * fix: skip default-zero infra data in AI summary prompt When K8s MCP returns a service-not-found response with all zeros (replicas: 0/0, containers: [], events: []), don't feed that to the LLM. The AI was reporting "0/0 replicas" for services that actually had pods running (visible via Prometheus metrics). * style: consistent font and button sizes in console panel Bump console header buttons (Clear, New chat) from caption-tier (10px) to label-tier (11px) with proper height (h-7) to reduce visual gap with service detail action buttons. Bump RCA card body text from 10-11px to 11-12px for better readability and DESIGN.md alignment. * style: polish console & service detail button consistency - Match console buttons (Clear, New chat) to service detail secondary button style: h-9, px-4, 12px font, rounded-lg, border-border/50 - Brighten RCA card text in console (foreground opacity 60-75% → 85-90%) - Change Investigate button from pill (rounded-full) to rounded-lg * fix: dependency graph loading forever in Overview tab Sync initialData when it arrives after mount — the parent fetches the brief async, so initialData starts undefined then becomes defined. Also cap fitView maxZoom at 0.8 to prevent single-node graphs from appearing oversized. * style: rename Dashboard nav to Operations Desk Update sidebar tooltip, service detail breadcrumb, and test. --------- Co-authored-by: Wilson Li <wli02@fortinet.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/api/services/:name/briefaggregator endpoint with per-section caching, timeouts, and graceful degradationArchitecture
Test Coverage
service-brief.test.ts— 8 unit tests covering: happy path, GitLab fail, K8s timeout, all fail, cache hit, stale-while-revalidate, in-flight dedup, AI summary failServiceDetail.test.tsx— default tab changed from Metrics to OverviewPre-Landing Review
5 issues found (0 critical, 5 informational) — all fixed:
Adversarial Review
16 findings from Claude adversarial review (large tier, 2319 lines):
Plan Completion
19/21 DONE, 0 NOT DONE, 2 CHANGED (equivalent approaches):
Test plan
npx tsc --noEmit— type check passesnpx vitest run— 60 files, 717 tests, all passing🤖 Generated with Claude Code