Feat/UI llm health widget 322#549
Merged
AbirAbbas merged 7 commits intoAgent-Field:mainfrom May 7, 2026
Merged
Conversation
Contributor
Performance
✓ No regressions detected |
Contributor
📊 Coverage gateThresholds from
✅ Gate passedNo surface regressed past the allowed threshold and the aggregate stayed above the floor. |
Contributor
📐 Patch coverage gateThreshold: 80% on lines this PR touches vs
✅ Patch gate passedEvery surface whose lines were touched by this PR has patch coverage at or above the threshold. |
6390fc1 to
2a9d56e
Compare
2a9d56e to
8f200aa
Compare
The test pinned the *boot* status of a freshly-minted replay row (=replayed), but ReplayEvent kicks off the dispatcher in a goroutine that immediately marks the row "failed" because the test fixture has no target agent node registered. Under CI load that goroutine wins the race against the test's GetInboundEvent and the assertion flips to "failed", breaking the control-plane coverage job and (cascading) the coverage-summary check. Pass nil for the dispatcher in this test — many sibling tests in triggers_api_contract_test.go already do this — and gate the three goroutine launch sites in the trigger handler on a nil-check so the contract is honored consistently. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AbirAbbas
previously approved these changes
May 7, 2026
The "filters active triggers..." test queries for the EventRow toggle button with getByRole synchronously, but the events list is fetched async by TriggerSheet on open (useEffect → refreshEvents). Under CI load the fetch hasn't resolved when the assertion runs, so no EventRow has mounted and the query throws. Locally the fetch is fast enough that the race never surfaces. Switch to findByRole so the query waits for the events list to render. The earlier getAllByText still passes because it matches the trigger's event_types summary which renders synchronously from the trigger row. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AbirAbbas
approved these changes
May 7, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the dashboard LLM health observability widget requested in #322 by surfacing backend status from
GET /api/ui/v1/llm/healthin a dedicated card on/ui. The UI now shows overall backend state (healthy/degraded/down/disabled), per-endpoint circuit state (closed/open/half_open), consecutive failure count, last error, last success/check timestamps, and a destructive visual alert when any circuit is open (the key troubleshooting gap called out in #316).Type of change
Test plan
npm --prefix control-plane/web/client test -- src/test/components/dashboard/LLMHealthWidget.test.tsx src/test/pages/NewDashboardPage.test.tsx src/pages/NewDashboardPage.test.tsxnpm --prefix control-plane/web/client run build/uidashboard shows LLM backend health cardDisabled+ “LLM health monitoring is disabled for this deployment.”Circuit breaker openalert, endpoint row withOpen, failure count, and last error messageTest coverage
coverage-baseline.jsonin this PR only if the removal caused a legitimate regression and I called it out in the summary above.Checklist
Related issues / PRs
UI screenshots
Dashboard LLM health widget (open circuit state)