fix(core-rpc): demote timeout unhandled-rejections + classify in client (#REACT-15+)#2196
Conversation
…EACT-15)
Promote local 30s AbortController timeout from bare `Error` to
`CoreRpcError('timeout')` so Sentry, call-site `.catch()`, and any
future filter can branch on `err.kind` instead of regex-matching the
raw message. Adds the `timeout` variant to `CoreRpcErrorKind` and
gives it precedence in `classifyRpcError` over the broader `transport`
arm (mirrors the loopback-vs-transport precedence pattern in PR tinyhumansai#2063).
Sentry-Issue: OPENHUMAN-REACT-15
…ow (#REACT-15) - Verbatim REACT-15/11/10/12 messages classify as `timeout`, not `transport`. - REACT-Z verbatim (`app_state_snapshot timed out after 30000ms`) classifies as `timeout` even when wrapped by the outer catch. - REACT-13 verbatim (backend-side `client error (Connect): operation timed out`) stays `transport` — the new arm must NOT swallow real network shapes. - Precedence guard: `timed out after \d+ms` wins over the generic `timed out` transport regex. - AbortController throw site rejects with `CoreRpcError` (kind=`timeout`), not bare `Error` — locks the OPENHUMAN-REACT-Z/Y regression. Sentry-Issue: OPENHUMAN-REACT-15
…-Z/Y `updateLocalState` and `storeSessionToken` awaited a follow-up `refresh()` without a `.catch()`, so a cold-boot `app_state_snapshot` timeout surfaced as an unhandled promise rejection at the caller — captured by Sentry's global handler as OPENHUMAN-REACT-Z/Y. Make the post-write refresh best-effort (sibling helpers like `setAnalyticsEnabled` / `setMeetAutoOrchestratorHandoff` already swallow here). The polling loop reconciles state within `POLL_MS` so any missed update is not user-visible. Sentry-Issue: OPENHUMAN-REACT-Z Sentry-Issue: OPENHUMAN-REACT-Y
…ion leaks (#REACT-15 #REACT-11 #REACT-10 #REACT-12) The three team-settings panels each fired a `void promise(...)` or `void promise.finally(...)` against a `refreshTeams` / `refreshTeamMembers` / `refreshTeamInvites` chain from inside `useEffect`. A rejection from any caller (cold core boot, backend 504, local AbortController 30 s ceiling) became an unhandled rejection captured by Sentry's `auto.browser.global_handlers.onunhandledrejection`, producing OPENHUMAN-REACT-15/11 (`team_list_teams`), REACT-10 (`team_list_members`), and REACT-12 (`team_list_invites`). Demote each to a logged `core-rpc:error` breadcrumb. The polling loop in `CoreStateProvider` reconciles state on the next tick, and any user-driven retry (revisiting the panel) re-runs the same chain — so a transient timeout is now silent, never user-visible noise, never Sentry noise. Sentry-Issue: OPENHUMAN-REACT-15 Sentry-Issue: OPENHUMAN-REACT-11 Sentry-Issue: OPENHUMAN-REACT-10 Sentry-Issue: OPENHUMAN-REACT-12
…end (#REACT-15) Last-line-of-defense filter so a future `await callCoreRpc(...)` chain that forgets a `.catch()` cannot regress the REACT timeout family. Mirrors the Rust-side classifier demote in PR tinyhumansai#2063. Match by `instanceof CoreRpcError` first (in-process Sentry hook); fall back to a duck-typed `name === 'CoreRpcError' && kind === 'timeout'` check so cross-realm-constructed errors (test harness, dynamic import, Vitest module isolation) still get demoted. Non-timeout `CoreRpcError` shapes (`transport`, `auth_expired`, `budget_exceeded`, `rate_limited`, `thread_not_found`) still surface in Sentry — only the local AbortController noise is suppressed. Sentry-Issue: OPENHUMAN-REACT-15 Sentry-Issue: OPENHUMAN-REACT-11 Sentry-Issue: OPENHUMAN-REACT-10 Sentry-Issue: OPENHUMAN-REACT-12 Sentry-Issue: OPENHUMAN-REACT-13 Sentry-Issue: OPENHUMAN-REACT-14 Sentry-Issue: OPENHUMAN-REACT-Z Sentry-Issue: OPENHUMAN-REACT-Y
…REACT-15) - Drops `CoreRpcError(kind='timeout')` passed via the `originalException` hint (the live in-process path). - Cross-realm duck-typed match: rejects `name === 'CoreRpcError'` + `kind === 'timeout'` even when `instanceof` would fail. - Preserves non-timeout `CoreRpcError` shapes (transport, auth_expired, …) so the filter cannot suppress real errors. Sentry-Issue: OPENHUMAN-REACT-15
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughAdds a typed 'timeout' RPC error kind, classifies/throws timeouts from coreRpcClient, filters timeout errors from Sentry beforeSend, and wraps UI/provider refresh calls to catch and log refresh failures; tests verify classification, Sentry filtering, and no unhandled rejections. ChangesTimeout Error Classification and Filtering
Sequence DiagramsequenceDiagram
participant Client
participant callCoreRpc
participant AbortController
participant classifyRpcError
participant Analytics_beforeSend
Client->>callCoreRpc: invoke RPC method
callCoreRpc->>AbortController: start timeout timer
AbortController->>callCoreRpc: abort on CORE_RPC_TIMEOUT_MS
callCoreRpc->>callCoreRpc: throw CoreRpcError(kind: 'timeout', message: 'Core RPC <method> timed out after Nms')
callCoreRpc->>classifyRpcError: classify error message
classifyRpcError->>classifyRpcError: return 'timeout'
callCoreRpc->>Analytics_beforeSend: error reaches Sentry hook (hint.originalException)
Analytics_beforeSend->>Analytics_beforeSend: isCoreRpcTimeoutError(hint.originalException) => true
Analytics_beforeSend->>Analytics_beforeSend: drop event (return null)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
…≥80% (#REACT-15+) Coverage Gate on PR tinyhumansai#2196 caught the new `.catch()` blocks at 0% on changed lines (`TeamPanel:44-45,55-56`, `TeamMembersPanel:52-54,56`, `TeamInvitesPanel:53-56`, `CoreStateProvider:579-580,606-607`). Add focused tests that mount each panel with a rejecting `refreshTeams` / `refreshTeamMembers` / `refreshTeamInvites` and assert `window.unhandledrejection` never fires, plus two CoreStateProvider cases that reject `fetchCoreAppSnapshot` on the follow-up refresh in `updateLocalState` (via `setEncryptionKey`) and `storeSessionToken`. Locks the regression behaviour in: every new `.catch()` arm has at least one execution path that exercises it. Sentry-Issue: OPENHUMAN-REACT-15 Sentry-Issue: OPENHUMAN-REACT-10 Sentry-Issue: OPENHUMAN-REACT-12 Sentry-Issue: OPENHUMAN-REACT-Z Sentry-Issue: OPENHUMAN-REACT-Y
graycyrus
left a comment
There was a problem hiding this comment.
Review — graycyrus
Clean PR. Well-structured, layered fix for the REACT-15/11/10/12/Z/Y Sentry unhandled-rejection family.
Walkthrough
This PR adds a 'timeout' kind to CoreRpcErrorKind, promotes the AbortController throw site from bare Error to CoreRpcError(kind='timeout'), and wires .catch() guards at all five leaking call sites (three team settings panels + two CoreStateProvider write paths). A beforeSend filter in analytics.ts acts as a last-line-of-defense to drop any future timeout that slips past a missing .catch(). Test coverage is thorough — each Sentry issue ID gets explicit regression coverage.
Change Summary
| File | Change | Notes |
|---|---|---|
coreRpcClient.ts |
Add 'timeout' kind + classifier arm + throw CoreRpcError on abort |
Correct ordering before generic transport arm |
analytics.ts |
isCoreRpcTimeoutError + beforeSend filter |
Cross-realm duck-typing is a nice touch |
TeamPanel.tsx |
try/catch in refreshTeamsWithLoading + defensive .catch() in useEffect |
Belt-and-suspenders approach is warranted here |
TeamInvitesPanel.tsx |
.catch().finally() chain |
Clean fix |
TeamMembersPanel.tsx |
.catch().finally() chain |
Clean fix |
CoreStateProvider.tsx |
.catch() on refresh() in updateLocalState + storeSessionToken |
Polling loop reconciles — correct call |
| 6 test files | Regression tests for all Sentry IDs | unhandledrejection listener approach is solid |
Notes
- Classifier precedence is correct:
/timed out after \d+ms/i(local AbortController shape) fires before the generic/timed out/in the transport arm. The existing'operation timed out after 30s'backend shape correctly stays astransportsince it lacks themssuffix. - The
as nevercasts in test mocks are fine — standard Vitest partial-mock pattern. - No security, PII, or breaking-change concerns. The new
CoreRpcErrorKindvariant is additive and no consumers currently switch exhaustively on it.
Nice work cleaning up this Sentry noise systematically. LGTM.
Summary
AbortControllerRPC timeout from bareError→CoreRpcError(kind='timeout')so callers, Sentry filters, and future telemetry can branch on a stable kind instead of regex-matching messages.'timeout'precedence toclassifyRpcErrorso the local-ceiling shape wins over the broadertransportarm (mirrors PR fix(observability): demote loopback sidecar-down noise to expected (#R5 #R6) #2063 loopback-vs-transport).TeamPanel,TeamInvitesPanel,TeamMembersPanel) and twoCoreStateProviderwrite paths (updateLocalState,storeSessionToken).beforeSendfilter inanalytics.tsdrops any futureCoreRpcError(kind='timeout')that slips past a missing.catch().Problem
Eight Sentry issues (~18 events, 18 unique users) all triggered via
auto.browser.global_handlers.onunhandledrejection:TeamPanel.tsx:31 → CoreStateProvider:listTeamsopenhuman.team_list_teamsTeamMembersPanel:44openhuman.team_list_membersTeamInvitesPanel:37openhuman.team_list_invitesGET /teams[/members]504 / connect timeoutopenhuman.app_state_snapshotAll hit the 30 s
CORE_RPC_TIMEOUT_MSceiling (REACT-Z/Y arrived as bareError:becausecoreRpcClient.ts:381threwnew Error(...)notCoreRpcError). Each call site fired the RPC asvoid promise(...)orvoid promise.finally(...)inside auseEffect, so the rejection bubbled towindow.onunhandledrejection→ Sentry capture.Sister to PR #2063 (Rust-side loopback classifier) — same family of "transient infra noise the user already retries past" promoted out of the error surface.
Solution
Layer A — Classifier (
coreRpcClient.ts)'timeout'toCoreRpcErrorKind.classifyRpcErrornow checks/timed out after \d+ms/iBEFORE the generictransportarm.CoreRpcError(msg, 'timeout')instead of a bareError— kills the REACT-Z/Y bare-Error:shape.Layer B — Call-site
.catch()(TeamPanel,TeamInvitesPanel,TeamMembersPanel,CoreStateProvider)void refresh*(...)is replaced with a.catch(err => log(...))that demotes to acore-rpc:errorbreadcrumb.CoreStateProvider.updateLocalStateandstoreSessionTokennow.catch()their follow-uprefresh()— the polling loop reconciles withinPOLL_MS, so a transientapp_state_snapshottimeout no longer escapes as an unhandled rejection.Layer C — Sentry
beforeSendfilter (analytics.ts)hint.originalExceptionand drops the event when it is aCoreRpcError(or duck-typed cross-realm equivalent) withkind === 'timeout'. Non-timeout kinds (transport,auth_expired,rate_limited,budget_exceeded,thread_not_found) still surface.Submission Checklist
CoreRpcError(kind='timeout')+ SentrybeforeSenddrops timeout / preserves non-timeout / cross-realm match.pnpm debug unit coreRpcClient(75/75) +pnpm debug unit analytics(27/27) +pnpm debug unit CoreStateProvider(22/22) all green.docs/TEST-COVERAGE-MATRIX.md.## Related— no matrix-tracked feature IDs touched; the Sentry IDs are listed below instead.docs/RELEASE-MANUAL-SMOKE.mdcovers UI surface, not classifier internals).Closes #NNNfor a GH issue — this is a Sentry-only PR (mirrors PRs fix(observability): drop 401 session-expired Sentry noise (#25, #1Q, #27, #1G) #1719, fix(observability): close 3 transient-failure leak paths in Sentry classifier (#1608) #1798, fix(observability): demote loopback sidecar-down noise to expected (#R5 #R6) #2063). Sentry IDs listed under## Related.Impact
app/src/services/+app/src/providers/+ three settings panels. No native shell, no Rust core change.instanceof+ duck-type check per Sentry event (negligible).sanitizeErrorhelper. ThebeforeSendfilter narrows what is sent, never broadens.CoreRpcErrorKindadds a variant; consumers that switch on it must handle the new arm — none in this repo do today (all consumers use string equality on specific known kinds). No breaking export shape change.client error (Connect)) still surface astransport— those are real backend issues, not local noise.Related
Closes OPENHUMAN-REACT-15
Closes OPENHUMAN-REACT-11
Closes OPENHUMAN-REACT-10
Closes OPENHUMAN-REACT-12
Closes OPENHUMAN-REACT-13
Closes OPENHUMAN-REACT-14
Closes OPENHUMAN-REACT-Z
Closes OPENHUMAN-REACT-Y
Sister PR: #2063 (Rust-side loopback classifier — same family, Rust core surface).
await callCoreRpc(...)consumers (coreCommandClient,coreHealthMonitor,webviewAccountService,memorySyncService,notificationService,chatService,meetCallService,walletApi,daemonHealthService,backendUrl) if their Sentry IDs surface — deliberately out of scope here.AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
fix/sentry-corerpc-unhandled-rejectionc6c966f7(tip at push time)Validation Run
pnpm --filter openhuman-app format:checkpnpm typecheckpnpm debug unit coreRpcClient(75 pass),pnpm debug unit analytics(27 pass),pnpm debug unit CoreStateProvider(22 pass)Validation Blocked
command:N/Aerror:N/Aimpact:N/A — pre-pushpnpm rust:checkinitially failed because thetauri-cef/tauri-plugin-notificationvendored submodules were not yet initialized in this fresh worktree; resolved by runninggit submodule update --init --recursiveand pushed cleanly (no--no-verify).Behavior Changes
beforeSendfilter. Other error kinds (transport,auth_expired, …) are unaffected./settings/teamsno longer see a Sentry-reported crash — the panel quietly waits for the next poll tick / user retry.Parity Contract
classifyRpcErrormappings unchanged. All non-timeoutCoreRpcErrorkinds still surface to Sentry. The pre-existingtransportarm still matches every body it matched before (the newtimeoutarm matches a strict subset).CoreRpcErrorinstance +kind.Duplicate / Superseded PR Handling
Summary by CodeRabbit
Bug Fixes
Tests