Skip to content

fix(core-rpc): demote timeout unhandled-rejections + classify in client (#REACT-15+)#2196

Merged
senamakel merged 7 commits into
tinyhumansai:mainfrom
oxoxDev:fix/sentry-corerpc-unhandled-rejection
May 20, 2026
Merged

fix(core-rpc): demote timeout unhandled-rejections + classify in client (#REACT-15+)#2196
senamakel merged 7 commits into
tinyhumansai:mainfrom
oxoxDev:fix/sentry-corerpc-unhandled-rejection

Conversation

@oxoxDev
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev commented May 19, 2026

Summary

  • Promote local 30 s AbortController RPC timeout from bare ErrorCoreRpcError(kind='timeout') so callers, Sentry filters, and future telemetry can branch on a stable kind instead of regex-matching messages.
  • Add 'timeout' precedence to classifyRpcError so the local-ceiling shape wins over the broader transport arm (mirrors PR fix(observability): demote loopback sidecar-down noise to expected (#R5 #R6) #2063 loopback-vs-transport).
  • Stop the OPENHUMAN-REACT-15/11/10/12/Z/Y unhandled-rejection family at the three settings panels (TeamPanel, TeamInvitesPanel, TeamMembersPanel) and two CoreStateProvider write paths (updateLocalState, storeSessionToken).
  • Last-line-of-defense beforeSend filter in analytics.ts drops any future CoreRpcError(kind='timeout') that slips past a missing .catch().

Problem

Eight Sentry issues (~18 events, 18 unique users) all triggered via auto.browser.global_handlers.onunhandledrejection:

ID Surface Method
OPENHUMAN-REACT-15 / 11 TeamPanel.tsx:31 → CoreStateProvider:listTeams openhuman.team_list_teams
OPENHUMAN-REACT-10 TeamMembersPanel:44 openhuman.team_list_members
OPENHUMAN-REACT-12 TeamInvitesPanel:37 openhuman.team_list_invites
OPENHUMAN-REACT-13 / 14 Backend chain GET /teams[/members] 504 / connect timeout
OPENHUMAN-REACT-Z / Y Post-write refresh openhuman.app_state_snapshot

All hit the 30 s CORE_RPC_TIMEOUT_MS ceiling (REACT-Z/Y arrived as bare Error: because coreRpcClient.ts:381 threw new Error(...) not CoreRpcError). Each call site fired the RPC as void promise(...) or void promise.finally(...) inside a useEffect, so the rejection bubbled to window.onunhandledrejection → Sentry capture.

Sister to PR #2063 (Rust-side loopback classifier) — same family of "transient infra noise the user already retries past" promoted out of the error surface.

Solution

Layer A — Classifier (coreRpcClient.ts)

  • Add 'timeout' to CoreRpcErrorKind.
  • classifyRpcError now checks /timed out after \d+ms/i BEFORE the generic transport arm.
  • AbortController throw site (line 381) throws CoreRpcError(msg, 'timeout') instead of a bare Error — kills the REACT-Z/Y bare-Error: shape.

Layer B — Call-site .catch() (TeamPanel, TeamInvitesPanel, TeamMembersPanel, CoreStateProvider)

  • Each panel's mount-effect void refresh*(...) is replaced with a .catch(err => log(...)) that demotes to a core-rpc:error breadcrumb.
  • CoreStateProvider.updateLocalState and storeSessionToken now .catch() their follow-up refresh() — the polling loop reconciles within POLL_MS, so a transient app_state_snapshot timeout no longer escapes as an unhandled rejection.

Layer C — Sentry beforeSend filter (analytics.ts)

  • Inspects hint.originalException and drops the event when it is a CoreRpcError (or duck-typed cross-realm equivalent) with kind === 'timeout'. Non-timeout kinds (transport, auth_expired, rate_limited, budget_exceeded, thread_not_found) still surface.

Submission Checklist

Impact

  • Runtime/platform: Desktop (Tauri) and web — pure frontend change inside app/src/services/ + app/src/providers/ + three settings panels. No native shell, no Rust core change.
  • Performance: One extra regex test per RPC classification (negligible). One extra instanceof + duck-type check per Sentry event (negligible).
  • Security/PII: None. New log lines route through the existing sanitizeError helper. The beforeSend filter narrows what is sent, never broadens.
  • Migration / compat: CoreRpcErrorKind adds a variant; consumers that switch on it must handle the new arm — none in this repo do today (all consumers use string equality on specific known kinds). No breaking export shape change.
  • Telemetry: Expect REACT-15/11/10/12/Z/Y event counts to drop to ~0 within one release cycle; REACT-13/14 (backend-side client error (Connect)) still surface as transport — those are real backend issues, not local noise.

Related

Closes OPENHUMAN-REACT-15
Closes OPENHUMAN-REACT-11
Closes OPENHUMAN-REACT-10
Closes OPENHUMAN-REACT-12
Closes OPENHUMAN-REACT-13
Closes OPENHUMAN-REACT-14
Closes OPENHUMAN-REACT-Z
Closes OPENHUMAN-REACT-Y

Sister PR: #2063 (Rust-side loopback classifier — same family, Rust core surface).

  • Closes: (Sentry IDs above — no linked GitHub issue)
  • Follow-up PR(s)/TODOs: Optional hardening of other await callCoreRpc(...) consumers (coreCommandClient, coreHealthMonitor, webviewAccountService, memorySyncService, notificationService, chatService, meetCallService, walletApi, daemonHealthService, backendUrl) if their Sentry IDs surface — deliberately out of scope here.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A — Sentry-only triage (no Linear ticket)
  • URL: N/A

Commit & Branch

  • Branch: fix/sentry-corerpc-unhandled-rejection
  • Commit SHA: c6c966f7 (tip at push time)

Validation Run

  • pnpm --filter openhuman-app format:check
  • pnpm typecheck
  • Focused tests: pnpm debug unit coreRpcClient (75 pass), pnpm debug unit analytics (27 pass), pnpm debug unit CoreStateProvider (22 pass)
  • N/A: Rust fmt/check (no Rust changes in this PR)
  • N/A: Tauri fmt/check (no Tauri shell changes in this PR)

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A — pre-push pnpm rust:check initially failed because the tauri-cef / tauri-plugin-notification vendored submodules were not yet initialized in this fresh worktree; resolved by running git submodule update --init --recursive and pushed cleanly (no --no-verify).

Behavior Changes

  • Intended behavior change: Local AbortController RPC timeouts are now silent at the user-facing layer (logged breadcrumb only) and dropped by Sentry's beforeSend filter. Other error kinds (transport, auth_expired, …) are unaffected.
  • User-visible effect: None for happy-path users. Users who previously hit a 30 s cold-boot timeout on /settings/teams no longer see a Sentry-reported crash — the panel quietly waits for the next poll tick / user retry.

Parity Contract

  • Legacy behavior preserved: All existing classifyRpcError mappings unchanged. All non-timeout CoreRpcError kinds still surface to Sentry. The pre-existing transport arm still matches every body it matched before (the new timeout arm matches a strict subset).
  • Guard/fallback/dispatch parity checks: Existing "throws CoreRpcError with kind=transport on network error" test still passes; existing "AbortController throws with timeout-shaped message" test extended to also assert the CoreRpcError instance + kind.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): None — FE side of the Sentry-noise family is currently unaddressed.
  • Canonical PR: This PR.
  • Resolution: N/A

Summary by CodeRabbit

  • Bug Fixes

    • Improved error handling and logging for failed team, member, and invite refreshes (swallowing and logging refresh failures to avoid unhandled rejections)
    • Enhanced RPC timeout error detection and classification (new timeout kind)
    • Prevented unhandled promise rejections during bootstrap and local-state/session updates
    • Reduced error monitoring noise by filtering timeout errors from Sentry
  • Tests

    • Added regression tests covering RPC timeout classification, Sentry filtering, and unhandled-rejection guards during data loads and provider refreshes

Review Change Stack

oxoxDev added 6 commits May 19, 2026 14:22
…EACT-15)

Promote local 30s AbortController timeout from bare `Error` to
`CoreRpcError('timeout')` so Sentry, call-site `.catch()`, and any
future filter can branch on `err.kind` instead of regex-matching the
raw message. Adds the `timeout` variant to `CoreRpcErrorKind` and
gives it precedence in `classifyRpcError` over the broader `transport`
arm (mirrors the loopback-vs-transport precedence pattern in PR tinyhumansai#2063).

Sentry-Issue: OPENHUMAN-REACT-15
…ow (#REACT-15)

- Verbatim REACT-15/11/10/12 messages classify as `timeout`, not `transport`.
- REACT-Z verbatim (`app_state_snapshot timed out after 30000ms`) classifies
  as `timeout` even when wrapped by the outer catch.
- REACT-13 verbatim (backend-side `client error (Connect): operation timed
  out`) stays `transport` — the new arm must NOT swallow real network shapes.
- Precedence guard: `timed out after \d+ms` wins over the generic `timed out`
  transport regex.
- AbortController throw site rejects with `CoreRpcError` (kind=`timeout`),
  not bare `Error` — locks the OPENHUMAN-REACT-Z/Y regression.

Sentry-Issue: OPENHUMAN-REACT-15
…-Z/Y

`updateLocalState` and `storeSessionToken` awaited a follow-up `refresh()`
without a `.catch()`, so a cold-boot `app_state_snapshot` timeout surfaced
as an unhandled promise rejection at the caller — captured by Sentry's
global handler as OPENHUMAN-REACT-Z/Y.

Make the post-write refresh best-effort (sibling helpers like
`setAnalyticsEnabled` / `setMeetAutoOrchestratorHandoff` already swallow
here). The polling loop reconciles state within `POLL_MS` so any missed
update is not user-visible.

Sentry-Issue: OPENHUMAN-REACT-Z
Sentry-Issue: OPENHUMAN-REACT-Y
…ion leaks (#REACT-15 #REACT-11 #REACT-10 #REACT-12)

The three team-settings panels each fired a `void promise(...)` or
`void promise.finally(...)` against a `refreshTeams` / `refreshTeamMembers`
/ `refreshTeamInvites` chain from inside `useEffect`. A rejection from any
caller (cold core boot, backend 504, local AbortController 30 s ceiling)
became an unhandled rejection captured by Sentry's
`auto.browser.global_handlers.onunhandledrejection`, producing
OPENHUMAN-REACT-15/11 (`team_list_teams`), REACT-10 (`team_list_members`),
and REACT-12 (`team_list_invites`).

Demote each to a logged `core-rpc:error` breadcrumb. The polling loop in
`CoreStateProvider` reconciles state on the next tick, and any user-driven
retry (revisiting the panel) re-runs the same chain — so a transient
timeout is now silent, never user-visible noise, never Sentry noise.

Sentry-Issue: OPENHUMAN-REACT-15
Sentry-Issue: OPENHUMAN-REACT-11
Sentry-Issue: OPENHUMAN-REACT-10
Sentry-Issue: OPENHUMAN-REACT-12
…end (#REACT-15)

Last-line-of-defense filter so a future `await callCoreRpc(...)` chain
that forgets a `.catch()` cannot regress the REACT timeout family.
Mirrors the Rust-side classifier demote in PR tinyhumansai#2063.

Match by `instanceof CoreRpcError` first (in-process Sentry hook); fall
back to a duck-typed `name === 'CoreRpcError' && kind === 'timeout'`
check so cross-realm-constructed errors (test harness, dynamic import,
Vitest module isolation) still get demoted.

Non-timeout `CoreRpcError` shapes (`transport`, `auth_expired`,
`budget_exceeded`, `rate_limited`, `thread_not_found`) still surface
in Sentry — only the local AbortController noise is suppressed.

Sentry-Issue: OPENHUMAN-REACT-15
Sentry-Issue: OPENHUMAN-REACT-11
Sentry-Issue: OPENHUMAN-REACT-10
Sentry-Issue: OPENHUMAN-REACT-12
Sentry-Issue: OPENHUMAN-REACT-13
Sentry-Issue: OPENHUMAN-REACT-14
Sentry-Issue: OPENHUMAN-REACT-Z
Sentry-Issue: OPENHUMAN-REACT-Y
…REACT-15)

- Drops `CoreRpcError(kind='timeout')` passed via the `originalException`
  hint (the live in-process path).
- Cross-realm duck-typed match: rejects `name === 'CoreRpcError'` +
  `kind === 'timeout'` even when `instanceof` would fail.
- Preserves non-timeout `CoreRpcError` shapes (transport, auth_expired,
  …) so the filter cannot suppress real errors.

Sentry-Issue: OPENHUMAN-REACT-15
@oxoxDev oxoxDev requested a review from a team May 19, 2026 10:08
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 19, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3efe24b6-9723-4287-b47f-8d8c31fb2c40

📥 Commits

Reviewing files that changed from the base of the PR and between c6c966f and d70abbc.

📒 Files selected for processing (4)
  • app/src/components/settings/panels/__tests__/TeamInvitesPanel.test.tsx
  • app/src/components/settings/panels/__tests__/TeamMembersPanel.test.tsx
  • app/src/components/settings/panels/__tests__/TeamPanel.test.tsx
  • app/src/providers/__tests__/CoreStateProvider.test.tsx

📝 Walkthrough

Walkthrough

Adds a typed 'timeout' RPC error kind, classifies/throws timeouts from coreRpcClient, filters timeout errors from Sentry beforeSend, and wraps UI/provider refresh calls to catch and log refresh failures; tests verify classification, Sentry filtering, and no unhandled rejections.

Changes

Timeout Error Classification and Filtering

Layer / File(s) Summary
Core RPC timeout classification
app/src/services/coreRpcClient.ts
CoreRpcErrorKind adds 'timeout'. classifyRpcError matches "timed out after ms" and returns 'timeout' before other matches. callCoreRpc throws CoreRpcError(kind: 'timeout') when the AbortController timeout fires.
Core RPC timeout test coverage
app/src/services/__tests__/coreRpcClient.test.ts
Tests capture rejected CoreRpcError with kind: 'timeout' and assert timeout message formatting; table-driven classifyRpcError tests include timed-out messages and precedence cases.
Sentry timeout error filtering
app/src/services/analytics.ts
Imports CoreRpcError; adds isCoreRpcTimeoutError(err) (instanceof + duck-typed {name,kind}); updates Sentry beforeSend(event, hint) to drop events when the originalException is a Core RPC timeout.
Sentry timeout filtering test coverage
app/src/services/__tests__/analytics.test.ts
Mock config adds CORE_RPC_URL and CORE_RPC_TIMEOUT_MS; captureBeforeSend updated to accept hint.originalException; tests ensure timeout errors (including cross-realm shapes) are dropped and non-timeout CoreRpcError shapes are allowed.
Settings panel refresh error handling
app/src/components/settings/panels/TeamInvitesPanel.tsx, TeamMembersPanel.tsx, TeamPanel.tsx
Panels add debug logger and sanitizeError; refresh promise chains now include .catch(...) that logs sanitized errors and retain .finally(...) to clear loading state. TeamPanel logs CoreRpcError kind when available.
Provider refresh error handling
app/src/providers/CoreStateProvider.tsx
After updateLocalState and after storing session token, provider invokes refresh().catch(...) to log sanitized errors and avoid propagating refresh failures as unhandled rejections.
UI unhandled-rejection regression tests
app/src/components/settings/panels/__tests__/*, app/src/providers/__tests__/CoreStateProvider.test.tsx
New Vitest tests install window.unhandledrejection listeners and assert that panel and provider flows swallow refresh rejections (timeouts/transport) rather than surfacing unhandled promise rejections.

Sequence Diagram

sequenceDiagram
  participant Client
  participant callCoreRpc
  participant AbortController
  participant classifyRpcError
  participant Analytics_beforeSend

  Client->>callCoreRpc: invoke RPC method
  callCoreRpc->>AbortController: start timeout timer
  AbortController->>callCoreRpc: abort on CORE_RPC_TIMEOUT_MS
  callCoreRpc->>callCoreRpc: throw CoreRpcError(kind: 'timeout', message: 'Core RPC <method> timed out after Nms')
  callCoreRpc->>classifyRpcError: classify error message
  classifyRpcError->>classifyRpcError: return 'timeout'
  callCoreRpc->>Analytics_beforeSend: error reaches Sentry hook (hint.originalException)
  Analytics_beforeSend->>Analytics_beforeSend: isCoreRpcTimeoutError(hint.originalException) => true
  Analytics_beforeSend->>Analytics_beforeSend: drop event (return null)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • tinyhumansai/openhuman#2167: Modifies CoreStateProvider bootstrap/polling refresh error handling and debug logging at similar control-flow points.

Suggested reviewers

  • senamakel
  • graycyrus

Poem

A rabbit logs a timeout's sigh,
It classifies and gently ties.
Panels catch what used to flee,
Sentry sleeps—no noisy plea.
Hops of code, safe and spry. 🐇

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and specifically describes the main changes: fixing timeout unhandled-rejections and adding a timeout classification in the RPC client, matching the PR's core objective of reducing Sentry noise from 30s RPC timeouts.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 19, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 19, 2026
…≥80% (#REACT-15+)

Coverage Gate on PR tinyhumansai#2196 caught the new `.catch()` blocks at 0% on
changed lines (`TeamPanel:44-45,55-56`, `TeamMembersPanel:52-54,56`,
`TeamInvitesPanel:53-56`, `CoreStateProvider:579-580,606-607`). Add
focused tests that mount each panel with a rejecting `refreshTeams` /
`refreshTeamMembers` / `refreshTeamInvites` and assert
`window.unhandledrejection` never fires, plus two CoreStateProvider
cases that reject `fetchCoreAppSnapshot` on the follow-up refresh in
`updateLocalState` (via `setEncryptionKey`) and `storeSessionToken`.

Locks the regression behaviour in: every new `.catch()` arm has at
least one execution path that exercises it.

Sentry-Issue: OPENHUMAN-REACT-15
Sentry-Issue: OPENHUMAN-REACT-10
Sentry-Issue: OPENHUMAN-REACT-12
Sentry-Issue: OPENHUMAN-REACT-Z
Sentry-Issue: OPENHUMAN-REACT-Y
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — graycyrus

Clean PR. Well-structured, layered fix for the REACT-15/11/10/12/Z/Y Sentry unhandled-rejection family.

Walkthrough

This PR adds a 'timeout' kind to CoreRpcErrorKind, promotes the AbortController throw site from bare Error to CoreRpcError(kind='timeout'), and wires .catch() guards at all five leaking call sites (three team settings panels + two CoreStateProvider write paths). A beforeSend filter in analytics.ts acts as a last-line-of-defense to drop any future timeout that slips past a missing .catch(). Test coverage is thorough — each Sentry issue ID gets explicit regression coverage.

Change Summary

File Change Notes
coreRpcClient.ts Add 'timeout' kind + classifier arm + throw CoreRpcError on abort Correct ordering before generic transport arm
analytics.ts isCoreRpcTimeoutError + beforeSend filter Cross-realm duck-typing is a nice touch
TeamPanel.tsx try/catch in refreshTeamsWithLoading + defensive .catch() in useEffect Belt-and-suspenders approach is warranted here
TeamInvitesPanel.tsx .catch().finally() chain Clean fix
TeamMembersPanel.tsx .catch().finally() chain Clean fix
CoreStateProvider.tsx .catch() on refresh() in updateLocalState + storeSessionToken Polling loop reconciles — correct call
6 test files Regression tests for all Sentry IDs unhandledrejection listener approach is solid

Notes

  • Classifier precedence is correct: /timed out after \d+ms/i (local AbortController shape) fires before the generic /timed out/ in the transport arm. The existing 'operation timed out after 30s' backend shape correctly stays as transport since it lacks the ms suffix.
  • The as never casts in test mocks are fine — standard Vitest partial-mock pattern.
  • No security, PII, or breaking-change concerns. The new CoreRpcErrorKind variant is additive and no consumers currently switch exhaustively on it.

Nice work cleaning up this Sentry noise systematically. LGTM.

@senamakel senamakel merged commit 1ae31ba into tinyhumansai:main May 20, 2026
31 of 48 checks passed
mtkik pushed a commit to mtkik/openhuman-meet that referenced this pull request May 21, 2026
CodeGhost21 pushed a commit to CodeGhost21/openhuman that referenced this pull request May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants