fix + diagnostic: native DB download, correct startup ordering, JS error handler#1529
Conversation
## Why Builds 15/16/18 crash ~15s after launch with a signature that shows the terminate path (RCTFatal → RCTExceptionsManager.reportException → void TurboModule NSException → abort) but NOT the originating JS error. Sentry is installed but has captured zero events from production iOS across all crashing builds — the 'smoke test' event visible in the Sentry dashboard was a manual DSN-verification from outside the app, not real telemetry. PR #1525's startup-probe approach failed (reverted in #1528): it gated the render tree and produced a worse symptom (black screen, no crash log) than the crash it was trying to diagnose. ## Approach Hook React Native's ErrorUtils.setGlobalHandler and write a tiny JSON blob to disk on any JS error, upstream of ExceptionsManager. On next launch, display via native Alert.alert. ErrorUtils is installed by React Native's JS bootstrap before any module loads — see Libraries/vendor/core/ErrorUtils.js comment on line 41: 'this ErrorUtils must be defined (and the handler set) globally before requiring anything.' So installing from the first import in index.ts captures errors during module resolution too. ## Why this won't repeat the probe mistake - Does NOT wrap App or any other component - Does NOT gate the render tree - Does NOT do any async work before first paint - Display via native Alert.alert (renders on top of any screen) - fire-and-forget from existing useEffect, no new render paths ## Chain design When a JS error fires: 1. Our handler runs first (captures error to disk, sync + small) 2. We call the prior handler (RN default → ExceptionsManager) 3. ExceptionsManager does whatever it does (likely what causes the NSException crash in the first place) 4. Process dies — but our file is already on disk On next launch: App.tsx init useEffect fires displayLastCrashIfAny() in parallel with init, which pops an Alert if a file exists. ## What this build will tell us Three possible outcomes: - App crashes, next launch shows Alert with JS error → we know the cause, fix it, done - App crashes, next launch shows no Alert → error is happening before our handler installs (i.e. during crashHandler's own require graph resolution). Rare but possible. - App doesn't crash → something incidentally fixed it, or the bug is heisen-style ## Files - app/src/utils/crashHandler.ts (new, ~150 LOC) - app/index.ts (+2 lines, 1 new import block) - app/App.tsx (+1 import, +5 lines in existing init useEffect) ## Verified - typecheck: clean - lint: 0 errors, 7 pre-existing warnings + 1 intentional import ordering warning (crashHandler must load first) - tests: 18 suites / 326 tests pass (DB, stores, ContentUpdater) ## Remove when done This is a diagnostic module. Once we know the JS error and fix it, revert this commit.
Test Results✅ All tests passed
Coverage
⏱️ Duration: 84.1s |
## Bug
App.tsx's onComplete callback for DbDownloadScreen previously:
try { await initDatabase(); }
catch (e) { console.error('Post-download init error:', e); }
setDbStatus('ready');
If initDatabase() threw, the catch logged it and then STILL transitioned
the app to 'ready'. A downstream DB read against the uninitialised DB
would then throw — and on iOS 26 that's exactly the kind of JS error
that manifests as the ExceptionsManager / NSException crash we've been
chasing on builds 15/16/18.
The return value was also ignored: even if initDatabase() returned
'needs_download' (indicating a corrupt/invalid downloaded file), the app
would still transition to 'ready'.
## Fix
- App.tsx onComplete: only transition to 'ready' when initDatabase()
explicitly returns 'ready'. If it throws or returns 'needs_download',
re-throw so DbDownloadScreen can display the error and offer Retry
via its existing error UI.
- DbDownloadScreen:
- Widen onComplete type to () => void | Promise<void>
- Await onComplete in startDownload and catch any thrown error, routing
it through the existing setStatus('error') + setError(...) flow.
## Relationship to the startup crash
This is a latent bug independent of the main build-18 crash, but it
could contribute to or mask that crash in some scenarios. Fixing it
eliminates one class of 'crash happens right after onComplete' as a
candidate explanation, so the diagnostic crashHandler added in this
same PR sees a cleaner picture when it fires.
## Credit
The bug was identified and the fix pattern proposed in code review.
Implementation uses the reviewer's cleaner unified try/catch structure
preserving the original error instance via .
## Verified
- typecheck clean
- 14 test suites / 246 tests pass (DB + ContentUpdater)
- No new dedicated DbDownloadScreen tests added in this PR — would
require mocking ContentUpdater in a new helper; deferred as scope
creep on a diagnostic branch. Typecheck enforces the interface.
Added commit 71e1a8a — onComplete post-download init fixIn code review, a separate bug was spotted in Fix now on this branch:
Relationship to the crash investigation: Latent bug independent of the main crash, but could contribute to or mask it. Fixing eliminates one candidate so the diagnostic Tests: 14 suites / 246 tests pass (DB + ContentUpdater). No new DbDownloadScreen-specific tests — that would require mocking Credit: Bug identified + cleaner fix pattern proposed in review. Commit uses the reviewer's unified try/catch structure. PR now contains 2 commits:
Ready to merge + build. |
f697b94 to
955be7f
Compare
…S5 teardown crash)
## The crash
TestFlight 1.0.7(20) segfaulted 3.5s after launch — EXC_BAD_ACCESS
on Thread 7, triggered by:
exsqlite3_finalize
sqlite3Fts5IndexClose
fts5DisconnectMethod
sqlite3VtabUnlock
disconnectAllVtab
sqlite3Close
SQLiteModule.closeDatabase (expo-sqlite iOS, SQLiteModule.swift:489)
The call chain: SDK 54 expo-sqlite's closeAsync → sqlite3Close →
disconnectAllVtab → iterates every FTS5 vtab → fts5DisconnectMethod
→ fts5FreeVtab → sqlite3Fts5IndexClose → exsqlite3_finalize on a
stale/freed statement pointer → crash.
This is a teardown-path bug in FTS5 virtual-table cleanup that
manifests specifically when:
1. The DB has FTS5 tables (scripture.db has them for verse/people search)
2. PRAGMA integrity_check was run on the connection (which loads
and validates every FTS5 index, leaving them in a state that
doesn't clean up correctly)
3. closeAsync is called on that connection
Our full-DB download verification path added by #1529 (cherry-pick
from #1531 commit 1) did exactly that: open temp DB, run content_hash
check, run integrity_check, close. The integrity_check was the
trigger.
## The fix
Remove the PRAGMA integrity_check from the full-DB post-download
verification path. Keep the content_hash check — it's sufficient
validation: the hash was computed at build time over the full file
contents, baked into db_meta at build time, and uploaded with the
DB to R2. A corrupt/truncated download is overwhelmingly likely to
fail at SQLite open (exception thrown before our read) or to miss
the db_meta row entirely.
This is less rigorous than integrity_check but avoids the teardown
crash. TODO: re-add proper integrity verification once expo-sqlite's
FTS5 teardown bug is resolved upstream, or migrate to streaming
SHA-256 during the chunked write.
## Scope
INTENTIONALLY LIMITED. The delta-apply path in applyDelta() has
the same latent PRAGMA integrity_check → close pattern (~line 234).
Left in place because:
- Delta updates don't fire for first-launch users (the crash we're
chasing is build-20 first-launch)
- Delta integrity_check runs AFTER transactional SQL mutation, so
it's more valuable as a corruption check than the post-download
one was
- Touching it expands scope beyond the confirmed crash site
- If it ever DOES crash for a delta user, same fix applies; we'll
know when we see the crash log
## Tests
- Removed 'returns failed when integrity_check fails after download'
(asserted behavior that no longer exists)
- Removed 5 dead integrity_check mocks from full-DB tests where they
were set up but would never be consumed now
- Delta-path tests untouched (still exercise integrity_check)
- 35 ContentUpdater tests pass (was 36, minus 1 removed)
- tsc clean, lint 0 errors
## Build budget
Build 21 will be the 2nd build on master since #1529 shipped. With
no other changes, failure mode cascade from the cascade in #1529:
- If Alert shows on relaunch → a JS error got past the fixes.
Handler captured it. We see it next launch.
- If app reaches home screen → all three crash causes fixed.
We revert the diagnostic and ship.
- If app still crashes silently → exceptionally rare; a different
native-layer issue we haven't hit yet.
## References
- Crash incident: 8C94684A-D8BE-45F5-B9A7-412450196AC4
- iOS Analytics log: CompanionStudy-2026-04-19-..-1.0.7(20)
- Supersedes PR #1533 (which kept the integrity_check call in place)
Combined build — three crash hypothesis fixes + diagnostic safety net
This PR now bundles four logical changes that together test three strong crash hypotheses in a single build, with a JS-error diagnostic as a safety net if none of the fixes resolve the crash.
The four commits
1. Global JS error handler diagnostic (
058c01c7)Hooks
ErrorUtils.setGlobalHandlerto capture JS errors upstream ofExceptionsManager, writes toDocuments/last-js-error.jsonsynchronously, chains to the prior handler so RN default + Sentry still run. On next launch,App.tsxfiresdisplayLastCrashIfAny()which pops a nativeAlert.alertwith the captured error + stack preview. Does NOT gate render tree, does NOT do async work before first paint — structurally incapable of producing the build-19 black-screen regression.2. Post-download init error surfacing (
71e1a8ae)App.tsx's post-downloadonCompletenow throws wheninitDatabase()returns non-'ready'or throws.DbDownloadScreenawaitsonCompleteand surfaces errors through its existing retry UI. Previously, a failed post-download init silently transitioned to the main app tree with a broken DB, guaranteeing downstream crashes.3. Native download + PRAGMA integrity_check for large DBs (cherry-picked from #1531)
Likely the biggest contributor to the crash. Two memory eliminations rolled into one:
responseType='arraybuffer'kept the entire ~100MB payload in JS heap even after the chunked write finished. NowFile.downloadFileAsyncstreams directly to disk for payloads ≥ 25MB. XHR path retained only for small payloads where fine-grained progress is useful.verifyChecksum()was loading the downloaded file viabase64→atob→Uint8Array— ~400MB peak JS heap for a 100MB DB. Replaced withdb_meta.content_hash+PRAGMA integrity_check— kilobytes, not hundreds of megabytes.Build 18 crashed ~15s after launch with a signature matching OOM-during-XHR-response processing. This fix addresses both memory sources.
4. Startup hydration ordering (cherry-picked from #1531)
App.tsxpreviously raninitUserDatabase(), Sentry binding, and all three store hydrations PLUSpruneEvents()even when scripture.db wasn't yet present (needs_downloadstatus). That guarantees downstream errors (pruneEvents runs SQL against the missing scripture.db). The catch-alltry/catchswallowed those errors silently but left stores in partial-hydration state — which could manifest as a crash later.Now: on
needs_download, app short-circuits to DbDownloadScreen immediately. Full hydration runs ONCE — either during normal init (when DB was already ready) OR inside the post-downloadonCompletecallback (after first-launch download). Extracted tohydrateAppState()at module scope so both paths share the same sequence.What's NOT included from #1531
Three commits from PR #1531 were held back to keep this PR's scope reviewable:
c367db92— reuse live DB handle ingetInstalledVersion()(legit optimisation but doesn't address first-launch crash)5b0c070d— close live DB before OTA swapsbf17eaee— reopen DB after swap succeeds/failsThe last two together are correct but introduce DB lifecycle complexity. Every caller of
getDb()becomes a potential stale-reference bug if anything caches the reference across an update. Worth doing — in a follow-up PR with time for proper review. They address the OTA update path, not the first-launch crash we're chasing.The hypothesis test
If build 20 launches cleanly → one or more of commits 2-4 fixed it. Diagnostic never fires.
If build 20 still crashes → the diagnostic captures the actual JS error and the next launch shows a native Alert. We finally see what's throwing. At that point we know whether to chase the OTA path, native-layer issues, or something else entirely.
Either way: one build, definitive answer.
Files changed
app/src/utils/crashHandler.ts(new, ~150 LOC)app/src/services/ContentUpdater.ts(native download path, integrity_check verification)app/src/screens/DbDownloadScreen.tsx(async-aware onComplete, surfaces errors)app/App.tsx(crash-handler call, hoisted hydrateAppState, correct status gating, full hydration in post-download)app/index.ts(crash handler install)app/__tests__/services/ContentUpdater.test.ts(PRAGMA integrity_check mock chain added to 6 tests)Verification
Supersedes
After crash is fixed
Revert
crashHandler.tsand its two call sites. The other three changes (native download, hydration ordering, onComplete hardening) stay — they're improvements regardless.