Defer OTA when live DB open, use native download for large DBs, and reorder app init/hydration#1533
Closed
CraigBuckmaster wants to merge 1 commit into
Closed
Conversation
CraigBuckmaster
pushed a commit
that referenced
this pull request
Apr 20, 2026
…S5 teardown crash)
## The crash
TestFlight 1.0.7(20) segfaulted 3.5s after launch — EXC_BAD_ACCESS
on Thread 7, triggered by:
exsqlite3_finalize
sqlite3Fts5IndexClose
fts5DisconnectMethod
sqlite3VtabUnlock
disconnectAllVtab
sqlite3Close
SQLiteModule.closeDatabase (expo-sqlite iOS, SQLiteModule.swift:489)
The call chain: SDK 54 expo-sqlite's closeAsync → sqlite3Close →
disconnectAllVtab → iterates every FTS5 vtab → fts5DisconnectMethod
→ fts5FreeVtab → sqlite3Fts5IndexClose → exsqlite3_finalize on a
stale/freed statement pointer → crash.
This is a teardown-path bug in FTS5 virtual-table cleanup that
manifests specifically when:
1. The DB has FTS5 tables (scripture.db has them for verse/people search)
2. PRAGMA integrity_check was run on the connection (which loads
and validates every FTS5 index, leaving them in a state that
doesn't clean up correctly)
3. closeAsync is called on that connection
Our full-DB download verification path added by #1529 (cherry-pick
from #1531 commit 1) did exactly that: open temp DB, run content_hash
check, run integrity_check, close. The integrity_check was the
trigger.
## The fix
Remove the PRAGMA integrity_check from the full-DB post-download
verification path. Keep the content_hash check — it's sufficient
validation: the hash was computed at build time over the full file
contents, baked into db_meta at build time, and uploaded with the
DB to R2. A corrupt/truncated download is overwhelmingly likely to
fail at SQLite open (exception thrown before our read) or to miss
the db_meta row entirely.
This is less rigorous than integrity_check but avoids the teardown
crash. TODO: re-add proper integrity verification once expo-sqlite's
FTS5 teardown bug is resolved upstream, or migrate to streaming
SHA-256 during the chunked write.
## Scope
INTENTIONALLY LIMITED. The delta-apply path in applyDelta() has
the same latent PRAGMA integrity_check → close pattern (~line 234).
Left in place because:
- Delta updates don't fire for first-launch users (the crash we're
chasing is build-20 first-launch)
- Delta integrity_check runs AFTER transactional SQL mutation, so
it's more valuable as a corruption check than the post-download
one was
- Touching it expands scope beyond the confirmed crash site
- If it ever DOES crash for a delta user, same fix applies; we'll
know when we see the crash log
## Tests
- Removed 'returns failed when integrity_check fails after download'
(asserted behavior that no longer exists)
- Removed 5 dead integrity_check mocks from full-DB tests where they
were set up but would never be consumed now
- Delta-path tests untouched (still exercise integrity_check)
- 35 ContentUpdater tests pass (was 36, minus 1 removed)
- tsc clean, lint 0 errors
## Build budget
Build 21 will be the 2nd build on master since #1529 shipped. With
no other changes, failure mode cascade from the cascade in #1529:
- If Alert shows on relaunch → a JS error got past the fixes.
Handler captured it. We see it next launch.
- If app reaches home screen → all three crash causes fixed.
We revert the diagnostic and ship.
- If app still crashes silently → exceptionally rare; a different
native-layer issue we haven't hit yet.
## References
- Crash incident: 8C94684A-D8BE-45F5-B9A7-412450196AC4
- iOS Analytics log: CompanionStudy-2026-04-19-..-1.0.7(20)
- Supersedes PR #1533 (which kept the integrity_check call in place)
Owner
Author
|
Closing — the underlying crash was in the The 'defer OTA apply while live DB is open' idea from this PR is still valuable for the OTA-during-runtime path (subsequent launches, not first-launch). That idea is being folded into #1532 (the DB lifecycle hardening PR) as a follow-up once build 21 confirms #1534 resolves the first-launch crash. |
CraigBuckmaster
pushed a commit
that referenced
this pull request
Apr 20, 2026
…tion ## Safety valve Previously, closeDatabaseConnection() swallowed errors from closeAsync and always set db = null, reporting success. If the native SQLite close actually failed (e.g., a statement still in flight), the subsequent file-swap would then move a temp file over a DB that SQLite still held open — leading to lock errors or, on iOS 26 with FTS5 tables, the teardown crash that PR #1534 just fixed for a different code path. Now closeDatabaseConnection() returns Promise<boolean>: - true : close succeeded OR db was already null (safe to swap) - false : closeAsync threw (NOT safe to swap) ContentUpdater.applyDelta and .downloadFullDb both check the return value. On false, they: 1. Log a warning 2. Cleanup any temp files (full-DB path) 3. Return { status: 'up_to_date' } The next OTA check will try again. Failing an update is strictly better than corrupting the DB file. ## Option B TODO Added a 30-line documentation block above ContentUpdaterService explaining the long-term architectural direction: defer-to-cold-start instead of close-and-reopen. Rationale: - The ONLY fully-safe time to swap the DB file is when no SQLite connection exists at all. - That condition naturally occurs during app cold-start, before initDatabase() runs. - Current close-and-reopen creates that condition artificially mid- session, which is fragile (stale getDb() refs, in-flight query errors). - Defer-to-cold-start trades 'updates land immediately' for 'updates land on next launch', which is acceptable for scholarly content updates. This PR ships phase 1 (close-and-reopen with safety valve). Phase 2 (full cold-start-swap architecture) is a separate project tracked as a TODO comment. PR #1533's original 'defer OTA when live DB open' idea informed this direction but cannot be implemented as-is (the DB is always open after startup, so defer-on-live-db would mean updates never fire). ## Tests - Added 2 focused tests covering the safety valve: - aborts delta apply when live DB close fails - aborts full DB swap when live DB close fails Both assert status=up_to_date AND that the swap/backup was NOT attempted (mockFileOps.move / copy). - Updated mock default return values for closeDatabaseConnection from undefined to true (matches new signature). - 41 ContentUpdater tests pass (was 39 + 2 new). - Broader sweep: 30 suites / 454 tests pass. - Typecheck clean, lint 0 errors. ## Not in scope Cold-start-swap implementation is out of scope for this PR — that's a separate project. This PR keeps #1532's original close-and-reopen strategy and hardens it; it does not replace it.
This was referenced Apr 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Description
getDbIfInitialized()andcloseDatabaseConnection()insrc/db/database.tsto expose the live DB handle and allow safe close-before-swap.services/ContentUpdater.tsadded a guard to defer OTA apply when a live DB handle exists, introducedLARGE_DB_NATIVE_DOWNLOAD_THRESHOLD_BYTES, and selected a nativeFile.downloadFileAsyncpath for large full-DB downloads while keeping XHR-based progress for small downloads.db_metacontent_hashandPRAGMA integrity_checkbefore swapping the file.App.tsxto introducehydrateAppState()and to early-return to the download screen wheninitDatabase()reports'needs_download', and to run the full hydration sequence after a successful post-download init.DbDownloadScreen.tsxsoonCompletemay be async and isawaited in the download flow.__tests__/services/ContentUpdater.test.tsto mock the live DB helper, add tests for reusing live DB handles, deferring OTA when DB is open, native-download behavior for large payloads, and integrity-check failure scenarios, and to adjust progress/error expectations accordingly.Testing
jest(theContentUpdater servicesuite and related mocks); all tests in the updated__tests__/services/ContentUpdater.test.tspassed.getInstalledVersion,checkForUpdates,downloadFullDb, andapplyDeltabehaviors; tests succeeded.Codex Task