Add database integrity check to detect and recover from corruption by findolor · Pull Request #2387 · rainlanguage/raindex

findolor · 2026-01-05T11:25:49Z

Motivation

See issues:

Local DB sync fails with 'database disk image is malformed' error #2385

When a browser tab is closed during a sync operation, the SQLite database can become corrupted with "database disk image is malformed" errors. Previously, this caused sync to fail repeatedly with no recovery mechanism, forcing users to manually clear their browser data.

Solution

Add a database integrity check at startup using SQLite's PRAGMA quick_check command:

New query module (integrity_check.rs): Defines the PRAGMA quick_check statement and IntegrityCheckRow struct to parse the JSON response
New check_integrity() method: Added to BootstrapPipeline trait to run the integrity check and return whether the database is healthy
Updated runner_run(): Now checks integrity at startup before any other operations
- If quick_check returns non-"ok" → database is corrupted → reset and resync from dump
- If quick_check query fails entirely (severe corruption) → treat as corrupted → reset and resync from dump

This ensures automatic recovery from database corruption without user intervention.

Checks

By submitting this for review, I'm confirming I've done the following:

made this PR as small as possible
unit-tested any new functionality
linked any relevant issues or PRs
included screenshots (if this involves a front-end change)

fix #2385

Summary by CodeRabbit

New Features
- Added automatic database integrity verification that detects and handles database health issues by resetting corrupted databases before they impact operations.
Tests
- Expanded test coverage to validate database integrity verification and automatic recovery mechanisms, including comprehensive scenarios for corruption detection, error handling, and reset procedures.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- Add PRAGMA quick_check query to detect database corruption at startup - Add check_integrity() method to BootstrapPipeline trait - Reset database automatically when corruption is detected - Handle both cases: non-"ok" response and query errors (severe corruption) - Add IntegrityCheckRow struct to parse JSON response from quick_check

coderabbitai · 2026-01-05T11:26:02Z

Walkthrough

This PR adds database integrity checking functionality to detect and recover from corrupted databases. A new check_integrity method is introduced to the BootstrapPipeline trait that executes SQLite's PRAGMA quick_check. The client-side runner is updated to invoke this check at startup and automatically reset the database if corruption is detected.

Changes

Cohort / File(s)	Summary
New Integrity Check Module `crates/common/src/local_db/query/integrity_check.rs`	Introduces public constant `INTEGRITY_CHECK_SQL` ("PRAGMA quick_check"), public struct `IntegrityCheckRow` for parsing results, and public function `integrity_check_stmt()` to create the SQL statement. Includes unit tests for statement validity and JSON serialization/deserialization.
Module Exports `crates/common/src/local_db/query/mod.rs`	Adds new public module `integrity_check` to the query module hierarchy.
Bootstrap Pipeline Trait `crates/common/src/local_db/pipeline/adapters/bootstrap.rs`	Adds asynchronous `check_integrity<DB>(&self, db: &DB) -> Result<bool, LocalDbError>` method to `BootstrapPipeline` trait with implementation that returns true for "ok" responses (case-insensitive) and false otherwise. Includes unit tests for various integrity check responses and error handling.
Client-Side Bootstrap Runner `crates/common/src/raindex_client/local_db/pipeline/bootstrap.rs`	Integrates integrity check into runner logic: executes `check_integrity` upfront and resets the database via `reset_db` if the check fails or returns an error. Adds comprehensive test mocks (`CorruptedDb`, `IntegrityCheckFailsDb`) and new test scenarios to verify reset behavior on database corruption and integrity check failures.

Sequence Diagram(s)

sequenceDiagram
    participant Runner as Runner
    participant BootstrapPipeline as BootstrapPipeline
    participant DB as Database
    participant Reset as Reset Handler

    rect rgba(100, 200, 100, 0.2)
    Note over Runner,DB: Integrity Check Phase
    Runner->>BootstrapPipeline: check_integrity(db)
    BootstrapPipeline->>DB: Execute PRAGMA quick_check
    DB-->>BootstrapPipeline: IntegrityCheckRow {quick_check}
    end

    alt Integrity OK (result == "ok")
        rect rgba(150, 200, 150, 0.1)
        BootstrapPipeline-->>Runner: Ok(true)
        Runner->>Runner: Continue inspection & schema checks
        end
    else Integrity Failed or Error
        rect rgba(200, 100, 100, 0.2)
        BootstrapPipeline-->>Runner: Ok(false) or Err
        Runner->>Reset: reset_db()
        Reset->>DB: Clear/recreate metadata & views
        Reset-->>Runner: Complete
        Runner->>Runner: Exit early, skip further checks
        end
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Add bootstrap pipeline with envrionment specific logic #2277 — Extends the BootstrapPipeline trait and introduces integrity-check query usage within the client/producer bootstrap flow, modifying the same files and trait interface as this PR.

Suggested reviewers

0xgleb
hardyjosh

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 51.43% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately and concisely describes the main change: adding a database integrity check mechanism to detect and automatically recover from SQLite corruption.
Linked Issues check	✅ Passed	The PR implements all core requirements from #2385: automatic integrity check at startup, database reset on corruption detection, and recovery path to prevent manual intervention.
Out of Scope Changes check	✅ Passed	All code changes are directly scoped to implementing the integrity check feature and its integration into the bootstrap pipeline; no unrelated modifications detected.

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3e93521 and 84162d3.

📒 Files selected for processing (4)

crates/common/src/local_db/pipeline/adapters/bootstrap.rs
crates/common/src/local_db/query/integrity_check.rs
crates/common/src/local_db/query/mod.rs
crates/common/src/raindex_client/local_db/pipeline/bootstrap.rs

🧰 Additional context used

📓 Path-based instructions (3)

crates/**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

crates/**/*.rs: For Rust crates in crates/*, run lints using nix develop -c cargo clippy --workspace --all-targets --all-features -D warnings
For Rust crates in crates/*, run tests using nix develop -c cargo test --workspace or --package <crate>

Files:

crates/common/src/local_db/pipeline/adapters/bootstrap.rs
crates/common/src/local_db/query/integrity_check.rs
crates/common/src/local_db/query/mod.rs
crates/common/src/raindex_client/local_db/pipeline/bootstrap.rs

**/crates/**

📄 CodeRabbit inference engine (AGENTS.md)

Rust workspace organized as crates/* with subdirectories: cli, common, bindings, js_api, quote, subgraph, settings, math, integration_tests

Files:

crates/common/src/local_db/pipeline/adapters/bootstrap.rs
crates/common/src/local_db/query/integrity_check.rs
crates/common/src/local_db/query/mod.rs
crates/common/src/raindex_client/local_db/pipeline/bootstrap.rs

**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Rust: format code with nix develop -c cargo fmt --all
Rust: lint with nix develop -c rainix-rs-static (preconfigured flags included)
Rust: crates and modules use snake_case; types use PascalCase

Files:

crates/common/src/local_db/pipeline/adapters/bootstrap.rs
crates/common/src/local_db/query/integrity_check.rs
crates/common/src/local_db/query/mod.rs
crates/common/src/raindex_client/local_db/pipeline/bootstrap.rs

🧠 Learnings (11)

📓 Common learnings

Learnt from: findolor
Repo: rainlanguage/rain.orderbook PR: 2145
File: crates/common/src/raindex_client/local_db/query/create_tables/query.sql:71-72
Timestamp: 2025-10-06T11:44:07.888Z
Learning: The local DB feature in the rain.orderbook codebase is not live yet (as of PR #2145), so schema migrations for existing databases are not required when modifying table structures in `crates/common/src/raindex_client/local_db/query/create_tables/query.sql`.

Learnt from: findolor
Repo: rainlanguage/rain.orderbook PR: 2202
File: crates/common/src/raindex_client/local_db/sync.rs:33-34
Timestamp: 2025-10-14T07:51:55.148Z
Learning: In `crates/common/src/raindex_client/local_db/sync.rs`, the hard-coded `DEFAULT_SYNC_CHAIN_ID` constant (set to `SUPPORTED_LOCAL_DB_CHAINS[0]`) will be replaced with proper chain ID handling in downstream PRs as part of the multi-network/orderbook implementation.

📚 Learning: 2025-10-06T14:41:41.909Z

Learnt from: findolor
Repo: rainlanguage/rain.orderbook PR: 2159
File: crates/cli/src/commands/local_db/sync/runner/mod.rs:52-113
Timestamp: 2025-10-06T14:41:41.909Z
Learning: The local DB sync CLI command (crates/cli/src/commands/local_db/sync/) is designed for CI-only usage, and simple println! statements are preferred over structured logging for status messages.

Applied to files:

crates/common/src/local_db/pipeline/adapters/bootstrap.rs
crates/common/src/raindex_client/local_db/pipeline/bootstrap.rs

📚 Learning: 2025-10-28T14:11:56.648Z

Learnt from: findolor
Repo: rainlanguage/rain.orderbook PR: 2277
File: crates/common/src/local_db/query/create_tables/query.sql:11-19
Timestamp: 2025-10-28T14:11:56.648Z
Learning: In the target_watermarks table (crates/common/src/local_db/query/create_tables/query.sql), additional indexes beyond the composite primary key (chain_id, orderbook_address) are not needed because the table will have a small number of rows.

Applied to files:

crates/common/src/local_db/pipeline/adapters/bootstrap.rs

📚 Learning: 2025-10-18T10:38:41.273Z

Learnt from: findolor
Repo: rainlanguage/rain.orderbook PR: 2237
File: crates/common/src/raindex_client/local_db/sync.rs:79-89
Timestamp: 2025-10-18T10:38:41.273Z
Learning: In `crates/common/src/raindex_client/local_db/sync.rs`, the sync_database method currently only supports indexing a single orderbook per chain ID, which is why `.first()` is used to select the orderbook configuration. Multi-orderbook support per chain ID is planned for future PRs.

Applied to files:

crates/common/src/local_db/pipeline/adapters/bootstrap.rs

📚 Learning: 2025-11-25T16:50:31.752Z

Learnt from: CR
Repo: rainlanguage/rain.orderbook PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-25T16:50:31.752Z
Learning: Applies to crates/integration_tests/**/*.rs : Rust: write tests using `cargo test`; integration tests live in `crates/integration_tests`. Prefer `insta` snapshots and `proptest` where helpful

Applied to files:

crates/common/src/local_db/pipeline/adapters/bootstrap.rs
crates/common/src/local_db/query/integrity_check.rs

📚 Learning: 2025-05-13T20:06:22.602Z

Learnt from: 0xgleb
Repo: rainlanguage/rain.orderbook PR: 1713
File: crates/settings/src/remote/chains/mod.rs:43-226
Timestamp: 2025-05-13T20:06:22.602Z
Learning: When writing tests for collections of complex objects in Rust, prefer item-by-item comparison over direct vector comparison to get more specific error messages that pinpoint exactly which item and field has a mismatch.

Applied to files:

crates/common/src/local_db/pipeline/adapters/bootstrap.rs

📚 Learning: 2025-05-20T10:20:08.206Z

Learnt from: 0xgleb
Repo: rainlanguage/rain.orderbook PR: 1859
File: crates/quote/src/quote_debug.rs:472-492
Timestamp: 2025-05-20T10:20:08.206Z
Learning: In the Rain Orderbook codebase, the `#[tokio::test(flavor = "multi_thread")]` annotation is specifically needed for tests that use `LocalEvm`, not just for consistency across all async tests.

Applied to files:

crates/common/src/local_db/pipeline/adapters/bootstrap.rs

📚 Learning: 2025-05-16T17:26:09.529Z

Learnt from: 0xgleb
Repo: rainlanguage/rain.orderbook PR: 1844
File: tauri-app/src-tauri/src/commands/wallet.rs:29-33
Timestamp: 2025-05-16T17:26:09.529Z
Learning: When testing error cases that might produce different types of errors depending on external conditions (such as hardware presence), using `unwrap_err()` without further assertions can be preferred over `assert!(result.is_err())` with specific error messages to avoid misleading readers about expected error details.

Applied to files:

crates/common/src/local_db/pipeline/adapters/bootstrap.rs

📚 Learning: 2025-12-03T10:40:25.429Z

Learnt from: findolor
Repo: rainlanguage/rain.orderbook PR: 2344
File: crates/common/src/local_db/pipeline/runner/mod.rs:18-31
Timestamp: 2025-12-03T10:40:25.429Z
Learning: In `crates/common/src/local_db/pipeline/runner/mod.rs`, the `TargetSuccess` struct does not need separate `ob_id` or `orderbook_key` fields because the contained `SyncOutcome` already includes orderbook identification information such as chain_id and orderbook_address. This avoids redundant data duplication.

Applied to files:

crates/common/src/raindex_client/local_db/pipeline/bootstrap.rs

📚 Learning: 2025-10-06T11:44:07.888Z

Learnt from: findolor
Repo: rainlanguage/rain.orderbook PR: 2145
File: crates/common/src/raindex_client/local_db/query/create_tables/query.sql:71-72
Timestamp: 2025-10-06T11:44:07.888Z
Learning: The local DB feature in the rain.orderbook codebase is not live yet (as of PR #2145), so schema migrations for existing databases are not required when modifying table structures in `crates/common/src/raindex_client/local_db/query/create_tables/query.sql`.

Applied to files:

crates/common/src/raindex_client/local_db/pipeline/bootstrap.rs

📚 Learning: 2025-10-06T11:13:29.956Z

Learnt from: findolor
Repo: rainlanguage/rain.orderbook PR: 2123
File: crates/common/src/raindex_client/local_db/mod.rs:23-29
Timestamp: 2025-10-06T11:13:29.956Z
Learning: In `crates/common/src/raindex_client/local_db/mod.rs`, the `Default` implementation for `LocalDb` that creates an RPC client pointing to `http://localhost:4444` is acceptable because the RPC client must be explicitly configured before actual usage in production scenarios.

Applied to files:

crates/common/src/raindex_client/local_db/pipeline/bootstrap.rs

🧬 Code graph analysis (1)

crates/common/src/local_db/query/integrity_check.rs (1)

crates/common/src/local_db/query/sql_statement.rs (2)

sql (99-101)

params (103-105)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)

GitHub Check: test
GitHub Check: standard-tests (ubuntu-latest, test-js-bindings)
GitHub Check: standard-tests (ubuntu-latest, rainix-wasm-browser-test)
GitHub Check: standard-tests (ubuntu-latest, rainix-sol-legal)
GitHub Check: standard-tests (ubuntu-latest, rainix-rs-artifacts, true)
GitHub Check: standard-tests (ubuntu-latest, rainix-wasm-test)
GitHub Check: standard-tests (ubuntu-latest, rainix-sol-static)
GitHub Check: standard-tests (ubuntu-latest, rainix-sol-artifacts)
GitHub Check: standard-tests (ubuntu-latest, rainix-wasm-artifacts)
GitHub Check: standard-tests (ubuntu-latest, rainix-sol-test)
GitHub Check: standard-tests (ubuntu-latest, rainix-rs-static)
GitHub Check: standard-tests (ubuntu-latest, ob-rs-test, true)
GitHub Check: git-clean
GitHub Check: test
GitHub Check: test
GitHub Check: Deploy-Docs-Preview
GitHub Check: build-tauri (ubuntu-22.04, true)
GitHub Check: Deploy-Preview-Push

🔇 Additional comments (11)

crates/common/src/local_db/query/integrity_check.rs (2)

1-14: LGTM! Clean and well-structured integrity check module.

The implementation correctly uses SQLite's PRAGMA quick_check for integrity verification. The IntegrityCheckRow struct appropriately models the JSON response with necessary derives for serialization.

15-45: Tests cover essential scenarios.

The unit tests verify static SQL generation and serde round-trips for both healthy and corrupted responses. Good coverage for this data module.

crates/common/src/local_db/query/mod.rs (1)

18-18: Module export correctly added.

The integrity_check module is properly declared and alphabetically ordered.

crates/common/src/local_db/pipeline/adapters/bootstrap.rs (3)

11-11: Import correctly added for integrity check functionality.

121-131: Well-implemented integrity check method.

The implementation correctly:

Uses case-insensitive comparison for the "ok" response.

Treats empty responses as unhealthy (safe default).

Propagates query errors through the ? operator.

This defensive approach ensures that any unexpected response triggers the recovery path.

768-849: Comprehensive test coverage for check_integrity.

The tests cover all important scenarios:

Healthy responses (case-insensitive)

Corrupted database messages

Generic non-ok responses

Empty responses

Error propagation

Good use of the existing MockDb pattern for consistency.

crates/common/src/raindex_client/local_db/pipeline/bootstrap.rs (5)

90-94: Integrity check correctly gates the bootstrap flow.

The implementation:

Uses unwrap_or(false) to safely treat query errors as corruption (matching PR objectives).

Resets the database and returns early on corruption detection.

The early return after reset_db is appropriate since the database is freshly initialized with the correct schema and metadata, making subsequent inspect_state and ensure_schema checks redundant.

177-186: Clean test helper for healthy integrity simulation.

The with_healthy_integrity builder method follows the existing pattern and correctly configures the mock for healthy database scenarios.

265-302: Existing tests correctly updated for integrity check.

The with_healthy_integrity() helper is properly added to all runner_run tests, ensuring they test the intended scenarios (tables missing, schema mismatch, etc.) rather than triggering the corruption recovery path.

612-718: CorruptedDb mock and test verify the corruption recovery path.

The test correctly validates that when integrity check returns a corruption message:

reset_db is invoked (clear tables, create tables, insert metadata)

Views are recreated

This matches the PR objective of automatic recovery from corruption.

720-797: IntegrityCheckFailsDb mock covers severe corruption scenario.

This test validates the case where the database is so corrupted that even the PRAGMA quick_check query fails to execute. The unwrap_or(false) in runner_run correctly treats this as corruption and triggers the reset path.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-01-13T09:42:26Z

@coderabbitai assess this PR size classification for the totality of the PR with the following criterias and report it in your comment:

S/M/L PR Classification Guidelines:

This guide helps classify merged pull requests by effort and complexity rather than just line count. The goal is to assess the difficulty and scope of changes after they have been completed.

Small (S)

Characteristics:

Simple bug fixes, typos, or minor refactoring
Single-purpose changes affecting 1-2 files
Documentation updates
Configuration tweaks
Changes that require minimal context to review

Review Effort: Would have taken 5-10 minutes

Examples:

Fix typo in variable name
Update README with new instructions
Adjust configuration values
Simple one-line bug fixes
Import statement cleanup

Medium (M)

Characteristics:

Feature additions or enhancements
Refactoring that touches multiple files but maintains existing behavior
Breaking changes with backward compatibility
Changes requiring some domain knowledge to review

Review Effort: Would have taken 15-30 minutes

Examples:

Add new feature or component
Refactor common utility functions
Update dependencies with minor breaking changes
Add new component with tests
Performance optimizations
More complex bug fixes

Large (L)

Characteristics:

Major feature implementations
Breaking changes or API redesigns
Complex refactoring across multiple modules
New architectural patterns or significant design changes
Changes requiring deep context and multiple review rounds

Review Effort: Would have taken 45+ minutes

Examples:

Complete new feature with frontend/backend changes
Protocol upgrades or breaking changes
Major architectural refactoring
Framework or technology upgrades

Additional Factors to Consider

When deciding between sizes, also consider:

Test coverage impact: More comprehensive test changes lean toward larger classification
Risk level: Changes to critical systems bump up a size category
Team familiarity: Novel patterns or technologies increase complexity

Notes:

the assessment must be for the totality of the PR, that means comparing the base branch to the last commit of the PR
the assessment output must be exactly one of: S, M or L (single-line comment) in format of: SIZE={S/M/L}
do not include any additional text, only the size classification
your assessment comment must not include tips or additional sections
do NOT tag me or anyone else on your comment

coderabbitai · 2026-01-13T09:42:34Z

The author of this PR is on the CodeRabbit Free Plan. In order to use the Chat feature, please upgrade the PR author to CodeRabbit Pro by visiting https://app.coderabbit.ai/login.

findolor self-assigned this Jan 5, 2026

findolor requested review from 0xgleb and hardyjosh January 5, 2026 11:28

findolor added this to the v5 final prs milestone Jan 6, 2026

0xgleb approved these changes Jan 6, 2026

View reviewed changes

hardyjosh approved these changes Jan 13, 2026

View reviewed changes

findolor merged commit 7850c7f into main Jan 13, 2026
30 of 31 checks passed

findolor mentioned this pull request Jan 27, 2026

Fix database corruption by adding wipe_and_recreate support #2418

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add database integrity check to detect and recover from corruption#2387

Add database integrity check to detect and recover from corruption#2387
findolor merged 1 commit intomainfrom
2025-01-05-local-db-malformed-fix

findolor commented Jan 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jan 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

github-actions Bot commented Jan 13, 2026

Uh oh!

coderabbitai Bot commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

findolor commented Jan 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Solution

Checks

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

Uh oh!

github-actions Bot commented Jan 13, 2026

S/M/L PR Classification Guidelines:

Small (S)

Medium (M)

Large (L)

Additional Factors to Consider

Notes:

Uh oh!

coderabbitai Bot commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

findolor commented Jan 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jan 5, 2026 •

edited

Loading