Skip to content

fix(server): open SQLite with WAL journaling and a busy timeout#3015

Merged
vpetersson merged 2 commits into
masterfrom
fix/sqlite-wal-busy-timeout
Jun 7, 2026
Merged

fix(server): open SQLite with WAL journaling and a busy timeout#3015
vpetersson merged 2 commits into
masterfrom
fix/sqlite-wal-busy-timeout

Conversation

@vpetersson

Copy link
Copy Markdown
Contributor

Issues Fixed

Sentry: ANTHIAS-C (database is locked in the viewer's generate_asset_list), ANTHIAS-E (reconcile_stuck_processing), ANTHIAS-G (cleanup).

Description

uvicorn, the celery worker, and the viewer all open the same ~/.anthias/anthias.db from separate containers. With the stock rollback journal and SQLite's default busy timeout of 0, any write transaction makes concurrent readers/writers fail instantly with OperationalError: database is locked — seen fleet-wide on 2026.6.2.

  • timeout=20 — wait for the lock instead of failing on the spot
  • journal_mode=WAL — readers no longer block the writer and vice versa; re-asserted on every connect so restored backups/legacy DBs get upgraded too
  • synchronous=NORMAL — the recommended WAL pairing (WAL commits are crash-safe at NORMAL)
  • transaction_mode=IMMEDIATE — concurrent writers queue on the busy handler instead of deadlocking on the read→write lock upgrade (those fail instantly, ignoring the busy timeout)

Unit runs (pytest-django per-worker :memory: DBs) accept the same pragmas harmlessly.

Checklist

  • I have performed a self-review of my own code.
  • New and existing unit tests pass locally and on CI with my changes.
  • I have done an end-to-end test for Raspberry Pi devices.
  • I have tested my changes for x86 devices.
  • I added a documentation for the changes I have made (when necessary).

🤖 Generated with Claude Code

- uvicorn, the celery worker, and the viewer share one SQLite file
  across containers; the stock rollback journal plus a 0s busy
  timeout raised OperationalError: database is locked fleet-wide
  (Sentry ANTHIAS-C, ANTHIAS-E, ANTHIAS-G)
- timeout=20 waits for the lock instead of failing on the spot
- journal_mode=WAL lets readers and the writer coexist;
  synchronous=NORMAL is the recommended WAL pairing
- transaction_mode=IMMEDIATE queues concurrent writers on the busy
  handler instead of deadlocking on the read-to-write lock upgrade
- Add regression tests covering the connection options

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vpetersson vpetersson requested a review from a team as a code owner June 7, 2026 11:05
@vpetersson vpetersson self-assigned this Jun 7, 2026
@vpetersson vpetersson requested a review from Copilot June 7, 2026 11:05

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Anthias’s Django SQLite connection configuration to reduce OperationalError: database is locked failures when multiple services (uvicorn, celery worker, viewer) concurrently access the same bind-mounted SQLite database file across containers.

Changes:

  • Configure SQLite connections with a 20s busy timeout and WAL journaling (plus synchronous=NORMAL) via connection init pragmas.
  • Force SQLite transactions to start in IMMEDIATE mode to avoid read→write upgrade lock behavior.
  • Add regression tests asserting the shared SQLite settings and validating the init SQL is executable.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/anthias_server/django_project/settings.py Adds SQLite OPTIONS (timeout, WAL/NORMAL pragmas, IMMEDIATE transaction mode) and documents the rationale for multi-container concurrency.
tests/test_django_db_settings.py Adds tests to assert the configured SQLite options and sanity-check the init pragma script against a scratch DB.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@sonarqubecloud

sonarqubecloud Bot commented Jun 7, 2026

Copy link
Copy Markdown

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@vpetersson vpetersson merged commit daa1d4b into master Jun 7, 2026
10 checks passed
vpetersson added a commit that referenced this pull request Jun 9, 2026
- CalVer (YYYY.0M.MICRO); still June 2026, micro 2 -> 3
- Gives Sentry a real release boundary: every build since 2026.6.2
  reported the same base version (only the +git-hash differed), so
  resolved-in-next-release never stuck and fixed issues kept
  reopening on the next event. A version bump lets the deployed
  fixes actually clear from the board.
- Ships the crash/noise fixes merged since 2026.6.2: SQLite WAL +
  busy timeout (#3015), celery migration-gate (#3016) and
  asset-probe soft limits (#3017), transient-redis/CancelledError
  Sentry filtering + redis healthcheck (#3018/#3028), GitHub
  update-check log level (#3019), webview respawn on D-Bus death at
  setup and mid-play (#3020/#3031), resilient static-file scan
  (#3026), Wayland-socket wait (#3030), and Sentry release/board
  triage tags (#3021/#3025)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants