ovsqlite: Add direct reader mode for nnrpd with WAL #338
ovsqlite: Add direct reader mode for nnrpd with WAL #338kev009 wants to merge 7 commits intoInterNetNews:mainfrom
Conversation
When WAL mode is enabled, nnrpd processes now open the overview database directly with SQLITE_OPEN_READONLY instead of routing reads through ovsqlite-server. This eliminates the IPC round-trip and the single-threaded server bottleneck for read operations, allowing read performance to scale with the number of concurrent readers. Direct reader mode activates automatically when walmode is true in ovsqlite.conf AND the database is actually in WAL mode (verified via pragma journal_mode). Without WAL, behavior is unchanged -- readers connect to ovsqlite-server as before to avoid EXCLUSIVE lock contention. Key changes: - ovsqlite.c: direct_open() opens the database read-only with schema version validation, journal mode verification, decompression support, and goto-based cleanup. Read functions (groupstats, getartinfo, search) branch on direct_reader flag. Write operations return errors in direct reader mode. Falls back to server path gracefully. - ovsqlite-server.c: WAL checkpoint at shutdown uses PASSIVE first (non-blocking, safe with readers connected) then TRUNCATE only if no readers hold pins. Prevents server shutdown from hanging. - ovsqlite.c: Fix pre-existing buffer leak in ovsqlite_close(): request and response buffers from server_connect() were never freed. - sql-read.sql: read-only prepared statements with busy_timeout=10000 and query_only=1. - New readercachesize config parameter (default 8 MB per reader). Includes unit tests (WAL gating, write rejection) and integration tests that start a real ovsqlite-server, write compressed data through it, kill the server, and verify the direct reader can read it back correctly including a server restart cycle.
|
Potential memory leak: you should call sqlite3_close_v2 when |
|
I see a real risk of checkpoint starvation here. It might be necessary to add server code to checkpoint and truncate when the WAL size reaches some threshold, accepting that this will block readers for a while. |
|
Thanks for your review, Bo; it is greatly appreciated. Kevin, it is quite a great improvement. Many thanks for the extensive patch, including documentation and a test suite! I note extra indentation in lines 728-761 of ovsqlite.c starting with |
|
Let me work through the reviews, I'll post follow up commits that can be squashed into one when it is ready for merging. |
When WAL mode is enabled with direct readers, the WAL file can grow without bound if checkpointing never occurs between commits. Add a configurable walcheckpointthreshold (default 1000 pages) that triggers a TRUNCATE checkpoint during the server's idle period. The checkpoint temporarily lowers the busy_timeout to 10 seconds (from ~31 years) so the server waits for readers to finish their current queries but doesn't stall indefinitely. Readers use autocommit with sub-millisecond snapshots, so the checkpoint typically completes in milliseconds. Also fix notice() messages being lost after daemon() by registering message_handlers_notice, remove the unused checkpoint_wal prepared statement from sql-main.sql, and update close_db() comment.
|
@blgl I think checkpoints are addressed now but this is certainly tricky machinery. My understanding is the readers effectively never block with WAL, and use very short auto transactions, and this followup attempts to add bounded waits to the server and make forward progress. |
ovsqlite WAL, mmap, direct readers: InterNetNews/inn#337 InterNetNews/inn#338 bloom filter InterNetNews/inn#339
Move pragma journal_mode out of the sql-main.sql preamble and into open_db() where it is set conditionally: WAL when walmode is true, PERSIST otherwise. Previously the preamble unconditionally set PERSIST mode before open_db() re-set WAL, causing a pointless WAL->PERSIST->WAL transition on every startup (or silently failing if direct readers held the WAL open).
This is the next step after #336, and brings ovsqlite similar to how bdb worked. The WAL PR without this change is a modest benefit so that PR still makes sense standalone but the commit is included here so this is self contained.