fix: Apply SQLite optimizations to the custom connection_string in SqlStorageClient#1837
Conversation
There was a problem hiding this comment.
Pull request overview
Applies SQLite concurrency/performance pragmas (notably journal_mode=WAL) even when SqlStorageClient is configured via a custom SQLite connection_string, rather than only for the default database path.
Changes:
- Replace the “default DB only” gate with a SQLite-specific optimization gate based on the provided
connection_string. - Ensure SQLite pragmas are executed during initialization for SQLite connection strings when
engineis not explicitly provided. - Update
uv.lockmetadata/options for dependency resolution reproducibility.
Reviewed changes
Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
uv.lock |
Adds uv lock options (exclude-newer, span) affecting how the lock was produced/resolved. |
src/crawlee/storage_clients/_sql/_storage_client.py |
Expands SQLite pragma application to custom SQLite connection_string configurations (when engine is not provided). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Flag needed to apply optimizations only for default database | ||
| self._default_flag = self._engine is None and self._connection_string is None | ||
| # Flag needed to apply optimizations only for SQLite database | ||
| self._optimization_flag = self._engine is None and ( |
There was a problem hiding this comment.
Could we instead set up an _on_connect callback here that would receive the new DB connection? That way, the logic (if sqlite, set up some pragmas) would be in one place.
|
|
||
| def _on_connect(self, dbapi_conn: DBAPIConnection, _connection_record: ConnectionPoolEntry) -> None: | ||
| """Event listener for new database connections to set pragmas.""" | ||
| if self._dialect_name == 'sqlite': |
There was a problem hiding this comment.
Can't this be derived from the dbapi_conn? 🙂
There was a problem hiding this comment.
This is a driver-specific connector. While it is possible to check the dialect via isinstance(dbapi_conn, sqlite3.Connection), this approach isn't convenient.
Description
crawleecan have highly concurrent access to the database, SQLite optimizations must be applied, the most critical beingjournal_mode=WAL. If a user wants fine-grained control over connections, they should pass a customengineinstead of a 'connection_string'.Issues