fix: SQLite 'database is locked' under concurrent writes#6474
fix: SQLite 'database is locked' under concurrent writes#6474Soulter merged 1 commit intoAstrBotDevs:masterfrom
Conversation
The async engine is created without a busy timeout, so concurrent
writes (agent responses, metrics, session updates) fail instantly
with 'database is locked' instead of waiting for the lock.
Add connect_args={'timeout': 30} for SQLite engines so the driver
waits up to 30 seconds for the write lock. Combined with the existing
WAL journal mode, this handles the typical concurrent write bursts
from agent + metrics + session operations.
Fixes AstrBotDevs#6443
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves frequent 'database is locked' errors encountered in AstrBot when multiple asynchronous operations attempt to write to SQLite concurrently. The fix introduces a 30-second busy timeout for SQLite database connections, allowing the driver to wait for write locks to be released. This significantly improves the stability and resilience of concurrent database interactions without altering the existing WAL journal mode, ensuring proper backpressure for serialized writes. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- Using
"sqlite" in self.DATABASE_URLto detect SQLite is a bit fragile; consider checking the URL scheme/driver name (e.g., via SQLAlchemy URL parsing or startswith"sqlite:"/"sqlite+") to avoid accidental matches. - The
timeout=30value is currently hardcoded; it might be useful to make this configurable (e.g., via settings/env) so deployments with different concurrency patterns can tune it without code changes.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Using `"sqlite" in self.DATABASE_URL` to detect SQLite is a bit fragile; consider checking the URL scheme/driver name (e.g., via SQLAlchemy URL parsing or startswith `"sqlite:"` / `"sqlite+"`) to avoid accidental matches.
- The `timeout=30` value is currently hardcoded; it might be useful to make this configurable (e.g., via settings/env) so deployments with different concurrency patterns can tune it without code changes.
## Individual Comments
### Comment 1
<location path="astrbot/core/db/__init__.py" line_range="41" />
<code_context>
+ # second write is attempted. Setting timeout=30 tells SQLite to
+ # wait up to 30 s for the lock, which is enough to ride out brief
+ # write bursts from concurrent agent/metrics/session operations.
+ is_sqlite = "sqlite" in self.DATABASE_URL
+ connect_args = {"timeout": 30} if is_sqlite else {}
self.engine = create_async_engine(
</code_context>
<issue_to_address>
**suggestion:** Use a stricter check for SQLite URLs instead of a substring search.
`"sqlite" in self.DATABASE_URL` can match non-SQLite URLs (e.g., if `sqlite` appears in credentials or query params). Consider parsing the URL and checking the scheme (e.g. via `sqlalchemy.engine.make_url(...).drivername`) or at least using `self.DATABASE_URL.startswith("sqlite")` so the timeout is only applied for real SQLite connections.
Suggested implementation:
```python
from sqlalchemy.engine import make_url
from sqlalchemy.ext.asyncio import create_async_engine
```
```python
# SQLite only supports a single writer at a time. Without a busy
# timeout the driver raises "database is locked" instantly when a
# second write is attempted. Setting timeout=30 tells SQLite to
# wait up to 30 s for the lock, which is enough to ride out brief
# write bursts from concurrent agent/metrics/session operations.
url = make_url(self.DATABASE_URL)
is_sqlite = url.drivername.startswith("sqlite")
connect_args = {"timeout": 30} if is_sqlite else {}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| # second write is attempted. Setting timeout=30 tells SQLite to | ||
| # wait up to 30 s for the lock, which is enough to ride out brief | ||
| # write bursts from concurrent agent/metrics/session operations. | ||
| is_sqlite = "sqlite" in self.DATABASE_URL |
There was a problem hiding this comment.
suggestion: Use a stricter check for SQLite URLs instead of a substring search.
"sqlite" in self.DATABASE_URL can match non-SQLite URLs (e.g., if sqlite appears in credentials or query params). Consider parsing the URL and checking the scheme (e.g. via sqlalchemy.engine.make_url(...).drivername) or at least using self.DATABASE_URL.startswith("sqlite") so the timeout is only applied for real SQLite connections.
Suggested implementation:
from sqlalchemy.engine import make_url
from sqlalchemy.ext.asyncio import create_async_engine # SQLite only supports a single writer at a time. Without a busy
# timeout the driver raises "database is locked" instantly when a
# second write is attempted. Setting timeout=30 tells SQLite to
# wait up to 30 s for the lock, which is enough to ride out brief
# write bursts from concurrent agent/metrics/session operations.
url = make_url(self.DATABASE_URL)
is_sqlite = url.drivername.startswith("sqlite")
connect_args = {"timeout": 30} if is_sqlite else {}There was a problem hiding this comment.
Code Review
This pull request correctly addresses the database is locked error with SQLite by introducing a busy timeout. The implementation is clear and directly solves the problem of concurrent writes. I've provided one suggestion to make the logic for detecting an SQLite connection more robust, which will prevent potential issues if other database URLs happen to contain 'sqlite' in their string.
| is_sqlite = "sqlite" in self.DATABASE_URL | ||
| connect_args = {"timeout": 30} if is_sqlite else {} |
There was a problem hiding this comment.
For improved robustness and conciseness, these lines can be combined. Using startswith("sqlite") is more reliable than checking for containment with in. This change prevents misidentifying a non-SQLite database that might have "sqlite" in its name or path, which could lead to errors if that database driver does not support the timeout connection argument.
| is_sqlite = "sqlite" in self.DATABASE_URL | |
| connect_args = {"timeout": 30} if is_sqlite else {} | |
| connect_args = {"timeout": 30} if self.DATABASE_URL.startswith("sqlite") else {} |
…6474) The async engine is created without a busy timeout, so concurrent writes (agent responses, metrics, session updates) fail instantly with 'database is locked' instead of waiting for the lock. Add connect_args={'timeout': 30} for SQLite engines so the driver waits up to 30 seconds for the write lock. Combined with the existing WAL journal mode, this handles the typical concurrent write bursts from agent + metrics + session operations. Fixes AstrBotDevs#6443
Problem
AstrBot frequently hits
(sqlite3.OperationalError) database is lockedwhen multiple async operations (agent responses, metrics inserts, session updates) write to SQLite concurrently. This affects both Windows and Linux deployments and is especially noticeable after importing personas or during rapid conversations.Root Cause
create_async_engine()inBaseDatabase.__init__doesn't set a busy timeout:SQLite only supports one writer at a time. Without a timeout, the second concurrent write fails immediately instead of waiting for the lock to be released.
Fix
Add
connect_args={"timeout": 30}for SQLite engines. This tells the driver to wait up to 30 seconds for the write lock before raising an error — more than enough to handle the typical concurrent write bursts.Combined with the existing WAL journal mode (set in
SQLiteDatabase.initialize()), this provides concurrent reads + serialized writes with proper backpressure.Fixes #6443
Summary by Sourcery
Bug Fixes:
sqlite3.OperationalError: database is lockederrors during concurrent SQLite writes by adding a busy timeout to async engine creation.