Database Layer Overhaul inclduing Thread Safety, Connection Management & Configuration#32
Merged
MatrixEditor merged 4 commits intoMatrixEditor:masterfrom Mar 21, 2026
Conversation
Thread safety: - Use scoped_session as a thread-local registry (property, not shared instance) so each handler thread gets its own Session and connection - Add _release() via _scoped_session.remove() in try/finally on every public method to return connections to the pool immediately - Atomic duplicate check + INSERT inside one lock scope to prevent TOCTOU race condition - Rollback on every error path to prevent PendingRollbackError cascade - Catch PoolTimeoutError gracefully (warning + skip, hash in file stream) - Display logging gated on successful DB write Code quality: - Extract _check_duplicate() and _log_credential() from 170-line add_auth() - Add _handle_db_error() shared helper for duplicated try/except - Fix add_host_extra() lock: contextlib.nullcontext replaces manual acquire/release with bool flag - Fix add_host_extra() connection leak on standalone calls (missing _release) - Fix add_host() unconditional commit (only when values change) - Fix missing commit() in add_host_extra() update branch - Fix db_path for :memory: (was str(None)) - expire_on_commit=False so ORM objects survive _release() Engine configuration (per SQLAlchemy 2.0 docs): - :memory: -> StaticPool, SQLite file -> QueuePool, MySQL -> QueuePool - skip_autocommit_rollback + pool_reset_on_return=None - pool_use_lifo for natural idle-connection expiry - Drop deprecated future=True, dead init_dementor_db(), dead config fields - Fix :memory: URL (was missing third slash) - Rename db_raw_path -> db_url Constants: - _CLEARTEXT/_NO_USER/_HOST_INFO -> CLEARTEXT/NO_USER/HOST_INFO with backward-compatible aliases
test_db.py (121 tests across 20 classes): - __init__.py: constants, aliases, __all__ exports, normalize_client_address - connector.py: DatabaseConfig defaults/loading, init_engine for all backends, create_db success and failure paths - model.py: DementorDB init/lifecycle/close, session thread-locality, _release isolation, add_host, add_host_extra, add_auth, protocol resolution, duplicate detection (10 variations), extras handling, logging output, _check_duplicate, error handling, connection release, thread safety test_db_concurrency.py (2 stress tests): - 20-thread concurrent add_auth on SQLite memory and file All tests use SQLite -- no external database required.
Three options, each with its own section header and examples: - Url: advanced, for MySQL/MariaDB/PostgreSQL (makes Path ignored) - Path: default SQLite backend (relative, absolute, or :memory:) - DuplicateCreds: store all hashes or deduplicate Removed stale commented-out Dialect/Driver fields.
- Add Choosing a backend section with comparison table - Document all 3 active options (Url, Path, DuplicateCreds) with py:attribute directives, examples, bullet points, tips, and notes - Mark Dialect and Driver as removed with versionremoved directives - Sphinx HTML build passes with -W --keep-going (zero warnings)
MatrixEditor
approved these changes
Mar 21, 2026
Owner
MatrixEditor
left a comment
There was a problem hiding this comment.
LGTM! I will be reviewing the DB tests more thoroughly, but the overall concept is a good starting point,
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hey! While working on the SMB server and testing against a bunch of Windows clients at once, I ran into a database crash that turned out to be a threading issue in the DB layer. Fixing that led me down a rabbit hole of related improvements, so I ended up doing a thorough pass through all three files in
dementor/db/. This PR brings the DB code in line with SQLAlchemy 2.0 best practices, adds a full test suite, and cleans up the configuration docs for users.Nothing here changes the external API that protocol handlers use —
add_auth(),add_host(), and thesession.db.sessionproperty all work exactly as before. The changes are all internal.The bug that started this
When multiple protocol handlers (SMB, HTTP, LDAP, etc.) were capturing hashes at the same time, the database would crash with
pymysql.err.InternalError: Packet sequence number wrong, followed by an endless cascade ofPendingRollbackErroron every thread. Once it started, no more hashes could be stored for the rest of the session.The root cause was in how
scoped_sessionwas being used. The code calledscoped_session(factory)()at startup — that trailing()immediately created one Session on the main thread and stored it as a shared attribute. Every handler thread then used that same Session (and the same underlying database connection). With one client at a time that works fine, but under concurrent load the threads corrupt each other's connection state.What's in this PR
Commit 1 — Code (
dementor/db/)Thread safety:
scoped_sessionis now stored as a registry, not a single instance. Asessionproperty returns each thread's own Session on demand — no shared state between threads.try/finallyand calls_scoped_session.remove()when done. This returns connections to the pool immediately instead of leaking them for the thread's lifetime.Code quality:
add_auth()into_check_duplicate(),_log_credential(), and a focusedadd_auth()that orchestrates the two.lock.acquire()/lock.release()withno_lockboolean inadd_host_extra()with acontextlib.nullcontext()pattern.add_host()committing on every call even when nothing changed.commit()inadd_host_extra()when updating an existing key.:memory:URL construction — it was missing the third slash, so SQLAlchemy parsedmemory:as a hostname.Engine configuration (per SQLAlchemy 2.0 docs):
:memory:SQLite usesStaticPool(DB only exists inside one connection).QueuePool(the new 2.0 default).QueuePoolwithpool_use_lifo,pool_pre_ping, and a fast-fail timeout.skip_autocommit_rollbackandpool_reset_on_return=Noneto eliminate useless rollbacks underAUTOCOMMIT.future=Trueparameter.init_dementor_db()function and deaddb_dialect/db_driverconfig fields.db_raw_pathtodb_urlto match the TOML key.Constants:
_CLEARTEXT/_NO_USER/_HOST_INFOtoCLEARTEXT/NO_USER/HOST_INFOwith backward-compatible aliases so nothing breaks.Commit 2 — Tests
124 tests total, all using SQLite — no external database needed for CI.
test_db.pytest_db_concurrency.pytest_version.pyCommit 3 — Config (
Dementor.toml)Rewrote the
[DB]section so each option stands on its own with clear documentation:Commit 4 — Docs (
database.rst)Rewrote the Sphinx documentation to match the current code:
py:attribute, type/default, examples, and tipsDialectandDrivermarked as removed withversionremoveddirectives-W --keep-going(zero warnings)Live testing
Tested all database backends with multiple simultaneous Windows clients:
Path = "/tmp/custom.db"Path = ":memory:"Url = "mysql+pymysql://..."Url = "mysql+pymysql://..."Happy to adjust anything — let me know what you think!