fix(memory): roll back pooled MySQL connections on release (follow-up to #296) by fede-kamel · Pull Request #303 · oracle-samples/locus

fede-kamel · 2026-06-03T16:46:23Z

Summary

Follow-up to #296. Fixes a connection-pool transaction-management bug in the MySQL checkpointer that causes an idle-in-transaction → metadata-lock deadlock (and latent stale reads), and fixes the cold-start adapter integration test that was hiding it.

The bug

_MySQLConnectionPool returned connections to the pool without ending their transaction. With the default autocommit=False, every read (load, exists, list_threads, count, get_metadata, query_by_metadata, search_data) — and the SELECT 1 health check added in #296 — opens a transaction that holds a SHARED_READ metadata lock on the tables it touches. A connection pooled in that state sits idle in transaction, which means:

A later DROP/ALTER blocks indefinitely behind the pending EXCLUSIVE MDL.
The next borrower reuses the pinned REPEATABLE READ snapshot and can read stale data.

How it was found

The new test_mysql_adapter_concurrent_same_thread_cold_start from #296 never actually ran the pool: it mutated a frozen AgentState (state.agent_id = ...), raising ValidationError before touching MySQL — so CI (unit-only) stayed green and the integration suite errored out early. Once that test is corrected to use the immutable model_copy(update=...) pattern, it reaches the pool and hangs.

Reproduced against MySQL 9.6:

performance_schema.metadata_locks
+----------------+-------------+-------------+
| OBJECT_NAME    | LOCK_TYPE   | LOCK_STATUS |
+----------------+-------------+-------------+
| t_xxxxxxxx     | SHARED_READ | GRANTED     |   <- idle pooled conns (Sleep, trx RUNNING)
| t_xxxxxxxx     | SHARED_READ | GRANTED     |
| ...            | ...         | ...         |
| t_xxxxxxxx     | EXCLUSIVE   | PENDING     |   <- DROP TABLE, blocked ~forever
+----------------+-------------+-------------+

The fix

Roll back any open transaction when a connection is returned to the pool (_release), so every pooled connection is handed out clean; discard it if the rollback fails. This is standard pool hygiene and fixes both the MDL deadlock and the stale-read window regardless of the autocommit setting.
Factor the connector error tuple into a shared _connection_errors() helper used by both _is_healthy and _release (a dead-connection rollback() raises mysql.connector.Error, which the previous narrow except would have let escape the finally).
Fix the cold-start adapter test to use model_copy(update=...) instead of mutating a frozen model.

Validation

MySQL integration suite (LOCUS_MYSQL_INTEGRATION=1) against MySQL 9.6: 11 passed, stable across repeated runs (previously: 1 hang + 1 error).
Standalone stress repro of the previously-hanging scenario: 5/5 runs complete.
Full unit suite: pass, total coverage 93.2% (≥90% gate); added 2 regression tests for _release (rollback-on-return, discard-on-rollback-failure).
ruff, ruff format --check, and hatch run typecheck all clean.

The MySQL connection pool returned connections to the pool without ending their transaction. Under the default autocommit=False, every read (load/exists/list_threads/count/get_metadata/query_by_metadata/ search_data) and the SELECT 1 health check opens a transaction that holds a SHARED_READ metadata lock on the tables it touched. A pooled connection left in that state sits idle-in-transaction: - a later DROP/ALTER blocks indefinitely on the pending EXCLUSIVE MDL (proved against MySQL 9.6: DROP TABLE hung behind GRANTED SHARED_READ locks held by sleeping locus connections), and - the next borrower reuses the pinned REPEATABLE READ snapshot and can read stale data. Roll back on return so every pooled connection is handed out clean, and discard it if the rollback fails. Factor the connector error tuple into a shared _connection_errors() helper used by both _is_healthy and the new _release. Also fix the cold-start adapter integration test, which mutated a frozen AgentState (state.agent_id = ...) and raised before exercising the pool; use the immutable model_copy(update=...) pattern. This is the test that surfaced the hang once it could actually run. Follow-up to #296. Signed-off-by: Federico Kamelhar <federico.kamelhar@oracle.com>

luigisaetta · 2026-06-04T03:46:02Z

Nice fix. Rolling back on pool release is the right boundary: it protects reads, health checks, and any future code path that borrows a connection without having to remember transaction cleanup locally.

luigisaetta

ok approved, great catch!

oracle-contributor-agreement Bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Jun 3, 2026

fede-kamel force-pushed the fix/mysql-pool-idle-in-transaction branch from 59c6f4c to 5578b88 Compare June 3, 2026 16:47

fede-kamel mentioned this pull request Jun 3, 2026

Feat/mysql checkpointer #296

Merged

fede-kamel requested a review from luigisaetta June 3, 2026 16:53

This was referenced Jun 4, 2026

fix(memory): atomic cross-process MySQL checkpoint index (resolves #301) #304

Open

cross-process: checkpoint-index updates aren't atomic across processes (follow-up to #300) #301

Open

luigisaetta approved these changes Jun 4, 2026

View reviewed changes

luigisaetta merged commit 770b6eb into main Jun 4, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(memory): roll back pooled MySQL connections on release (follow-up to #296)#303

fix(memory): roll back pooled MySQL connections on release (follow-up to #296)#303
luigisaetta merged 1 commit into
mainfrom
fix/mysql-pool-idle-in-transaction

fede-kamel commented Jun 3, 2026

Uh oh!

luigisaetta commented Jun 4, 2026

Uh oh!

luigisaetta left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

fede-kamel commented Jun 3, 2026

Summary

The bug

How it was found

The fix

Validation

Uh oh!

luigisaetta commented Jun 4, 2026

Uh oh!

luigisaetta left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants