Skip to content

Fix Notifications.close() deadlock#13

Merged
merlimat merged 3 commits intooxia-db:mainfrom
merlimat:fix/bug-6-close-deadlock
Apr 16, 2026
Merged

Fix Notifications.close() deadlock#13
merlimat merged 3 commits intooxia-db:mainfrom
merlimat:fix/bug-6-close-deadlock

Conversation

@merlimat
Copy link
Copy Markdown
Contributor

close() joined worker threads while holding self._lock. Workers acquire the same lock to update _last_notification after each batch — if close() catches a worker mid-update, both sides wait on each other.

Fix: cancel streams under the lock, then release before joining.

Includes regression test with a high-throughput fake stream that reproduces the deadlock.

close() was joining worker threads while holding self._lock. Workers
acquire the same lock to update _last_notification after each batch,
so if close() catches a worker mid-update both sides wait on each
other indefinitely.

Fix: cancel streams under the lock (so no new batches arrive), then
release the lock before joining.

Includes a regression test that reproduces the deadlock by having the
worker cycle through the lock hundreds of times during close().

Signed-off-by: Matteo Merli <mmerli@apache.org>
Signed-off-by: Matteo Merli <mmerli@apache.org>

# Conflicts:
#	src/oxia/internal/notifications.py
The reconnect test belongs in PR oxia-db#14 (bug oxia-db#1) which rewrites the
Notifications class. This PR only fixes the close() deadlock.

Signed-off-by: Matteo Merli <mmerli@apache.org>
@merlimat merlimat merged commit f9e12f3 into oxia-db:main Apr 16, 2026
6 checks passed
@merlimat merlimat deleted the fix/bug-6-close-deadlock branch April 16, 2026 02:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant