Add keepalives and safe recreation to session management#7
Merged
merlimat merged 2 commits intooxia-db:mainfrom Apr 15, 2026
Merged
Add keepalives and safe recreation to session management#7merlimat merged 2 commits intooxia-db:mainfrom
merlimat merged 2 commits intooxia-db:mainfrom
Conversation
Send periodic KeepAlive RPCs from a background thread so that ephemeral state tied to a client session does not get garbage-collected by the server, and teach SessionManager to transparently recreate sessions that have been closed (locally expired or explicitly torn down). The heartbeat cadence defaults to roughly one tenth of the session timeout, floored at 2s and capped strictly below the session timeout so that short-timeout configurations still get at least one heartbeat before the local expiry check fires. The cadence is also injectable for tests. SessionManager.on_session_closed now verifies the tracked session is still the one that closed before evicting, so that a late callback from an already-replaced session cannot drop a live replacement and leak its heartbeat thread. A new tests/sessions_test.py drives the heartbeat loop deterministically via fake stubs and a short injected interval, so the whole suite runs in well under a second. Signed-off-by: Matteo Merli <mmerli@apache.org>
00b7ac0 to
99c88c6
Compare
The previous `session_timeout_ms < 2` guard rejected only the degenerate case. The real constraint is that the default heartbeat cadence has a 2000ms floor, so a timeout below that cannot fit a single default heartbeat before the local expiry check fires. Now `_resolve_heartbeat_interval_ms` enforces `session_timeout_ms >= _DEFAULT_HEARTBEAT_FLOOR_MS` only when the caller did not supply an explicit `heartbeat_interval_ms`. Callers that pass a custom interval (including the test suite) can still configure tight timing deterministically. Signed-off-by: Matteo Merli <mmerli@apache.org>
Closed
merlimat
added a commit
to merlimat/oxia-client-python
that referenced
this pull request
Apr 16, 2026
…db#7, oxia-db#9) Bug oxia-db#7: put() used `type(value) is str` (rejects str subclasses) and `bytes(str(value), ...)` (redundant str()). Non-str/non-bytes values were forwarded to protobuf, erroring deep in serialization. Extracted _coerce_value(): isinstance checks, clear TypeError for invalid types, str.encode() instead of redundant bytes(str(...)). Bug oxia-db#9: `raise oxia.ex.KeyNotFound` (class) was inconsistent with the rest of the file which uses `raise KeyNotFound()` (instance). Both are legal Python but the instance form is conventional. Bug oxia-db#10 (non-atomic assignment swap) was a false positive — the existing _parse_assignments already holds self._lock during both the clear and repopulate. Dropped from the fix list. Signed-off-by: Matteo Merli <mmerli@apache.org>
2 tasks
merlimat
added a commit
to merlimat/oxia-client-python
that referenced
this pull request
Apr 16, 2026
…db#7, oxia-db#9) Bug oxia-db#7: put() used `type(value) is str` (rejects str subclasses) and `bytes(str(value), ...)` (redundant str()). Non-str/non-bytes values were forwarded to protobuf, erroring deep in serialization. Extracted _coerce_value(): isinstance checks, clear TypeError for invalid types, str.encode() instead of redundant bytes(str(...)). Bug oxia-db#9: `raise oxia.ex.KeyNotFound` (class) was inconsistent with the rest of the file which uses `raise KeyNotFound()` (instance). Both are legal Python but the instance form is conventional. Bug oxia-db#10 (non-atomic assignment swap) was a false positive — the existing _parse_assignments already holds self._lock during both the clear and repopulate. Dropped from the fix list. Signed-off-by: Matteo Merli <mmerli@apache.org>
merlimat
added a commit
that referenced
this pull request
Apr 16, 2026
… (#19) Bug #7: put() used `type(value) is str` (rejects str subclasses) and `bytes(str(value), ...)` (redundant str()). Non-str/non-bytes values were forwarded to protobuf, erroring deep in serialization. Extracted _coerce_value(): isinstance checks, clear TypeError for invalid types, str.encode() instead of redundant bytes(str(...)). Bug #9: `raise oxia.ex.KeyNotFound` (class) was inconsistent with the rest of the file which uses `raise KeyNotFound()` (instance). Both are legal Python but the instance form is conventional. Bug #10 (non-atomic assignment swap) was a false positive — the existing _parse_assignments already holds self._lock during both the clear and repopulate. Dropped from the fix list. Signed-off-by: Matteo Merli <mmerli@apache.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
KeepAliveRPCs from a background thread so ephemeral state tied to a client session stays alive on the server.SessionManagerto transparently recreate sessions that have been closed, while makingon_session_closedrace-safe so a late callback from an already-replaced session cannot drop a live replacement.tests/sessions_test.pywith deterministic fake stubs; the whole new suite runs in well under a second.This is an alternative to #6 that addresses the review feedback from both Copilot and Codex.
Heartbeat cadence
_resolve_heartbeat_interval_ms()picks a cadence of roughlysession_timeout_ms / 10, floored at 2s, and — crucially — capped strictly belowsession_timeout_ms. The cap is the fix for the small-timeout foot-gun flagged on #6: withmax(session_timeout_ms // 10, 2_000)alone, anysession_timeout_ms < 2000makes the loop sleep past the timeout, so the local-expiry check fires before a single heartbeat is ever sent. The cadence is also injectable for tests, which is howsessions_test.pystays fast and deterministic without sleeping for wall-clock seconds.Race fix in
on_session_closedSession.close()now notifies the manager before running the potentially slowclose_sessionRPC, so replacement sessions created during the RPC don't race with the callback. Andon_session_closedonly evicts when the tracked entry is still the same object:Without this, a late callback from an old session could evict a newer replacement that
get_session()had installed during the close-RPC window, leaving a live session with a running heartbeat thread completely unmanaged.Test plan
pytest tests/sessions_test.py— 12 passed in ~0.35s, run 5x with no flakespytest tests/compare_test.py tests/sessions_test.py— 13 passedpython -c "import oxia"— no import regressionsclient_test.py) against a real Oxia server — requires Docker locally, please run on CI