Add SSE watchdog and improve connection error handling #20
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR addresses issues where customers were reporting stale configuration data. Investigation revealed several failure modes where the SSE streaming connection could silently fail or never start, leaving clients stuck with outdated configs.
Changes
SSE Watchdog: New monitoring thread that detects stuck SSE connections by tracking keepalive activity. If no data is received for 120 seconds (4 missed 30s keepalives), it triggers recovery by polling for fresh data and forcing SSE reconnection.
Fixed 401/403 handling: The previous code caught
UnauthorizedExceptionwhich is never raised byraise_for_status(). Now properly catchesHTTPErrorand inspectsresponse.status_codefor 401/403.Fixed silent loop exits: Changed
except Exceptiontoexcept BaseExceptionand addedfinallyblock logging to detect when the streaming loop exits unexpectedly.Fixed streaming startup on checkpoint failure: If checkpoint loading fails (CDN down, unexpected exception), streaming now starts as a fallback so SSE can potentially load configs. Preserves timeout behavior for
get()calls.Dev runner script: Added
dev_runner.pyfor observing SDK behavior during development.Files Changed
sdk_reforge/_sse_watchdog.pysdk_reforge/_sse_connection_manager.pysdk_reforge/config_sdk.pytests/test_sse_watchdog.pytests/test_sse_connection_manager.pytests/test_config_sdk.pydev_runner.pyTest plan
handle_unauthorized_responseis calleddev_runner.pyto observe SSE connection and watchdog behavior🤖 Generated with Claude Code