Fix connection lifecycle: disconnect propagation, connect dedup, client cleanup#74
Merged
Merged
Conversation
…pe connect, close clients Three connection-ownership issues: - Disconnect/close hung in-flight commands (ISSUE-3). When the WebSocket dropped or close() was called, the receive loop just ended; any callCommand parked on its response deferred waited forever. The loop now fails all pending requests (new ConnectionClosedException) on both normal completion of incoming() and on error, and close() does the same, so callers observe a failure instead of hanging. The receive-loop error is now logged (logger.error) rather than printStackTrace'd. - connect() was unsynchronized (ISSUE-4). Concurrent first commands could each pass the isActive check and open a duplicate session, leaking the extra socket and listener. connect() is now guarded by a double-checked connectMutex, and the prepareExpert/prepareHeadless block by a separate prepareMutex (distinct mutex avoids re-entrancy: prepare* issue ONE_SHOT commands that skip that block). - HttpClients were never closed (ISSUE-9). KtorWebSocketTransport.close() now closes its engine-backed client, and DefaultBrowser.stop() closes the HTTP-API client, so create/stop cycles no longer leak threads/connection pools. Verified red->green with ConnectionLifecycleTest (fake transport): callCommand fails with ConnectionClosedException on disconnect and on close, and concurrent first commands open the transport exactly once. All three fail against the unfixed code and pass after. ISSUE-9 is resource cleanup, verified by inspection plus the real-browser create/stop suite. Full :core:jvmTest (real Chrome) + :opentelemetry:jvmTest pass; macOS native tests pass; Linux/MinGW compile.
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three connection-ownership issues, sharing the lifecycle root cause. Builds on the
WebSocketTransportseam +pendingRequestsregistry from #73.ISSUE-3 — disconnect/close hung in-flight commands
When the WebSocket dropped or
close()was called, the receive loop just ended; anycallCommandparked on its response deferred waited forever (the loop onlyprintStackTrace'd). NowfailPendingRequests(...)drains the registry and completes every waiter exceptionally with a newConnectionClosedException:incoming()(socket closed),logger.error, notprintStackTrace),close().Callers observe a failure instead of hanging.
ISSUE-4 — unsynchronized
connect()opened duplicate sessionsConcurrent first commands could each pass the
isActivecheck and open a second session, leaking the extra socket + listener.connect()is now guarded by a double-checkedconnectMutex; theprepareExpert/prepareHeadlessblock by a separateprepareMutex(distinct mutex avoids re-entrancy —prepare*issueONE_SHOTcommands that skip that block).ISSUE-9 — HttpClients never closed
KtorWebSocketTransport.close()now closes its engine-backedHttpClient;DefaultBrowser.stop()closes the HTTP-API client.createBrowser/stopcycles no longer leak threads/connection pools.Verification (red → green)
New
ConnectionLifecycleTest(fake transport,UnconfinedTestDispatcher) — all three fail against the unfixed code and pass after:callCommandfails withConnectionClosedExceptionon disconnect (RED: hang → timeout).callCommandfails withConnectionClosedExceptionon close (RED: hang → timeout).ISSUE-9 is resource cleanup — not deterministically unit-testable without injecting clients, so verified by inspection plus the real-browser
createBrowser/stopsuite.Full
:core:jvmTest(real Chrome) +:opentelemetry:jvmTestpass; macOS native tests pass; Linux/MinGW compile.Test plan
./gradlew :core:jvmTest --tests "*.ConnectionLifecycleTest"— red on unfixed code, green after./gradlew :core:jvmTest(real Chrome) — all pass./gradlew :opentelemetry:jvmTest— all pass./gradlew :core:macosArm64Test+ Linux/MinGW compile — pass