Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Unhandled "WebSocket connection closed" when CDP connection is unstable #29830

Merged
merged 22 commits into from
Jul 16, 2024

Conversation

cacieprins
Copy link
Contributor

Additional details

When CDP throws certain errors on a command, CriClient enqueues the original command and attempt to reconnect to CDP. When CDP reconnects, CriClient tries to send this command again, after restoring subscriptions and enablements. This bug appears to occur when CDP disconnects immediately after reconnecting while a previously enqueued command is pending.

The fix is to re-enqueue failed commands upon reconnect, and attempt to reconnect again.

This PR includes that fix, as well as some quality of life refactors to CriClient:

  • extracts most command queue logic to its own class
  • divides state restoration (subscriptions and enablements) from command retries (when a command fails due to being pending when CDP disconnects)
  • Removes bespoke retry logic to use a shared async-retry util
  • refactors some unit tests for grokability

Steps to test

The original issue is very difficult to reproduce - the fix is expected due to new tests that produce the issue. Typical CI & manual testing is sufficient.

How has the user experience changed?

PR Tasks

@cacieprins cacieprins changed the title Cacie/fix/ws disconnected trampoline Fix: Unhandled "WebSocket connection closed" when CDP connection is unstable Jul 10, 2024
@cacieprins cacieprins marked this pull request as ready for review July 10, 2024 18:17
@cacieprins cacieprins changed the title Fix: Unhandled "WebSocket connection closed" when CDP connection is unstable fix: Unhandled "WebSocket connection closed" when CDP connection is unstable Jul 10, 2024
Copy link

cypress bot commented Jul 10, 2024

3 flaky tests on run #56153 ↗︎

0 4677 951 0 Flakiness 3

Details:

Merge branch 'develop' into cacie/fix/ws-disconnected-trampoline
Project: cypress Commit: 6a4031b7bd
Status: Passed Duration: 16:26 💡
Started: Jul 15, 2024 7:37 PM Ended: Jul 15, 2024 7:54 PM
Flakiness  cypress/e2e/commands/net_stubbing.cy.ts • 3 flaky tests • 5x-driver-webkit

View Output

Test Artifacts
network stubbing > intercepting request > can delay and throttle a StaticResponse
    </td>
  </tr>
  <tr>
    <td colspan="2">
      <a href="https://cloud.cypress.io/projects/ypt4pf/runs/56153/overview/c1b89eb6-b805-42fb-9657-00502c5eba47?reviewViewBy=FLAKY&utm_source=github&utm_medium=flaky&utm_campaign=view%20test">
        ... > with `resourceType` > can match a proxied image request by resourceType
      </a>
    </td>
    <td>
      
    </td>
  </tr>
  <tr>
    <td colspan="2">
      <a href="https://cloud.cypress.io/projects/ypt4pf/runs/56153/overview/5ae404ce-0ca6-45cc-9278-4a6b25295354?reviewViewBy=FLAKY&utm_source=github&utm_medium=flaky&utm_campaign=view%20test">
        ... > stops waiting when an xhr request is canceled
      </a>
    </td>
    <td>
      
    </td>
  </tr></table>

Review all test suite changes for PR #29830 ↗︎

cli/CHANGELOG.md Outdated Show resolved Hide resolved
@cacieprins cacieprins requested review from mschile and removed request for AtofStryker July 15, 2024 15:38
import type { CdpCommand } from './cdp_automation'
import Debug from 'debug'

const debug = Debug('cypress:server:browsers:cdp-command-queue')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How chatty will these logs be? Do we want to make some of them verbose and some of them non-verbose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very chatty - commands are only pushed into the command queue when they come in while the websocket connection is unexpectedly disconnected. params might be a larger payload, though, and could benefit from a verbose (or removal, as it's not pertinent to the command queue operations)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in fc52033

Copy link
Collaborator

@ryanthemanuel ryanthemanuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I like that we're iteratively refactoring a lot of this stuff. Nice work!

@cacieprins cacieprins merged commit 8a48ee7 into develop Jul 16, 2024
82 of 83 checks passed
@cacieprins cacieprins deleted the cacie/fix/ws-disconnected-trampoline branch July 16, 2024 13:46
@cypress-bot
Copy link
Contributor

cypress-bot bot commented Jul 16, 2024

Released in 13.13.1.

This comment thread has been locked. If you are still experiencing this issue after upgrading to
Cypress v13.13.1, please open a new issue.

@cypress-bot cypress-bot bot locked as resolved and limited conversation to collaborators Jul 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error: WebSocket connection closed (in cypress 13.10)
6 participants