Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firefox randomly failing to start in CI #7159

Closed
Seb-C opened this issue Apr 28, 2020 · 9 comments · Fixed by #7372
Closed

Firefox randomly failing to start in CI #7159

Seb-C opened this issue Apr 28, 2020 · 9 comments · Fixed by #7372

Comments

@Seb-C
Copy link

Seb-C commented Apr 28, 2020

Current behavior:

When running tests with Firefox in CI (GitHub Actions), it randomly fails to start with this error:

Cypress could not connect to Firefox.

An unexpected error was received from Marionette connection:

Error: cannot open socket

To avoid this error, ensure that there are no other instances of Firefox launched by Cypress running.

It may be linked to #6504 , but because the environment seems to be different, I think a new issue is more appropriate.

Desired behavior:

Cypress runs reliably everytime.

Test code to reproduce

Repository: https://github.com/Seb-C/test-cypress-action
CI script: https://github.com/Seb-C/test-cypress-action/blob/master/.github/workflows/tests.yml
(all tests are the default one when we first do cypress open in a new project)

Runs history (showing the randomness): https://github.com/Seb-C/test-cypress-action/actions
Logs of a failed run: https://github.com/Seb-C/test-cypress-action/runs/625344325?check_suite_focus=true

Versions

  • GitHub actions
  • Ubuntu 18.04.4 LTS
  • Firefox 74.0.1
  • Cypress 4.4.1
@jennifer-shehane
Copy link
Member

Duplicate of #6392

@flotwig
Copy link
Contributor

flotwig commented May 12, 2020

@jennifer-shehane This can be reopened, it seems like #6392 is distinctly about an issue on Windows when launching the second spec via Firefox, and this issue is more about an issue intermittently launching Firefox which can happen on any system.

@flotwig flotwig reopened this May 12, 2020
@flotwig flotwig removed the type: duplicate This issue or pull request already exists label May 12, 2020
@Seb-C
Copy link
Author

Seb-C commented May 13, 2020

I did some additional tests based on random possibilities:

  • Explicitly doing a cypress install everytime: does not change anything
  • Checking for the existing processes before and after: there is really no firefox running (I checked this in case the VM/container were somehow shared with other people)
  • pkill -9 firefox before running cypress: it does not change anything as well
  • I also tried to run Cypress as root but then somehow it does not find the Firefox's binary

@flotwig flotwig self-assigned this May 13, 2020
@flotwig
Copy link
Contributor

flotwig commented May 13, 2020

* Checking for the existing processes before and after: there is really no firefox running (I checked this in case the VM/container were somehow shared with other people)
* `pkill -9 firefox` before running cypress: it does not change anything as well

Interesting, did you check for firefox-bin in the process list too?

That leads me to think it's one of two things:

  1. There is an error while launching Firefox that leads to the process immediately exiting - this would be clearly visible with DEBUG=cypress:* enabled to show more logs
  2. Or, firefox is spawning too slowly, and there is some bug/timeout in the connection code that is throwing a confusing error. Connection code (we use a 3rd-party library for connection here, which may be hiding the true reason this error is thrown):
    reject = (err) => {
    throw err
    }
    }
    return (err) => {
    debug('error in marionette %o', { from, err })
    reject(errors.get('FIREFOX_MARIONETTE_FAILURE', from, err))
    }
    }
    await driver.connect()
    .catch(onError('connection'))
    await new Bluebird((resolve, reject) => {

@flotwig
Copy link
Contributor

flotwig commented May 13, 2020

It's starting to look like (2) - GitHub actions might be underpowered and we're hitting a timeout while loading Firefox. I have a GitHub actions run here with debug logs that demonstrates this: https://github.com/flotwig/test-cypress-action/runs/671731574

It's very likely that the timeout for the Marionette connection is way lower compared to the timeout for the Firefox foxdriver connection/the Chrome CDP connection, because for those, we use custom retry logic in Cypress.

Adjusting setupMarionette in firefox-util to use the inbuilt retry mechanisms with a longer retry period would most likely fix this issue.

@Seb-C
Copy link
Author

Seb-C commented May 14, 2020

Interesting, did you check for firefox-bin in the process list too?
Yes, nothing called firefox at all.

actions-process-log.txt

It's starting to look like (2) - GitHub actions might be underpowered and we're hitting a timeout while loading Firefox.

Interesting find! It seemed to me that my project (which always takes 1~2G or memory) succeeds less frequently than this test project, that could be due to Cypress running more on the swap:

              total        used        free      shared  buff/cache   available
Mem:        7093500     2418484      388408       41808     4286608     4326052
Swap:       4194300         268     4194032

@flotwig
Copy link
Contributor

flotwig commented May 15, 2020

Yeah, so the initial connection timeout was only 2.5 seconds, which seems way too short. 20 seconds still occasionally timed out, but 50 seconds seems to be a good sweet spot from my testing (failures are unrelated to this issue): https://github.com/flotwig/test-cypress-action/commits/master

Opened a PR: #7372

@cypress-bot cypress-bot bot added stage: work in progress stage: needs review The PR code is done & tested, needs review and removed stage: needs review The PR code is done & tested, needs review stage: work in progress labels May 15, 2020
@cypress-bot
Copy link
Contributor

cypress-bot bot commented May 15, 2020

The code for this is done in cypress-io/cypress#7372, but has yet to be released.
We'll update this issue and reference the changelog when it's released.

@cypress-bot
Copy link
Contributor

cypress-bot bot commented May 20, 2020

Released in 4.6.0.

This comment thread has been locked. If you are still experiencing this issue after upgrading to
Cypress v4.6.0, please open a new issue.

@cypress-bot cypress-bot bot locked as resolved and limited conversation to collaborators May 20, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants