Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An attempt at fixing intermittent test failure #1816

Merged
merged 1 commit into from Jul 29, 2015

Conversation

@floehopper
Copy link
Contributor

commented Jul 15, 2015

Capybara::Poltergeist::JavascriptError

This error has been occurring intermittently, but annoyingly regularly,
in the Jenkins CI build. The reported underlying error is:

TypeError: 'null' is not an object (evaluating 'data['title']')

See #1624 for more details.

I've seen the error happen on my local machine and it got to a point yesterday
where it seemed to be happening reasonably regularly, so I decided to
investigate further.

I found I could reproduce the problem by running only the "with JavaScript"
tests in ReportAProblemTest, but the error only seemed to happen at all
reliably if I ran multiple tests together, not just one on its own.

When the error occurred, I noticed it seemed to always coincide with a JSON
request to SmartAnswersController#show logging something like:

Completed 500 Internal Server Error in 13m

However, there was never any mention of an exception or a stack trace. So I
then edited ActionController::LogSubscriber#process_action to log the details
of the exception:

WebMock::NetConnectNotAllowedError
Real HTTP connections are disabled.
Unregistered request: GET http://contentapi.dev.gov.uk/bridge-of-death.json

The exception made it look as if the WebMock stubbing of GdsApi::ContentApi
setup by the call to
GdsApi::TestHelpers::ContentApi#stub_content_api_default_artefact in
EngineIntegrationTest#setup was not working correctly. What was more
confusing was that this stubbing was working most of the time, but just not
all the time, even though the same URLs were being requested.

In reading the WebMock documentation to come up with a plausible hypothesis
for why this might be happening, I came across the net_http_connect_on_start
option
. More in hope than expectation, I tried enabling this option and the
problem appeared to go away. More encouragingly when I disabled it again, the
problem came back.

Although I can't say for sure that this fixes the problem or exactly how it
might be fixing the problem, I'm about 95% sure it does fix the problem. So
I think it's worth trying it out and see whether we see the problem recur in
the Jenkins CI build.

@chrisroos

This comment has been minimized.

Copy link
Contributor

commented Jul 16, 2015

I found I could reproduce the problem by running only the "with JavaScript"
tests in ReportAProblemTest.

How were you running this? I've just tried running it a number of times using ruby test/integration/engine/report_a_problem_test.rb and haven't seen the error.

@floehopper

This comment has been minimized.

Copy link
Contributor Author

commented Jul 16, 2015

@chrisroos: That's how I was running it too. After a while I disabled the non-JS versions of the tests by commenting out the relevant lines of the with_and_without_javascript method. And a bit later I used the following command to run the tests over and over until they failed:

while ruby test/integration/engine/report_a_problem_test.rb; do :;done

My hypothesis is that the problem is some kind of timing/race condition, so I'm not surprised that you didn't see the problem. I suspect I was just lucky (?) that the problem was occurring reasonably regularly on my local machine yesterday.

My main reason for submitting this PR was to see whether (a) anyone understood more about WebMock's net_http_connect_on_start option; and (b) whether or not people thought it was worth trying out this fix on CI.

@floehopper

This comment has been minimized.

Copy link
Contributor Author

commented Jul 29, 2015

The gds-api-adapters gem uses the rest-client gem which in turn uses Net::HTTP. More significantly it uses the Net::HTTP.start method. I think this makes it more likely that we do need to enable the net_http_connect_on_start option on WebMock.

What do people think about merging this to see whether it stops the intermittent build failures on CI?

@tadast

This comment has been minimized.

Copy link
Contributor

commented Jul 29, 2015

It doesn't have any effect on production, so I only see benefits in merging this 👍

@floehopper floehopper force-pushed the fix-capybara-poltergeist-javascript-error branch from 0b64ec1 to ac4ac83 Jul 29, 2015

@chrisroos

This comment has been minimized.

Copy link
Contributor

commented Jul 29, 2015

Let's get it merged and see whether it helps with the intermittent failing tests.

An attempt at fixing intermittent test failure
    Capybara::Poltergeist::JavascriptError

This error has been occurring intermittently, but annoyingly regularly,
in the Jenkins CI build. The reported underlying error is:

    TypeError: 'null' is not an object (evaluating 'data['title']')

See [1] for more details.

I've seen the error happen on my local machine and it got to a point yesterday
where it seemed to be happening reasonably regularly, so I decided to
investigate further.

I found I could reproduce the problem by running only the "with JavaScript"
tests in `ReportAProblemTest`, but the error only seemed to happen at all
reliably if I ran multiple tests together, not just one on its own.

When the error occurred, I noticed it seemed to always coincide with a JSON
request to `SmartAnswersController#show` logging something like:

    Completed 500 Internal Server Error in 13m

However, there was never any mention of an exception or a stack trace. So I
then edited `ActionController::LogSubscriber#process_action` to log the details
of the exception:

    WebMock::NetConnectNotAllowedError
    Real HTTP connections are disabled.
    Unregistered request: GET http://contentapi.dev.gov.uk/bridge-of-death.json

The exception made it look as if the `WebMock` stubbing of `GdsApi::ContentApi`
setup by the call to
`GdsApi::TestHelpers::ContentApi#stub_content_api_default_artefact` in
`EngineIntegrationTest#setup` was not working correctly. What was more
confusing was that this stubbing was working most of the time, but just not
all the time, even though the same URLs were being requested.

In reading the `WebMock` documentation to come up with a plausible hypothesis
for why this might be happening, I came across the `net_http_connect_on_start`
option [2]. More in hope than expectation, I tried enabling this option and the
problem appeared to go away. More encouragingly when I disabled it again, the
problem came back.

Although I can't say for sure that this fixes the problem or exactly how it
might be fixing the problem, I'm about 95% sure it does fix the problem. So
I think it's worth trying it out and see whether we see the problem recur in
the Jenkins CI build.

[1]: #1624
[2]: https://github.com/bblimke/webmock#connecting-on-nethttpstart

@floehopper floehopper force-pushed the fix-capybara-poltergeist-javascript-error branch from ac4ac83 to 1d78712 Jul 29, 2015

@floehopper floehopper merged commit 1d78712 into master Jul 29, 2015

1 check passed

default "Build #2636 succeeded on Jenkins"
Details

@chrisroos chrisroos deleted the fix-capybara-poltergeist-javascript-error branch May 27, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.