Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky failure mode when running tests #1874

Closed
maherbeg opened this issue Oct 5, 2016 · 45 comments
Closed

Flaky failure mode when running tests #1874

maherbeg opened this issue Oct 5, 2016 · 45 comments

Comments

@maherbeg
Copy link

@maherbeg maherbeg commented Oct 5, 2016

Do you want to request a feature or report a bug?
Bug

What is the current behavior?
Very rarely (as in we've only seen this happen once so far), we see a failure when running a test like so

FAIL __tests/test-file.js
   ● Test suite failed to run

     b.js:94
                         /* istanbul ignore next */_react2.def
                                                           ^^^
     SyntaxError: missing ) after argument list

       at transformAndBuildScript (node_modules/jest-runtime/build/transform.js:284:10)
       at Object.<anonymous> (a.js:5:47)

Where the line number for a.js is an ES6 import for b.js. We have no instanbul ignore comments in our code base.

What is the expected behavior?
I don't expect this strange type of failure. Because it's only happened once so far, I have nothing useful for reproduction unfortunately. I presume there's a strange race condition with the transform and test execution, but I'm not sure what.

Happy to provide any more info!

@maherbeg
Copy link
Author

@maherbeg maherbeg commented Oct 7, 2016

Also another sporadic, random test failure in transform with something that usually works

     /vendor/jquery-2.1.0.min.js:4
     }});
     ^
     SyntaxError: Unexpected token }

       at transformAndBuildScript (node_modules/jest-runtime/build/transform.js:284:10)
       at Object.<anonymous> (src/compliance/modules/subjects.js:281:739)
@cpojer
Copy link
Collaborator

@cpojer cpojer commented Oct 7, 2016

seems like @Daniel15 ran into this at FB too.

@Daniel15
Copy link
Member

@Daniel15 Daniel15 commented Oct 7, 2016

Yeah I encountered the same thing, but very sporadically (maybe 1 in 30 or 40 test runs). It's failing at the exact same location (transform.js:284:10) every time

    /..../html/shared/react/renderers/dom/client/eventPlugins/SimpleEventPlugin.js:274
    }});
      ^
    SyntaxError: Unexpected token )

      at transformAndBuildScript (scripts/jest-config/node_modules/jest-runtime/build/transform.js:284:10)
      at Object.inject (html/shared/react/renderers/dom/shared/ReactDefaultInjection-core.js:61:19)
      at Object.<anonymous> (html/shared/react/renderers/dom/ReactDOM-core.js:28:34)
@aaronabramov
Copy link
Member

@aaronabramov aaronabramov commented Oct 13, 2016

the only race condition that might take place i can think of is writing the same transformed file to cache at the same time from two different child processes.

fs.WriteFileSync(cacheFilename, transformedSource)

can that be the case?

@gregberge
Copy link
Contributor

@gregberge gregberge commented Dec 24, 2016

I have the issue on my repository, it happens very very often:

https://travis-ci.org/neoziro/recompact/builds/186529118

Not in local, but on travis, a race condition. I will try to fix it in jest, hints are welcome.

@gregberge
Copy link
Contributor

@gregberge gregberge commented Dec 27, 2016

I found a workaround to fix it. Force execution in the same thread using -i or --runInBand option.

https://facebook.github.io/jest/docs/troubleshooting.html#tests-are-extremely-slow-on-docker-and-or-continuous-integration-server

sapegin added a commit to styleguidist/react-styleguidist that referenced this issue Jan 6, 2017
Because of a bug in Jest: facebook/jest#1874
@michaelAndersonCampingWorld

I get this same error when in CI. I believe it's exacerbated by slow CPUs. I only see the problem locally when my laptop is being thermally throttled, but I see it quite a bit on codeship where run times are orders of magnitude longer than they are locally.

Test suite failed to run

/home/rof/src/github.com/GoodSamEnterprises/RVCare/node_modules/asn1/lib/ber/writer.js:210
encodeOctet(bytes,parseIn
^^^^^^^
SyntaxError: missing ) after argument list

at transformAndBuildScript (node_modules/jest-runtime/build/transform.js:320:12)
at Object.<anonymous> (node_modules/asn1/lib/ber/index.js:7:12)
at Object.<anonymous> (node_modules/asn1/lib/index.js:6:9)
@jakubzitny
Copy link

@jakubzitny jakubzitny commented Apr 7, 2017

Any update on this, anyone? We're still getting this on CI, it's quite annoying.. 😭

@aaronabramov
Copy link
Member

@aaronabramov aaronabramov commented Apr 7, 2017

i tried to debug it and stress test with multiple concurrent writes and wasn't able to reproduce. What i noticed is that it almost never happens on hot cache. What happens on our CI servers is that we get this failure on the first day after we check in a new test, and then it goes away after a few days.

@jakubzitny
Copy link

@jakubzitny jakubzitny commented Apr 8, 2017

Thanks for the info @dmitriiabramov. How do you cache on CI though? We have a fresh docker image for each test job on (GitLab) CI. I am not sure how to cache there, do you have any suggestions on that, please?

We deal with this by manually retrying the test that fails on this. It happens roughly 1 in 10 CI pipelines. Not sure if it's related to CI runner load.

I saw that you added logTransformErrorsconfig in #2159. Did it reveal anything interesting or not at all? I am open to help with further debugging this if you point me in the right direction.

targos added a commit to image-js/image-js that referenced this issue Apr 26, 2017
@Vanuan
Copy link
Contributor

@Vanuan Vanuan commented May 8, 2017

I have the following flaky error:

SyntaxError: Unexpected token ILLEGAL

at transformAndBuildScript (node_modules/jest-runtime/build/transform.js:320:12)
at Object.<anonymous> (node_modules/lodash-es/lodash.js:156:62)

So might it be related to using the --coverage flag?

@thymikee
Copy link
Collaborator

@thymikee thymikee commented May 8, 2017

This is unrelated. It seems like lodash-es is not compiled to ES5, therefore you need to either import regular lodash or whitelist lodash-es so Jest can compile it with babel or other compiler of your choosing, look here for details http://facebook.github.io/jest/docs/en/tutorial-react-native.html#transformignorepatterns-customization

@Vanuan
Copy link
Contributor

@Vanuan Vanuan commented May 8, 2017

@thymikee This doesn't explain why it's flaky.

I have the following babel transform:

var babel = require('babel-jest').createTransformer({
  babelrc: false,
  presets: [
    'react',
    ['es2015', {
       // import statements are supported since node 7
       // TODO switch to false after upgrade
       'modules': 'commonjs'
    }],
    'stage-1'
  ],
  plugins: ['transform-decorators-legacy']
});
module.exports = babel;

And it is added to exceptions:

"transformIgnorePatterns": ["/node_modules\/(?!(react-tree-menu|lodash-es)).*/"],

And it is not specific to lodash. I have this in ./src folder too.

@thymikee
Copy link
Collaborator

@thymikee thymikee commented May 8, 2017

Oh, sorry then!

@Vanuan
Copy link
Contributor

@Vanuan Vanuan commented May 8, 2017

BTW the line numbers are quite strange:

node_modules/lodash-es/lodash.js:156:62

Does it do transform in memory?

@Vanuan
Copy link
Contributor

@Vanuan Vanuan commented May 11, 2017

@dmitriiabramov

the only race condition that might take place i can think of is writing the same transformed file to cache at the same time from two different child processes.

fs.WriteFileSync(cacheFilename, transformedSource)
can that be the case?

Oh, so does it write to file from multiple processes without any mutexes? This will definitely cause corruption!

@Vanuan
Copy link
Contributor

@Vanuan Vanuan commented May 11, 2017

Maybe no-cache will help?

Retry with --no-cache. Jest caches transformed module files to speed up test execution. If you are using your own custom transformer, consider adding a getCacheKey function to it: getCacheKey in Relay.

I'm using require('babel-jest').createTransformer. Maybe that one has the bug.

@Vanuan
Copy link
Contributor

@Vanuan Vanuan commented May 11, 2017

Looks like the default cache directory is /tmp:
https://facebook.github.io/jest/docs/configuration.html#cachedirectory-string

That would explain why it is reproducible on CI much frequently. So to reproduce you should run two Jest processes with exactly the same settings. And don't forget to clean cache before that.

@jakubzitny
Copy link

@jakubzitny jakubzitny commented May 11, 2017

@Vanuan We have Docker on CI — everything runs in its own container so there are not multiple processes writing to the same file. And we do see this on CI frequently.

@Vanuan
Copy link
Contributor

@Vanuan Vanuan commented May 11, 2017

@jakubzitny Each jest run creates multiple processes unless you specify --runInBand.

I'm just pointing out why you can't reproduce this locally: the default cache directory is in /tmp. So unless you clean it up before each run you wouldn't be able reproduce.

I'm commenting this:

What happens on our CI servers is that we get this failure on the first day after we check in a new test, and then it goes away after a few days.

Looks like they use a global cache directory at facebook's CI servers. I.e. /tmp or whatever cache directory used is never cleaned. That would speed up things but hide such race condition bugs.

I also use docker. But I mount container's /tmp to host /tmp. Maybe you do this as well?

In either case there's a race condition when cache doesn't exist.

@Vanuan
Copy link
Contributor

@Vanuan Vanuan commented May 12, 2017

Unfortunately, there's no file locking in node, so the use of thirdparty libraries is required. The cross-platform alternative is TCP port locking but that would be more troublesome for user.

@Vanuan
Copy link
Contributor

@Vanuan Vanuan commented May 12, 2017

Probably the best thing we can do now is --no-cache. But there should be a comparison whether --runInBand is slower.

@Villar74
Copy link

@Villar74 Villar74 commented Jul 25, 2017

@Vanuan Thx, i just need some help with setup jest

@cpojer
Copy link
Collaborator

@cpojer cpojer commented Jul 26, 2017

Please feel free to upgrade to jest@20.1.0-delta.3 which should contain a fix for this issue thanks to @jeanlauliac.

@cpojer cpojer closed this Jul 26, 2017
@jeanlauliac
Copy link
Contributor

@jeanlauliac jeanlauliac commented Jul 26, 2017

I think we could release a new revision for jest-runtime@20.0 so that we get the fix everywhere. I'm a little stressed this could cause regression, but at worst we can always revert.

@Daniel15
Copy link
Member

@Daniel15 Daniel15 commented Jul 27, 2017

@jeanlauliac - Thank you so much for fixing this! It's been broken for a while and I still sporadically hit the same problem, so I really appreciate it 😃

@aaronabramov
Copy link
Member

@aaronabramov aaronabramov commented Aug 8, 2017

i'm still getting a lot of these errors at fb after the latest update. Seems like it's still not fixed :(

@onetrickwolf
Copy link

@onetrickwolf onetrickwolf commented Aug 29, 2017

Any update on this? I am getting intermittent failures on circleci (seems to be during times of load and always when max workers are set to 1). Tried updating to 20.1.0-echo.1 but does not seem to be working.

Have to rebuild several times to get tests to work even though they work every time on laptop. Any help would be appreciated.

@aaronabramov aaronabramov reopened this Aug 29, 2017
@MattCopenhaver
Copy link

@MattCopenhaver MattCopenhaver commented Aug 30, 2017

I am also getting intermittent test failures. My team first noticed this at roughly the same time as we upgraded react, react-dom, and react-addons-test-utils from 0.14.3 to 15.4.0 and enzyme from 2.4.1 to 2.9.1 (maybe a few days after). I'm not sure if this is related or not.

We have a test suite of about 2800 tests (not all are testing react components though). At least one test fails very rarely on local workstations, but fairly often on our CI server. It does seem to be related to load on the workstation or server.


Here is an example of an error message:
Method “simulate” is only meant to be run on a single node. 0 found instead.
Which is an enzyme error thrown when you try to run .simulate('click') on something that doesn't exist.

In this particular example, we run:
wrapper.find('li').simulate('click');
where wrapper is an enzyme.mount object, and get that error.

However, when I console.log(mountedDOM.debug());, the li is clearly there.

Most, if not all, of the tests that fail intermittently are trying to interact the an enzyme.mount object.


Hopefully this provides some more data that will help diagnose this. We aren't really sure if it's react, jest, enzyme, something in our project config, or something environment related that is causing this problem.

@Vanuan
Copy link
Contributor

@Vanuan Vanuan commented Aug 31, 2017

That doesn't look like the original issue. It's something different, related to jsdom.
The original issue was due to filesystem write race condition. I suggest creating another issue and renaming this one to "Corrupted compile cache".

@SimenB
Copy link
Collaborator

@SimenB SimenB commented Mar 19, 2019

No activity here for a long time, is this still happening with Jest 24?

@jeysal
Copy link
Collaborator

@jeysal jeysal commented Mar 19, 2019

@mcous you mentioned this in Opentrons/opentrons#2815, perhaps you could tell us if there's still problems with the latest version since this is apparently hard to reproduce

@mcous
Copy link

@mcous mcous commented Mar 19, 2019

@jeysal we're still running Jest v23 and we continue to use --runInBand on CI. If I get a free minute I can try out v24 and see if we still get failures in CI when running in parallel

@mcous
Copy link

@mcous mcous commented Jul 22, 2019

@jeysal sorry for the delay here. We (finally) upgraded to v24 and removed --runInBand in CI. After running for a few weeks we have not seen the flaky failures we used to get

@jeysal
Copy link
Collaborator

@jeysal jeysal commented Jul 23, 2019

@mcous thanks for the update :)
I'll close this then, hope nobody is running into this kind of problem anymore

@jeysal jeysal closed this Jul 23, 2019
@devangnaghera312
Copy link

@devangnaghera312 devangnaghera312 commented Feb 14, 2020

We are still experiencing these failures on circleci with jest@24.9.0

@github-actions
Copy link

@github-actions github-actions bot commented May 11, 2021

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Please note this issue tracker is not a help forum. We recommend using StackOverflow or our discord channel for questions.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 11, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet