Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erratic test failure in master #7249

Closed
mramato opened this issue Nov 8, 2018 · 27 comments · Fixed by #10707
Closed

Erratic test failure in master #7249

mramato opened this issue Nov 8, 2018 · 27 comments · Fixed by #10707

Comments

@mramato
Copy link
Contributor

mramato commented Nov 8, 2018

The last master build failed with the below error. If I had to guss, this test is using an external server. We should probably bring whatever data its using in-house to make sure CI doesn't randomly fail.

CC @likangning93 @OmarShehata

  Core/GoogleEarthEnterpriseTerrainData
    upsample
      ✗ works for all four children of a simple quad
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5096)
	<Jasmine>
5.002 secs: Core/GoogleEarthEnterpriseTerrainData upsample works for all four children of a simple quad
3.236 secs: Core/GoogleEarthEnterpriseTerrainData createMesh creates specified vertices plus skirt vertices
1.667 secs: Core/GoogleEarthEnterpriseMetadata resolves readyPromise
1.187 secs: Core/VideoSynchronizer Syncs time when not looping
1.151 secs: Core/ApproximateTerrainHeights initializes
1.132 secs: Scene/GroundPrimitive does not pick when allowPicking is false
1.109 secs: Scene/GroundPrimitive larger scene renders batched instances
1.087 secs: Scene/Vector3DTileContent renders polygons
1.069 secs: DataSources/GeometryVisualizer Creates and removes static color open geometry
1.053 secs: Scene/GroundPrimitive picking in 2D
Firefox 63.0.0 (Linux 0.0.0): Executed 10347 of 10355 (1 FAILED) (skipped 8) (2 mins 22.893 secs / 2 mins 13.272 secs)
TOTAL: 1 FAILED, 10346 SUCCESS
@likangning93
Copy link
Contributor

likangning93 commented Nov 8, 2018

I think this one might just be slow, it passes in airplane mode on my Macbook Pro but times out with-or-without a network connection on my N2920-based netbook. [EDIT] After a cursory glance at the spec, it also doesn't look like it's requesting anything.

@OmarShehata
Copy link
Contributor

OmarShehata commented Jan 2, 2019

I just had a build fail on Travis I believe for a similar reason as above:

    Scene/PointCloudEyeDomeLighting
    ✗ adds a clear command and a post-processing draw call
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ does not change commands for pick calls
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
  Scene/Tileset3DTileContent
    ✗ resolves readyPromise
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ destroys
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ gets properties
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
  Scene/Batched3DModel3DTileContentClassification
    ✗ renders with batch table
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders with binary batch table
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
  Scene/Batched3DModel3DTileContent
    ✗ recognizes the legacy 20-byte header
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ recognizes the legacy 24-byte header
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ logs deprecation warning for use of BATCHID without prefixed underscore
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ resolves readyPromise
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders with batch table
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders with batch table binary
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders without batch table
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders with all features translucent
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders with a mix of opaque and translucent features
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders with textures
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders with a tile transform and box bounding volume
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders with a tile transform and sphere bounding volume
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders with a tile transform and region bounding volume
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ picks with batch table
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ picks without batch table
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ can get features and properties
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ throws when calling getFeature with invalid index
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ gets memory usage
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ Links model to tileset clipping planes based on bounding volume clipping
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ Links model to tileset clipping planes if tileset clipping planes are reassigned
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ rebuilds Model shaders when clipping planes change
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ transforms model positions by RTC_CENTER property in the features table
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ destroys
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
  Scene/Cesium3DTileBatchTable
    ✗ renders tileset with batch table
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders tileset without batch table
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders when vertex texture fetch is not supported
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders with featuresLength greater than maximumTextureSize
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders with featuresLength of zero
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders translucent style
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders translucent style when vertex texture fetch is not supported
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders tileset with batch table hierarchy extension
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders tileset with batch table hierarchy using binary properties
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders tileset with batch table hierarchy with multiple parent classes
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders tileset with batch table hierarchy with no parents
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ renders tileset with legacy batch table hierarchy (pre-version 1.0)
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ warns about deprecated batch hierarchy (pre-version 1.0)
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>
    ✗ destroys
	Error: Timeout - Async callback was not invoked within 5000ms (set by jasmine.DEFAULT_TIMEOUT_INTERVAL) in node_modules/jasmine-core/lib/jasmine-core/jasmine.js (line 5451)
	<Jasmine>

Restarting the build fixed it.

@mramato
Copy link
Contributor Author

mramato commented Jan 3, 2019

Master just failed again https://travis-ci.org/AnalyticalGraphicsInc/cesium/builds/474883290?utm_source=github_status&utm_medium=notification We need to nip this in the bud.

@OmarShehata since you've been doing some test clean-up/optimization, do you think you can take a look at this? We worked hard to get CI consistently passing and now all of a sudden it's failing every dozen builds.

@OmarShehata
Copy link
Contributor

@mramato yeah I'll take a look at this today or tomorrow.

@OmarShehata
Copy link
Contributor

I wonder/hope there's some underlying issue here. It seems a lot less table than it was a week ago. This just failed on Travis:

Scene/raises the cameraMoveEvent event when the camera stops moving

But it wasn't a timeout (expected 0 to 1).

@mramato
Copy link
Contributor Author

mramato commented Jan 8, 2019

@OmarShehata any update here?

@OmarShehata
Copy link
Contributor

Sorry, this got pushed down my to do list with all the recent shuffles. I'm running some Travis builds now to see how consistent the failing tests are. If there are any recent failed builds like that whose logs still exist please post them here.

@OmarShehata
Copy link
Contributor

I'm able to replicate a subset of these failures by running them on a slower machine (and they pass fine when re-running the tests). I don't see an easy way around this other than refactoring each one of those tests to make it faster. Although it's still unclear why:

  • This is happening more frequently recently,
  • These tests vary wildly in timing. Core/VideoSynchronizer Syncs time when looping takes anywhere between 1 second, to 5.1 more frequently, up to 9 seconds.

Cranking the default timeout up to 20 is a band-aid solution that will make these timing spikes not stop the build until we refactor them. I can open a PR here and re-run the build a bunch of times to test that new timeout. Running it locally I haven't seen anything above ~10 seconds.

@mramato
Copy link
Contributor Author

mramato commented Jan 8, 2019

Well, for VideoSynchronizer, network latency may be at play (those tests should probably use a local, in-repository video instead of an externally hosted one). Please write up an issue.

That doesn't explain the other tests if they are not using external resources.

Do the other tests ever vary wildly when run in isolation or only when all ran together?

@OmarShehata
Copy link
Contributor

I'm running more tests to answer that, but just documenting a peculiar finding: GroundPrimitive/does not render whens how if false, just failed at the first expect:

expect(frameState.afterRender.length).toEqual(1);

The value was 2 instead of 1. This test does not seem to be relying on any external resources, and it was not a timeout, so just increasing the timeout wouldn't fix this intermittent failure. It failed when running all the tests together, so far hasn't failed when running in isolation.

@OmarShehata
Copy link
Contributor

The most common failure seems to be this set of tests #7249 (comment), and what's common there is that they all use Cesium3DTilesTester.loadTileset.

My best guess now is that something goes wrong with the scene object they all use, causing the tileset to never load and timeouts to occur. They can all be traced to:

Cesium3DTilesTester.waitForTilesLoaded = function(scene, tileset) {
        return pollToPromise(function() {
            scene.renderForSpecs();
            return tileset.tilesLoaded;
        }).then(function() {
            scene.renderForSpecs();
            return tileset;
        });
    };

If tileset.tilesLoaded is never true, all those failures will occur.

@OmarShehata
Copy link
Contributor

It took 2 re-builds for this to pass on Travis #7476

The first time, this failed #7249 (comment) and the second time, this did #7249 (comment).

Both of those are non-timeout tests. The only thing in common amongst all these failures is they rely on something happening in scene.render(). If this failing for any reason, it will cause those failures as well as the timeouts. I'm wondering if this has to do with our recent switch to FirefoxHeadless.

Each Spec file creates a scene, which creates a new canvas, but it does get cleaned up, and it should be running even on failure.

@mramato
Copy link
Contributor Author

mramato commented Jan 10, 2019

I'm wondering if this has to do with our recent switch to FirefoxHeadless.

We were already using FirefoxHeadless for built tests before this happened, so I doubt that's it.

@mramato
Copy link
Contributor Author

mramato commented Jan 10, 2019

@OmarShehata I think you are on to something and it would make sense that the delay is really the test waiting for something that never happens. Any idea why this is happening? CC @lilleyse

@OmarShehata
Copy link
Contributor

I've been seeing this one fail a lot:

SceneSpec/raises the cameraMoveEvent event when the camera stops moving

Which is very interesting because the two camera events tests, right before and right after it, are very stable. @lilleyse suggested refactoring this test (for example, it calls render twice, perhaps because the event doesn't always get triggered with a single render, which seems quite unreliable). That's something I'll try next that'll hopefully get a good chunk of the failures to disappear.

@mramato
Copy link
Contributor Author

mramato commented Jan 16, 2019

Pretty sure #7482 fixed most of the stability issues except for the cameraMoveEvent that @OmarShehata mentions above. @OmarShehata have you seen crashes since #7482 was merged? (I forced master a few times and didn't). I have seen the cameraMoveEvent one though, so it might good to fix or ignore that one ASAP. (if we ignore it, write up a new clean issue so we don't forget about it)

@OmarShehata
Copy link
Contributor

@mramato I haven't seen any other test failures yet since. I'll refactor the camera test this week and that'll hopefully curb these failures. The only thing that will remain is tests using external resources (described here #7244) but those don't seem to be a critical issue on Travis currently.

@OmarShehata
Copy link
Contributor

Just saw a failure today with this same sequence: #7249 (comment)

@hpinkos
Copy link
Contributor

hpinkos commented Mar 28, 2019

This seems to be getting worse. I'm seeing travis fail with random tests more often than I see it pass.

@mramato
Copy link
Contributor Author

mramato commented Mar 28, 2019

I think we need to experiment with bumping the test timeout to a large value (like 30 seconds) when running on travis and see if that helps improve things. travis' infrastructure seems to have gotten work and their servers are getting slower and slower.

@OmarShehata
Copy link
Contributor

OmarShehata commented Apr 4, 2019

There's a new erratic failure discovered in #7713

Core/Resource fetchImage without ImageBitmap can load an image fails erratically when the loaded image has a width/height of 16 instead of 1. The image request isn't failing, it's almost as if a request from the other tests is interfering with it? (Many of which are 16x16). I haven't been able to recreate it locally or in after various Travis rebuilds.

The exact failures are:

Core/Resource
    ✗ can load an image preferring blob
	Expected 16 to equal 1.
    ✗ can load an image from a data URI
	Expected 16 to equal 1.
    fetchImage without ImageBitmap
      ✗ can load an image
	Expected 16 to equal 1.

@OmarShehata
Copy link
Contributor

I'm starting to get the failures above locally (but only when running all the tests) which is good news, I might be able to look into and get a fix for this now.

@lilleyse
Copy link
Contributor

lilleyse commented Oct 2, 2019

I've been seeing a new error lately: changing the camera frustum does not cause continuous rendering in requestRenderMode

Once in #8242 and again in #8205. Restarting the build usually makes CI pass again.

Build details: https://travis-ci.org/AnalyticalGraphicsInc/cesium/builds/592619357?utm_source=github_status&utm_medium=notification

@mramato
Copy link
Contributor Author

mramato commented Oct 25, 2019

After some CI fixes and the ES6 migration, the only consistent CI failure I'm seeing is with Vector3DTileContent These tests look overly complicated and have a lot of setup, so I'm leaning towards the specs being the problem: https://travis-ci.org/AnalyticalGraphicsInc/cesium/builds/602749111

mramato added a commit that referenced this issue Oct 31, 2019
@mramato
Copy link
Contributor Author

mramato commented Oct 31, 2019

Ignored Vector3DTileContent specs for now in #8317 until we have time to look at them.

@sanjeetsuhag
Copy link
Contributor

sanjeetsuhag commented Aug 18, 2022

As of fd66f69, there's a bunch of test failures in the Vector3DTileContent suite that I can reproduce locally:

image

CI seems to have passed though.

Moreover, it seems to me that each spec is trying to do too much. We're testing for render as well as pick everytime, and there's a bunch of nested verify style functions that are invoked in each spec. I'm not opposed to the robustness of this is striving for, but I'm leaning towards re-writing the entire suite with smaller, more focused tests.

@ggetz
Copy link
Contributor

ggetz commented Aug 18, 2022

CI seems to have passed though.

Do you see the same when running with --webgl-stub locally?

I'm not opposed to the robustness of this is striving for, but I'm leaning towards re-writing the entire suite with smaller, more focused tests.

This sounds like the best approach to me. Specs should only test one thing, and re-writing the specs seems to be the consensus in thread anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Issue/PR closed
Development

Successfully merging a pull request may close this issue.

7 participants