-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracing test is flakey #7538
Comments
As #7538 describes, the tracing test is especially flakey. This change skips running this test unless the `RUN_FLAKEY_TEST` environment variable is set.
As #7538 describes, the tracing test is especially flakey. This change skips running this test unless the `RUN_FLAKEY_TEST` environment variable is set.
So, I tried to dig more on this test, and have the following findings:
@mateiidavid WDYT? |
@Pothulapati do we know how long it usually takes for emojivoto to deploy? I wonder why it would take so long for it to start-up, we make use of it in other tests and I don't think they take as long as @Pothulapati I looked through the tracing test and there are a couple of things I think we can do:
We could move One solution from my side would be to get rid of emojivoto and just rely on slow-cooker and nginx (the latter is already used in the test). Edit: raised #7587 to investigate the possibility of caching the images for emojivoto, maybe that might help? |
I had a similar question too and I think the reason here is that the
Yeah, As each test seems to have its own inject configuration, this does not seem like a great way to fix this. We should probably start with AOT image loading, and then see if its still slow? |
I did some testing and realized the bottleneck is the loading of the nginx-ingress image. Unpacked it weights 555MB. Upgrading to 0.33.0 results in 327MB, but that'd require updating other things in the test. OTOH, preloading the image didn't make much of a difference to me in terms of total time, but maybe that can help avoiding the flakiness. Replacing that ingress with another one might also be another avenue to improve times... |
@alpeb That's a good idea! I will start working on this i.e performing the same test without using nginx-ingress |
Part of #7538 This PR removes the `nginx-ingress` from the tracing integration test, and also replaces the emojivoto manifest with the official manifest from the getting started guide. Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Part of #7538 This PR removes the `nginx-ingress` from the tracing integration test, and also replaces the emojivoto manifest with the official manifest from the getting started guide. This should increase the speed of this test, as initializing `nginx-ingress` [seems to take a while](#7538 (comment)). ```bash # Before Test script: [tracing] Params: [] ok github.com/linkerd/linkerd2/test/integration/tracing 222.845s # After Test script: [tracing] Params: [] ok github.com/linkerd/linkerd2/test/integration/tracing 170.030s ``` Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Notably, even the merge of #7601 (f2ad5a9 log) failed on this test:
I've seen other branches fail on main as well, like #7623 (0704f1b log):
|
Skip tracing test for now until we can make it less flakey. See #7538 Signed-off-by: Alex Leong <alex@buoyant.io>
Per #7538, the tracing test is especially flakey. This change skips the test when certain failures are encountered Signed-off-by: Oliver Gould <ver@buoyant.io>
Per #7538, the tracing test is especially flakey. This change skips the test when certain failures are encountered Signed-off-by: Oliver Gould <ver@buoyant.io>
Per #7538, the tracing test is especially flakey. This change skips the test when certain failures are encountered Signed-off-by: Oliver Gould <ver@buoyant.io>
Per #7538, the tracing test is especially flakey. This change skips the test when certain failures are encountered Signed-off-by: Oliver Gould <ver@buoyant.io>
Part of #7538 This PR removes the `nginx-ingress` from the tracing integration test, and also replaces the emojivoto manifest with the official manifest from the getting started guide. This should increase the speed of this test, as initializing `nginx-ingress` [seems to take a while](#7538 (comment)). ```bash # Before Test script: [tracing] Params: [] ok github.com/linkerd/linkerd2/test/integration/tracing 222.845s # After Test script: [tracing] Params: [] ok github.com/linkerd/linkerd2/test/integration/tracing 170.030s ``` Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
The tracing test frequently fails with spurious errors like:
That is, after running other tests, we wait 3 minutes for the test to fail, invalidating the whole CI run.
I'm going to disable this test until we can get at the root of this flakiness.
See also #7403
The text was updated successfully, but these errors were encountered: