Gardening: mark flaky tests flaky. #82754

gspencergoog · 2021-05-17T22:00:29Z

This marks the following tests as flaky, per the gardener rotation:

linux_complex_layout_scroll_perf__devtools_memory: complex_layout_scroll_perf__devtools_memory is 4.65% flaky #82741
linux_large_image_changer_perf_android: large_image_changer_perf_android is 2.44% flaky #82747
linux_new_gallery__crane_perf: new_gallery__crane_perf is 2.56% flaky #82745
linux_platform_channels_benchmarks: platform_channels_benchmarks is 3.33% flaky #82743

zanderso · 2021-05-17T22:04:23Z

Please cross-reference the individual issues for the flaky tests in the PR description so that github will add links, and link to the individual issues for the flaky tests in prod_builders.json.

keyonghan · 2021-05-17T22:09:54Z

For the tests exceeding 2%, the proposal is to remove them out of the dashboard. They need to go through the staging env. for validation before enabled back: https://github.com/flutter/flutter/wiki/Reducing-Test-Flakiness#fixing-flaky-tests

gaaclarke · 2021-05-17T22:15:58Z

Can I get a link to a failure for linux_platform_channels_benchmarks? I want to verify that the issue was intrinsic to the actual test. FWIW that test has been running for a week so you might not have an accurate flake percentage for it. A single failure from before the test was marked not flakey might be giving you a reading with that threshold.

zanderso · 2021-05-17T22:16:43Z

@keyonghan If "remove them out of the dashboard" means that they won't run at all, then I don't think that's the right thing to do. Especially since these are benchmarks, removing them entirely will mean that we will miss regressions.

keyonghan · 2021-05-17T22:19:10Z

@keyonghan If "remove them out of the dashboard" means that they won't run at all, then I don't think that's the right thing to do. Especially since these are benchmarks, removing them entirely will mean that we will miss regressions.

"remove them out of the dashboard" means they will not show up in the flutter build dashboard, but they will still be triggered and run. Milo dashboard will still show the build info.

gspencergoog · 2021-05-17T22:21:51Z

I've removed the entries (and so didn't add the bug cross-references, since there's nothing left to connect them to), but I'll wait to commit this until you LGTM this updated PR again, @zanderso.

zanderso · 2021-05-17T22:24:37Z

I think we need to leave these on the dashboard so that we have a visual reminder of what is marked flaky.

keyonghan · 2021-05-17T22:31:09Z

I think we need to leave these on the dashboard so that we have a visual reminder of what is marked flaky.

I guess our intention is to unblock the tree and make sure the test passes 50 successful runs in staging pool before enabled back.
I guess your suggestion here is okay just by enforcing the validation in prod pool instead of staging pool.

We can start this way, and adjust the process if needed. I will update the doc accordingly.

gspencergoog · 2021-05-17T22:34:37Z

OK, done. Since this is pure jSON (no comments), I added an extra field called "issue_url" to the flaky tests, that will hopefully be ignored. If not, well, I'll revert it.

CaseyHillers · 2021-05-17T22:41:09Z

dev/prod_builders.json

@@ -82,7 +82,8 @@
      "name": "Linux complex_layout_scroll_perf__devtools_memory",
      "repo": "flutter",
      "task_name": "linux_complex_layout_scroll_perf__devtools_memory",
-      "flaky": false
+      "flaky": true,
+      "issue_url": "https://github.com/flutter/flutter/issues/82741"


I haven't announced it yet, but I'm aiming to deprecate this JSON config Wednesday night in favor of the .ci.yaml. Just commenting here as I'll need to move these to comments in the migration PR (@CaseyHillers @christopherfujino )

I think you just announced it. :-)

I suppose new properties could be naturally supported in .ci.yaml?

Yes, PRs welcome if you want to add automation around this

I suppose new properties could be naturally supported in .ci.yaml?

You would have to update the protobuf in flutter/cocoon first. I actually see this as a feature, as if you add a new target and mispell a field, CI will fail on an unknown field, rather than silently ignoring it.

This reverts commit f568f4e139f0f7f94e3a060e99bccaf8cadc9735 to mark tests flaky instead of removing them.

gspencergoog · 2021-05-17T23:04:10Z

Sorry, @zanderso, I can't add extra fields or comments to the JSON, so I had to remove the cross-linking for the issues.

@CaseyHillers might be able to add a field for that into .ci.yaml for this (seems useful in any case).

gspencergoog · 2021-05-17T23:47:05Z

When re-enabling these tests, they should still probably go through the "ran 50 times and didn't flake" soak time, right?

keyonghan · 2021-05-17T23:49:33Z

When re-enabling these tests, they should still probably go through the "ran 50 times and didn't flake" soak time, right?

Yes (https://github.com/flutter/flutter/wiki/Reducing-Test-Flakiness#fixing-flaky-tests)
Whenever a test is marked as flaky, it has to be validated before enabled back.

gaaclarke · 2021-05-17T23:51:41Z

Will it show up in go/flutter_15day_flakiness_dashboard to verify that it isn't flakey if it has been marked flakey though? We might want to add instructions on how to validate 50 successful runs.

keyonghan · 2021-05-18T00:00:34Z

Will it show up in go/flutter_15day_flakiness_dashboard to verify that it isn't flakey if it has been marked flakey though? We might want to add instructions on how to validate 50 successful runs.

It will still show up in the 15 day dashboard. Created #82767 to enable an easy validation. Before that, the flutter build dashboard will be the source of truth (no task box with exclamation point for top 50 consecutive commits).

gspencergoog requested a review from zanderso May 17, 2021 22:00

gspencergoog requested review from CaseyHillers and christopherfujino as code owners May 17, 2021 22:00

flutter-dashboard bot added the team Infra upgrades, team productivity, code health, technical debt. See also team: labels. label May 17, 2021

google-cla bot added the cla: yes label May 17, 2021

gspencergoog changed the title ~~Mark flaky tests flaky~~ Gardening: remove flaky tests. May 17, 2021

gspencergoog changed the title ~~Gardening: remove flaky tests.~~ Gardening: mark flaky tests flaky. May 17, 2021

zanderso approved these changes May 17, 2021

View reviewed changes

CaseyHillers reviewed May 17, 2021

View reviewed changes

gspencergoog added the waiting for tree to go green label May 17, 2021

gspencergoog added 4 commits May 17, 2021 15:54

Mark flaky tests flaky

532b822

Remove flaky tests rather than marking flaky

3f65409

Revert "Remove flaky tests rather than marking flaky"

fcd5de5

This reverts commit f568f4e139f0f7f94e3a060e99bccaf8cadc9735 to mark tests flaky instead of removing them.

Add issue URLs to flaky tests

e9ef982

gspencergoog force-pushed the flaky_tests branch from 2558d4f to e9ef982 Compare May 17, 2021 22:54

Remove issue URLs, as they cause validation errors in backend

a38da9d

zanderso mentioned this pull request May 17, 2021

platform_channels_benchmarks is 3.33% flaky #82743

Closed

fluttergithubbot merged commit 78ee9cc into flutter:master May 18, 2021

This was referenced May 24, 2021

Mark new_gallery__crane_perf unflaky #83288

Merged

Mark linux_large_image_changer_perf_android unflaky #83289

Merged

Mark linux_complex_layout_scroll_perf__devtools_memory unflaky #83290

Merged

Mark linux_platform_channels_benchmarks unflaky #83295

Merged

gspencergoog deleted the flaky_tests branch October 7, 2022 22:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gardening: mark flaky tests flaky. #82754

Gardening: mark flaky tests flaky. #82754

gspencergoog commented May 17, 2021 •

edited

Loading

zanderso commented May 17, 2021

keyonghan commented May 17, 2021

gaaclarke commented May 17, 2021

zanderso commented May 17, 2021

keyonghan commented May 17, 2021

gspencergoog commented May 17, 2021

zanderso commented May 17, 2021

keyonghan commented May 17, 2021

gspencergoog commented May 17, 2021

CaseyHillers May 17, 2021

gspencergoog May 17, 2021

keyonghan May 17, 2021

CaseyHillers May 17, 2021

christopherfujino May 17, 2021

gspencergoog commented May 17, 2021

gspencergoog commented May 17, 2021

keyonghan commented May 17, 2021

gaaclarke commented May 17, 2021

keyonghan commented May 18, 2021

Gardening: mark flaky tests flaky. #82754

Gardening: mark flaky tests flaky. #82754

Conversation

gspencergoog commented May 17, 2021 • edited Loading

zanderso commented May 17, 2021

keyonghan commented May 17, 2021

gaaclarke commented May 17, 2021

zanderso commented May 17, 2021

keyonghan commented May 17, 2021

gspencergoog commented May 17, 2021

zanderso commented May 17, 2021

keyonghan commented May 17, 2021

gspencergoog commented May 17, 2021

CaseyHillers May 17, 2021

Choose a reason for hiding this comment

gspencergoog May 17, 2021

Choose a reason for hiding this comment

keyonghan May 17, 2021

Choose a reason for hiding this comment

CaseyHillers May 17, 2021

Choose a reason for hiding this comment

christopherfujino May 17, 2021

Choose a reason for hiding this comment

gspencergoog commented May 17, 2021

gspencergoog commented May 17, 2021

keyonghan commented May 17, 2021

gaaclarke commented May 17, 2021

keyonghan commented May 18, 2021

gspencergoog commented May 17, 2021 •

edited

Loading