Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky test turned CI bot red #44352

Closed
alexmarkov opened this issue Nov 30, 2020 · 3 comments
Closed

Flaky test turned CI bot red #44352

alexmarkov opened this issue Nov 30, 2020 · 3 comments
Assignees
Labels
area-infrastructure Use area-infrastructure for SDK infrastructure issues, like continuous integration bot changes. gardening

Comments

@alexmarkov
Copy link
Contributor

vm-kernel-nnbd-linux-release-x64 bot turned red at https://ci.chromium.org/p/dart/builders/ci.sandbox/vm-kernel-nnbd-linux-release-x64/3109 due to

There are new unapproved failures on this build
There are unapproved failures
(dartk-weak-asserts-linux-release-x64):
    service/async_star_step_out_test/service_1   (Pass -> Timeout, expected Pass) at a0ed69

Deflaking log says the test passed 5 times:

Test configuration:
    dartk-weak-asserts-linux-release-x64(architecture: x64, compiler: dartk, mode: release, runtime: vm, system: linux, nnbd: weak, builder-tag: vm_nnbd, enable-asserts)
Suites tested: benchmark_smoke, corelib, ffi, language, lib, samples, service, standalone, utils, vm
Total: 1 tests
 * 0 tests will be skipped (0 skipped by design)
 * 0 tests are expected to be flaky but not crash
 * 0 tests are expected to flaky crash
 * 1 tests are expected to pass
 * 0 tests are expected to fail that we won't fix
 * 0 tests are expected to fail that we should fix
 * 0 tests are expected to crash that we should fix
 * 0 tests are allowed to timeout
 * 0 could not be categorized or are in multiple categories


--- Total time: 01:10 ---
0:00:16.345720 - vm - dartk-vm release_x64/service/async_star_step_out_test/service_1
0:00:13.630006 - vm - dartk-vm release_x64/service/async_star_step_out_test/service_1
0:00:13.583862 - vm - dartk-vm release_x64/service/async_star_step_out_test/service_1
0:00:13.578357 - vm - dartk-vm release_x64/service/async_star_step_out_test/service_1
0:00:13.112040 - vm - dartk-vm release_x64/service/async_star_step_out_test/service_1

=== All 5 tests passed ===
INFO: Core dump archiving is activated
INFO: No unexpected crashes recorded

@whesse @athomas Why that particular one-time failure turned bot red? Are we not filtering flaky test failures anymore?

@alexmarkov alexmarkov added area-infrastructure Use area-infrastructure for SDK infrastructure issues, like continuous integration bot changes. gardening labels Nov 30, 2020
@alexmarkov
Copy link
Contributor Author

Another case happened on https://ci.chromium.org/p/dart/builders/ci.sandbox/app-kernel-linux-release-x64/10130 build, where the bot turned red due to failure standalone_2/io/regress_flutter_57125_test (Pass -> Timeout, expected Pass), while it passed 5 times during deflaking.

@alexmarkov
Copy link
Contributor Author

Happened once again - app-kernel-linux-release-x64 turned red at https://ci.chromium.org/p/dart/builders/ci.sandbox/app-kernel-linux-release-x64/10133

Failures:

There are new unapproved failures on this build
There are unapproved failures
(app_jitk-linux-release-x64):
    service_2/debugger_location_test/service   (Pass -> Timeout, expected Pass) at b200c9..295a9a
    service_2/debugger_location_test/dds   (Pass -> Timeout, expected Pass) at b200c9..295a9a

They passed during deflaking:

{"name":"service_2/get_client_name_rpc_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"get_client_name_rpc_test/dds","time_ms":20083,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/mirror_references_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"mirror_references_test/dds","time_ms":20413,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/debugger_location_test/service","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"debugger_location_test/service","time_ms":27340,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/debugger_location_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"debugger_location_test/dds","time_ms":32674,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/get_client_name_rpc_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"get_client_name_rpc_test/dds","time_ms":19552,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/mirror_references_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"mirror_references_test/dds","time_ms":19751,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/debugger_location_test/service","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"debugger_location_test/service","time_ms":28327,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/get_client_name_rpc_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"get_client_name_rpc_test/dds","time_ms":19672,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/mirror_references_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"mirror_references_test/dds","time_ms":19472,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/debugger_location_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"debugger_location_test/dds","time_ms":32393,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/get_client_name_rpc_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"get_client_name_rpc_test/dds","time_ms":19952,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/mirror_references_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"mirror_references_test/dds","time_ms":19953,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/debugger_location_test/service","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"debugger_location_test/service","time_ms":27054,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/debugger_location_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"debugger_location_test/dds","time_ms":31902,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/get_client_name_rpc_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"get_client_name_rpc_test/dds","time_ms":19474,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/mirror_references_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"mirror_references_test/dds","time_ms":19424,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/debugger_location_test/service","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"debugger_location_test/service","time_ms":23579,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/debugger_location_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"debugger_location_test/dds","time_ms":22981,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/debugger_location_test/service","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"debugger_location_test/service","time_ms":19773,"result":"Pass","expected":"Pass","matches":true}
{"name":"service_2/debugger_location_test/dds","configuration":"app_jitk-linux-release-x64","suite":"service_2","test_name":"debugger_location_test/dds","time_ms":21927,"result":"Pass","expected":"Pass","matches":true}

@whesse whesse self-assigned this Dec 1, 2020
@whesse
Copy link
Contributor

whesse commented Dec 1, 2020

I suspect this may have to do with my rewrite of the logic that marks a flaky record as active: false when it is forgiven, and sets it to active: true when flakiness is seen again. The test with the problem does have an inactive flaky record, and it is not being set to active again, and the timeout seen from the original run (not the deflaking runs) is not being stored in the flaky record.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-infrastructure Use area-infrastructure for SDK infrastructure issues, like continuous integration bot changes. gardening
Projects
None yet
Development

No branches or pull requests

2 participants