feat(ci): re-introduce dynamic test scheduler / balancer with addressed false green issue #12286

samugi · 2024-01-03T13:06:15Z

Note for reviewers:

Easier if reviewed commit by commit

Summary

The test scheduler was reverted because of an issue of "false-green" where the CI jobs were returning green results despite some failing Busted tests. This PR:

Reintroduces the test scheduler, upgraded to V2
Configures the test scheduler to enable it to use the new static-scheduling mode (introduced in V2), when the schedule job is rerun - happening here
Applies a fix to the "false-green" bug (more on this below)
Fixes some test failures that emerged after fixing the false-green bug above, these are failing when the scheduler is involved due to the different order of execution that the scheduler applies, compared to how they are regularly run

False-green bug

This was caused by two issues. Fixing either would resolve the problem. One of them affected the test scheduler, fixed in V2, the second one is described below:

We use busted.subscribe to override the Busted output handlers with a callback. To implement the mediator pattern, Busted uses mediator_lua. The second value returned by the subscription callback is used to decide whether to continue execution of other subscribers. Since we only return nil, the test failure was not handled to exit with the right status and failing tests were exiting with 0, so the scheduler was interpreting failing results as passing tests. The fix is simply this line.

Reproduced failure: https://github.com/Kong/kong/actions/runs/7398042284/job/20126353464?pr=12286#step:17:3090
Fixed: https://github.com/Kong/kong/actions/runs/7398209062/job/20127873115?pr=12286#step:17:5300

Checklist

The Pull Request has tests
(no) A changelog file has been created under changelog/unreleased/kong or skip-changelog label added on PR if changelog is unnecessary. README.md
(no) There is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - PUT DOCS PR HERE

Issue reference

KAG-3465
KAG-3533

spec/busted-ci-helper.lua

spec/02-integration/02-cmd/03-reload_spec.lua

kikito · 2024-01-18T03:23:10Z

@samugi let's not merge this until after 3.6 Feature Freeze, please.

This reverts commit e804fd4 effectively reapplying 543004c. Original commit message: This commit adds an automatic scheduler for running busted tests. It replaces the static, shell script based scheduler by a mechanism that distributes the load onto a number of runners. Each runner gets to work on a portion of the tests that need to be run. The scheduler uses historic run time information to distribute the work evenly across runners, with the goal of making them all run for the same amount of time. With the 7 runners configured in the PR, the overall time it takes to run tests is reduced from around 30 minutes to around 11 minutes. Previously, the scheduling for tests was defined by what the run_tests.sh shell script did. This has now changed so that the new JSON file `test_suites.json` is instead used to define the tests that need to run. Like before, each of the test suites can have its own set of environment variables and test exclusions. The test runner has been rewritten in Javascript in order to make it easier to interface with the declarative configuration file and to facilitate reporting and interfacing with busted. It resides in the https://github.com/Kong/gateway-test-scheduler repository and provides its functionality through custom GitHub Actions. A couple of tests had to be changed to isolate them from other tests better. As the tests are no longer run in identical order every time, it has become more important that each test performs any required cleanup before it runs.

* bump test scheduler to v3 * apply changes required by v3: pass `xml-output-file` and `setup-venv-path` params to runner * update busted ci helper to be consistent with EE * reintroduce debug steps in build and test workflow

after fixing the test scheduler helper, new failures emerged. This commit fixes them. fix(test-scheduler): pass github token to gh-storage

We use `busted.subscribe` to override the output handlers with a callback. To implement the mediator pattern, Busted uses [mediator_lua](https://github.com/Olivine-Labs/mediator_lua). The second value returned by the subscription callback is used to decide whether to continue execution of other subscribers. Since we only return `nil`, the test failure was not handled to exit with the right status and failing tests were exiting with `0`. This commit changes the return value of the callback to: `nil, true` so that the original callback is executed to handle the test result and return the correct exit status.

spec/02-integration/03-db/01-db_spec.lua

pull-request-size bot added the size/XL label Jan 3, 2024

samugi marked this pull request as draft January 3, 2024 13:06

github-actions bot assigned samugi Jan 3, 2024

github-actions bot added the chore Not part of the core functionality of kong, but still needed label Jan 3, 2024

samugi force-pushed the fix/dynamic-test-scheduler-callback branch 4 times, most recently from b7274f0 to 84ba934 Compare January 3, 2024 14:46

samugi commented Jan 4, 2024

View reviewed changes

spec/busted-ci-helper.lua Outdated Show resolved Hide resolved

samugi force-pushed the fix/dynamic-test-scheduler-callback branch from c939312 to b1cfe7c Compare January 4, 2024 10:25

samugi mentioned this pull request Jan 4, 2024

tests(workflows): continue execution of subscribers #12291

Closed

3 tasks

samugi force-pushed the fix/dynamic-test-scheduler-callback branch from b1cfe7c to 5e33052 Compare January 4, 2024 16:14

samugi changed the title ~~fix(test-scheduler): return correct exit status from busted~~ fix(test-scheduler): re-introduce and address false green issues Jan 4, 2024

samugi force-pushed the fix/dynamic-test-scheduler-callback branch from 0eb3071 to b1d31dc Compare January 4, 2024 16:45

samugi changed the title ~~fix(test-scheduler): re-introduce and address false green issues~~ fix(test-scheduler): re-introduce scheduler and address false green issues Jan 4, 2024

samugi force-pushed the fix/dynamic-test-scheduler-callback branch from c7e484c to a720be1 Compare January 5, 2024 10:36

github-actions bot added the plugins/request-transformer label Jan 5, 2024

samugi force-pushed the fix/dynamic-test-scheduler-callback branch from a720be1 to 1efb245 Compare January 5, 2024 10:38

samugi commented Jan 9, 2024

View reviewed changes

spec/busted-ci-helper.lua Outdated Show resolved Hide resolved

samugi force-pushed the fix/dynamic-test-scheduler-callback branch 11 times, most recently from 84a758c to 183483e Compare January 15, 2024 12:20

StarlightIbuki reviewed Jan 17, 2024

View reviewed changes

spec/02-integration/02-cmd/03-reload_spec.lua Outdated Show resolved Hide resolved

samugi force-pushed the fix/dynamic-test-scheduler-callback branch 2 times, most recently from 3c62cbc to 13be664 Compare January 17, 2024 11:05

samugi requested review from StarlightIbuki and hanshuebner January 17, 2024 11:20

StarlightIbuki approved these changes Jan 18, 2024

View reviewed changes

samugi marked this pull request as draft January 18, 2024 07:50

samugi force-pushed the fix/dynamic-test-scheduler-callback branch from 3bfc648 to b57e096 Compare January 19, 2024 14:38

samugi force-pushed the fix/dynamic-test-scheduler-callback branch from b57e096 to e59980c Compare January 30, 2024 13:33

samugi changed the title ~~fix(ci): re-introduce scheduler and address false green issue~~ feat(ci): re-introduce scheduler and address false green issue Jan 30, 2024

samugi force-pushed the fix/dynamic-test-scheduler-callback branch 2 times, most recently from 0b27d90 to 33b4419 Compare January 30, 2024 15:46

samugi marked this pull request as ready for review January 30, 2024 16:00

samugi changed the title ~~feat(ci): re-introduce scheduler and address false green issue~~ feat(ci): re-introduce dynamic test scheduler / balancer with addressed false green issue Jan 31, 2024

samugi force-pushed the fix/dynamic-test-scheduler-callback branch 3 times, most recently from f9be10c to 5acba23 Compare February 8, 2024 08:50

samugi added 4 commits February 12, 2024 13:05

chore(ci): bump scheduler + consistency with EE

5a852f7

* bump test scheduler to v3 * apply changes required by v3: pass `xml-output-file` and `setup-venv-path` params to runner * update busted ci helper to be consistent with EE * reintroduce debug steps in build and test workflow

fix(tests): failures emerged running the scheduler

d5f356f

after fixing the test scheduler helper, new failures emerged. This commit fixes them. fix(test-scheduler): pass github token to gh-storage

samugi force-pushed the fix/dynamic-test-scheduler-callback branch from 5acba23 to 55b9162 Compare February 12, 2024 12:05

kikito reviewed Feb 14, 2024

View reviewed changes

spec/02-integration/03-db/01-db_spec.lua Show resolved Hide resolved

kikito approved these changes Feb 14, 2024

View reviewed changes

samugi merged commit 246fd30 into master Feb 14, 2024
25 checks passed

samugi deleted the fix/dynamic-test-scheduler-callback branch February 14, 2024 16:08

ADD-SP mentioned this pull request Feb 19, 2024

tests(integration): fix flakiness #12580

Merged

team-gateway-bot mentioned this pull request Feb 19, 2024

[backport -> release/3.6.x] tests: fix flakiness #12581

Merged

locao added the skip-changelog label Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ci): re-introduce dynamic test scheduler / balancer with addressed false green issue #12286

feat(ci): re-introduce dynamic test scheduler / balancer with addressed false green issue #12286

samugi commented Jan 3, 2024 •

edited

Loading

kikito commented Jan 18, 2024

feat(ci): re-introduce dynamic test scheduler / balancer with addressed false green issue #12286

feat(ci): re-introduce dynamic test scheduler / balancer with addressed false green issue #12286

Conversation

samugi commented Jan 3, 2024 • edited Loading

Note for reviewers:

Summary

False-green bug

Checklist

Issue reference

kikito commented Jan 18, 2024

samugi commented Jan 3, 2024 •

edited

Loading