Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ci): re-introduce dynamic test scheduler / balancer with addressed false green issue #12286

Merged
merged 4 commits into from
Feb 14, 2024

Conversation

samugi
Copy link
Member

@samugi samugi commented Jan 3, 2024

Note for reviewers:

Easier if reviewed commit by commit

Summary

The test scheduler was reverted because of an issue of "false-green" where the CI jobs were returning green results despite some failing Busted tests. This PR:

  • Reintroduces the test scheduler, upgraded to V2
  • Configures the test scheduler to enable it to use the new static-scheduling mode (introduced in V2), when the schedule job is rerun - happening here
  • Applies a fix to the "false-green" bug (more on this below)
  • Fixes some test failures that emerged after fixing the false-green bug above, these are failing when the scheduler is involved due to the different order of execution that the scheduler applies, compared to how they are regularly run

False-green bug

This was caused by two issues. Fixing either would resolve the problem. One of them affected the test scheduler, fixed in V2, the second one is described below:

We use busted.subscribe to override the Busted output handlers with a callback. To implement the mediator pattern, Busted uses mediator_lua. The second value returned by the subscription callback is used to decide whether to continue execution of other subscribers. Since we only return nil, the test failure was not handled to exit with the right status and failing tests were exiting with 0, so the scheduler was interpreting failing results as passing tests. The fix is simply this line.

Reproduced failure: https://github.com/Kong/kong/actions/runs/7398042284/job/20126353464?pr=12286#step:17:3090
Fixed: https://github.com/Kong/kong/actions/runs/7398209062/job/20127873115?pr=12286#step:17:5300

Checklist

  • The Pull Request has tests
  • (no) A changelog file has been created under changelog/unreleased/kong or skip-changelog label added on PR if changelog is unnecessary. README.md
  • (no) There is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - PUT DOCS PR HERE

Issue reference

KAG-3465
KAG-3533

@samugi samugi marked this pull request as draft January 3, 2024 13:06
@github-actions github-actions bot added the chore Not part of the core functionality of kong, but still needed label Jan 3, 2024
@samugi samugi force-pushed the fix/dynamic-test-scheduler-callback branch 4 times, most recently from b7274f0 to 84ba934 Compare January 3, 2024 14:46
spec/busted-ci-helper.lua Outdated Show resolved Hide resolved
@samugi samugi force-pushed the fix/dynamic-test-scheduler-callback branch from c939312 to b1cfe7c Compare January 4, 2024 10:25
@samugi samugi force-pushed the fix/dynamic-test-scheduler-callback branch from b1cfe7c to 5e33052 Compare January 4, 2024 16:14
@samugi samugi changed the title fix(test-scheduler): return correct exit status from busted fix(test-scheduler): re-introduce and address false green issues Jan 4, 2024
@samugi samugi force-pushed the fix/dynamic-test-scheduler-callback branch from 0eb3071 to b1d31dc Compare January 4, 2024 16:45
@samugi samugi changed the title fix(test-scheduler): re-introduce and address false green issues fix(test-scheduler): re-introduce scheduler and address false green issues Jan 4, 2024
@samugi samugi force-pushed the fix/dynamic-test-scheduler-callback branch from c7e484c to a720be1 Compare January 5, 2024 10:36
@samugi samugi force-pushed the fix/dynamic-test-scheduler-callback branch from a720be1 to 1efb245 Compare January 5, 2024 10:38
spec/busted-ci-helper.lua Outdated Show resolved Hide resolved
@samugi samugi force-pushed the fix/dynamic-test-scheduler-callback branch 11 times, most recently from 84a758c to 183483e Compare January 15, 2024 12:20
@samugi samugi force-pushed the fix/dynamic-test-scheduler-callback branch 2 times, most recently from 3c62cbc to 13be664 Compare January 17, 2024 11:05
@kikito
Copy link
Member

kikito commented Jan 18, 2024

@samugi let's not merge this until after 3.6 Feature Freeze, please.

@samugi samugi marked this pull request as draft January 18, 2024 07:50
@samugi samugi force-pushed the fix/dynamic-test-scheduler-callback branch from 3bfc648 to b57e096 Compare January 19, 2024 14:38
@samugi samugi force-pushed the fix/dynamic-test-scheduler-callback branch from b57e096 to e59980c Compare January 30, 2024 13:33
@samugi samugi changed the title fix(ci): re-introduce scheduler and address false green issue feat(ci): re-introduce scheduler and address false green issue Jan 30, 2024
@samugi samugi force-pushed the fix/dynamic-test-scheduler-callback branch 2 times, most recently from 0b27d90 to 33b4419 Compare January 30, 2024 15:46
@samugi samugi marked this pull request as ready for review January 30, 2024 16:00
@samugi samugi changed the title feat(ci): re-introduce scheduler and address false green issue feat(ci): re-introduce dynamic test scheduler / balancer with addressed false green issue Jan 31, 2024
@samugi samugi force-pushed the fix/dynamic-test-scheduler-callback branch 3 times, most recently from f9be10c to 5acba23 Compare February 8, 2024 08:50
This reverts commit e804fd4 effectively
reapplying 543004c.

Original commit message:

This commit adds an automatic scheduler for running busted tests.  It
replaces the static, shell script based scheduler by a mechanism that
distributes the load onto a number of runners.  Each runner gets to
work on a portion of the tests that need to be run.  The scheduler
uses historic run time information to distribute the work evenly
across runners, with the goal of making them all run for the same
amount of time.  With the 7 runners configured in the PR, the overall
time it takes to run tests is reduced from around 30 minutes to around
11 minutes.

Previously, the scheduling for tests was defined by what the
run_tests.sh shell script did.  This has now changed so that the new
JSON file `test_suites.json` is instead used to define the tests that
need to run.  Like before, each of the test suites can have its own
set of environment variables and test exclusions.

The test runner has been rewritten in Javascript in order to make it
easier to interface with the declarative configuration file and to
facilitate reporting and interfacing with busted.  It resides in the
https://github.com/Kong/gateway-test-scheduler repository and
provides its functionality through custom GitHub Actions.

A couple of tests had to be changed to isolate them from other tests
better.  As the tests are no longer run in identical order every time,
it has become more important that each test performs any required
cleanup before it runs.
* bump test scheduler to v3
* apply changes required by v3: pass `xml-output-file` and
  `setup-venv-path` params to runner
* update busted ci helper to be consistent with EE
* reintroduce debug steps in build and test workflow
after fixing the test scheduler helper, new failures emerged. This
commit fixes them.

fix(test-scheduler): pass github token to gh-storage
We use `busted.subscribe` to override the output handlers with a
callback. To implement the mediator pattern, Busted uses
[mediator_lua](https://github.com/Olivine-Labs/mediator_lua).

The second value returned by the subscription callback is used to decide
whether to continue execution of other subscribers. Since we only return
`nil`, the test failure was not handled to exit with the right status
and failing tests were exiting with `0`.

This commit changes the return value of the callback to: `nil, true` so
that the original callback is executed to handle the test result and
return the correct exit status.
@samugi samugi force-pushed the fix/dynamic-test-scheduler-callback branch from 5acba23 to 55b9162 Compare February 12, 2024 12:05
@samugi samugi merged commit 246fd30 into master Feb 14, 2024
25 checks passed
@samugi samugi deleted the fix/dynamic-test-scheduler-callback branch February 14, 2024 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Not part of the core functionality of kong, but still needed plugins/request-transformer size/XL skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants