Skip to content

test: fail proxysql-tester.py TAP runs when zero binaries are discovered (safety net)#5602

Merged
renecannao merged 1 commit intov3.0from
ci/proxysql-tester-zero-test-safety-net
Apr 11, 2026
Merged

test: fail proxysql-tester.py TAP runs when zero binaries are discovered (safety net)#5602
renecannao merged 1 commit intov3.0from
ci/proxysql-tester-zero-test-safety-net

Conversation

@renecannao
Copy link
Copy Markdown
Contributor

@renecannao renecannao commented Apr 11, 2026

Summary

Adds a defence-in-depth safety net to `test/scripts/bin/proxysql-tester.py` so that a TAP run with zero discovered test binaries fails loudly instead of silently reporting green.

Depends conceptually on #5601 (GH-Actions), which fixes the underlying build regression. This PR is the backstop that will catch any future regression of the same class.

Why

Up until now, if CI-builds produced no `test/tap/tests/**/-t` binaries on the host (for any reason -- wrong make target, broken docker-compose mount, typo in a rewrite, removal of a build target), `proxysql-tester.py` would walk the three TAP workdirs (`tests`, `tests_with_deps/deprecate_eof_support`, `tests/unit`), `glob(-t)` would return zero in each, and the tester would print:

```
SUMMARY: 'tests' PASS 0/0 : FAIL 0/0 : SKIP 0/0
SUMMARY: 'deprecate_eof_support' PASS 0/0 : FAIL 0/0 : SKIP 0/0
SUMMARY: 'unit' PASS 0/0 : FAIL 0/0 : SKIP 0/0
SUMMARY: ret_rc = [0]
```

Zero failures out of zero tests is "success" by the old logic, so the workflow turns green.

That is exactly what happened during the multi-week silent regression on the `GH-Actions` branch after commit `6e7d93229` landed the wrong `make` target in `ci-builds.yml`. The entire TAP test suite -- CI-legacy-g*, CI-mysql84-g*, CI-basictests, CI-taptests-pgsql-cluster, CI-legacy-clickhouse-g1, CI-legacy-g2-genai -- was reporting green every commit while running zero tests. The build-target fix is in #5601. This PR is the guard that means the next regression of this kind surfaces immediately instead of hiding.

What the safety net does

At the end of `Psqlt.run_tap_tests`, after the workdir loop completes, it checks `len(ret_summary)`.

The `summary` tuple `(cmd, rc)` is populated for every discovered binary, including ones that end up skipped by version filter, group membership, or `TEST_PY_TAP_INCL/EXCL` regexes -- those get `rc=None` rather than being absent from the summary. So `len(ret_summary) == 0` is a strict "`glob(*-t)` found nothing in any workdir" signal.

When that happens, the safety net looks up `TAP_GROUP` in `test/tap/groups/groups.json` and counts how many tests that group is expected to contain. If the expected count is > 0 (or `TAP_GROUP` is unset, meaning we're running the full TAP suite), the safety net logs a `SAFETY NET FAIL` message and bumps `ret_rc` to at least 1, making the workflow fail.

What does NOT trip the safety net:

  • Version-filter skips: version-filtered tests still appear in the summary (`rc=None`), so `len(ret_summary) > 0` and the guard is silent.
  • `INCL/EXCL` regex skips: same as above.
  • A group legitimately declared with zero tests in `groups.json`: `expected_tests == 0`, guard is silent.

Only the "nothing was even on disk" case trips it. That's the case that used to silently report green and must stop doing so.

Test plan

  • Merge into `v3.0`.
  • After ci: fix TAP build target so test binaries are actually produced (critical) #5601 merges into `GH-Actions`, push an empty commit to a PR branch and confirm the downstream TAP workflows actually run tests (expected to take 10+ minutes each, not 2 seconds).
  • Optional negative test: temporarily break the build target on a scratch branch, push, and confirm the safety net fails the run with `SAFETY NET FAIL`.

Trade-offs

  • False positives: a group that genuinely should run nothing because all its tests are gated on a newer ProxySQL version would still populate `ret_summary` via the version-filter path, so it's not affected. The only false positive would be a TAP_GROUP that's in `groups.json` pointing at tests with names that don't exist on disk yet -- but that's exactly the condition worth failing on.
  • Log noise: when legitimately running the full suite with no specific group, and someone's working on a dev branch that deleted all tap tests, this will fail. I consider that desired behavior.
  • Edge case: if `groups.json` is unreadable or malformed, the safety net logs a warning and treats `expected_tests` as 0. If `TAP_GROUP` is set in that case, the guard is silent. I can tighten this if you'd like it louder.

Summary by CodeRabbit

  • Tests
    • Enhanced test validation to prevent false positive outcomes when no tests are discovered, ensuring the pipeline properly reports failures in such cases.

Defence-in-depth against the silent false-green failure mode we just
tracked down. Previously, if CI-builds produced no TAP test binaries
on the host (for any reason -- wrong make target, broken volume mount,
typo in a docker-compose override, misconfigured groups.json), the
tester would walk three workdirs, find zero `*-t` binaries in each,
report:

    SUMMARY: 'tests' PASS 0/0 : FAIL 0/0 : SKIP 0/0
    SUMMARY: 'deprecate_eof_support' PASS 0/0 : FAIL 0/0 : SKIP 0/0
    SUMMARY: 'unit' PASS 0/0 : FAIL 0/0 : SKIP 0/0
    SUMMARY: ret_rc = [0]

...and exit 0. Zero failures out of zero tests is technically "success"
by the old logic, so the workflow would turn green. Combined with a
build-target regression on the GH-Actions branch, this masked the fact
that the entire TAP test suite (CI-legacy-g*, CI-mysql84-g*, CI-basictests,
CI-taptests-pgsql-cluster, CI-legacy-clickhouse-g1, CI-legacy-g2-genai)
had been running zero tests per commit for several weeks while still
reporting green.

Safety net
----------

At the end of `run_tap_tests`, check whether `glob(*-t)` found ANY
binaries across ALL workdirs. The summary tuple (cmd, rc) is populated
for every discovered binary, including ones that end up skipped by
version filter or INCL/EXCL regexes -- those get `rc=None` rather
than being absent. So `len(ret_summary) == 0` is a strict "nothing
was even on disk" signal.

When that happens, check groups.json for how many tests TAP_GROUP is
expected to run. If N > 0, log a SAFETY NET FAIL line and bump ret_rc
to 1. Version-filter skips and INCL/EXCL skips are not affected -- they
still produce summary entries, so ret_summary is non-empty in those
cases. Only the actual "test binaries were never built" case trips
the guard.

If TAP_GROUP is unset (running the full TAP suite), any zero-discovery
result is a failure regardless -- there's no legitimate configuration
where proxysql-tester should run zero tests.

This complements the ci-builds.yml fix on the GH-Actions branch that
restores the TAP build target. With both in place, a future regression
in either the build or the test discovery chain surfaces loudly
instead of silently.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 11, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 73bc4fc4-f9d6-4119-a85f-9218dff410fa

📥 Commits

Reviewing files that changed from the base of the PR and between e0fded8 and 8c22751.

📒 Files selected for processing (1)
  • test/scripts/bin/proxysql-tester.py
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: claude-review
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-11T05:43:20.598Z
Learning: Applies to test/tap/tests/unit/**/*.cpp : Unit tests in `test/tap/tests/unit/` must link against `libproxysql.a` and use `test_globals.h` and `test_init.h`
Learnt from: wazir-ahmed
Repo: sysown/proxysql PR: 5557
File: test/tap/tests/unit/gtid_set_unit-t.cpp:14-17
Timestamp: 2026-04-01T21:27:03.216Z
Learning: In ProxySQL's unit test directory (test/tap/tests/unit/), test_globals.h and test_init.h are only required for tests that depend on the ProxySQL runtime globals/initialization (i.e., tests that exercise components linked against libproxysql.a). Pure data-structure or utility tests (e.g., ezoption_parser_unit-t.cpp, gtid_set_unit-t.cpp, gtid_trxid_interval_unit-t.cpp) only need tap.h and the relevant project header — omitting test_globals.h and test_init.h is correct and intentional in these cases.
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:27.165Z
Learning: In ProxySQL test files (test/tap/tests/), resource leaks (such as not calling `mysql_close()` on early return paths) are not typically fixed because test processes are short-lived and the OS frees resources on process exit. This is a common pattern across the test suite.
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-11T05:43:20.598Z
Learning: Applies to test/tap/**/*.cpp : Use TAP (Test Anything Protocol) for tests with Docker-based backend infrastructure
📚 Learning: 2026-04-01T21:27:03.216Z
Learnt from: wazir-ahmed
Repo: sysown/proxysql PR: 5557
File: test/tap/tests/unit/gtid_set_unit-t.cpp:14-17
Timestamp: 2026-04-01T21:27:03.216Z
Learning: In ProxySQL's unit test directory (test/tap/tests/unit/), test_globals.h and test_init.h are only required for tests that depend on the ProxySQL runtime globals/initialization (i.e., tests that exercise components linked against libproxysql.a). Pure data-structure or utility tests (e.g., ezoption_parser_unit-t.cpp, gtid_set_unit-t.cpp, gtid_trxid_interval_unit-t.cpp) only need tap.h and the relevant project header — omitting test_globals.h and test_init.h is correct and intentional in these cases.

Applied to files:

  • test/scripts/bin/proxysql-tester.py
📚 Learning: 2026-04-11T05:43:20.598Z
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-11T05:43:20.598Z
Learning: Applies to test/tap/tests/unit/**/*.cpp : Unit tests in `test/tap/tests/unit/` must link against `libproxysql.a` and use `test_globals.h` and `test_init.h`

Applied to files:

  • test/scripts/bin/proxysql-tester.py
📚 Learning: 2026-04-11T05:43:20.598Z
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-11T05:43:20.598Z
Learning: Applies to test/tap/**/*.cpp : Use TAP (Test Anything Protocol) for tests with Docker-based backend infrastructure

Applied to files:

  • test/scripts/bin/proxysql-tester.py
📚 Learning: 2026-01-20T09:34:27.165Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:27.165Z
Learning: In ProxySQL test files (test/tap/tests/), resource leaks (such as not calling `mysql_close()` on early return paths) are not typically fixed because test processes are short-lived and the OS frees resources on process exit. This is a common pattern across the test suite.

Applied to files:

  • test/scripts/bin/proxysql-tester.py
📚 Learning: 2026-01-20T07:40:34.938Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:24-28
Timestamp: 2026-01-20T07:40:34.938Z
Learning: In ProxySQL test files, calling `mysql_error(NULL)` after `mysql_init()` failure is safe because the MariaDB client library implementation returns an empty string for NULL handles (not undefined behavior).

Applied to files:

  • test/scripts/bin/proxysql-tester.py
🪛 Ruff (0.15.9)
test/scripts/bin/proxysql-tester.py

[warning] 1121-1121: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (1)
test/scripts/bin/proxysql-tester.py (1)

1108-1135: Safety-net logic is solid and matches the failure mode this PR targets.

Good defensive check: it only fails when nothing was discovered, preserves valid skip-only runs, and forces non-zero exit via ret_rc = max(ret_rc, 1) without masking earlier failures.


📝 Walkthrough

Walkthrough

The pull request adds a safety net check to the run_tap_tests() method in proxysql-tester.py that detects when no TAP test binaries were discovered and forces pipeline failure rather than allowing a false-positive 0/0 "all good" outcome. The check reads test/tap/groups/groups.json when TAP_GROUP is set to validate expected test counts.

Changes

Cohort / File(s) Summary
Safety Net Validation
test/scripts/bin/proxysql-tester.py
Added fallback logic after TAP workdir processing to detect missing TAP binaries by reading groups.json, logging critical errors when no tests are found, and forcing non-zero return code to prevent false-positive test outcomes.

Estimated Code Review Effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly Related PRs

Poem

🐰 A safety net woven with care,
Catches the tests that vanish in air,
No more false cheers when nothing ran true,
Groups.json whispers what tests are due!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly summarizes the main change: adding a safety net to fail TAP runs when zero test binaries are discovered, which matches the core objective of the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ci/proxysql-tester-zero-test-safety-net

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a safety net to the TAP test runner to prevent silent CI successes when no test binaries are discovered. The review feedback highlights that the logic for counting expected tests should exclude metadata tags (keys starting with '@') and that the safety net should also trigger if a specified group is not found in the configuration file, as the system defaults to running all tests in that scenario.

Comment on lines +1117 to +1120
expected_tests = sum(
1 for _, test_groups in groups.items()
if tap_group in test_groups
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current logic counts all entries in groups.json where the group matches, including metadata entries (keys starting with @). According to the logic in test/tap/groups/check_groups.py (line 71), keys starting with @ are metadata tags and not actual test names. These should be excluded from the expected_tests count to ensure the safety net accurately reflects the number of runnable tests.

Suggested change
expected_tests = sum(
1 for _, test_groups in groups.items()
if tap_group in test_groups
)
expected_tests = sum(
1 for test_name, test_groups in groups.items()
if not test_name.startswith("@") and tap_group in test_groups
)

)
except Exception as e:
log.warning(f"Safety net: could not parse groups.json: {e}")
if not tap_group or expected_tests > 0:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If TAP_GROUP is set but not found in groups.json (or if groups.json is missing/unreadable), the tester defaults to running all discovered TAP tests (see lines 650 and 653). In this scenario, if zero binaries are discovered, the safety net will remain silent because expected_tests will be 0. To ensure the safety net catches these cases, it should also trigger if a group is specified but cannot be validated against the configuration file, as we still expect the full suite to be present.

@renecannao renecannao merged commit ecd8b72 into v3.0 Apr 11, 2026
3 checks passed
renecannao added a commit that referenced this pull request Apr 11, 2026
Validates that #5601 (GH-Actions ci-builds -tap target fix) and #5602
(v3.0 proxysql-tester.py zero-test safety net) combine to produce an
actually-running TAP test pipeline.

Expected signals in CI-builds log:
- make ubuntu22-tap (not ubuntu22-dbg)
- >>>tap-matrix.txt<<< section with 3+ workdirs and hundreds of -t entries
- _test cache size >> 2 MB

Expected signals in CI-legacy-g* / CI-mysql84-g* runs:
- Run <group> tests step takes minutes, not seconds
- proxysql-tester.py reports PASS N/N with N > 0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant