Skip to content

ci: run plugin Python test suites (catches the PR #125 regression)#147

Merged
dguido merged 3 commits intomainfrom
add-python-tests-ci
Apr 16, 2026
Merged

ci: run plugin Python test suites (catches the PR #125 regression)#147
dguido merged 3 commits intomainfrom
add-python-tests-ci

Conversation

@tob-joe
Copy link
Copy Markdown
Contributor

@tob-joe tob-joe commented Apr 14, 2026

Summary

Adds a python-tests job to lint.yml, parallel to the existing bats job. Discovers test_*.py / *_test.py under plugins/ and runs each as a script. Currently exercises:

  • plugins/constant-time-analysis/ct_analyzer/tests/test_analyzer.py (unittest, 60 tests)
  • plugins/let-fate-decide/skills/let-fate-decide/scripts/test_draw_cards.py (stdlib self-runner, 27 tests)

Both test files already existed; no CI job invoked them.

Motivation

PR #125 ("replace os.urandom with secrets.randbelow") removed import os but left os.path calls inside draw(). Ruff F821 catches this today, but the check went green at merge time (the PR's head commit was pushed 12 days before merge and wasn't re-run against drifted main). The existing test_draw_* / test_cli_* tests in test_draw_cards.py would also have caught it — draw() raises NameError on every call — but no CI job ran them. PR #144 fixes the one-line import.

A Python-tests CI job would have caught the regression independently of ruff.

Expected CI behavior on this PR

The python-tests job will FAIL on this PR until #144 merges. That failure is the point — it demonstrates the regression. Once #144 lands on main, rebase this PR and the job goes green. Suggested merge order:

  1. Merge let-fate-decide: add missing import #144 (one-line import os fix)
  2. Rebase & merge this PR

Or merge both together; either way the end state is green.

Design notes

  • Executes tests as scripts (python3 path/to/test.py) rather than invoking pytest. Matches how the existing tests are authored (each has an if __name__ == '__main__': main block or uses unittest's auto-discovery), avoids pytest rootdir/conftest surprises, and keeps the job dependency-free.
  • find discovery means new plugin test files are picked up automatically — no per-plugin config.
  • Job uses ::group:: folding so each test file's output is collapsed in the Actions UI.

Test plan

@tob-joe tob-joe requested a review from dguido as a code owner April 14, 2026 23:55
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

tob-joe and others added 2 commits April 14, 2026 20:30
Discovers test_*.py / *_test.py files under plugins/ and executes
each one as a script. Matches the style of the existing bats job.

Works today for:
- plugins/constant-time-analysis/ct_analyzer/tests/test_analyzer.py
- plugins/let-fate-decide/.../scripts/test_draw_cards.py

Both test files already exist in the repo but no CI job invoked them.
As a result, PR #125 (which broke let-fate-decide by removing
`import os` while leaving `os.path` calls in draw()) merged with
green CI even though the existing `test_draw_*` / `test_cli_*`
tests would have caught it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nalysis

TestCrossArchitecture.test_cross_compile_arm64 invokes clang with
--target=aarch64-unknown-linux-gnu, which needs the aarch64 libc
headers. Without them clang fails with:
  fatal error: 'bits/libc-header-start.h' file not found

Install gcc-aarch64-linux-gnu + libc6-dev-arm64-cross so clang can
find the cross headers. Also install clang explicitly since the
runner may not have it preinstalled.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tob-joe tob-joe force-pushed the add-python-tests-ci branch from 59a4b5e to f37d54e Compare April 15, 2026 00:30
Resolves code review findings on PR #147 (comment accuracy only —
no behavior change):

- Rewrite the toolchain dependencies comment. The previous wording
  ("aarch64 cross gcc pulls in libc6-dev-arm64-cross") implied a
  transitive dependency, but `--no-install-recommends` suppresses
  Recommends, so libc6-dev-arm64-cross is installed only because it
  is listed explicitly. New comment names what each package supplies.
- Document why `set -uo pipefail` deliberately omits -e (the loop
  collects per-file failures and exits with a combined code).

Reviewers (codex + gemini + pr-review-toolkit agents) flagged 13
findings total; 11 were dismissed (false positives, design choices
matching the bats job, or speculative). Quality pipeline (actionlint,
zizmor, shellcheck, pre-commit, plugin validators, both Python test
suites) all pass locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dguido
Copy link
Copy Markdown
Member

dguido commented Apr 16, 2026

Review Summary

Three review passes ran in parallel against the head commit (f37d54e): the pr-review-toolkit skill (code-reviewer, silent-failure-hunter, comment-analyzer), Codex (gpt-5.3-codex at xhigh reasoning), and Gemini (default model after gemini-3-pro-preview quota was exhausted). 13 findings total — 2 fixed, 11 dismissed.

Findings

# Severity Source(s) Finding Resolution
1 P2 comment-analyzer Toolchain comment misstates package dep direction. With --no-install-recommends, gcc does NOT "pull in" libc6-dev-arm64-cross; libc is installed only because it is listed explicitly. Fixed — rewrote the comment to accurately describe what each package supplies (libc6-dev-arm64-cross → headers/crt; gcc-aarch64-linux-gnu → cross linker via Depends on binutils).
2 P2 silent-failure-hunter, comment-analyzer set -uo pipefail (no -e) is intentional but undocumented; future contributor may "fix" it to -euo pipefail and break the per-test failure aggregation. Fixed — added a one-line comment above set -uo pipefail explaining why -e is omitted.
3 P3 gemini, code-reviewer Use uv run instead of python3 so PEP 723 inline metadata is honored and tests with deps work seamlessly. Dismissed — author explicitly designed for stdlib-only ("keeps the job dependency-free", per PR description). Both current test files import only stdlib + local modules. Switching to uv run would override an explicit design choice for a forward-looking concern that doesn't apply today.
4 P3 silent-failure-hunter mapfile -d '' files < <(find ...) silently swallows find exit code; missing plugins/ dir would yield empty array. Dismissedplugins/ is guaranteed to exist after a clean actions/checkout. The defensive wrap (find ... || exit 1) is over-engineering for a near-zero risk.
5 P3 silent-failure-hunter, comment-analyzer "No test files found" treated as success could hide accidental test removal. Dismissed — the sibling bats job uses xargs --no-run-if-empty (same silent-success pattern). Changing only python-tests would create asymmetry. If we want fail-on-empty, both jobs should change together (out of scope).
6 P3 silent-failure-hunter python3 test_draw_cards.py allegedly runs zero assertions because the file lacks a unittest.main() block. Dismissed (false positive) — verified locally: test_draw_cards.py has a custom __main__ block (lines 226-241) that auto-discovers and runs all test_* functions. Local run reports 27 passed, 0 failed. CI run (25s) confirms tests execute.
7 P4 silent-failure-hunter Per-file failure detail is collapsed into a single exit bit. Dismissed::group:: folding in the GHA UI auto-expands the failing group, so the failing file is already obvious in the logs.
8 P4 silent-failure-hunter, gemini apt-get update partial-failure modes; apt-get install could install stale cached version silently. Dismissed — speculative; GHA runner package mirrors are stable and this risk is theoretical.
9 P4 silent-failure-hunter No timeout-minutes on the test step. Dismissed — the existing bats job has the same omission, so this is not a regression introduced by this PR. Worth fixing both together in a follow-up.
10 P4 code-reviewer find may descend into __pycache__ / .venv / node_modules if any contributor commits or generates them. Dismissed — these directories are gitignored and won't appear on a fresh CI checkout (the agent acknowledged this).
11 P4 code-reviewer aarch64 cross toolchain (~200MB) installed on every run; could be conditional via paths: filter or workflow split. Dismissed — adds workflow complexity for a marginal time saving. The 25s CI runtime is acceptable.
12 P4 code-reviewer exit "$failed" could be a literal exit 1 / exit 0. Dismissed — pure stylistic; current code is correct.
13 P4 gemini If test count grows, switch to pytest. Dismissed — premature for two test files; the PR description explicitly justifies the script-based approach.

Codex returned 0 findings.

Verification

  • actionlint: clean
  • zizmor: clean (no findings)
  • shellcheck (on the inline script): clean
  • pre-commit (on the modified file): all hooks pass (check-yaml, end-of-file-fixer, trim trailing whitespace)
  • Plugin validators (validate_codex_skills.py, validate_plugin_metadata.py): clean (72 plugin skills against 73 Codex entries; 38 plugins in sync)
  • Python tests locally:
    • plugins/let-fate-decide/skills/let-fate-decide/scripts/test_draw_cards.py: 27 passed, 0 failed
    • plugins/constant-time-analysis/ct_analyzer/tests/test_analyzer.py: 60 passed, 1 skipped
  • YAML syntax: valid

Commit

90e8b21 — ci: clarify python-tests workflow comments

@dguido dguido merged commit e8cc5ba into main Apr 16, 2026
7 checks passed
@dguido dguido deleted the add-python-tests-ci branch April 16, 2026 22:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants