Skip to content

mturac/pluginpool-flaky-detector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hero

flaky-detector

Run your test command N times. Find out which tests don't always agree with themselves.

License: MIT Python 3.8+ Claude Code Plugin Tests: 11 passing

TL;DR: /flaky-detector --cmd "pytest -v" --runs 10 → per-test flakiness %, sorted worst-first, ready to triage.

Why this exists

A test that fails 1-in-20 wastes more team time than a test that never fails. CI flakes erode trust; everyone learns to "just re-run it". This tool runs your suite N times, parses pass/fail per test, and tells you exactly which tests are flaky and at what rate — so you can decide: rerun, isolate, mark @pytest.mark.flaky, or fix the underlying race.

Install (Claude Code)

git clone https://github.com/mturac/pluginpool-flaky-detector ~/.claude/plugins/flaky-detector

Restart Claude Code; the slash command /flaky-detector appears.

Quick start

python3 scripts/flaky.py --cmd "pytest -v" --runs 10 --format md
python3 scripts/flaky.py --cmd "go test ./..." --runs 20 --parallel 4 --out report.json
python3 scripts/flaky.py --cmd "jest --ci" --parser jest --runs 5

Tip: prefer pytest -v over pytest -q so every result lands on its own line. If you use -q, flaky-detector still picks up the tail FAILED path::test summary lines and exits non-zero with a warning rather than reporting a false green.

Flags

Flag Default Description
--cmd required The test command (single line, no shell wrapping)
--runs 10 How many times to invoke the command
--parallel 1 Concurrent runs (only safe for parallel-clean suites)
--parser auto pytest, jest, gotest, tap, or auto
--out none Write JSON report to this path
--format json json or md

Supported parsers

Parser Matches
pytest `tests/foo.py::test_bar PASSED
jest / vitest ✓ name, ✗ name, PASS file, FAIL file
gotest --- PASS:, --- FAIL:, --- SKIP:
tap ok N - name, not ok N - name

Exit codes

Code Meaning
0 No flakies, no always-failing
1 At least one test is flaky (0 < flakiness_pct < 100)
2 At least one test is always-failing
3 Zero tests parsed but the runner reported activity — re-run with -v

Example output (markdown)

# Flaky-detector report (10 runs)

- flaky: **2**  |  always-failing: **0**  |  always-passing: 47

| test | pass | fail | flakiness % |
|---|---|---|---|
| tests/test_payment.py::test_idempotency | 6 | 4 | 40.0 |
| tests/test_search.py::test_index_warmup | 8 | 2 | 20.0 |

Limitations

  • --parallel > 1 only works for parallel-safe suites; otherwise concurrent runs share state and lie.
  • Streaming stdout from very long suites is buffered — be patient on the first run.
  • The parser is tuned for default reporters. Custom plugins (pytest-rich, etc.) may need a tweak.

Examples

Step-by-step walkthroughs with real input fixtures and the helper's actual output live in examples/. Three or four scenarios per plugin — from the happy path to the edge cases the test suite guards.

Part of the pluginpool family

Ten focused Claude Code plugins for everyday productivity: commit-narrator · pr-storyteller · test-gap · deps-doctor · env-lint · secret-guard · standup-gen · todo-harvest · flaky-detector · changelog-forge

License

MIT — see LICENSE. Contributions welcome.

About

Group conventional commits into a CHANGELOG section with semver bump suggestion.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors