agentic-test-forge

Python quality enforcement for AI-generated and legacy codebases. Implements Uncle Bob Martin's workflow: CRAP analysis, mutation testing, and Gherkin scenario mutation, optimized for agentic development and CI gates.

Status

v1.1 complete — PyPI, dogfood CI, external pilot, pre-commit hook, and docs polish shipped (#58).

Command	Status
`forge crap`	Available
`forge mutate`	Available (Linux/WSL; mutmut does not run natively on Windows)
`forge mutate-gherkin`	Available
`forge check`	Available (includes optional advisory DRY scan)

Install

pip install agentic-test-forge

Pin a version:

pip install agentic-test-forge==1.0.0

For local development of this repo:

pip install -e ".[dev]"

Alternative (VCS install):

pip install "agentic-test-forge @ git+https://github.com/cheezd/agentic-test-forge.git"

Usage

forge --help
forge crap --path src/ --threshold 30
forge mutate --path src/ --base main --threshold 80
forge mutate-gherkin --path features/ --base main --threshold 80
forge check --path src/ --features-path features/

Run tests with coverage, then the full quality gate:

pytest --cov=src
forge check --path src/ --json report.json

Differential mutation uses git diff against --base (default main) and skips unchanged files tracked in .forge/mutation-manifest.json. Use --full to ignore the manifest.

Gherkin mutation mutates Examples table cells in changed .feature scenarios, runs the configured acceptance test command, and tracks results in .forge/gherkin-manifest.json.

Thresholds are gate cutoffs, not comparable scales: crap_threshold is a maximum CRAP score per function; mutation_threshold and gherkin_threshold are minimum mutation kill rates (0–100%). See score interpretation for what the numbers mean.

Configure per-project thresholds in pyproject.toml:

[tool.forge]
paths = ["src"]
crap_threshold = 30
crap_formula = "standard"  # standard | simplified
manifest_dir = ".forge"
mutation_threshold = 80
mutation_base_ref = "main"
mutation_test_cmd = "pytest"
gherkin_threshold = 80
gherkin_base_ref = "main"
gherkin_test_cmd = "behave"
gherkin_runner = "behave"  # behave | pytest
gherkin_paths = ["features"]

[tool.forge.gates]
crap = true
mutation = false
gherkin = false
dry = true         # advisory — does not fail forge check

Optional local override: you do not need forge.toml for normal use — [tool.forge] in pyproject.toml is enough (including consumer repos). If present, a forge.toml in the current working directory is merged on top of pyproject.toml (useful for uncommitted experiments, e.g. stricter thresholds on your machine). Unlike pyproject.toml, forge does not search parent directories for forge.toml; run from the directory that contains it, or rely on pyproject.toml only.

Consumer CI integration: see docs/consumer-ci.md (GitHub Actions, version pinning, Django / monorepo, Windows console notes).

Pre-commit (optional)

Run forge check locally before commit (respects [tool.forge.gates]):

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/cheezd/agentic-test-forge
    rev: v1.0.0
    hooks:
      - id: forge-check

pip install pre-commit agentic-test-forge==1.0.0
pre-commit install
pytest --cov=src   # CRAP gate needs .coverage
pre-commit run forge-check --all-files

See consumer-ci — pre-commit for coverage prerequisites, Windows/mutation notes, and troubleshooting.

Development

pip install -e ".[dev]"
pytest
ruff check .
mypy src

Domain language

See docs/domain/CONTEXT.md.

Architecture decisions

Index and when to write ADRs: docs/adr/README.md. Package layout, dependency direction, and refactor conventions: docs/adr/0001-package-boundaries-and-refactor-conventions.md.

License

Licensed under the GNU Lesser General Public License v3.0 or later (LGPL-3.0-or-later).

What this means in practice:

Modifications to forge must be shared under LGPL when you distribute them.
Using forge to check your code — via CLI in CI, locally, or on build servers — does not require your application or SaaS product to become open source.
Importing forge as a library in a proprietary product is generally permitted under LGPL (unlike GPL), subject to LGPL’s requirements (e.g. allowing replacement of the library).

We deliberately use LGPL, not AGPL, so network/SaaS deployment of your product does not trigger additional copyleft beyond the library itself. This is not legal advice; consult counsel for your specific deployment.

Roadmap

v1.0 (shipped)

Foundation & CLI shell — done
CRAP analyzer (radon + coverage.py) — done
Differential code mutation (mutmut) — done
Gherkin mutation — done
Quality gate orchestrator (forge check) — done
DRY flagging (advisory) — done

v1.1 (shipped — #58)

PyPI publish & GitHub Release — done (#64)
Dogfood CI (forge check + report artifact) — done (#70)
External consumer pilot (compliance-llm) — done (#71)
Pre-commit hook — done (#74)
Docs polish & ADR bootstrap — done (#78–#81, #128)

Beyond v1.1

Semantic DRY (#122) — deferred to v1.2+

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.github/workflows		.github/workflows
.vscode		.vscode
docs		docs
pilot		pilot
scripts		scripts
src/agentic_test_forge		src/agentic_test_forge
tests		tests
.gitignore		.gitignore
.pre-commit-hooks.yaml		.pre-commit-hooks.yaml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agentic-test-forge

Status

Install

Usage

Pre-commit (optional)

Development

Domain language

Architecture decisions

License

Roadmap

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agentic-test-forge

Status

Install

Usage

Pre-commit (optional)

Development

Domain language

Architecture decisions

License

Roadmap

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages