-
Notifications
You must be signed in to change notification settings - Fork 0
Sprint plan
Migrated from paxman repositorys docs/sprints/README.md as part of the Sprint 11 repo springclean.
Status: Planning deliverable (no code). Branch:
sprint-planning-v1Generated: 2026-06-22 Target release: Paxman 1.0.0 on PyPI Total duration: 10 sprints (~5 months at 2-week sprints + 3-week Sprint 10 for release)
This directory contains the complete sprint planning for taking Paxman from its current state (zero source code, design-only) to a production-ready 1.0.0 release. Each sprint has scope, deliverables, prerequisites, exit criteria, risk register, and per-sprint tooling/API-key/prerequisite checklist.
| Reader | Read first |
|---|---|
| Project lead / engineering manager |
README.md (this file) → sprint-00-design-closure.md → sprint-10-release.md
|
| Engineer starting Sprint N |
sprint-NN-*.md only — each is self-contained |
| Stakeholders / reviewers |
README.md → CHANGES_LOG.md → each sprint's "Exit Criteria" |
| Operations / DevOps |
sprint-01-foundation.md §Prerequisites + sprint-08-docs-ci-hardening.md
|
| Security reviewer |
sprint-09-production-hardening.md + sprint-10-release.md §Security |
| # | Name | Focus | Duration | Sum of deliverables (id-ed) | Exit artifact |
|---|---|---|---|---|---|
| 0 | Design closure | 3 spec gaps + license decision | 1 week | 3 |
docs/specs/ (per Sprint 0 doc) containing Dict DSL spec, Input Profile spec, CostHint values per capability, license ADR |
| 1 | Foundation | pyproject.toml, Makefile, CI, src layout, all 9 cross-cutting modules, py.typed, smoke test | 2 weeks | 22.6 |
pip install -e . works, make ci is green, public paxman/__init__.py empty re-export shell |
| 2 | Contract subsystem |
contract/canonical.py, contract/_types.py, contract/validator.py, Pydantic + JSON Schema + Dict DSL adapters, fixture contracts |
2 weeks | 30.7 |
paxman.contract.adapt(InvoiceModel) → CanonicalContract round-trips; first passing adapter tests |
| 3 | Planner + 3 capabilities |
planner/* (7 modules), 3 capabilities (text_extraction, regex_extraction, validation), planner determinism tests |
2 weeks | 35.2 |
planner.plan(...) produces a deterministic ExecutionPlan (no execution yet); first capability unit tests pass |
| 4 | Executor + 2 capabilities + OpenAPI |
executor/* (7 modules), lookup + inference capabilities with stub provider, OpenAPI adapter (best-effort) |
2 weeks | 27.5 |
executor.run(plan, contract, registry, input) → CandidateResult[] works end-to-end; OpenAPI 3.0 petstore smoke test passes |
| 5 | Reconciler |
reconciler/* (9 modules) including money.py, fetch_test_data.py fully implemented |
2 weeks | 25.8 | Reconciler is the sole confidence assigner; MONEY arithmetic property-tested; adversarial corpus vendored |
| 6 | Artifact + API |
artifact/* (8 modules), api/* (7 modules), paxman.normalize() end-to-end, replay rehydration, CapabilityNotFoundError (per Oracle C1) |
2 weeks | 31.5 |
paxman.normalize() and paxman.replay() work end-to-end; first golden artifact produced |
| 7 | Integration + property tests | E2E fixtures, golden artifacts (≥5), Hypothesis strategies, replay tests, public API snapshot | 2 weeks | 25.3 | Full test pyramid green: 90% coverage on contract/, planner/, executor/, reconciler/; 95% on artifact/; replay equality proven |
| 8 | Documentation + community + CI hardening |
docs/concepts/ (5 docs), docs/howto/ (4 docs), README.md quickstart, CHANGELOG.md, CONTRIBUTING.md, CODE_OF_CONDUCT.md, GitHub Actions CI, import-linter, interrogate, bandit, pip-audit |
2 weeks | 17.2 | All CI checks pass on PR; documentation site builds; new contributor can ship a PR in <30 min |
| 9 | Production hardening |
pytest-benchmark performance harness, security audit, p50/p99 measurements, import-time profiling, OIDC trusted publishing setup, issue/PR templates, branch protection, 5-platform wheel note (pure-Python → universal py3-none-any)
|
2 weeks | 11.8 |
make ci green + performance targets measured; security lint clean; pip-audit clean |
| 10 | Release | TestPyPI publish, v0.5.0 → v1.0.0 RC cycle, smoke install on 3 platforms (linux/amd64, osx/arm64, win/amd64), ≥3 external users validation, GitHub release, post-release retrospective | 3 weeks | 10.6 | Paxman 1.0.0 on PyPI with passing acceptance criteria, ≥3 external users, golden artifact reproducibility across two independent runs |
Total: ~241 ideal-engineering-days of work across 10 sprints. With 4 engineers working in parallel within each sprint, the calendar duration is 22 weeks (5.5 months) — see the parallelization table below. The "sum of deliverables" column is the raw id-ed if the work were done sequentially; the sprint-length column is the actual wall-clock.
Reconciliation note (added per Oracle review C2): An earlier draft of this README quoted "~146 id-ed" and "~190 id-ed"; both were undercounts. The 241 id-ed number is the sum of every per-sprint deliverable's effort estimate. The realistic 4-person-team calendar estimate of 22 weeks (5.5 months) is defensible because each sprint assumes 2-4 engineers working in parallel, and the sum of parallel work-hours per sprint fits within 2 weeks. Recommended buffer: add 2 weeks of slack before Sprint 10 (6 months total) to absorb determinism regressions discovered in Sprint 7 property tests.
Sprint 0 Design closure ─┐
▼
Sprint 1 Foundation (cross-cutting) ─┐
▼
Sprint 2 contract/ + Pydantic + Dict DSL adapters ─┐
▼
Sprint 3 planner/ + 3 capabilities (text, regex, validation) ─┐
▼
Sprint 4 executor/ + 2 capabilities (lookup, inference) ─┐
▼
Sprint 5 reconciler/ + MONEY + fetch_test_data.py ─┐
▼
Sprint 6 artifact/ + api/ (normalize + replay) ─┐
▼
Sprint 7 Integration + E2E + property tests + goldens ─┐
▼
Sprint 8 Docs + CI hardening + community ─┐
▼
Sprint 9 Production hardening + OIDC ─┐
▼
Sprint 10 Release (v1.0.0) ─── DONE
These are decisions the project owner must make; they are documented in sprint-00-design-closure.md:
-
License decision: MIT or Apache-2.0? (Currently TBD per
README.md.) - Dict DSL syntax: Spec the internal Dict DSL format. Required as the "test source of truth" per ADR-0007.
-
Input Profile spec: What fields does
planner/input_profile.pyproduce? How isInputProfileconstructed? -
CostHintvalues per capability: Document the cost model (tokens, ms, usd) for all 5 V1 capabilities.
The following can run in parallel within a sprint (4-person team):
| Sprint | Parallel tracks |
|---|---|
| 1 | (a) cross-cutting modules, (b) pyproject.toml + Makefile + .pre-commit-config.yaml, (c) GitHub Actions CI workflow, (d) .gitignore + LICENSE + repo boilerplate |
| 2 | (a) Pydantic adapter, (b) Dict DSL adapter, (c) Validator + canonical model, (d) fixture Pydantic contracts + Dict DSL contracts |
| 3 | (a) planner/heuristics.py + planner/scoring.py, (b) planner/field_plan.py + planner/input_profile.py, (c) capabilities/v1/regex_extraction.py + capabilities/v1/validation.py, (d) capabilities/v1/text_extraction.py + tests |
| 4 | (a) executor/field_runner.py + executor/budget_tracker.py, (b) executor/early_stop.py + executor/evidence.py, (c) capabilities/v1/lookup.py, (d) capabilities/v1/inference.py with stub provider |
| 5 | (a) reconciler/merge.py + reconciler/conflict.py, (b) reconciler/money.py (MONEY arithmetic), (c) reconciler/confidence.py + reconciler/unresolved.py, (d) fetch_test_data.py implementation + license gating |
| 6 | (a) artifact/artifact.py + artifact/evidence.py, (b) artifact/_hash.py + artifact/serializer.py, (c) artifact/replay.py + api/replay.py, (d) api/normalize.py orchestration |
| 7 | (a) E2E fixtures + golden artifacts, (b) Hypothesis strategies, (c) replay equality tests, (d) public API snapshot test |
| 8 | (a) docs/concepts/* (5 docs), (b) docs/howto/* (4 docs), (c) CHANGELOG.md + CONTRIBUTING.md + CODE_OF_CONDUCT.md, (d) GitHub Actions CI refinement + import-linter config |
| 9 | (a) Performance benchmarks, (b) security audit + bandit/pip-audit, (c) issue/PR templates, (d) OIDC trusted publishing setup |
| 10 | (a) TestPyPI publish + smoke tests, (b) external user validation, (c) GitHub release + changelog excerpt, (d) post-release retrospective |
A sprint is "done" when all of these are true:
- All scope items in that sprint's "Deliverables" section are complete.
- All exit-criteria tests pass.
- The CI matrix (py311/3.12/3.13) is green.
-
mypy --strict src/paxmanis clean. -
ruff checkandruff format --checkare clean. -
interrogatereports 100% docstring coverage on the public surface. - For sprints 2+:
import-linteris clean (DAG contract). - For sprints 6+: at least one end-to-end smoke test runs and produces an artifact.
Paxman 1.0.0 ships when all V1 acceptance criteria in V1_ACCEPTANCE_CRITERIA.md are checked. The Sprint 10 doc lists the specific checklist.
-
CHANGES_LOG.md— documentation changes made during this planning exercise -
../PRD.md— product requirements and success metrics -
../V1_ACCEPTANCE_CRITERIA.md— definition of done for 1.0 -
../PACKAGE_STRUCTURE.md— module layout (the source of truth for code structure) -
../docs/adr/— 9 ADRs that constrain the design (7 from planning + 2 from Sprint 0: license, Dict DSL)