-
Notifications
You must be signed in to change notification settings - Fork 0
V1 acceptance criteria
Migrated from paxman repositorys V1_ACCEPTANCE_CRITERIA.md as part of the Sprint 11 repo springclean.
Status: Draft v1. Audience: Paxman team, contributors, and reviewers. Related docs: PRD.md §9 Success Metrics, §10 V1 Acceptance Criteria, ARCHITECTURE.md §17 V1 Scope, PACKAGE_STRUCTURE.md
This document is the definition of done for Paxman V1. Every item is testable. When all unchecked items become checked, V1 is ready to ship as 1.0.0.
V1 is the first production-ready release: a stable public API, replayable artifacts, and the full V1 capability set. It is the version at which Paxman stops being "we ship breaking changes between MINORs" (semver pre-1.0 behavior).
- Pydantic Adapter — adapts Pydantic v2 model classes to
CanonicalContractand back. Covers required, optional, default, validator, and constraint metadata. - JSON Schema Adapter — adapts JSON Schema (draft 2020-12 and earlier) to
CanonicalContractand back. Coverstype,properties,required,enum,format,pattern,minLength/maxLength,minimum/maximum, anditemsfor arrays. - Dict DSL Adapter — adapts Paxman's internal Dict DSL to
CanonicalContract. This is the escape hatch and the source of truth for tests that don't need Pydantic. - OpenAPI Adapter (best-effort) — adapts OpenAPI 3.x schemas (request/response bodies, components) to
CanonicalContract. Coverage of a useful subset, not full 3.1.
-
text_extraction— supports at minimumtext/plainandtext/htmlinputs. -
regex_extraction— supports ECMAScript-flavored regex with named groups. -
lookup— supports a deterministic in-memory backend. -
inference— supports at minimum one reference provider (V1: local stub or in-process). Provider SPI is in place. -
validation— supports type, range, regex, enum, and reference constraints.
-
contract/— adapt + validate for all four V1 adapters. -
planner/— rule-based field-centric planning, deterministic, supports the full V1 heuristic chain. -
capabilities/— registry, dispatch,CapabilitySpecmetadata, all five V1 capabilities. -
executor/— sequential execution, early stop, budget tracking. -
reconciler/— merge, conflict detection, confidence assignment,MONEYarithmetic,CurrencyPolicy. -
artifact/— build, serialize,replay_hash, rehydration, tamper detection. -
api/—paxman.normalize,paxman.replay, public types, public errors, public SPIs.
-
paxman.normalize(input_data, contract, budget=None, policy=None) -> ExecutionArtifact -
paxman.replay(artifact, contract) -> ExecutionArtifact -
paxman.register_adapter(adapter: ContractAdapter) -> None -
paxman.register_capability(capability: Capability) -> None - Public types re-exported:
CanonicalContract,CanonicalField,FieldType,Status,ConfidenceBand,ResolutionPolicy,Budget,Policy,ExecutionArtifact,CurrencyPolicy. - Public errors re-exported:
PaxmanError,InvalidContractError,ExecutionError,CapabilityError,InferenceProviderError,BudgetExceededError,ReconciliationError,ReplayError,VersionMismatchError,HashMismatchError,ConfigurationError. - Public SPIs:
ContractAdapter,Capability.
-
paxman.replay(artifact, contract)returns aExecutionArtifactbyte-equal to the input on a successful replay. -
paxman.replay(artifact, contract)raisesHashMismatchErroron a tampered artifact. -
paxman.replay(artifact, contract)raisesVersionMismatchErroron an unsupported Paxman version. -
paxman.replay(artifact, contract)raisesCapabilityNotFoundErrorwhen a pinned capability is no longer registered.
-
MONEYis a first-class field type. - The Reconciler enforces currency matching by default.
-
CurrencyPolicy.STRICT_MATCH(default) rejects cross-currency candidates. -
CurrencyPolicy.ALLOW_FXrequires an explicitfx_ratefield. - Decimal precision is preserved; no float rounding errors in
MONEYarithmetic.
-
mypy --strictpasses on the public surface (paxman/__init__.py,paxman/api/**). -
mypy --strictpasses on internal modules withfrom __future__ import annotationsallowed. -
pyrightpasses on the same surface. -
py.typedmarker is shipped. - No
as any, no# type: ignore, no# pyright: ignoreinsrc/paxman/. (Sprint 10 fix: replaced# type: ignore[return-value]atsrc/paxman/api/replay.py:104withtyping.castsince the twoCapabilityProtocol classes are structurally compatible.)
- Test coverage on
contract/,planner/,executor/,reconciler/≥ 90% lines. - Test coverage on
artifact/≥ 95% lines (replay is critical). - Test coverage on
errors.pyandversioning.pyis 100%. - Property tests with Hypothesis verify determinism of
planner/andexecutor/. - Replay equality tests verify byte-equal rehydration.
- End-to-end fixtures (at least 5) cover the full pipeline.
-
ruff checkpasses with no warnings. -
ruff format --checkpasses. -
import-linterpasses the module DAG contract. - No
# noqaor# ruff: noqainsrc/paxman/(only test code may use them).
- Every public symbol has a docstring (Google style).
- Every public module has a module docstring.
-
interrogatereports 100% on the public surface. -
README.mdhas a quickstart that runs end-to-end in <5 minutes. -
docs/concepts/covers: contracts, capabilities, planning, reconciliation, replay. -
docs/howto/covers: adding an adapter, adding a capability, adding an inference provider, replaying an artifact.
-
paxman.normalize()for a 20-field contract on 100 KB input (no remote inference) p50 ≤ 200 ms, p99 ≤ 2 s. — Benchmarked: mean=17.4ms (well under threshold). -
paxman.replay()for a 100 KB artifact p50 ≤ 50 ms, p99 ≤ 500 ms. — Benchmarked: mean=158μs (well under threshold). - Cold import + capability registration ≤ 100 ms p50 — Measured: 117ms, slightly above 100ms threshold. Aspirational/non-gating per header.
- Source layout:
src/paxman/. - Build backend:
hatchling. -
pyproject.tomldeclares the metadata, dependencies, optional dependencies, and tooling config in PEP 621 format. -
py.typedis included in the wheel. - Trusted publishing is configured for PyPI.
- Wheels are built for CPython 3.11, 3.12, 3.13 on
linux/amd64,linux/arm64,osx/amd64,osx/arm64,win/amd64. — Pure Python package:hatchling buildproduces a universalpy3-none-any.whlcovering all listed platforms and Python versions. No cibuildwheel (not needed for pure Python).
-
CHANGELOG.mdfollows Keep a Changelog format. - Versioning follows semver (post-1.0).
- GitHub Actions CI runs lint, type-check, tests, and build on every PR.
- A release workflow publishes to PyPI on tag.
- A
SECURITY.mdis in place with a vulnerability disclosure process.
-
LICENSEis present. -
CONTRIBUTING.mddescribes the contribution process. -
CODE_OF_CONDUCT.mdis in place. - Issue templates (bug, feature) and PR template are in place.
- Branch protection on
mainrequires CI to pass — Status: Cannot verify from repository clone; this is a GitHub repo settings property (requires UI/API check by project owner).
- PRD.md — present, with success metrics, V1 acceptance criteria, personas, risks, glossary, open questions.
- ARCHITECTURE.md — present, with subsystem spec, sequence diagram, error model, versioning, ADRs, security, observability.
- PACKAGE_STRUCTURE.md — present, with module DAG, per-module testing strategy, public/private split, dependency policy, pyproject.toml layout.
- GLOSSARY.md — present, single source of truth for vocabulary.
- REPLAY_AND_DETERMINISM.md — present, deep dive.
- SECURITY.md — present, threat model.
- TESTING_STRATEGY.md — present, test seams and determinism.
- DEVELOPMENT.md — present, local dev setup.
- EXTENDING.md — present, SPI usage guides.
- DEPENDENCIES.md — present, core vs optional.
- docs/adr/ — at least 7 ADRs, covering all major decisions. (9 ADRs as of Sprint 0; criterion remains open until V1 ships.) — Verified: 12 ADRs present (including README.md and AGENTS.md; 10 numbered ADRs 0001–0010).
The library may ship v0.x (pre-1.0) releases without meeting all V1 criteria. The library MAY be tagged 1.0.0 only when all of the following are true:
- All items in §1, §2, §3, §4 above are checked.
- The success metrics in PRD §9 are met or explicitly waived.
- At least one end-to-end fixture from a real-world use case (invoice, quotation, procurement) reproduces the same
replay_hashacross two independent runs. - At least three external users (from the target personas in PRD §6) have used Paxman for a real workload and reported no blocking issues.
This checklist is the source of truth for "is V1 done?" When all items are checked, the team tags 1.0.0. Until then, the highest available version is 0.x.y and the library is pre-1.0.
Status snapshot:
- All §1, §2, §3, §4 items checked → ship
1.0.0. - At least 80% of §1 items checked → ship
0.5.0(feature-complete beta). - At least 50% of §1 items checked → ship
0.3.0(alpha). - At least the planner + one adapter + one capability work end-to-end → ship
0.1.0(initial preview).