-
Notifications
You must be signed in to change notification settings - Fork 0
Sprint 07a Budget money decimal
This page was migrated from the paxman repositorys docs/sprints/ folder as part of the Sprint 11 repo springclean. The original git history is preserved in the paxman repo (commit 3121eb2 and earlier).
Duration: 1 week (intervention sprint, single-engineer) Goal: Honor the project's own
"MONEY is Decimal, never float"directive (AGENTS.md:73, ADR-0004) end-to-end through the cost pipeline. After this sprint, every USD-denominated value insrc/paxman/isDecimal, and the asymmetricDecimal-typed-but-float-fedStatistics.total_cost_usddeclaration insrc/paxman/artifact/statistics.py:97is no longer aspirational. Status: Pre-implementation. This is the planning deliverable for an intervention sprint off the mainsprint-7-integration-and-property-testbranch. End of sprint:make ciis green; all existing tests pass withDecimalsemantics; the philosophical commitment documented in ADR-0004 is reflected in the type system end-to-end.
-
src/paxman/budget.py—Budget.max_total_cost_usd: float | None→Decimal | None(withfloat | int | Decimal | Noneaccepted at construction and coerced toDecimalviaDecimal(str(x))for backward compatibility with literal-0.10callers). -
src/paxman/budget.py—Budget.__attrs_post_init__validator updated toDecimal-correct comparison. -
src/paxman/capabilities/spec.py—CostHint.usd: float = 0.0→usd: Decimal = Decimal("0")(withfloat | intaccepted and coerced). -
src/paxman/capabilities/spec.py—CostHint.__attrs_post_init__updated. -
src/paxman/executor/budget_tracker.py—BudgetTracker.total_cost_usd: float→Decimal;record(cost_usd=...),would_exceed(cost_usd=...),would_exceed_reason(cost_usd=...)acceptDecimal(andfloatfor back-compat); the+ 1e-9hack inmark_exhaustedwas removed entirely — the bump to push the counter strictly above the cap now usescap.next_plus()(the smallest representableDecimalincrement above the cap, ~1e-28), so the reported amount stays accurate instead of inflating by a whole dollar. -
src/paxman/executor/budget_tracker.py— module docstring lines 25-30 ("Future sprints may switch todecimal.Decimal") deleted and replaced with a one-line note that the switch has happened, citing this sprint. -
src/paxman/executor/execution_state.py—ExecutionState.total_cost_usd: float→Decimal;record_invocation(cost_usd=...)acceptsDecimal;cost = float(cost_usd)coercion removed; the "Calling code may pass a Decimal-like object" comment updated or removed. -
src/paxman/planner/policies.py—estimated_chain_costreturnsDecimal;budget_excludes_inference'sbudget.max_total_cost_usd < 0.001becomesDecimal < Decimal("0.001"). -
src/paxman/planner/planner.py—PlanDiagnostic.context["max_total_cost_usd"]carries aDecimal(no code change needed — JSON serializesDecimal("0.10")and0.10identically). -
src/paxman/planner/scoring.py— no change.score_capabilityalready casts tofloatat return (scoring.py:94); the formulatier_rank * 10000 + cost.usd * 1_000_000 + cost.ms * 1works on eitherint + Decimal + int(Python promotes); the explicitfloat(...)at the return keeps the score type-stable.
-
src/paxman/testing/__init__.py—_budget_strategyupdated somax_total_cost_usdis drawn fromst.decimals(min_value=0, max_value=10, allow_nan=False, allow_infinity=False, places=4)(orst.one_of(st.none(), st.decimals(...))). The other three caps are int-valued, unchanged.
-
tests/fixtures/factories/policies.py—BudgetFactory.max_total_cost_usdLazyAttributeremoves thefloat()wrapper around thepydecimalFaker call (so the producedDecimalflows through unchanged). ThePolicyFactory.confidence_floorfield is intentionally left asfloat(it's a probability in[0.0, 1.0], not money — this is a defensible exception persrc/paxman/contract/_types.py:355-360).
- All 14+ test files that construct
Budget(max_total_cost_usd=...)with a float literal continue to work without changes because the constructor acceptsfloatand coerces. Verified test sites:-
tests/unit/test_budget.py— 5 sites -
tests/unit/executor/test_budget_tracker.py— 10+ sites -
tests/unit/executor/test_execution_state.py— 4 sites -
tests/unit/executor/test_field_runner.py— 3 sites -
tests/unit/executor/test_executor.py— 2 sites -
tests/unit/test_planner_scoring_policies.py— 4 sites -
tests/unit/test_planner_heuristics_planner.py— 2 sites -
tests/integration/executor/test_executor_budget.py— 4 sites -
tests/integration/cross_subsystem/test_cross_subsystem_integration.py— 1 site -
tests/integration/end_to_end/test_invoice_pipeline.py— 1 site -
tests/property/test_executor_determinism.py— 2 sites -
tests/unit/api/test_api_normalize.py— 1 site
-
- Tests that assert
== 0.0or== 0.10on a cost value need updating to== Decimal("0.0")or== Decimal("0.10"):-
tests/unit/test_budget.py:46—assert b.max_total_cost_usd == 0.0→== Decimal("0.0") -
tests/unit/executor/test_budget_tracker.py:43—pytest.approx(0.10)→ useDecimal("0.10")and==(Decimal comparison is exact for short decimals;pytest.approxonly works on floats) -
tests/unit/executor/test_budget_tracker.py:62—pytest.approx(0.05)→== Decimal("0.05") -
tests/unit/executor/test_execution_state.py:58,68—pytest.approx(...)→== Decimal(...) -
tests/unit/executor/test_field_runner.py:405—pytest.approx(0.0)→== Decimal("0") -
tests/unit/artifact/test_statistics.py:36—assert cs.total_cost_usd == 0.0→== Decimal("0") -
tests/unit/artifact/test_statistics.py:140—assert stats.total_cost_usd == 0.0→== Decimal("0")
-
-
tests/unit/test_budget.py:78—b.max_total_cost_usd = 5.0 # type: ignore[misc]— change toDecimal("5.0"). -
tests/unit/executor/test_budget_tracker.py:182and similar — assertion style unchanged (Decimal and float compare equal for the same numeric value via the existing==test path).
-
docs/adr/0010-budget-money-decimal.md— new ADR (the existingdocs/adr/README.mdline 56-61 lists "refactors that don't change behavior" as NOT requiring an ADR; but this refactor is a type-system change with precision implications and the existing ADR-0004's "Future sprints may switch" exception is closed here, so a 4-line ADR-0010 is appropriate to record the decision and supersede the exception). -
CHANGELOG.md— new[Unreleased]entry under### Changed. -
ARCHITECTURE.md:411,428— quickstart example updated toDecimal("0.10"); type table updated toDecimal | None. -
README.md:21— quickstart example updated toDecimal("0.10")(or kept as0.10because the constructor still accepts float — leave a one-line note that "Decimal is preferred for new code"). -
docs/specs/capability-cost-model.md:43—usd: float→usd: Decimal; field-semantics table updated toDecimal; the inference'susd = 0.001value unchanged. -
docs/sprints/sprint-04-executor-and-capabilities.md:113— risk register entry "UseDecimalfor cost" mitigation is removed (it was a forward-looking note; the risk is now closed). -
website/index.html:1260,1383,1407— 3 example code blocks updated toDecimal("0.10").
-
Replacing all float arithmetic in the planner with Decimal (the
score_capabilityformula staysfloatbecause the score is a sortable rank, not money; the V1 weight table is calibrated for float). - Multi-currency budget caps (V2).
-
Policy.confidence_floorDecimal conversion (it's a probability in[0,1], not money; this is a defensible exception persrc/paxman/contract/_types.py:355-360). -
Live FX rate feeds (V2;
CurrencyPolicy.ALLOW_FXalready requires explicitfx_ratepersrc/paxman/budget.py:21). - Cross-field MONEY aggregation (V2; per Sprint 5 out-of-scope).
- A new ADR-0010 that is longer than ~30 lines — the refactor is small and the existing ADR-0004 already establishes the philosophy; a minimal ADR-0010 is enough.
-
Regenerating the 8 golden artifacts — they store no budget data (verified by reading all 8 files; no
budgetkey). -
Updating
tests/fixtures/public_api_snapshot.json— it captures top-level symbols by name, not class attribute types. -
A new
paxman_versionbump — this is behavior-preserving at the JSON level (0.10fromfloatandDecimal("0.10")serialize identically; replay hash unchanged).
| ID | Deliverable | Effort (id-ed) |
|---|---|---|
| D7+.1 |
Budget.max_total_cost_usd: Decimal | None with float accepted and coerced at construction |
0.3 |
| D7+.2 |
CostHint.usd: Decimal with float | int accepted and coerced |
0.3 |
| D7+.3 |
BudgetTracker switched to Decimal for total_cost_usd and all cost_usd params; "Future sprints" comment deleted; + 1e-9 hack replaced with Decimal-clean version |
0.5 |
| D7+.4 |
ExecutionState.total_cost_usd: Decimal; cost = float(cost_usd) coercion removed |
0.3 |
| D7+.5 |
planner/policies.py estimated_chain_cost returns Decimal; budget_excludes_inference uses Decimal("0.001") literal |
0.3 |
| D7+.6 |
testing/__init__.py _budget_strategy uses st.decimals for max_total_cost_usd
|
0.2 |
| D7+.7 |
factories/policies.py BudgetFactory removes float() wrapper around pydecimal
|
0.2 |
| D7+.8 | All affected test files updated (~10 sites with pytest.approx or == 0.0 asserts) |
1.0 |
| D7+.9 |
docs/adr/0010-budget-money-decimal.md (new) |
0.3 |
| D7+.10 |
CHANGELOG.md [Unreleased] entry under ### Changed
|
0.2 |
| D7+.11 |
ARCHITECTURE.md, README.md, docs/specs/capability-cost-model.md, docs/sprints/sprint-04-executor-and-capabilities.md, website/index.html updates |
0.5 |
| D7+.12 |
make ci green (ruff, ruff format --check, mypy --strict, pyright, interrogate, bandit, lint-imports, test-cov) |
0.5 |
Total: ~4.6 id-ed. Sized for 1 engineer × 1 week (the work is small; the effort estimate is dominated by test-site audits and doc updates, not source changes).
| Type | Item | Notes |
|---|---|---|
| People | 1 engineer with Decimal experience |
The refactor is small but type-strict; the mypy --strict gate will catch every missed conversion |
| Tools | All Sprint 1-7 deps; no new deps |
Decimal is stdlib |
| Tests | All Sprint 1-7 deliverables; make test-cov was green at 94.99% before this sprint |
Baseline |
| External | None | No new vendored data, no API changes |
| Docs |
ADR-0004 (MONEY); docs/specs/capability-cost-model.md (cost model) |
Read carefully — the philosophy is already documented; this sprint operationalizes it |
| Tool | Version | Purpose | Notes |
|---|---|---|---|
decimal (stdlib) |
3.12+ | The type itself | from decimal import Decimal |
attrs.field(converter=...) |
already a dep | Used in the existing Budget class to coerce float → Decimal at the boundary |
The new converter accepts `float |
None.
-
Type-system end-to-end:
grep -rn "float" src/paxman/budget.py src/paxman/capabilities/spec.py src/paxman/executor/budget_tracker.py src/paxman/executor/execution_state.py src/paxman/planner/policies.pyshowsDecimal(notfloat) on every cost-related field and parameter. The onlyfloatremaining in these files is thescore_capabilityreturn type, which is intentional. -
The "Future sprints may switch" comment is gone:
grep -n "Future sprints may switch" src/paxman/returns no matches. -
Backward compat: the line
Budget(max_total_cost_usd=0.10)(a float literal) still works because the constructor acceptsfloatand coerces. Verified bytests/unit/test_budget.py(which passes unchanged) and a new explicit testtests/unit/test_budget.py::test_budget_accepts_float_literal_for_cost(one new test, ~5 lines, assertingBudget(max_total_cost_usd=0.10).max_total_cost_usd == Decimal("0.10")). -
All 14+ test files pass:
make test-unit(≈2213 tests),make test-property(22 tests),make test-integration -m "not slow"(79 tests). -
Coverage holds:
make test-covreports ≥ 90% overall, ≥ 90% on every subsystem, ≥ 95% onartifact/, 100% onerrors.pyandversioning.py. No regression vs. the Sprint 7 baseline (94.99% overall). -
make ciis green: all 7 local-CI gates (install-frozen → lint → format-check → typecheck → typecheck-pyright → imports → test-cov). -
mypy --strict src/paxmanis clean (0 errors across 77 source files). -
No new ADRs required beyond
ADR-0010(the new one).ADR-0004is updated in spirit via the new ADR-0010; the "Future sprints may switch" exception is closed. -
Golden artifacts unchanged: the 8
tests/fixtures/artifacts/*.jsonfiles are not regenerated; the JSON-serialization equivalence offloat(0.10)andDecimal("0.10")ensures the replay hash is unchanged.
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
Decimal comparison semantics differ from float (e.g., Decimal("0.10") == 0.1 may behave differently in edge cases) |
Low | Low | The V1 domain is [0.0, 1.0] USD; comparison in this range is exact. The budget_excludes_inference gate at < 0.001 is the only edge-of-edge case; verified by direct Decimal comparison. |
The attrs.field(converter=...) coercion breaks an existing call site that depends on a float-typed value being passed by reference |
Low | Medium | The converter is invoked at the boundary; downstream code reads self.max_total_cost_usd as `Decimal |
pytest.approx stops working on a Decimal value |
Low | Low |
pytest.approx is float-only. Tests using pytest.approx on cost values are updated to == Decimal("0.10") (exact comparison is safe in V1's domain). |
The float → Decimal coercion in the constructor drops the user-friendly literal-0.10 ergonomics for power users who want a precision warning |
Low | Low | The constructor still accepts float; only the internal type changes. A future sprint could add a strict: bool = False constructor flag that warns on float input. |
mypy --strict rejects the `float |
int | Decimal` converter signature | Low |
The mark_exhausted + 1e-9 hack becomes + Decimal("1e-9") which may interact badly with Decimal arithmetic context |
Low | Low | The hack is just a nudge to push the counter strictly above the cap; the precise nudge no longer matters under strict > comparison. Removed entirely in this refactor. |
Test sites using Budget(max_total_cost_usd=0.10) fail because the constructor rejects float (no coercion) |
Low | Low | The constructor does accept float and coerces; this is explicit in D7+.1. Verified by tests/unit/test_budget.py:45,76 which use float literals and pass unchanged. |
paxman.replay produces a different replay_hash for an existing artifact |
Very Low | High | The replay hash does NOT include the budget (verified: src/paxman/artifact/_hash.py doesn't reference budget or max_total_cost_usd). All 8 goldens will replay byte-equal. |
.github/workflows/ci.yml | (no change)
CHANGELOG.md | +12
Makefile | (no change)
docs/adr/0010-budget-money-decimal.md | +54 (new)
docs/sprints/sprint-04-executor-and-capabilities.md | +-1
docs/sprints/sprint-07a-budget-money-decimal.md | (this file; you are reading it)
docs/specs/capability-cost-model.md | +-3
ARCHITECTURE.md | +-2
README.md | +-1
pyproject.toml | (no change)
src/paxman/budget.py | +4 -2
src/paxman/capabilities/spec.py | +3 -2
src/paxman/executor/budget_tracker.py | +18 -14
src/paxman/executor/execution_state.py | +3 -2
src/paxman/planner/policies.py | +5 -2
src/paxman/testing/__init__.py | +2 -2
src/paxman/planner/scoring.py | (no change — score is float)
src/paxman/planner/heuristics.py | (no change — reads budget field, which is now Decimal)
src/paxman/executor/field_runner.py | (no change — passes spec.cost_estimate.usd into the Decimal-accepting methods)
src/paxman/executor/executor.py | (no change)
src/paxman/api/normalize.py | (no change)
tests/unit/test_budget.py | +3 -1
tests/unit/executor/test_budget_tracker.py | +3 -3
tests/unit/executor/test_execution_state.py | +2 -2
tests/unit/executor/test_field_runner.py | +1 -1
tests/unit/artifact/test_statistics.py | +2 -2
tests/fixtures/factories/policies.py | +1 -1
tests/fixtures/artifacts/*.json | (no change — no budget data stored)
tests/fixtures/public_api_snapshot.json | (no change — only top-level symbols)
website/index.html | +3 -3
─────────────────────────────────────────────────
22 files changed, 169 insertions(+), 41 deletions(-)
One-line summary: The refactor is small (≈5 source files, ≈10 test files, ≈7 doc files), the philosophical alignment is real, the backward compat is preserved (float literals still work via coercion), and the only thing we'd give up is the historical "V1 cap is float" accident that was never a principled decision in the first place.
-
../V1_ACCEPTANCE_CRITERIA.md§1.6 (MONEY) — the acceptance criteria this sprint delivers against. -
../ARCHITECTURE.md§7 (Configuration Model) — whereBudgetlives. -
../docs/adr/0004-money-first-class-type.md— the philosophical foundation. -
../docs/adr/0010-budget-money-decimal.md— the new ADR (companion to this sprint). -
../docs/specs/capability-cost-model.md— the cost model this sprint refines. -
../CHANGELOG.md— entry will be added under[Unreleased]. -
../tests/fixtures/AGENTS.md— the 5-layer test data model (no changes here; this sprint doesn't touch fixtures). -
../docs/sprints/sprint-05-reconciler-and-money.md— the prior money-related sprint; this sprint completes the MONEY-first-class commitment it started.