Add results.json schema validation and source tracking by PavelMakarchuk · Pull Request #237 · PolicyEngine/policyengine.py

PavelMakarchuk · 2026-02-24T00:46:19Z

Summary

Adds a policyengine.results module with two small pieces that support the blog post content pipeline:

Schema validation (schema.py): Pydantic models that validate results.json at generation time
Source tracking (tracking.py): Helper that auto-captures line numbers for traceability

What it does

Schema validation

from policyengine.results import ResultsJson, ResultsMetadata, ValueEntry

results = ResultsJson(
    metadata=ResultsMetadata(title="SALT Cap Repeal", repo="PolicyEngine/analyses"),
    values={
        "budget_impact": ValueEntry(
            value=-15.2e9,
            display="$15.2 billion",
            source_line=47,
            source_url="https://github.com/.../analysis.py#L47",
        ),
    },
)
results.write("results.json")

Catches errors at generation time:

Missing source_line or source_url on any value
Table rows with wrong column count
Chart alt text that's too short (< 20 chars)
Missing required metadata fields

Source tracking

from policyengine.results import tracked_value

budget = reform_revenue - baseline_revenue
results["values"]["budget_impact"] = tracked_value(
    value=budget,
    display=f"${abs(budget)/1e9:.1f} billion",
    repo="PolicyEngine/analyses",
)
# Automatically captures source_line and builds source_url

Why

Every number in a PolicyEngine blog post should link to the exact line of code that produced it. This module ensures the results.json contract is valid before it reaches the resolve-posts build step.

Test plan

11 unit tests passing (schema validation, source tracking, edge cases)
CI passes

🤖 Generated with Claude Code

New `policyengine.results` module with two pieces: - `schema.py`: Pydantic models (ResultsJson, ValueEntry, TableEntry, ChartEntry) that validate results.json at generation time. Catches missing source_line/source_url, row/column mismatches in tables, and vague alt text on charts before they reach the blog build step. - `tracking.py`: `tracked_value()` helper that captures the caller's line number via `inspect` and builds the source_url automatically. Eliminates repetitive inspect.currentframe() boilerplate in analysis scripts. These support the blog post content pipeline where every number in a published post links back to the exact line of analysis code that produced it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Use model_dump(mode="json") instead of json.loads(model_dump_json()) to avoid unnecessary serialize→parse→serialize round-trip - Create parent directories automatically so callers don't need to mkdir first - Add trailing newline to output file - Add test for nested directory creation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Builds on PolicyEngine#274's bundle-level TRO and closes the gaps that would surface at an AEA replication review: - schema:creator is now a schema.org Organization, not a version string - model wheel is hashed as a fourth composition artifact (read from the manifest when present, fetched from the PyPI JSON API otherwise and degrades silently when unreachable) - every trov:path resolves over HTTPS (Hugging Face resolve URLs, PyPI download URL) so a reviewer can dereference the TRO without custom clients - certification metadata moves from prose in schema:description to structured pe:* fields on TrustedResearchPerformance (pe:certifiedForModelVersion, pe:compatibilityBasis, pe:builtWithModelVersion, pe:dataBuildFingerprint, pe:dataBuildId) - GitHub Actions runs add pe:ciRunUrl / pe:ciGitSha attestation - JSON Schema ships at data/schemas/trace_tro.schema.json and every generated TRO is validated against it in tests Adds the per-simulation layer that the bundle-level TRO doesn't cover: - build_simulation_trace_tro chains a bundle TRO to a reform + results - policyengine.results.build_results_trace_tro / write_results_with_trace_tro emit a TRO alongside a ResultsJson payload Wiring: - policyengine trace-tro CLI (plus release-manifest subcommand) - TaxBenefitModelVersion.trace_tro property and the build_trace_tro_from_release_bundle / compute_trace_composition_fingerprint / serialize_trace_tro / extract_bundle_tro_reference / build_simulation_trace_tro re-exports from policyengine.core that were dropped when PolicyEngine#276 merged - scripts/generate_trace_tros.py regenerates bundled TROs before a policyengine.py release - jsonschema added to dev dependencies Restores the TRACE TRO tests that PolicyEngine#276 removed as part of the test_release_manifests.py rewrite, now isolated in tests/test_trace_tro.py with coverage for determinism, schema conformance, CI attestation, and per-simulation chaining. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Switches PEP 604 `X | None` unions in `ResultsMetadata` and `ResultsJson.write` to `Optional[X]` / `Union[X, Y]`, matching the project-wide pattern enforced for the 3.9 floor (ruff `UP007` is disabled for the same reason in `pyproject.toml`). Without this fix the `content-pipeline-results` branch fails `ResultsMetadata` class construction on Python 3.9 with `TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Collapses string concatenations that the ruff 0.15.11 formatter in CI wants unified onto single lines. No behaviour change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@type

Round two of reviewer fixes. The published TRACE/TROv reference demos use a different vocabulary than the draft this module was originally written against; reviewers caught that our emission would not validate against real TROv SHACL shapes. TROv vocabulary conformance: - Switch to the public namespace https://w3id.org/trace/2023/05/trov# - Flatten the locally-invented trov:hash / trov:hashAlgorithm / trov:hashValue wrapper to the vocabulary-native trov:sha256 property - Rename trov:path -> trov:hasLocation on ArtifactLocation - Rename the inverse pointer to trov:hasArtifact (was trov:artifact) - Correct TrustedResearchSystem -> TransparentResearchSystem - Correct TrustedResearchPerformance -> TransparentResearchPerformance - Drop the locally-invented ArrangementBinding chain; use the vocabulary-native trov:accessedArrangement on the TRP instead - Emit @type as a single string (not a 2-element array), matching the published trov-demos reference shape Hardening from reproducibility + code-simplifier reviewers: - pe:emittedIn is always present ("local" or "github-actions") so a verifier can tell a CI-emitted TRO from a laptop rebuild without inferring from absent fields - Per-simulation TRO records pe:bundleTroUrl on the performance node; a verifier can fetch that URL, re-hash it, and confirm it matches the bundle_tro artifact hash - so swapping the caller's bundle_tro dict is detectable - Composition fingerprint joins hashes with \n to prevent hex-length concatenation collisions (sha256("ab" + "cdef") vs "abcd" + "ef") - CertifiedDataArtifact.sha256 is now authoritative when present; us.json ships the real dataset sha256, so bundle TRO emission no longer requires the data release manifest to carry it - JSON Schema rejects non-HTTPS trov:hasLocation values and requires canonical 64-hex sha256 strings - Inline the real policyengine-us 1.647.0 / policyengine-uk 2.88.0 wheel sha256 + URL on us.json/uk.json Extracted shared helpers to collapse the ~120-line duplication between build_trace_tro_from_release_bundle and build_simulation_trace_tro (_assemble_composition_and_arrangement, _assemble_tro_node, _policyengine_trs, _build_bundle_performance). Removed dead code flagged by simplifier: - DataReleaseArtifact.https_uri (zero callers, zero tests) - _data_release_manifest_url (replaced by https_release_manifest_uri) - Prose certification_description_parts (metadata is now purely in pe:* structured fields, as the commit message for PolicyEngine#274 originally claimed) CLI + release workflow: - Dropped the broken --offline flag (never had a working code path) - Added policyengine trace-tro-validate <path> subcommand that validates a TRO against the shipped JSON Schema - Versioning CI job now runs scripts/generate_trace_tros.py and commits the generated bundled TROs alongside the changelog, so every released wheel ships with its matching TRACE TRO - generate_trace_tros.py skips (with warning) countries whose data release manifest is unreachable instead of hard-failing Tests (34 total in tests/test_trace_tro.py, replacing the prior 20): - Real determinism: build TRO from two fresh manifest instances, assert bytes equal (previously tested only that json.dumps is deterministic) - Forgery detection: swap bundle_tro, assert hash in sim TRO changes - Schema rejects file:// locations - Schema rejects missing pe:emittedIn - Hex-length ambiguity test for the fingerprint separator - All 4 TROv property renames have explicit assertions so a future regression to the wrong names fails loudly - trace-tro-validate CLI accepts valid TROs and rejects invalid ones Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@graph

Reviewers came back accept / clean / minor revisions; this commit picks up the remaining suggestions. Forgery resistance: - Bundle TRO takes an optional self_url and records it as pe:selfUrl so a verifier with only the bundle bytes can discover the canonical location it was published at. - write_results_with_trace_tro now requires bundle_tro_url, not merely accepts it. A published simulation TRO that omitted the URL would leave reviewers without a pinned fetch target; raising when the caller forgets matches the "adversarial reviewer" expectation. - docs/release-bundles.md shows the three-step verifier workflow a replication reviewer should run: fetch pe:bundleTroUrl, recompute its sha256, compare to the sim TRO's bundle_tro artifact hash, confirm pe:bundleFingerprint matches the bundle's own CompositionFingerprint. A sim TRO with a swapped bundle_tro dict but a truthful URL fails step 2; both-swapped fails step 3. CI regression guard: - scripts/generate_trace_tros.py now exits non-zero if a country that previously shipped a .trace.tro.jsonld fails to regenerate (e.g. HUGGING_FACE_TOKEN expired). The Versioning CI job will block a release rather than silently ship a stale TRO. Schema tightening: - trov:hasLocation regex now anchors end-of-string on every legal local path and restricts data/ to data/release_manifests/<country>. data/../../etc/passwd and bundle.trace.tro.jsonld.evil no longer pass. HTTPS locations must contain no whitespace. - Added a test covering the multi-node @graph path after filter fix. extract_bundle_tro_reference filter: - Locates the trov:TransparentResearchObject node by @type rather than trusting @graph[0]. Future TROs that embed TRS/TSA nodes no longer break reference extraction. Dead-kwarg cleanup (simplifier): - Dropped emission_context kwarg from both public builders; tests use monkeypatch on GITHUB_ACTIONS/GITHUB_SHA instead, which is closer to what CI does anyway. - Dropped tro_id / composition_id / arrangement_id default kwargs from the helpers; hardcoded as module constants. - Dropped the bundle_tro_path branch from write_results_with_trace_tro — no caller, no test, no actual use case. Tests (38 total in test_trace_tro.py): - test__given_fixed_ci_env__then_tro_bytes_match_across_builds locks down determinism under CI with pinned run_id/git_sha - test__given_self_url__then_tro_records_it covers pe:selfUrl - test__given_graph_with_multiple_nodes__then_extract_finds_tro exercises the @type filter - test__given_write_helper_without_url__then_raises locks the required-kwarg contract Docstring caveat on build_trace_tro_from_release_bundle now states explicitly that pe:compatibilityBasis covers the model and data layers only; Python version, OS, and transitive lockfile are not yet pinned. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Last two simplifier nits from round 3: - _build_bundle_performance no longer takes emission_context as a kwarg; like the sim builder, it calls _emission_context() inline at the end of performance construction. One fewer parameter, same ordering behaviour, matches the sim-side pattern. - write_results_with_trace_tro no longer passes the URL to both bundle_tro_location and bundle_tro_url; the build_simulation_trace_tro fallback (bundle_tro_location or bundle_tro_url or <default>) picks the URL up on its own. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

PavelMakarchuk and others added 9 commits February 23, 2026 19:46

Merge remote-tracking branch 'origin/main' into content-pipeline-results

47b3d63

Apply ruff format to results module

2331883

Collapses string concatenations that the ruff 0.15.11 formatter in CI wants unified onto single lines. No behaviour change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

MaxGhenis marked this pull request as ready for review April 18, 2026 15:13

MaxGhenis merged commit 26c372c into PolicyEngine:main Apr 18, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add results.json schema validation and source tracking#237

Add results.json schema validation and source tracking#237
MaxGhenis merged 9 commits intoPolicyEngine:mainfrom
PavelMakarchuk:content-pipeline-results

PavelMakarchuk commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

PavelMakarchuk commented Feb 24, 2026

Summary

What it does

Schema validation

Source tracking

Why

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants