-
Notifications
You must be signed in to change notification settings - Fork 1
Roadmap
Auto-generated mirror. This page mirrors
docs/ROADMAP.md, the canonical source of truth. Do not edit this file directly; editdocs/ROADMAP.mdand re-runscripts/wiki/sync_mirrors.py.
Last updated: v0.10.9 (June 2026).
This roadmap synthesizes community feedback with the architecture plan at the project root. Versions v0.3.0 through v0.7.16 + v0.8.0-v0.8.7
- v0.9.0-v0.9.9 + v0.10.0-v0.10.8 have shipped; v0.10.9 is the current
dev cycle (a debt + robustness patch closing the v0.10.8 ship findings
and hardening the release machinery that cycle built, before the v0.11
federal-compliance theme). v0.9.0 opened the
v0.9.x "federal compliance" line with POA&M + CONMON read-only
library; v0.9.1 landed the Polycentric Labs org migration; v0.9.2
added the CONMON REST router + federal corpus + LLM rater + federal
walk-through scenarios; v0.9.3 was the largest minor of the line —
CONMON daemon (Theme A) + AI governance (Theme B); v0.9.4 was the
consolidation pass closing deferred review items + the federal-SI
walk-through opener; v0.9.5 landed walk-through refinement +
collaboration-primitives groundwork; v0.9.6 brought the federal
expansion (WORM evidence store + CLI RBAC + CONMON MCP first-mover);
v0.9.7 was the comprehensive v0.9.x close-out + v1.0 prep
(api-stability NORMATIVE + multi-tenant RBAC + CIMD-signature
groundwork); v0.9.8 wired those primitives into live CLI, REST,
MCP-dispatch, and storage surfaces; v0.9.9 was a supply-chain
hygiene + gate-fidelity patch (paramiko CVE closure + an
osv-scanner --sbompre-push gate + a full Dependabot-queue clear). v0.10.0 opens the v0.10.x research-driven integration line: the OCSF-aligned findings schema (the keystone identified by the 2026-05-21 competitive/integration research pass), a bidirectional OCSF Compliance Finding mapping layer (evidentia_core.ocsf, behind the new[ocsf]extra), SARIF 2.1.0 output forevidentia gap(runs gap analysis as a CI gate, surfaced in GitHub code scanning + GitLab security dashboards), and 3 pilot collectors (AWS, GitHub, Postgres) populating the new fields. v0.10.1 consolidates the integration line on the same calendar day as v0.10.0: closes both v0.10.0 pre-release-review findings (F-V100-L1 trust-boundary on the OCSFunmapped["evidentia"]block via a newtrust_unmapped=Falseparameter; F-V100-M1bump_version.pyover-bumping third-party pins via a[tool.uv.sources]workspace allowlist), ships the deferred third-party OCSF ingestion collector with a Detection Finding path for Prowler / AWS Security Hub, extendscompliance_statuspopulation to the remaining 11 collectors, and introduces theFindingclass-name alias plusevidentia collect convert --format ocsf. v0.10.2 brings the integration line into AI clients: 4 new MCP tools (gap_analyze_sarif,collect_ocsf,tprm_vendor_list,poam_list) expand the §MCP tool contract from 8 → 12; a GRC Engineering Club marketplace plugin is staged in-repo (marketplace/grc-engineering-suite/plugins/evidentia/— generalist OSS scope per the first concrete v0.10.x plugin scope decision, with persona-tied workflows kept out of scope for the OSS plugin); and F-V101-L1 (the v0.10.1 SSRF surface oncollect ocsfURL mode) is closed via a new default-on--block-private-ipsflag. Per the v1.0 master-plan resequencing (2026-05-21), the v0.9.x and v0.10.x lines iterate as many times as needed toward a solid product — the operator self-test and demo/pitch recording precede the walk-throughs and multi-reviewer peer review, which complete before v1.0.0. Seev1.0-transition.mdfor the v1.0 narrative and acceptance gates.
How Evidentia is shown to evaluators, decided after a structured multi-model + primary-source review of how comparable open-source GRC / security tools handle demos.
Principle: Evidentia is a stateful, credentialed compliance platform, not a stateless widget. Its differentiators (OSCAL emit + verify, Sigstore signing, the collector suite, the MCP server) live in the CLI / library, which is also air-gap-native. So the showcase leads with the CLI.
-
CLI-first assets (now): a tight README quickstart (
pip install->init->gap analyze-> OSCAL emit, with the bundled sample inventory), an asciinema cast, and a short walkthrough video. Air-gap-consistent; shows real output and the real differentiators. -
Clickable demo (planned): an in-browser terminal (Killercoda / Instruqt-style) running
the real
evidentiaCLI in an ephemeral sandbox. The user drives the actual tool; Evidentia hosts no state. Best fit for a CLI / library-shaped tool. -
Local writable GUI demo (deferred until CLI<->GUI parity): a one-command
docker compose -f docker-compose.demo.yml upwith a seeded store + mock collectors. Deferred until the web console reaches CLI parity (it currently surfaces a subset of the CLI); requires a store-seeder. -
Not planned: a public, hosted, stateful backend demo. A credentialed GRC backend
exposed publicly is a real security surface (SSRF / secret exfiltration / prompt-injection
via collected evidence — the class the
--block-private-ipshardening addresses). A durable hosted experience is reserved for a future managed / commercial edition.
Validated via a structured research pass (multi-model fleet + a primary-source survey of comparable tools + a 3-way adversarial validation).
-
evidentia gap diff— compare two gap snapshots, classify every gap as opened / closed / severity-changed / unchanged. Supports console / json / markdown / github output formats.--fail-on-regressionblocks PRs that make compliance posture worse. -
evidentia explain <control_id>— LLM-generated plain-English control translation, cached on disk. - Documentation:
docs/github-action/README.md+ example workflow YAML so anyone can drop a.github/workflows/evidentia.ymlinto their repo and get PR-level compliance checking without waiting for the reusable-action wrapper.
- Three realistic end-to-end scenarios in
examples/(Meridian fintech v2, Acme Healthtech, Northstar DoD contractor). - Dogfooded GitHub Action workflow (
.github/workflows/evidentia.yml). - Fixed
_is_openbug on the in-memory gap-diff path. - 392 passing tests.
The audience shift from security engineers (CLI) to compliance officers and auditors (web UI). Three coordinated deliverables:
FastAPI backend + React/Vite/shadcn/ui frontend, served together from
127.0.0.1:8000. Non-technical users install via
uv tool install "evidentia[gui]" or
pip install "evidentia[gui]", then run evidentia serve and
get a polished localhost-only dashboard.
Shipped:
-
evidentia serveCLI command - New workspace package
evidentia-apiwith 18 REST endpoints under/api/* - New workspace directory
evidentia-ui(Vite + React + shadcn/ui) - Every user-facing page:
- Home with three-path onboarding wizard (sample data / upload / wizard)
- Dashboard — saved-report listing with top-line metrics
- Frameworks (list + detail) — 82-catalog browser with tier / category / search filters
- Gap Analyze — interactive form → TanStack Table results
- Gap Diff — two-report picker → summary + per-entry table
- Risk Generate — SSE-streamed per-gap progress
-
Settings — editable
evidentia.yaml+ LLM provider / air-gap posture
- Hatchling build hook that bundles the SPA into the Python wheel
- 36 FastAPI TestClient + 6 Vitest tests
- Playwright E2E smoke test against
evidentia serve - "Commit to disk" button on the wizard preview (auto-write the three YAMLs to the CWD after confirmation)
- Deeper component test coverage (AppLayout, PathChooser, GapTable)
- Auto-generated TypeScript types from FastAPI's OpenAPI schema
Stack: React 18 + TypeScript strict + Vite 5 + shadcn/ui (Radix primitives -> WCAG 2.1 AA) + TanStack Query / Table / Virtual + React Router 6 + Zustand + React Hook Form + Zod + Recharts.
Global CLI flag plus evidentia doctor --check-air-gap validator.
Every LLM / network call consults the evidentia_core.network_guard
module; non-loopback / non-RFC-1918 targets raise
OfflineViolationError before any network IO fires.
Positioning: "The only open-source GRC tool that runs entirely on your infrastructure. Use with Ollama for fully air-gapped FedRAMP, CMMC, and healthcare deployments."
Shipped: flag, guard module, doctor validator, LLM client integration, 43 unit tests covering the host classifier and guard functions. The UI Settings page surfaces the posture live. GUI-triggered offline-toggle is planned for v0.4.2.
allenfbyrd/evidentia-action is live at v1.0.0 + floating v1
pointer. Consumers replace the 80-line drop-in workflow template with:
- uses: allenfbyrd/evidentia-action@v1
with:
inventory: my-controls.yaml
frameworks: nist-800-53-rev5-moderate,soc2-tsc
fail-on-regression: trueSubmission to the GitHub Actions Marketplace is a manual UI step in the repo settings; the listing is pending final screenshots before publication.
First three real integrations. These shipped as empty shells all the way back to v0.1.0; v0.5.0 wires them up. What landed:
Push gaps as Jira issues + bidirectional status sync. When a Jira
issue transitions to Done, the linked gap's status becomes REMEDIATED
on the next sync. Full workflow-name mapping (To Do, In Progress,
Done, Won't Do, + common customizations). Credentials via env vars
only; no secrets ever flow through Evidentia REST responses.
CLI: evidentia integrations jira {test,push,sync,status-map}.
REST: /api/integrations/jira/{status,push/{key},sync/{key},status-map}.
Auto-evidence from AWS Config + Security Hub. Covers NIST 800-53
AC / IA / SC / AU / CM / CP / SI families for cloud-native
deployments. Curated mapping of 25+ Config rules + FSBP / CIS
standards controls; unknown sources fall back to empty control_ids
rather than speculative attribution.
Credentials via standard boto3 chain. Unit tests use MagicMock paginators (Config) + controlled responses (Security Hub); integration-test-level moto coverage lands in v0.5.1.
CLI: evidentia collect aws [--region] [--profile].
REST: POST /api/collectors/aws/collect.
Branch protection + CODEOWNERS + repo visibility findings mapped to SA-11 (developer security testing), CM-2/CM-3 (baseline + change control), AC-3/AC-6 (access enforcement), SI-2 (flaw remediation). Zero extra deps — uses httpx directly rather than pulling in PyGithub.
CLI: evidentia collect github --repo owner/repo.
REST: POST /api/collectors/github/collect.
The six old PyPI names (controlbridge, controlbridge-core,
controlbridge-ai, controlbridge-api, controlbridge-collectors,
controlbridge-integrations) released at v0.5.1 as transitional shims
that emit a DeprecationWarning on import and forward every attribute
and submodule to their evidentia-* replacements via sys.modules
aliasing. Scheduled for PyPI yank at v0.7.0 (~October 2026).
The v0.5.0 name collided with controlbridge.ai — a live commercial
SOX 302/404 compliance platform. v0.6.0 renamed the project end-to-end:
PyPI packages (6 names), GitHub repo, CLI entry point, config file
(controlbridge.yaml → evidentia.yaml), frontend npm scope, and
all docs. No functional changes. See docs/archive/RENAMED.md for
the full rationale, CHANGELOG.md § 0.6.0 for the mechanical details,
and the standing_rule_github_repo_names.md memory note for the absolute
rule protecting the GitHub URL redirect.
The "enterprise-grade" release. Closes all 10 BLOCKER items in
docs/enterprise-grade.md and ships the
end-to-end supply-chain hardening narrative:
- Evidence integrity — SHA-256 digests on every embedded resource in OSCAL Assessment Results back-matter; optional GPG signing (air-gap path) or Sigstore/Rekor signing (online path, OIDC-keyless via Fulcio).
-
Verification —
evidentia oscal verifychecks digests + GPG.asc+ Sigstore.sigstore.jsonbundles end-to-end.--require-signatureis satisfied by either GPG or Sigstore.--expected-identity/--expected-issuerenforce signer identity for production audit pipelines. -
CycloneDX SBOM — generated from
uv.lockon every release, attached to the GitHub Release alongside the wheels. -
PyPI Trusted Publisher (OIDC) — long-lived
PYPI_API_TOKENremoved; release publishes are signed via the workflow's ambient OIDC identity. Auto-enables PEP 740 attestations on every wheel- sdist (Sigstore-signed, Rekor-logged).
-
OSCAL schema conformance —
compliance-trestle>=4.0round-trip in CI catches unknown-field bugs that NIST's JSON Schema misses. -
AWS IAM Access Analyzer + GitHub Dependabot collectors
with explicit
BLIND_SPOTSdisclosure lists threaded into the AR back-matter for auditor transparency. -
ECS-8.11 / NIST AU-3 / OpenTelemetry structured logs via
--json-logs. Drop-in for Splunk / Elastic / Datadog / Sentinel. - Secret scrubber covers AWS / GitHub / Slack / Stripe / Google / npm tokens + JWTs + generic password= patterns.
-
Consolidated GitHub Action at
.github/actions/gap-analysis/(replaces the archivedallenfbyrd/evidentia-actionrepo). - 6 controlbridge- deprecation shims removed* from the workspace per the public migration contract from v0.6.0.
The release was preceded by a 6-step comprehensive pre-tag review
(see docs/positioning-and-value.md,
docs/capability-matrix.md,
docs/v0.7.1-plan.md).
857 tests passing; mypy strict clean; ruff lint clean; all 10
BLOCKER items in docs/enterprise-grade.md closed.
The "AI features hardening" release. Brings evidentia-ai
(risk_statements/ + explain/) up to the v0.7.0 collector-pattern
enterprise grade — closing the v0.7.0 BLOCKER B3 carry-over for both
AI subsystems:
-
GenerationContextPydantic model inevidentia_core.audit.provenance, sibling ofCollectionContext. Captures per-output AI provenance:model,temperature,prompt_hash(SHA-256),run_id(ULID),generated_at,attempts,instructor_max_retries,credential_identity(best-effort operator label per NIST AU-3),evidentia_version. Optional field onRiskStatementandPlainEnglishExplanation(defaultNonefor v0.7.x backward compat; will tighten to required in v0.8 with deprecation cycle). -
9 new
EventActionentries under theevidentia.ai.*namespace (AI_RISK_*+AI_EXPLAIN_*covering generated/failed/retry/cache_hit/batch_completed). -
Typed exception hierarchy in
evidentia_ai.exceptions(EvidentiaAIError,LLMUnavailableError,LLMValidationError,RiskStatementError,RiskGenerationFailed,ExplainError,ExplainGenerationFailed) — closes BLOCKER B3 for both AI subsystems. -
Bounded retry against shared
LLM_TRANSIENT_EXCEPTIONSvia the newwith_retry_asyncdecorator +build_retrying/build_async_retryingfactory functions inevidentia_core.audit.retry. AI generators passAI_RISK_RETRY/AI_EXPLAIN_RETRYso SIEM operators can filter retry storms by namespace. -
Audit-trail correlation — every
AI_*event carriesrun_id(and inheritedtrace.idfrom the run_id scope), so SIEM queries onevidentia.run_idsurface failures + successes + retry storms attributable to the same batch. -
Best-effort operator identity via
evidentia_ai.client.get_operator_identity()(returns$EVIDENTIA_AI_OPERATORif set, elseuser@hostname). Closes the NIST AU-3 "Identity" gap for AI-derived artifacts.
Shipped as P0-only by deliberate scope-narrowing decision at ship
time. P1 (supply-chain polish — SHA-pin composite action, action E2E
smoke test, SLSA L3 build provenance, OpenSSF Scorecard) and P2/P3
(documentation polish + community-driven items) moved to
docs/v0.7.2-plan.md so v0.7.1 could land
focused on the BLOCKER B3 closure without scope creep.
973 tests collected (965 passed + 8 environmental skips on local Windows; 8 skips are GnuPG entropy + Sigstore CI-OIDC-only and pass on Linux CI per the v0.7.0 baseline); mypy strict clean (98 source files); ruff lint clean.
The "supply-chain polish + documentation refresh" release. What landed:
-
OpenSSF Scorecard weekly workflow —
.github/workflows/scorecard.ymlpublishes tosecurityscorecards.devon Mondays + push-to-main. Surfaces ~20 supply-chain checks (Pinned-Dependencies, Branch-Protection, Code-Review, SBOM, Signed-Releases, etc.). v0.7.0 work covers most baseline checks; v0.7.3 S1 SHA-pinning will improve Pinned-Dependencies. -
IDE setup for testing/validation — version-controlled
.vscode/{settings,launch,tasks,extensions}.json+.cursorrules-
.editorconfig+docs/ide-setup.mdwalkthrough. Both Cursor and VS Code share the same config; pytest discovery / mypy strict / ruff format-on-save / coverage gutters / 7 debug launch configs / 16 pre-canned tasks. Pre-commit hooks + dev container queued for v0.7.3 (DOC6 + DOC7).
-
-
Catalog-drift false positive fix — closes daily-noise issues
#1, #2, #3, #4 opened by
catalog-refresh.ymlbetween 2026-04-23 and 2026-04-26. Pinnedyaml.safe_dump(width=200)for byte-stable manifest emit +--ignore-all-spacebelt-and-suspenders workflow guard. - Pre-release-review refinements — 4 MEDIUM doc/config polish fixes from the v0.7.2 comprehensive pre-tag review (DORA past-tense, doc stamp date, Windows venv path removal, regen stderr warning).
-
Scratch-directory convention —
.gitignoreadds.local/for per-developer working notes and drafts not ready to share.
Shipped without the originally-scoped P0 supply-chain items
(SHA-pinning, action E2E smoke test, SLSA L3) — those moved to
docs/v0.7.3-plan.md along with the originally-scoped
docs polish (sigstore-quickstart, v0.8.0-plan, etc.). See the v0.7.2
plan's "Deferred to v0.7.3" section for the full carry-forward
inventory.
965 tests passing + 8 environmental skips on local Windows (GnuPG entropy + Sigstore CI-OIDC; full pass on Linux CI per v0.7.1 baseline); mypy strict clean (98 source files); ruff lint clean.
See docs/v0.7.3-plan.md for the full plan. Theme:
finishes the v0.7.1-plan-originated supply-chain items that didn't
make v0.7.2. P0 SHIPPED: SHA-pin every third-party action across the
composite action + every workflow file (28 pinned refs), composite
action E2E smoke test workflow against the Meridian fixture, SLSA L3
build provenance via actions/attest-build-provenance@v2.4.0. P1
SHIPPED: release-checklist verifier-note refresh, docs/v0.8.0-plan.md
forward release plan, docs/sigstore-quickstart.md end-to-end
walkthrough, architecture-plan "Updates since v0.7.0" callout block,
.pre-commit-config.yaml + companion .yamllint + .markdownlint.yaml,
.devcontainer/devcontainer.json. DOC5 quarterly positioning re-sync
deferred to v0.7.4+ (Q3 cadence). Audit-cleanup items A6 README
truncation + A10 CITATION.cff + B4 release-checklist refresh + A3
frontend dev-stack CVE bumps (vite + vitest + plugin-react) +
B2 lightweight container image (Dockerfile + non-publishing CI smoke
test) all landed. P2 community items (Okta, ServiceNow, Vanta/Drata,
OSCAL Plugfest, multi-industry sample data) carry forward to v0.7.4+.
Same-day patch correcting three wrong CLI invocations shipped in
v0.7.3's container-image work + an additional pre-existing latent
same-pattern bug in the composite action's install step (latent
since v0.7.0; never surfaced because the composite action was
never externally consumed in CI before v0.7.3). The Evidentia CLI
registers version as a SUBCOMMAND (alongside init, doctor,
serve, gap, catalog, risk, etc.) — not as a --version
flag. Similarly the framework-catalog subcommand is evidentia catalog (not evidentia frameworks). Adds a "local Docker build"
line to docs/release-checklist.md Step 5 so future
Dockerfile-touching releases catch this class of bug pre-tag.
All v0.7.3 PyPI artifacts (wheels, SBOM, attestations) carry
forward unchanged. See CHANGELOG.md [0.7.4] block.
See docs/v0.7.5-plan.md. Renumbered from
v0.7.4-plan at v0.7.4 hot-fix ship time; augmented 2026-04-29
post-v0.7.4 with three new buckets: P0.5 critical-security
batch (S1-S6 closing 14 HIGH py/path-injection + 1 HIGH
py/polynomial-redos + 3 MEDIUM stack-trace exposure + 4 MEDIUM
missing-workflow-permissions + 5 MEDIUM Pinned-Dependencies +
2 HIGH URL-substring-sanitization review = ~20 of the 37 open
code-scanning alerts), P0.6 Dependabot batch merge (5 currently
open PRs), P0.7 quick-win polish (OpenSSF Best Practices Badge
filing, /api/health hardening, docs/troubleshooting.md).
Original P0 (container publish + cosign + SLSA) and P1 (R1
quarterly resync, R2 oscal verify UX) carry forward unchanged.
~5-7 week ship target.
See docs/v0.7.6-plan.md. Closes the alpha.2
UI completion gap that's been outstanding since v0.4.0 (Gap Analyze
form, Gap Diff picker, Risk Generate streaming page, README
screenshots), runs the deferred quarterly research-resync if Q3
cadence has arrived, lands the performance benchmark design + first
measurement run (docs/benchmarks.md v1), publishes
docs/quickstart.md (90-second flow), and runs a /security-review
deep-pass threat-model walk. ~4-5 week ship target.
See docs/v0.7.7-plan.md. First substantive
new-collector release since v0.5.0. Adds 5 SQL-family adapters as
evidentia-collectors[sql-{postgres,mysql,sqlite,mssql,oracle}]
extras — read-only collectors mapping DB-resident compliance
evidence (user privileges, audit-log status, encryption posture,
schema change history) to NIST 800-53 controls AC-2 / AC-3 / AC-6
/ AU-2 / AU-3 / SC-12 / SC-28. Plus the carried-forward Okta
collector + ServiceNow integration + a benchmark re-run. ~6-8 week
ship target.
See docs/v0.7.8-plan.md for the full plan.
Extended the v0.7.7 relational-DB evidence layer into modern cloud
data warehouses (Databricks, Snowflake) and added the first BI
output integrations (Tableau, Power BI). Each cloud-DW adapter
maps to the same NIST 800-53 control families as the SQL adapters
plus AC-2(11), AC-6(7), AC-7, IA-2(1)/(2), IR-4 for Snowflake.
The Tableau + Power BI integrations push three datasets (gap
inventory, risk register with AI-provenance, collection-run audit
trail) to enterprise BI surfaces, positioning Evidentia as the
OSS evidence-feed beneath dashboards risk officers + audit
committees + boards already consume.
CSV-based Tableau publish (no .hyper native binary needed) +
Power BI Push Datasets via Azure AD service-principal OAuth. CLI
- REST + status-endpoint wiring for all four. Comprehensive
walkthrough docs (
docs/cloud-dw-collectors.md,docs/bi-integrations.md) + Meridian-with-BI demo scenario (examples/meridian-fintech-v2-with-bi/). Step 5.A pre-tag batch landed 8 fixes (F-V08-1 unbacked azure/gcp extras removal; F-V08-2 DFAH/DSE arXiv expansion corrections; F-V08-DAST-1 frameworks 500→404 + regression test; F-V08-DAST-3 17 manual HTTPException(422) sites converted to 400 to match OpenAPI schema; F-V08-CR-H1 Snowflake LOGIN_HISTORY LIMIT; F-V08-CR-H2 Snowflake cursor-reuse refactor; F-V08-CR-H3 Power BI clear_table 404 swallow; F-V08-CR-MEDIUM Databricks workspace_url rename + O(N) coverage + dead-code removal). 1259 tests passing (+159 new); mypy strict clean across 138 source files. Some evidence sources DEFERRED to v0.7.9+ (Databricks audit logs + lineage need SQL Warehouse plumbing; Snowflake ACCESS_HISTORY needs pagination design; Databricks network policies need Account API auth path) — all surfaced as explicit BLIND_SPOTS.
See docs/v0.7.9-plan.md + the v0.7.9 SHIPPED
memory pointer. Tag v0.7.9 at commit b643caf (2026-05-04).
Brings Evidentia into the regulated financial-services compliance
domain via the new evidentia tprm top-level capability module —
vendor inventory CRUD, due-diligence questionnaire generation +
ingestion (5 formats incl. SIG BYO + caiq-full), concentration-
risk reporting (6 dimensions), OSCAL TPRM emit (vendor inventory
in metadata.parties[] + back-matter.resources[] with SHA-256
integrity hashes), and 4 vendor-risk SaaS collectors (Vanta +
Drata + BitSight + SecurityScorecard). Plus the v0.7.8 Step 5.A
carry-over batch (4 MEDIUM closed) + --security-headers
middleware + PR #18 actions-bump fix. Per the comprehensive plan
§19.1 final-scope-narrowing decision, the model-risk module + 7
new catalogs + governance primitives + audit chain-of-custody
work split out across v0.7.10 + v0.7.11 follow-ons (rather than
the original 8-10 week mega-release scope). 1540 tests / mypy
strict 0/0 across 160 source files / ruff clean. Image digest
sha256:a378f24efef3ea33062592a767abc82d5c4df9accea61e409a404faec34ac344.
See docs/v0.7.10-plan.md. The v0.7.9
follow-on. Shipped: top-level evidentia model-risk module per
SR 11-7 / SR 26-02 / OCC Bulletin 2011-12 / OCC 2026-13a (model
inventory CRUD + SR-aligned doc generator + validation report
generator + RiskStatement.model_inventory_ref AI-feature linkage),
evidentia governance module (G1 Three Lines of Defense
lines-report + G2 Effective Challenge log), 7 new bundled Tier-A
catalogs (FFIEC IT Handbook 5 booklets + FFIEC CAT + OCC 2026-13a /
FRB SR 26-02; total 82 → 89), Codecov + 81.87% statement coverage
closing the last OpenSSF Silver MUST (test_statement_coverage80),
and 4 of the 17 v0.7.9-deferred findings (M-1 / M-2 / L-3 / L-7).
Pre-tag review: 0 HIGH / 1 MEDIUM (F-V10-S1 inline-fixed) / 1 LOW
(F-V10-S2 deferred); 0 unfixed at ship.
See docs/v0.7.11-plan.md. Shipped: P0 audit
chain-of-custody (RetentionMetadata + lifecycle state machine +
WORMBackend ABC + LocalFilesystemWORM reference impl), P1.5
governance trio (G3 KRI/KPI/KGI metrics + G4 Open FAIR risk
quantification + G5 process-as-code workflows), P3 first-batch
deferral closures (F-V10-S2 + M-1 + M-2 + M-5 + M-6 + L-1 + L-3 +
L-6 + L-7), validate_within harmonization across 6 stores, +
P4 docs (audit-chain-of-custody.md + governance-metrics.md +
risk-quantification.md). Concrete S3/Azure/GCS WORM backends +
FAIR Monte Carlo simulation deferred to v0.7.12. Pre-tag review
0 HIGH / 0 MEDIUM / 0 LOW — first PROCEED-CLEAN of the v0.7.x
cycle.
See docs/v0.7.12-plan.md. Shipped: 3 cloud-
WORM backend implementations (S3ObjectLockWORM /
AzureImmutableBlobWORM / GCSBucketLockWORM via
evidentia[worm-s3] / [worm-azure] / [worm-gcs] extras),
FAIR Monte Carlo simulation (risk quantify --method fair-mc),
GDPR Article 17 purge-flow (purge_immediately +
force_gdpr_purge operator override), CodeQL custom sanitizer
pack registering validate_within as a path-injection sanitizer,
bump_version.py inter-package pin tightening, release-checklist
Steps 5.5 + 9.5 doc-consistency + release-notes practices, and
3 cloud-WORM operator runbooks. Second consecutive PROCEED-CLEAN
/security-review (0 HIGH / 0 MEDIUM / 0 LOW). 2075 tests passing
across 188 source files.
See docs/v0.7.13-shipped.md. Wrap-up
release for the v0.7.x cycle. PR #18 (13 GH Actions major bumps)
merged post-ship. Codecov source_pkgs fix (Cobertura XML emits
full repo-relative file paths). P3 carry-overs closed (M-9
OSCAL UUID conformance + L-2 Vanta/Drata extended fields + L-4
SIG BYO debug logging + 5 of 9 v0.7.8 LOWs). release.yml
auto-populates GitHub Release body from CHANGELOG via new
extract_changelog_block.py (closes the v0.7.5→v0.7.12 stub-
body gap structurally). 10 historical release-body backfills
landed retroactively. Third consecutive PROCEED-CLEAN
/security-review (0 unfixed findings; 0 inline-fixes). Step 7
post-tag verification all sub-checks PASS + 2nd consecutive
pin-trap fix validation + 1st validation of G16 release body
substantiveness gate.
v0.7.14 — Frontend modernization + Codecov P2.1 + final v0.7.x hygiene + v0.8.0 G4 foundation — SHIPPED
See docs/v0.7.14-shipped.md. 7 of 8 PR
#21 frontend major bumps landed (TypeScript 5→6, ESLint 9→10,
plugin-react-hooks 5→7, plugin-react-refresh 0.4→0.5, jsdom
25→29, postcss + @types/node minors; tailwind 3→4 deferred to
v0.7.15). 3 deferred v0.7.8 LOWs closed (test-coverage gaps,
Tableau Windows tempfile via TemporaryDirectory, Databricks
LTS env-var). Codecov 0% RESOLVED via P2.1 attempt 1
(flag_management block removal); dashboard now shows 82.14%
on c0c9a31. container-build Wait extended to poll all 6
packages. Hash-pinned docker/requirements.txt preview lands
as v0.8.0 G4 foundation. Fourth consecutive PROCEED-CLEAN
/security-review.
See docs/v0.7.15-shipped.md. Tailwind 3→4
migration (CSS-first @theme blocks; @tailwindcss/vite plugin;
tw-animate-css replaces v3-era tailwindcss-animate),
SettingsPage refactor (key-based remount; lint rule promoted
warn→error), standing-rule sweep pre-commit hook
(file-content stage). Fifth consecutive PROCEED-CLEAN. Ship-cycle
hardening: post-ship commit fd36e78 extends release.yml
publish-container Wait step to all 6 packages (matches v0.7.14
P2.2 fix for container-build.yml).
Final v0.7.x release. PR #23 closes 2 Dependabot medium-severity
alerts (python-dotenv CVE — symlink-following in set_key;
vulnerable < 1.2.2). Adds the commit-msg pre-commit hook
variant that closes the gap left by v0.7.15's file-content-only
hook (catches leaks in commit-message body too). Publishes
docs/v0.7.15-shipped.md in-repo retrospective. Validates the
post-v0.7.15 release.yml Wait extension (commit fd36e78) on
its first release pipeline run. Refreshes the OpenSSF Silver
answer sheet with v0.7.16 ship state (Codecov 82.14%
test_statement_coverage80 MET via v0.7.14 P2.1 fix). Sixth
consecutive PROCEED-CLEAN. v0.7.x cycle CLOSED.
See docs/security-review-v0.8.0.md for the
full pre-tag review (5th canonical Pre-tag deliverable per the
pre-release-review v4 §G7) + docs/v0.8.0-plan.md
for the original plan. First minor release after the v0.7.x cycle
close. Lands the four AI-quality features that distinguish a
Vanta-class dashboard from a compliance-engineering tool:
-
DFAH determinism harness (P0.1) —
evidentia eval stub-smokeCLI verb +DFAHarnesslibrary API per arXiv 2601.15322. New moduleevidentia_ai.evalwith harness/metrics/seeds + result models. CI-gateable via--fail-on-determinism-rate-below. 4 new EventActions (started + determinism-violation + faithfulness- violation reserved + completed). -
Policy Reasoning Traces (P0.2) —
evidentia risk generate --emit-traceflag per arXiv 2509.23291. NewTraceClaim+ReasoningTracePydantic models; optionalRiskStatement.reasoning_tracefield (backward-compat). OSCAL emit gainsrisk_statements_with_traceskwarg surfacing traces as Evidentia-namespaced back-matter resources with canonical JSON + SHA-256 (Sigstore-signable). Trestle pydantic.v1 round-trip preserves trace data. New EventActionAI_RISK_TRACE_EMITTED. v0.8.0 ships single-claim stub trace; v0.8.1 ships LLM-driven per-claim decomposition. -
MCP server (P0.3) — NEW
evidentia-mcpworkspace member exposing 4 read-only tools (list_frameworks,get_control,gap_analyze,gap_diff) over stdio transport.evidentia mcp serve+evidentia mcp doctor. HTTP/SSE + CIMD richness defer to v0.8.1. PyPI Pending Publisher feature validated for the newevidentia-mcpproject. -
Plugin contract scaffolding (P0.4) — 4 ABCs in
evidentia_core.plugins:AuthProvider,StorageBackend[T](PEP 695 generic),MarketplaceProvider,BaseSaaSCollector. 3 reference implementations +discover_plugins()opt-in entry-point discovery. -
M-4 collector base-class refactor — Vanta, Drata, BitSight,
SecurityScorecard inherit
BaseSaaSCollector; per-collector scaffolding LOC drops ~60%. BitSight + SecurityScorecard override_auth_header()for HTTP Basic + custom Token schemes.
P1 architectural primitives:
-
G3 Prometheus
/metricsendpoint onevidentia serve(stdlib-only counter aggregator taps audit-event-firing path). -
G8
docs/evidence-integrity.mdanti-tamper deployment guidance (3 deployment patterns + verification commands). - G1 mutmut + G2 hypothesis + G4 Dockerfile
--require-hashesflip deferred to v0.8.1 per pace constraints.
Image digest sha256:fa8df8028986bd005469a267db46dc25f834b47bf232566422b63f7e2f6b2c1f.
PyPI: 7 packages all at 0.8.0 with PEP 740 attestations verified.
SBOM 159 packages / 0 issues (osv-scanner clean). 2227 tests / 12
skipped, mypy strict 0/0 across 210 source files, ruff clean.
First PROCEED-CLEAN of the v0.8.x line. Step 7 post-tag
verification all 7 sub-checks PASS (PEP 740 / cosign / osv-
scanner / docker run / fresh-venv install 6th consecutive
pin-trap validation / G16 release-body 7615 bytes 5th
consecutive auto-populate-from-CHANGELOG / Scorecard delta).
Two recurring code-scanning false positives dismissed
(py/partial-ssrf on BaseSaaSCollector; Pinned-Dependencies
on Dockerfile); 0 open code-scanning alerts at close.
Tag v0.8.1 at commit 3e520a0. Image digest
sha256:c9dfcfee90685b6b3232646d11eb43ebf4c6842847f6fe82cec52944b45ca352.
PyPI: 7 packages all at 0.8.1 with PEP 740 attestations
verified. Release pipeline first-fire PASS (3m56s).
Step 7 post-tag verification all sub-checks PASS: PEP 740 +
cosign + osv-scanner (159 packages / 0 issues) + docker run
smoke (89 frameworks + 9 crosswalks) + fresh-venv install
(7th consecutive pin-trap validation) + G16 release-body
8484 bytes (6th consecutive auto-populate-from-CHANGELOG).
0 open code-scanning alerts at close. Pre-release-review
v4 Continuous variant PROCEED-CLEAN — 8th consecutive of
the v0.7.x → v0.8.x line.
See docs/security-review-v0.8.1.md
for the full Pre-tag review. Aggressive ~4-week scope (Allen's
v0.8.1 cycle-open lock-in 2026-05-05) executed in a single
focused session.
ALL 12 v0.8.0-bucketed review findings closed — 2 HIGH
(logger record_event level filter, MetricsRegistry
encapsulation), 4 MEDIUM (collector _get non-dict raise,
FastMCP private API → public, F-V08-S3 /api/metrics auth
gate via Phase 3.3 AuthProvider middleware, LocalDirectoryMarketplace
manifest warning), 6 LOW (LocalTokenAuthProvider symlink-
rejection, doctor unbound vars, assert→ValueError under
PYTHONOPTIMIZE, BaseSaaSCollector PEP-695 generic rationale,
discover_plugins of_type kwarg, test defensive None checks).
LLM-driven richness landed:
-
DFAH risk-determinism CLI verb —
evidentia eval risk-determinism --context X --gaps Yruns the v0.8.0 DFAHarness against the live RiskStatementGenerator. CI-gateable via--fail-on-determinism-rate-below 0.95. -
PRT LLM-driven per-claim decomposition —
RISK_STATEMENT_TRACE_PROMPTaugments the system prompt whenemit_trace=True. Instructor extracts 3-7 atomic claims with per-claim policy clause citations + self- introspected confidence. v0.8.0 stub trace remains as defensive fallback. Audit-logtrace_kind=v0.8.1-llmvsv0.8.0-stubfor auditor filtering.
Network surfaces:
-
MCP HTTP/SSE transport —
evidentia mcp serve --transport <stdio|sse|http>with--host+--portflags. Loopback-default; non-loopback warns at startup. -
FastAPI AuthProvider middleware —
create_app(auth_provider=...)-
evidentia serve --auth-token-file <path>ergonomic wiring. Closes v0.8.0 F-V08-S3 MEDIUM finding —/api/metrics+ all data-bearing routes inherit the auth requirement. UNAUTHENTICATED_PATHS allowlist for liveness probes.
-
Deferred to v0.8.2 per §24.6 R6 (infra primitives benefit from a thoughtful integration plan, not rushed at cycle-end):
- G4 Dockerfile
--require-hashesflip + reproducible-build verification (consumes v0.7.14 P1.5 hash-pinneddocker/requirements.txt). - G1 mutmut mutation-testing baseline ≥ 65%.
- G2 hypothesis property-based tests on crosswalk + normaliser.
- MCP CIMD richness (best explored against real MCP-client deployments).
- 2 NEW v0.8.1 findings: F-V81-S1 MEDIUM (HTTP/SSE file-path
tool input gating), F-V81-S2 LOW (module-load AuthProvider
→ FastAPI
lifespan).
Pre-release-review v4 Continuous variant PROCEED-CLEAN — 8th consecutive across v0.7.{11,12,13,14,15,16} + v0.8.0 + v0.8.1. 0 CRITICAL/HIGH unfixed at ship. 2240 tests / 13 skipped, mypy strict 0/0 across 211 source files, ruff clean.
v0.8.2 — Review-deferral closure + supply-chain hardening + test-quality + DFAH faithfulness — SHIPPED
Tag v0.8.2 at commit (TBD post-tag). Aggressive ~3-week scope
executed in a single focused session — closes 8 reservations
carried out of v0.8.1 (CIMD richness deferred further to v0.8.3
per §24.6 R6). 9th consecutive PROCEED-CLEAN of the v0.7.x →
v0.8.x line.
See docs/security-review-v0.8.2.md
for the full Pre-tag review.
Closures:
-
F-V81-S1 —
evidentia mcp serve --allow-root <path>flag gates file-path tool inputs (gap_analyze,gap_diff) viavalidate_within. Out-of-root paths surface asPathTraversalError(MCP tool error, not server crash). Non- loopback HTTP/SSE without--allow-rootwarns at startup. -
F-V81-S2 — AuthProvider construction moved from import-
time module-level → FastAPI
lifespanasync context manager. Importingevidentia_api.appis now side-effect-free; env varEVIDENTIA_API_AUTH_TOKEN_FILEis read at app startup.AuthProviderMiddlewareis always-attached + reads provider fromrequest.app.state.auth_providerat dispatch (no-op when None preserves v0.8.0 backward-compat). -
G4 Dockerfile
--require-hashes(foundation; activation deferred to v0.8.3) —docker/requirements.txtregenerated against the v0.8.2 dep tree (~140 transitive deps with SHA256 hashes);bump_version.py --regenerate-requirementswires regeneration into the version-bump flow. Activation deferred per §25.6 R1: release.ymluv buildis not byte-identical across hosts, so pre-tag hashes don't match PyPI. v0.8.3 closes via reproducible-build verification (SOURCE_DATE_EPOCH) OR release-pipeline regeneration step. -
G1 mutmut baseline —
[tool.mutmut]config + weekly.github/workflows/mutmut.ymltargetinggap_analyzer+risk_statements.docs/mutation-testing.mdoperator runbook ships. -
G2 hypothesis property-based tests — 8 new property tests
in
tests/property/covering invariants on the gap-analyzer normalizer + the catalogs CrosswalkEngine. Configurableci/devprofiles viatests/property/conftest.py. -
DFAH faithfulness scoring (P3.1) — second arXiv 2601.15322
metric. New
evidentia_ai.eval.faithfulnessmodule withFaithfulnessResultmodel +faithfulness_score()function using stdlib Jaccard token-overlap (no heavy ML deps). Default threshold 0.3.docs/dfah-faithfulness.mdoperator guide. -
First-class Sigstore signing for
evidentia evaloutput (P3.2) —evidentia_ai.eval.signingmodule + CLI flags (--sign / --no-sign) + newevidentia eval verifysubcommand. Tri-state default auto-detects viaGITHUB_ACTIONSenv. NewEventAction.AI_EVAL_OUTPUT_SIGNEDaudit entry.
Quality at ship: 2277 tests / 14 skipped (was 2240 / 13 at v0.8.1), mypy strict 0/0 across ~215 source files, ruff clean. 0 CRITICAL/HIGH/MEDIUM findings; 3 LOW deferrals to v0.8.3.
Tag v0.8.3 at commit (TBD post-tag). Aggressive ~3-week scope
executed in a single focused session — closes 6 of 8 v0.8.2
carry-overs; MCP CIMD richness deferred to v0.8.4 (4th
cycle-deferral; per §24.6 R6 gated on empirical operator demand);
DFAHarness check_faithfulness=True wiring deferred to v0.8.4
polish. 10th consecutive PROCEED-CLEAN of the v0.7.x →
v0.8.x line.
See docs/security-review-v0.8.3.md
for the full Pre-tag review.
Closures:
-
G4 Dockerfile
--require-hashesACTIVATED — Path 1 (SOURCE_DATE_EPOCH-driven reproducible builds) per §26.D.release.ymlexportsSOURCE_DATE_EPOCH=$(git log -1 --format=%ct HEAD)beforeuv build→ byte-identical wheels across hosts → SHA256 hashes match between local pre-tag pip-compile + PyPI uploads. Newrelease.ymlbuild-twice verification step assertssha256summatches before publish.bump_version.py --regenerate-requirementswrapsuv build(with SOURCE_DATE_EPOCH from HEAD) + pip-compile against locally-built wheels via--find-links=./dist/. Closes recurring Scorecard PinnedDependencies false-positive cycle (alerts #100 → #115 across v0.7.12 → v0.8.2) structurally + permanently. -
F-V82-S1 LOW:
bump_version.py --regenerate-requirementsauto-detects host platform; on non-Linux hosts auto-invokes pip-compile inside the pinnedpython:3.14-slimbase image so Linux-only transitives (uvloop) resolve correctly. -
F-V82-S2 LOW:
evidentia eval verifyCLI replaces broadexcept Exceptionwith specificSigstoreErrorsubclass catches mapped to distinct exit codes (2 = infrastructure missing; 1 = cryptographic failure). - F-V82-S3 LOW (transitive): paraphrase precision via P1.1.
-
DFAH faithfulness sentence-transformers path (P1.1) —
new
evidentia_ai.eval.faithfulness_semanticmodule + opt-in[eval-faithfulness]extra carrying sentence-transformers. Default modelall-MiniLM-L6-v2(~90 MB); default threshold 0.7. Catches paraphrases that the v0.8.2 stdlib Jaccard baseline misses. -
LLM atomic-claim extraction (P1.2) — new
evidentia_ai.eval.claim_extractionmodule +extract_claims()function decomposes any AI-generated artifact into atomic verifiable claims via LiteLLM-driven LLM call. Defensive parsing (strip bullets/numbering; drop empties). Empty input returns[]cost-aware. NewEventAction.AI_EVAL_FAITHFULNESS_CHECKEDreserved for v0.8.4 DFAHarness wiring. -
DFAH calibration corpus + threshold-tuning script (P1.3)
— 50-entry corpus at
tests/data/dfah-calibration/corpus.jsonl(4 categories; verbatim / paraphrase / semi-related / hallucination). Newscripts/tune_faithfulness_threshold.pymeasures FPR/FNR across thresholds + recommends optimum via Youden's J. Empirically demonstrates the v0.8.2 Jaccard limitation: the bundled corpus's optimal Jaccard threshold is 0.85 (vs default 0.3) — paraphrase-heavy corpora drag the optimum upward.
Quality at ship: 2299 tests / 14 skipped (was 2277 / 14 at v0.8.2; +22 new tests across P1.1 + P1.2 + reproducible-build self-tests). mypy strict 0/0 across 220+ source files; ruff clean. 0 CRITICAL/HIGH/MEDIUM findings; 0 LOW unfixed.
Tag v0.8.4 at commit (TBD post-tag). Aggressive ~2-3 week
focused scope (executed in single session compression matching
v0.8.3 cadence). Closes the v0.8.3 ship-failure root cause via
G4 Path 2 (post-PyPI regeneration in release.yml —
sidesteps cross-platform reproducibility entirely) + the
v0.8.3 P1.2 deferred wiring (check_faithfulness=True
first-class on DFAHarness). MCP CIMD richness deferred 5th
time to v0.8.5; CLI flags + corpus expansion + real-LLM
integration tests deferred to v0.8.5.
See docs/security-review-v0.8.4.md
for the v4 Pre-tag-style closeout (PROCEED-CLEAN; 11th
consecutive of v0.7.x → v0.8.x line).
-
G4 Dockerfile
--require-hashesACTIVATED via Path 2 — closes the recurring Scorecard PinnedDependencies false- positive cycle (alerts #100 → #116 across v0.7.12 → v0.8.3.1) structurally + permanently.release.yml's publish-container job now regeneratesdocker/requirements.txtagainst PyPI's just-published wheels viapip-compile --generate-hashes --no-emit-find-linksBETWEEN the existing Wait-for-PyPI step + the docker build step. Hashes match because pip-compile downloads from PyPI's bytes in the Linux CI runner — same source as the container build's pip install. Cross-platform reproducibility no longer required. Built-in 3-attempt retry loop with 30s sleeps absorbs PyPI propagation lag. The committeddocker/requirements.txtis preview state for operators reading the repo; release-time regeneration overwrites it ephemerally. Defense-in-depth: hash verification fires at pip-compile time + at install time (two distinct points in the supply chain). -
DFAHarness
check_faithfulness=Truewiring — closes the v0.8.3 P1.2 deferral.EvalSampleschema gains optionalsource_clauses: list[str] | None = Nonefield;EvalResultschema gainsfaithfulness_results: list[PromptFaithfulnessResult]list;DFAHarness.run()gains 5 new kwargs:check_faithfulness,faithfulness_threshold,faithfulness_method(jaccard | semantic),claim_extraction_fn(mock-callable injection point),faithfulness_score_fn(mock-callable injection point).EventAction.AI_EVAL_FAITHFULNESS_CHECKED(reserved-but- inactive in v0.8.0; ACTIVATED in v0.8.4) +EventAction.AI_EVAL_FAITHFULNESS_VIOLATION(reserved-but- inactive in v0.8.0; ACTIVATED in v0.8.4). Mock-callable injection points keep harness tests cost-zero (no LLM / sentence-transformers token burn in CI) while exercising real production code paths. Default callable resolution falls back to v0.8.3-shippedextract_claims+ v0.8.2/v0.8.3-shippedfaithfulness_score/faithfulness_score_semanticwhen callers don't inject mocks. 14 new unit tests across 5 test classes. Library + harness integration first-class; CLI flags (--check-faithfulness --source-clauses-file <yaml>) deferred to v0.8.5.
- pytest 100% green: 2313 passed / 14 skipped (was 2299 / 14 at v0.8.3.1 ship)
- mypy strict 0/0 across 220+ source files
- ruff clean
- Standing-rule keyword sweep clean across both v0.8.4-cycle commits
Tag v0.8.5 at commit (TBD post-tag). Aggressive ~2-3 week
focused scope (single-session compression matching v0.8.3 +
v0.8.4 cadence). Closes ALL 4 v0.8.4 carry-overs per Allen's
explicit Comprehensive scope + Implement-CIMD-now lock-in
(§28). 12th consecutive PROCEED-CLEAN of v0.7.x → v0.8.x line.
See docs/security-review-v0.8.5.md
for the v4 Pre-tag-style closeout (PROCEED-CLEAN; 12th
consecutive of v0.7.x → v0.8.x line).
-
DFAH faithfulness CLI flags —
evidentia eval risk-determinism --check-faithfulness --faithfulness-threshold N --faithfulness-method {jaccard,semantic} --source-clauses-file <yaml>operator- facing surface. Closes the v0.8.4 P1.2 CLI-surface deferral. Pre-condition validation rejects malformed inputs BEFORE any LLM call fires. -
DFAH calibration corpus expansion to 123 entries +
per-framework subsets (
corpus_nist.jsonl/corpus_ffiec.jsonl/corpus_iso27001.jsonl, 24 entries each across the 4 categories).tune_faithfulness_threshold.py --corpus-pattern <glob>for per-framework sweep. Empirical per-framework recommended thresholds documented. -
Real-LLM integration tests for
extract_claims()+DFAHarness.run(check_faithfulness=True)end-to-end attests/integration/test_eval/test_real_llm_extraction.py. Opt-in viaEVIDENTIA_LLM_INTEGRATION=1env var. -
MCP CIMD richness — implemented after 5 deferral cycles
per Allen's "implement now" directive. New module
evidentia_mcp.cimdwithCIMDDocument(per RFC 7591) +CIMDRegistry(JSON-file-backed, version-tagged).evidentia mcp serve --cimd-registry <path>flag. Server-side attributeserver.evidentia_cimdexposed for tool implementations. v0.8.5 ships the registry-loading + attachment infrastructure; per-tool scope enforcement at MCP-protocol level deferred to v0.8.6.
- pytest 100% green: 2338 passed / 17 skipped (was 2313/14 at v0.8.4 ship; +25 new across P1 + P3 + P4)
- mypy strict 0/0 across 216 source files
- ruff clean
- Standing-rule keyword sweep clean across all 4 v0.8.5-cycle commits
Tag v0.8.6 at commit eb0f331. Container digest
sha256:583d3849b5997edd2557530c48a32f085fa22ebbc2441bbeb2e7fcf7db8799a5.
Aggressive ~2-3 week comprehensive scope (single-session
compression matching v0.8.3 + v0.8.4 + v0.8.5 cadence).
Closes ALL 3 v0.8.5 carry-overs + 3 cycle-additions per
Allen's explicit Comprehensive scope + CIMD-first sequencing
- v0.7.x-retrospective / v1.0-transition / audit-trail-layer additions lock-in (§29). 13th consecutive PROCEED-CLEAN of v0.7.x → v0.8.x line.
See docs/security-review-v0.8.6.md
for the v4 Pre-tag-style closeout (PROCEED-CLEAN; 13th
consecutive).
-
CIMD scope enforcement at MCP-protocol level + per-call
audit trail (P1) — closes the v0.8.5 P4 deferral. NEW
evidentia_mcp.scopemodule monkey-bindsFastMCP.call_toolwith idempotency guard; per-callAI_MCP_TOOL_AUTHORIZED/AI_MCP_TOOL_DENIEDaudit events;--default-client-idCLI flag; deny paths raiseMcpErrorcode -32602. Pass-through preserves v0.8.5 default no-gating behavior. -
Cohen's Kappa rater agreement script (P2) — closes the
v0.8.5 P2 multi-rater methodology reservation. NEW
scripts/compute_inter_rater_kappa.pyships κ formula + Landis-Koch interpretation + CI-gateable exit codes; rule-based jaccard rater mode probe → best κ = 0.4848 (moderate) at threshold 0.85 → ships as "single-rater + κ probe inconclusive" per §29 R3 mitigation; empirically demonstrates v0.8.3 sentence-transformers semantic path's necessity. Real LLM-assisted second rater + human second rater both reserved for v0.9.0 walk-through. -
Per-claim bootstrap-resampled confidence + framework-
aware threshold defaults (P3) —
FaithfulnessResult.confidence-
frameworkfields;DEFAULT_THRESHOLDS_BY_FRAMEWORK_JACCARDmap (NIST 0.60 / FFIEC 0.35 / ISO27001 0.30 per v0.8.5 P2 empirical sweep);resolve_threshold(framework, method)helper. CLI flag--faithfulness-threshold-mode {framework- aware,fixed}deferred to v0.8.7.
-
-
docs/v0.7.x-retrospective.md(P4) — 18-release narrative (v0.7.0 → v0.7.16 over ~12 days). -
docs/v1.0-transition.mdDRAFT (P5) — v1.0 theme candidates + acceptance gates.
- pytest 100% green: 2383 passed / 17 skipped (was 2338/17 at v0.8.5 ship; +45 new across P1 + P2 + P3)
- mypy strict 0/0 across 217 source files
- ruff clean
- Standing-rule keyword sweep clean across all 4 v0.8.6-cycle commits
Tag v0.8.7 at commit (TBD post-tag). Single focused session
per Allen's explicit cycle-open lock-in (§30: Single v0.8.7
wrap-up release + LLM-rater deferred to v0.9.0 + CIMD
signatures deferred to v1.0). 14th consecutive PROCEED-CLEAN
of v0.7.x → v0.8.x line. FINAL v0.8.x patch — v0.9.0 opens
with a clean slate.
See docs/security-review-v0.8.7.md
for the v4 Pre-tag-style closeout (PROCEED-CLEAN; 14th
consecutive).
-
--faithfulness-threshold-mode {framework-aware,fixed}CLI flag (P2) — closes the v0.8.6 P3 CLI-surface deferral. Defaultframework-aware; explicit--faithfulness-thresholdvalue always wins; framework-aware mode extracts framework from prompt_id (canonical<framework>:<control_id>format) +resolve_threshold(framework, method)lookup; fixed mode usesDEFAULT_FAITHFULNESS_THRESHOLD(0.30). Default--faithfulness-thresholdchanged from0.3→Nonesentinel; backward-compatible. -
6 v0.8.6 cycle-close artifacts backfilled (P1; docs
only) —
security-review-v0.8.6.md+v0.8.6-plan.md+ threat-model v0.8.6 delta + capability-matrix v0.8.6 snapshot + README v0.8.6 entry + ROADMAP v0.8.6 PLANNED → SHIPPED transition.
- pytest 100% green: 2386 passed / 17 skipped (was 2383/17 at v0.8.6 ship; +3 new from TestFaithfulnessThresholdMode)
- mypy strict 0/0 across 217 source files
- ruff clean
- Standing-rule keyword sweep clean across the v0.8.7-cycle commits
v0.9.0 SHIPPED 2026-05-15 — first minor of the v0.9.x line. Opens the federal-compliance theme per the 2026-04-28 §10 Q4 lock-in.
Phase 1 — POA&M data layer + state model: POAMState
5-state enum (planned / in_progress / overdue / completed /
verified) aligned to FedRAMP POA&M Template Completion Guide
v3.0 + NIST SP 800-53A Rev 5 Appendix F. Forward-only state
transitions; backward transitions programmatically blocked to
preserve auditor-defensible monotonic progress. Milestone
Pydantic record + ControlGap.poam_milestones optional list
(default-empty for v0.7.x + v0.8.x backward-compat). New
evidentia_core.poam sub-package + evidentia_core.poam_store
JSON file-store mirroring v0.7.9 vendor_store (atomic-write +
UUID-shape ID gate + validate_within path-traversal defense +
EVIDENTIA_POAM_STORE_DIR env override). 6 new EventActions.
Phase 2 — POA&M CLI + REST + OSCAL emit: evidentia poam
Typer subcommand group (7 verbs: create / list / show / update /
milestone add|update / delete / calendar). /api/poam/*
FastAPI router (8 endpoints) mirroring v0.7.9 TPRM router shape
- v0.7.8 F-V08-DAST-3 error-normalization. NEW
evidentia_core.oscal.poam_exporter.gap_report_to_oscal_poam()emits OSCAL 1.1.2 plan-of-action-and-milestones JSON; eachControlGap→ one (observation, risk, poam-item) triple with UUID cross-references; milestones as tracking-entries underrisks[].remediations[]; back-matter SHA-256 integrity mirrors v0.7.0 finding-resource embedding. Default severity-filter is CRITICAL + HIGH per FedRAMP §3.1 auditor-default.
Phase 3 — CONMON cycle calendar (read-only):
evidentia_core.conmon pure-function library with 7 bundled
cadences (NIST 800-53 CA-7 monthly + FedRAMP ConMon × 3 +
CMMC L2 triennial + DoD RMF annual + OCC 2026-13a model-risk
annual). evidentia conmon CLI (3 verbs: list / next / check).
2 new EventActions. NO DAEMON — operators poll; the
evidentia conmon watch live-trigger daemon is reserved for
v1.0 per §31.1.
Step 5.A 14-item refinement batch (commit ceab880):
UUID canonicalization in poam_store + vendor_store prevents
duplicate-records-per-alias + non-conformant OSCAL UUID emit;
_enum_value extracted to evidentia_core.models.common;
stale-doc refreshes across governance + config + generation_context
references.
15th consecutive PROCEED-CLEAN of v0.7.x → v0.8.x → v0.9.x line. 2583 tests / 17 skipped / 227 source files / mypy strict 0/0 / ruff clean. Pre-release-review v4 Pre-tag full 7-step clearance + 3-invocation /security-review (diff-scoped + per-subsystem + final-gate) all CLEAR.
Phase 4 — Walk-through-as-validation: deferred to v0.9.1 per §31.A POA&M-first / walk-through-as-validation posture. v0.8.6 §29 P2 R3 single-rater κ probe inconclusive carry- forward acknowledged; domain-expert walk-through becomes the v0.9.1 reservation. v0.9.0 ships regardless.
Cycle opened 2026-05-15 after v0.9.0 ship. Plan file:
docs/v0.9.1-plan.md.
-
Phase 1: CONMON REST router — 4 endpoints under
/api/conmon/(list, get, next, check) matching CLI parity. 17 integration tests. -
Phase 2: LLM-assisted second rater —
scripts/llm_rater.py-
--rule llmmode incompute_inter_rater_kappa.py. Temperature-0 deterministic labeling with JSONL sidecar persistence.
-
-
Phase 3: Federal-compliance calibration corpus —
corpus_federal.jsonl(24 entries; FedRAMP ConMon + POA&M + NIST 800-53 CA-7). Total corpus now 147 entries. -
Phase 4: Federal-SI walk-through scenarios — 10 scenarios
(FS-1 through FS-10) in
capability-matrix.mdwith persona, goal, surfaces exercised, expected outcome. - Phase 5 (pending): Domain-expert walk-through execution (requires federal partner scheduling).
- Phase 6 (pending): Pre-release-review + version bump + ship.
(Originally PROPOSED as "AI governance foundation"; the AI governance theme deferred to v0.9.3 when the org migration consumed the v0.9.1 cycle.)
-
CONMON REST router — 4 endpoints under
/api/conmon/(list, get, next, check) matching CLI parity. -
LLM-assisted second rater —
scripts/llm_rater.py+--rule llmmode incompute_inter_rater_kappa.py. Temperature-0 deterministic labeling with JSONL sidecar persistence. -
Federal-compliance calibration corpus —
corpus_federal.jsonl(24 entries; FedRAMP ConMon + POA&M + NIST 800-53 CA-7). Total corpus 147 entries. -
Federal-SI walk-through scenarios (FS-1 through FS-10) in
capability-matrix.md.
The largest minor of the v0.9.x line so far. Combines both originally-PROPOSED themes (CONMON daemon Theme A + AI governance Theme B) into a single ship since v0.9.1 (org migration) + v0.9.2 (CONMON REST + LLM rater) consumed the originally-planned slots.
Theme A — CONMON daemon:
-
evidentia conmon watch --poll— long-running daemon with state-file-driven slug→last_completed tracking, configurable poll interval, graceful SIGINT/SIGTERM shutdown. -
Basic alerting — SMTP (STARTTLS-only with
has_extnassertion)- generic HTTP webhook (HMAC-SHA256 with timestamp-included signed material for capture-replay defense). File-backed dedup state + per-(slug, state) suppression. Secret-handling protocol enforced (file > env > error; CLI value flags rejected).
-
Control health scoring —
evidentia conmon healthCLI +GET /api/conmon/healthREST endpoint produce per-framework attention-bucket counts + cross-framework overall health score. - ContinuousEvidenceSource plugin Protocol + NoopContinuousSource reference impl (production refs deferred to v0.9.4).
Theme B — AI governance:
- EU AI Act catalog enrichment — risk_tier + applies_to_annex_iii on every Article 9-15 control; tier promoted D→A.
- NIST AI RMF crosswalks — bidirectional mappings to EU AI Act (26 entries) + ISO 42001 (23 entries); confidence + rubric fields on catalog model.
-
evidentia_core.ai_governance— classification + registry + registry_store (UUID validation + path-traversal guard + atomic write). -
evidentia ai-govCLI (classify/register/list/get/delete) +/api/ai-gov/*REST router (5 endpoints with audit-event parity to CLI).
Carry-overs:
- LLM-rater κ recompute on 147-entry corpus (framework-agnostic κ = 0.8820; overall κ = 0.7956; 3 of 5 subsets PASS κ≥0.80).
- Docker/requirements drift CI gate.
- GHCR public-flip release-checklist item.
- API-stability.md DRAFT (v1.0 NORMATIVE commitment scope).
Consolidation pass after v0.9.3's aggressive single-session compression. Despite the originally-planned conservative pacing, shipped via the same aggressive single-session pattern. Closed the 2 deferred MEDIUMs + 1 HIGH from the v0.9.3 review + the LOW polish batch + the federal-SI walk-through reserved since v0.9.0. 19th consecutive PROCEED-CLEAN.
Phase 1 — Daemon hardening:
- P1.1
evidentia_core.security.FileLock(POSIXfcntl.flock/ Windowsmsvcrt.locking) +--state-lockCLI flag wiring → closes F-V93-Q3 HIGH (CWE-362 race-condition). - P1.2 webhook SSRF mitigation: default-deny
http://+ loopback/RFC1918/link-local/reserved IPs; opt-in--webhook-allow-plaintext+--webhook-allow-private-network→ closes F-V93-S2 MEDIUM (CWE-918). - P1.3 token-bucket rate-limit middleware on POST /api/ai-gov/
register + /classify +
X-Idempotency-Keyheader support → closes F-V93-S10 LOW (CWE-770). - P1.4 polish batch (F-V93-Q11 User-Agent + Q12 Windows latency doc + Q14 narrow except + S9 path-disclosure doc).
Phase 2 — Operator polish:
- P2.1
GET /api/conmon/daemon-status+ sidecar JSON +--status-fileCLI +CONMON_DAEMON_STATUS_QUERIEDaction. - P2.2
evidentia conmon dedup-listCLI verb +AlertDeduper.list_entries()API. - P2.3
evidentia ai-gov update+retireverbs wiringAI_SYSTEM_UPDATED+AI_SYSTEM_RETIRED.
Phase 3 — Federal-SI walk-through:
- P3.1 synthetic fixtures + recipe doc + smoke test.
- P3.2 3 walk-through-surfaced refinements (real cadence slugs,
truncate-tolerant assertions, valid
decision_roleenum).
Phase 4 — Hygiene (P4.1 backfill skipped per cycle-open lock-in; P4.2 Codecov operator-completed; P4.3 DAST deferred to v0.9.5):
- P4.4 fixed flaky TestJiraStatus (real fix: assertion-scoping, NOT fixture leak as initially classified).
- P4.5 added
workflow_dispatchto.github/workflows/test.yml. - P4.6 token-rotation doc fix in
docs/release-checklist.md.
2798 tests / 17 skipped / mypy strict 0 / 219 source files / ruff clean.
Theme: Walk-through-driven refinement + collaboration primitives + carry-over closure.
Phase 1 — Carry-over closure (6 sub-items):
- P1.1: pytest-randomly added to dev deps + random-order test sweep clean.
- P1.2: DAST tools (schemathesis + playwright) in dev deps;
tests/dast/scaffold withtest_openapi_fuzz.py+playwright.config.ts. - P1.3: 7 v0.9.3 LOW-bucket residuals closed (F-V93-S4 SSL context, S5 trust-boundary doc, S6 SIGINT race doc, S7 state-file size cap, S8 RFC 5321 recipient validation, Q4 dedup-state mtime cache, Q13 sleep_fn typing).
- P1.4: 8 v0.9.4 formal-review LOWs + 2 INFOs closed (FileLock fd leak / fcntl per-fd doc / rate-limit LRU spray / sleep_fn type / rate-limit GIL docstring / IPv6 scope-id sort / cross- process FileLock test / model_copy validator skip / Pydantic upgrade body-hash doc / replay-after-target-deleted regression).
- P1.5: shared
evidentia_core.security.atomic_write_texthelper + 4 v0.9.4 inline call sites refactored. - P1.6:
EVIDENTIA_TRUST_PROXY_HEADERS=1auto-wires uvicorn's ProxyHeadersMiddleware increate_app().
Phase 2 — Operator polish:
- P2.1: AI-persona federal-SI walk-through validation (driven
by Perplexity + WebSearch + training corpus on FedRAMP 20x,
RFC-0024, OMB M-24-10, NIST AI RMF). 10 refinement findings
closed;
docs/walkthrough-validation-v0.9.5.mdcaptures the artifact. - P2.2: POA&M emit + OSCAL 1.1.2 plan-of-action-and-milestones added as Step 8 of the federal-SI walk-through.
- P2.3: daemon-status REST expansion —
GET /api/conmon/ daemon-history?limit=Nrolling-history endpoint + Prometheusevidentia_conmon_daemon_*gauges at/api/metrics. New daemon CLI flags--history-file+--history-max-entries.
Phase 3 — Collaboration primitives (groundwork):
- P3.1: POA&M ownership fields —
Milestone.owner+Milestone.reviewer+evidentia poam list --owner X --reviewer YCLI + REST?owner=X&reviewer=Yfilter. - P3.2: Append-only evidence versioning —
EvidenceArtifact. version+lineage_id+predecessor_idfields +new_version()factory helper. Data-model + helper only at v0.9.5; WORM store-side enforcement lands v0.9.6. - P3.3: Basic RBAC primitives —
evidentia_core.rbacpackage withRoleenum /RBACPolicy/check_permission+ FastAPIrequire_role(action)dependency factory.EVIDENTIA_RBAC_POLICY_FILEenv var loads policy atcreate_app(). Default permissive policy preserves v0.9.4 behavior. CLI-side RBAC enforcement deferred to v0.9.6.
Phase 4 — Hygiene: P4.1 backfill deferred (the v0.9.3 + v0.9.4 docs are the canonical pattern; backfill is portfolio polish, not blocking). P4.2 Codecov at 84.26% (vs 80% target). P4.3 uv.lock regenerated atomically at version bump.
2862 tests / 17 skipped / mypy strict 0 / ~225 source files / ruff clean / pytest-randomly seed-sweep clean.
Tag v0.9.6 (2026-05-18). Comprehensive ~3-week scope compressed
into a focused session per the v0.9.5 cycle-close lock-in (Allen's
"Comprehensive ~2-3 weeks" + "Phase 0 verification gate first" +
"CONMON MCP claim now" + "defer walk-through to v0.9.7" choices).
21st consecutive PROCEED-CLEAN of the v0.7.x → v0.8.x → v0.9.x
line.
Phase 0 — pre-cycle verification (BLOCKING, all PASSED): OSCAL 1.2.1 changelog confirmed schema-compatible with one observation type rename; OMB M-24-10 field set locked from agency compliance plans; mypy strict scout reported 0 errors post-cross-package re-resolution.
Phase 1 — CLI RBAC + flag normalization:
- NEW
evidentia.cli._rbac.require_role_cli(action)Typer decorator mirroringevidentia_api.rbac_dependency.require_role. Sharesevidentia_core.rbac.check_permission+ action taxonomy (read / write / admin). Denial exits with code 77 (BSDEX_NOPERM). - NEW
evidentia.cli._rbac_lifecycle— process-lifetime singleton loader. Env varsEVIDENTIA_RBAC_POLICY_FILE+EVIDENTIA_RBAC_IDENTITY+ new--rbac-identityglobal flag. -
conmon check --state-filecanonical;--last-completed-filedeprecated (DeprecationWarning; removal target v1.0).
Phase 2 — WORM evidence store + lineage CLI:
- NEW
evidentia_core.evidence_store— append-only enforcement; refuses overwrite of<lineage>/v<N>.json; raisesEvidenceWORMViolationwith canonical recovery (callEvidenceArtifact.new_version()). UUID canonicalization + path-traversal protection. - NEW
evidentia_core.evidence_store_worm— optional cloud-WORM mirror composing withWORMBackendABC (S3 / Azure / GCS). - NEW
evidentia evidenceCLI —save(write-gated) +history <lineage>(read) +show <lineage> --version N(read). - 3 new EventActions:
EVIDENCE_VERSION_PERSISTED,EVIDENCE_WORM_VIOLATION_BLOCKED,EVIDENCE_LINEAGE_QUERIED.
Phase 3 — AI-gov federal expansion:
- NEW
evidentia_core.ai_governance.fips199—FIPS199CategorizationPydantic model + high-water-mark validator per FIPS PUB 199 §3. - NEW
evidentia_core.ai_governance.omb_m_24_10—OMBImpactCategoryenum (rights / safety / both / neither) +triggers_minimum_practices()helper. - NEW
evidentia_core.ai_governance.scr—SCRFormmatching FedRAMP template +classify_change()(routine / adaptive / transformative) +emit_scr_form()diff emitter + JSON / MD writers. - Extended
AISystemRegistryEntrywith 4 Optional fields + NEWATOReferencesubmodel. - NEW CLI verbs:
ai-gov categorize-fips,ai-gov set-omb-impact,ai-gov update --ssp-reference,ai-gov update --emit-scr <path>. - 3 new EventActions:
AI_SYSTEM_FIPS_CATEGORIZED,AI_SYSTEM_OMB_CLASSIFIED,AI_SYSTEM_SCR_EMITTED.
Phase 4 — MCP first-mover + OSCAL upgrade + mypy + positioning:
-
CONMON MCP first-mover CLAIMED: 4 new tools on
evidentia_mcp.server(conmon_list_cadences,conmon_next_due,conmon_check_state,conmon_health) wrapping the v0.9.3 daemon. Verified-unclaimed at the v0.9.5 Q3 2026 quarterly resync; first-mover lock established ahead of FedRAMP CR26 mandatory adoption (Jan 1 2027). - OSCAL 1.1.2 → 1.2.1 via single-source-of-truth
OSCAL_SCHEMA_VERSIONconstant + observationtypes: ["finding"]→["implementation-issue"]at one emit site. - mypy strict gate extended to all 7 evidentia-* packages. 256 source files clean (was 223 of 247 at v0.9.5).
- Positioning: §6.1.A moat trinity + §6.1.B counter-positioning vs agentic GRC; README moat-trinity hook.
Phase 5 — Hygiene + validation + ship:
- Walk-through deferred to v0.9.7 per scope lock-in.
-
docs/v0.9.6-plan.md+docs/v0.9.6-shipped.md+docs/security-review-v0.9.6.mdall shipped per plan-first discipline + v4 G7.
3018 tests / 17 skipped / mypy strict 256 of 256 source files / ruff clean / pytest-randomly seed-sweep clean.
Tag v0.9.7 (2026-05-19). Comprehensive ~3-4 week scope
compressed into a focused session per the v0.9.6 cycle-close
lock-in (Allen's "comprehensive + walk-through deferred + api-
stability NORMATIVE + multi-tenant RBAC partial + CIMD signatures
groundwork" choices). 22nd consecutive PROCEED-CLEAN of the
v0.7.x → v0.8.x → v0.9.x line.
Phase 0 — pre-cycle verification (all PASSED): paramiko upstream still unpatched (carry-forward); RFC-0007 SCN required- field set captured; api-stability.md surface enumerated for NORMATIVE promotion.
Phase 1 — v0.9.6 carry-overs:
-
P1.1 WORM auto-mirror (closes F-V96-worm-app-layer): NEW
EVIDENTIA_EVIDENCE_AUTO_MIRROR_WORM+EVIDENTIA_EVIDENCE_ WORM_BACKEND_FACTORYenv vars.save_evidence()callsmirror_to_worm()after local-store write succeeds. Mirror failure non-fatal. 7 new tests. -
P1.2 CIMD scope-migration CLI verb (closes F-V96-conmon-
mcp-cimd-migration): NEW
evidentia mcp cimd-migrate <registry- path>verb. Adds v0.9.6conmon_*MCP tools to each client's scope. Idempotent + atomic-write +--dry-run+--client-idfilter. 9 new tests. - P1.3 Codecov target bump: 80% → 85%.
Phase 2 — v1.0 prep (headline):
- P2.1 api-stability.md → NORMATIVE: status flipped from DRAFT. v0.9.4-v0.9.6 surfaces backfilled (45+ models / 60+ EventActions / 18+ CLI commands / 8 MCP tools / 8 env vars). NEW "MCP tool contract" section + "Env-var public contract" section. Pre-v1.0 binding semantics now in force.
-
P2.2 Deprecation calendar (NEW
docs/deprecation-calendar.md): formal catalogue withconmon check --last-completed-fileas anchor entry (target removal v1.0). -
P2.3 Multi-tenant RBAC primitives: NEW
evidentia_core.rbac.multi_tenantmodule —TenantRBACPolicy,resolve_tenant_from_identity,check_permission_multi_tenant,load_multi_tenant_policy_from_file,from_single_tenant_policybackward-compat. 31 tests. CLI + REST integration deferred to v1.0. -
P2.4 CIMD signatures groundwork: NEW
evidentia_mcp.signaturesmodule —SignedToolOutputenvelope,sign_tool_output,verify_tool_output, env-var-driven signer factory. 19 tests. FastMCP dispatch-layer auto-wrap deferred to v1.0.
Phase 3 — OSCAL SCR notification standard alignment (RFC-0007):
-
SCRFormextended with 8 Optional RFC-0007 fields. - NEW
SCRForm.to_oscal_scr_notification()emitter — raisesValueErrorlisting missing required fields. Per-category extras (Adaptive + Transformative pre-impl) auto-emitted. - 8 new tests.
Phase 4 — Q3 quarterly resync follow-ups:
- P4.1 Academic positioning sharpened: NEW §11.2.A "OSS-native reference implementation for computational compliance" frame citing Marino & Lane (arXiv 2601.04474), de la Chica & Martí- González (arXiv 2605.14744), FedRAMP CR26 + RFC-0024 readiness.
-
P4.2 HF Hub GRC LLM eval-suite scaffolding (NEW
docs/hf-eval-suite-scaffolding.md): documented planned dataset structure + publication path. Full publish deferred to v0.9.8+. - P4.3 Conference outreach DEFERRED to v0.9.8+ (needs human- reviewed talk abstracts).
Phase 5 — Hygiene + ship:
- Walk-through deferred indefinitely per scope lock-in.
-
docs/v0.9.7-plan.md+docs/security-review-v0.9.7.mdshipped. -
scripts/bump_version.py --to 0.9.7+ uv.lock regen. - Backfill v0.9.1 + v0.9.2 security-review docs deferred.
3092 tests / 17 skipped / mypy strict 258 of 258 source files / ruff clean.
Tag v0.9.8 (2026-05-21). Focused session wiring v0.9.7's
data/decision-only primitives into live surfaces, closing the
CR-V97 review polish, and clearing supply-chain + type-safety gaps
caught during the pre-tag review.
Phase 1 — v0.9.7 deferral closure:
-
Multi-tenant RBAC integration (P1.3-P1.6): NEW
--rbac-tenantCLI flag + tenant-aware policy auto-detection; FastAPIrequire_rolederives the tenant claim from the authenticated principal (closes F-V97-multi-tenant-claim-spoofing); POA&M + evidence stores gain per-tenant directory roots; NEWRBAC_TENANT_BOUNDARY_CROSSEDaudit event; sharedload_rbac_policy_auto. -
MCP dispatch-layer signing (P1.1):
SignedToolOutputwired at the FastMCP tool-dispatch path; the signature rides inCallToolResult._metaas additive provenance. -
In-tree Sigstore-keyless MCP signer (P1.2): NEW
evidentia_mcp.sigstore_signer(closes F-V97-mcp-signer-trust). -
HF Hub GRC eval suite (P1.9): FedRAMP Rev 5 High + CMMC L2
corpus subsets + dataset card +
scripts/publish_hf_eval.py; combined corpus regenerated to 195 entries.
Phase 2 — CR-V97 review polish:
- Shared
evidentia_core.factory_resolver(CR-V97-3 de-duplication- CR-V97-1 cached resolution).
-
sign_tool_output()canonical-JSON encoding viadefault=str(CR-V97-4).
Phase 3 — supply-chain + type-safety:
- idna 3.11 → 3.15 (CVE-2026-45409).
- Three
SigningContext.production()runtime breaks fixed (sigstore 4.2.0 API migration) + PostgreSQL collector type narrowing. - The CI + release-checklist mypy gates aligned (
--all-extrasin CI;evidentia-mcpin the checklist command) so extra-gated type errors can no longer slip through.
Deferred: federal-SI walk-through validation (folded into the v1.0 self-test phase); paramiko CVE-2026-44405 LOW (a fix now exists upstream — carried forward to v0.9.9 as its own focused SSH-library major bump).
3250 tests / 14 skipped / mypy strict 262 of 262 source files / ruff clean.
Tag v0.9.9 (2026-05-21). A focused supply-chain patch — no source
or test code changed; dependency versions, CI workflow, supply-chain
tooling, and docs only.
Phase 1 — Dependabot queue clearance:
- Five grouped version-update PRs merged (the
python-dev,npm-runtime,npm-dev, andgithub-actionsgroups + the Docker base-image digest), all CI-green. - Three orphaned PRs closed — they targeted only
docker/requirements.txtvia apip/uv-dockerDependabot ecosystem no longer present in.github/dependabot.yml; that file is regenerated fromuv.lockat release time (G4 Path 2), so the PRs were superseded. -
.github/dependabot.ymlaudited — coverage confirmed complete.
Phase 2 — osv-scanner --sbom pre-push gate:
- NEW
scripts/run_osv_scan.py+osv-scanner.tomlallowlist + anosv-scanjob in.github/workflows/test.yml+ a Step 5 entry indocs/release-checklist.md. CI and the documented gate invoke one shared script. Closes the v0.9.8 gate-fidelity gap — the 16-row pre-push gate's Row 14 read Dependabot alerts, which suppress DISPUTED CVEs, so a disputedpyjwtadvisory surfaced post-tag.
Phase 3 — paramiko CVE-2026-44405 closed:
-
compliance-trestle4.0.2 → 4.0.3 pullsparamiko4.0.0 → 5.0.0, past the<= 4.0.0vulnerable range.paramikois a dev-only transitive dependency (viacompliance-trestle, OSCAL round-trip tests); no Evidentia code imports it.
Deferred: the federal-SI domain-expert walk-through — deferred indefinitely per the resequencing above; runs before v1.0.0, after the operator self-test + demo/pitch phase.
3250 tests / 14 skipped / mypy strict 261 of 261 source files / ruff clean.
Opened 2026-05-21 following a competitive/integration research pass
(see docs/integration-survey.md and
docs/positioning-and-value.md §5.5 /
§5.6.A). The v0.10.x line is the home for the research-driven feature
surface: because it brings meaningful new feature surface, it is a
minor bump from v0.9.9 rather than a continuation of v0.9.x patches.
Themes (precise per-release boundaries are set per release plan):
- OCSF normalized findings schema — the keystone, and where v0.10.0 begins. Refactor evidence collectors to emit a canonical, framework-neutral finding aligned to the Open Cybersecurity Schema Framework, mapping into control gaps downstream. Decouples collector count from framework count and unlocks the integrations below.
-
SARIF emit —
evidentia gapemits SARIF 2.1.0 so gap analysis is a blocking PR check in GitHub / GitLab security dashboards. - OCSF-based collectors — Prowler, AWS Security Hub, Trivy / Checkov ingestion, near-free once the normalized schema lands.
-
MCP-as-backend + GRC Engineering Club interop — deepen the MCP
tool/resource surface; publish a thin Evidentia MCP plugin into the
GRC Engineering Club's
grc-engineering-suitemarketplace. - Persona modes (auditor / engineer / TPRM) and YAML-driven catalog / control-tier definitions to broaden UX and contribution.
- AI-governance regulatory packs (EU AI Act Annex IV technical documentation, ISO/IEC 42001) and agentic-governance primitives (agent cards, tool-use permissioning) on the existing MCP / CIMD substrate.
- Map Evidentia onto the OpenSSF Gemara reference model in positioning material.
The full prioritized integration list and sequencing rationale are in
docs/integration-survey.md §7.
A dedicated phase sequenced after the v0.10.x research-driven feature surface is built: the maintainer works through the entire product hands-on to fully understand it end-to-end — exercising every CLI verb, REST endpoint, evidence collector, MCP tool, and UI surface — builds out project documentation / a project wiki, and produces demo recordings. This formalizes and expands the "operator self-test + demo phase" referenced in the v0.9.9 entry above; it runs before the v1.0 domain-expert walk-through, and any gaps it surfaces feed back into the backlog.
Sourced from the Phase B audit re-run + 6-stream Evidentia-integration
research synthesis (~/.claude/skills/pre-release-review/_audits/evidentia-integration-plan-2026-05-24.md).
Full plan at docs/v0.10.5-plan.md. Headline:
Evidentia ships 4 first-of-its-kind OSS artifacts, each currently
absent from the public ecosystem (confirmed via gh api search/code +
ecosystem scan):
-
First public OSCAL serialization of OpenSSF OSPS Baseline
(zero prior in
usnistgov/oscal-content/oscal-club/awesome-oscal/IBM/compliance-trestle/lula-tool/ OpenSCAP). -
First public
OSPS-CONFORMANCE.mdself-attestation (gh api search/code "OSPS-CONFORMANCE.md"returns 0). -
First Tier-A OSPS-Baseline bundled control catalog set in any
GRC tool (3 maturity files matching existing
fedramp-rev5-{low,moderate,high}+cmmc-2-{l1,l2,l3}precedent). - First Apache-2.0 machine-readable EU AI Act ↔ ISO/IEC 42001 crosswalk (deferred to v0.11 per RF4; v0.10.5 sets up the bundled OSPS Baseline crosswalk infrastructure that v0.11 reuses).
v0.10.5 phases: (1) OSPS Baseline 3-maturity catalog set; (2) OSCAL
conversion + upstream PR to oscal-club/awesome-oscal (separate
publishing approval); (3) OSPS-CONFORMANCE.md + machine-readable
companion + CI gate; (4) SECURITY.md + security.txt + GitHub
Security Advisories enablement (separate gh api approval); (5)
EOL.md + docs/verification.md consumer-side cosign + PEP 740
recipes; (6) positioning §16 skip-by-reuse note; (9) evidentia-eval
workspace package extraction (Kimi audit close-out — DFAH harness
extracted from evidentia-ai/eval/ to dedicated 8th package so air-
gap installs of the production runtime no longer transitively pull
the dev-time eval stack; evidentia_ai.eval.* retained as
deprecation shim through v0.11.x, removal in v0.12.0). ~2 weeks scope.
v0.10.6 — OSS first-mover artifacts + OSPS crosswalks + GitHub collector extension + hygiene — SHIPPED
Patch on v0.10.5 (released 2026-05-26). 17 cycle commits authored
2026-05-27. Tag v0.10.6. Carried out the v0.10.5
deferred Phases 1-5 OSS first-mover artifacts theme plus downstream
OSPS crosswalks + GitHub collector extension + post-v0.10.5 hygiene.
Headline shipments: OSPS Baseline 3-catalog bundle + first public
OSCAL Catalog 1.2.1 serialization of the OpenSSF OSPS Baseline (first-
mover claim verified via gh api search); OSPS-CONFORMANCE.md self-
attestation + verify-osps-conformance.yml CI gate that re-validates
every evidence link on push/PR/cron (first public open-source project
to ship this artifact); SECURITY.md refresh + .well-known/security.txt
- GHSA private vulnerability reporting (closes OSPS-VM-01/02/03 + CISA
SbD Pledge Goal 5);
EOL.md+docs/verification.mdconsumer-facing lifecycle + cosign + PEP 740 + osv-scanner + SLSA Provenance v1 verification recipes; 5 OSPS-Baseline crosswalks (NIST SSDF / NIST CSF 2.0 / EU CRA / PCI DSS 4.0 / NIST 800-161) shipped raw with upstream- attested provenance disclaimer per the 2026-05-26 brainstorm rigor decision (hand-verification deferred to v0.10.7);CrosswalkDefinitionextended additively with 3 optionalprovenance/verification/verification_notefields;evidentia_collectors.github.ospsmodule with 16populate_osps_*helpers covering AC/BR/DO/GV/LE/QA/VM families - 4 additive
GitHubClientmethods; workflow-permissions audit (advisory; v0.10.7 promotes to blocking); Scorecard 6.2 → 6.5+ restoration viaverify-changelog.ymlSHA pinning. Release-checklist Step 2.A captures the v0.10.5 LL-V105-1 partial-publish prevention (new-PyPI-project pending-publisher check before tagging). Workspace ships 8 PyPI packages unchanged from v0.10.5 (no new packages this cycle, no LL-V105-1 recurrence risk). 3536 tests pass / 14 skipped / 3550 collected across 279 source files (was 268 v0.10.5); mypy strict 0/0; ruff clean. Four §12 corrections-log entries this cycle (seedocs/v0.10.6-plan.md§12). OSCAL upstream contribution PR at https://github.com/oscal-club/awesome-oscal/pull/59.
v0.10.7 — Web console (GUI v2) + gap-export, on a hygiene + automation-debt + wiki-fill + doc-accuracy base — SHIPPED
Patch on v0.10.6 (released 2026-05-27). Tag v0.10.7 (2026-05-30).
A web-UI + hardening + automation-debt + documentation cycle. The
headline end-user change is the web console: a full GUI v2 visual
refresh plus a real gap-report export/download surface (8 formats).
The hardening side closed the v0.10.6 code-quality reviewer backlog
(Groups A + D) and the 2 deferred Scorecard alerts, added a blocking
pre-push gate (now with never-skip version-anchor + frontend guards),
filled the in-repo wiki tree, added 7 operator-walkthrough guides,
fixed two real product bugs (TPRM + governance enum rendering), and
ran a doc-wide CLI-example accuracy sweep. Headline shipments:
- Web console (GUI v2) — full design-system refresh (federal-blue / deep-navy chrome, light/dark, self-hosted IBM Plex + favicons / PWA manifest / OG brand assets, every route + onboarding restyled; presentation-only with all API / SSE / Zustand wiring + accessibility preserved) and a real gap-report export/download of all 8 formats, guarded by an OpenAPI → TS type-parity drift-gate. Live-validated across all 8 routes with zero console errors.
-
7 operator-walkthrough guides + an input-schema reference, and
two product bug-fixes —
tprm dd-questionnaire ingest+ the governance workflowrun/advancestatus output rendered enum fields raw (the models store enums as strings underuse_enum_values); fixed via the sharedenum_valuehelper, TDD, with a sibling audit. -
Scorecard delta closed —
verify-osps-conformance.ymlpip install hash-pinned (#123PinnedDependenciesID);sync-wiki.ymltop-level token scope reduced toread-allwithcontents: writepushed down to the wiki-push job (#124TokenPermissionsID). -
OSPS crosswalk reproducibility —
scripts/catalogs/gen_osps_crosswalks.pydeterministically rebuilds the 5 OSPS JSONs byte-for-byte from a single-source upstream-SHA constant (_osps_upstream.py) with a--checkdrift mode. The ~15 literal SHAs are now a generated artifact (next upstream bump = one-line constant change + regen). Note: A2 closed-via-reproducibility, not literal-deduplication — JSON can't reference a Python constant (seedocs/v0.10.7-plan.md§12.2). Crosswalk SME hand-verification remains deferred (v0.11+). -
translate_url()extraction fromverify-osps-conformance.ymlinto the testedscripts/verify_osps_conformance.pymodule. -
GitHub OSPS collector DRY pass —
_unknown_finding()factory dedupes theUNKNOWN-branch boilerplate;_file_present_at_anynow surfaces UNKNOWN (not FAIL) on all-5xx probes (honest signal). -
Workflow-permissions audit promoted to a blocking CI gate —
audit_workflow_permissions.py --strict+# JUSTIFIED:parser +--json; newverify-workflow-perms.yml; 3 workflows carry JUSTIFIED annotations (PR-comment + issue-opening bots). -
Pre-push gate Layer 2 — hand-rolled
.githooks/pre-pushorchestrator (consistent with the existing.githooks/commit-msg; the pre-commit framework was rejected because it conflicts with this repo'score.hooksPathsetup — see §12.3) running 7 blocking checks (action-pins, secrets, CHANGELOG-presence, docs-health--strict, workflow-perms--strict,uv.lockthird-party pin-drift, OSPS-crosswalk drift) + bypass logging +docs/pre-push-gate.md. L1 (local Scorecard sweep) + L3 (warning-only) deferred. -
In-repo wiki content fill (~47 pages) — auto-generated
canonical mirrors + reference pages (CLI / MCP tools / config /
catalogs / crosswalks) + 7 per-package API pages + hand-authored,
triple-validated concept / guide / compliance pages + FAQ;
generators wired into
sync-wiki.yml. -
Bundled
evidentia.examples/sample-inventory.yamlin theevidentiawheel so the quickstartgap analyzeis runnable forpip installusers. -
Doc-wide CLI-example accuracy sweep — fixed
gap analyzeexamples in README + both quickstarts + air-gapped guide to the real--inventory/--frameworks/--output/oscal-arsurface; corrected the federal-SI walkthrough Step-8 CLI.
Two §12 accuracy corrections caught by the doc verify-everything
pass (see docs/v0.10.7-plan.md §12.5/§12.6): the CIMD
terminology misnomer (Client ID Metadata Document, OAuth scope —
distinct from the SignedToolOutput signing mechanism) corrected in
the wiki + api-stability.md; and the gap analyze CLI examples
that had never matched the shipped CLI, corrected doc-wide. Workspace
ships 8 PyPI packages unchanged from v0.10.6 (no new packages this
cycle). 3666 tests pass / 14 skipped / 3680 collected across 281
source files (was 279 v0.10.6); mypy strict 0/0; ruff clean.
Patch on v0.10.7 (released 2026-05-31). Tag v0.10.8 (2026-06-05).
Full plan: docs/v0.10.8-plan.md. Theme:
institutionalize the v0.10.7 quality discipline into the automatic
release mechanism, start enforcing CLI↔GUI feature parity, and begin
closing the GUI gap. First ship under /pre-release-review v5.2.
Headline shipments: the tag-time gate job in release.yml (publish
jobs blocked on a full gate-suite run, via the run_gate_suite.py
single source of truth) + the consistency.yml CI staleness mirror +
a real CI secret-scan; the cli-gui-parity.yaml manifest +
check_parity.py gate (advisory this cycle; GUI coverage 6.1% →
13.3%); 4 Tier-B GUI screens (POA&M / TPRM / ConMon / Explain); the
Phase G upkeep workflows (stale-branches, dependabot-automerge,
safeguards-resweep); 4 wiki guides + README hero refresh. The new
tag-gate proved itself on its first release — it correctly blocked
the initial publish on 3 real pre-publish issues (a pyjwt CVE wave, an
accepted-with-rationale aiohttp pair, and an eval-CLI test failure
only reproducible under the gate's full-extras environment). Post-ship
fix (no version bump): secret-scan.yml switched from the
license-gated gitleaks action to the MIT-licensed gitleaks binary.
Note: the required-signatures ruleset shipped with its admin bypass
still in place — the bypass removal (closing F-V107-1) moved to the
v0.10.9 cycle's Tier-4 cleanups.
Approved scope (Allen 2026-05-31):
-
Release-hardening automation — a tag-time gate job in
release.yml(pytest / mypy / ruff / version-consistency / docs-health / osv) that blocks the irreversible PyPI publish on a red or stale tree; a CI mirror of the version/doc-staleness guards (currently pre-push-only); auto-regenerate the README on CHANGELOG change; a real CI secret-scan. -
Commit-signature enforcement — a GitHub required-signatures
ruleset on
mainwithenforce_admins=true(server-side, closing the F-V107-1 admin-bypass), with the v0.10.7 local pre-push check as defense-in-depth. -
CLI↔GUI parity mechanism — a
cli-gui-parity.yamlmanifest + ascripts/check_parity.pyCI gate (completeness + GUI-existence + debt-ratchet so new CLI work must ship its GUI surface). The current OpenAPI drift-gate keeps types in sync but enforces no feature parity. - GUI build-out (start) — phased, API-exists-add-screen first (collect / tprm / poam / conmon / model-risk / integrations / ai-gov / explain), then build API+screen for governance / retention / evidence / oscal / eval. This cycle lands 4 Tier-B screens — POA&M / TPRM / ConMon / Explain (scope resolved 2026-06-02); the rest follow across v0.10.9+, driven down by the parity debt-ratchet.
- Operator-walkthrough wiki media — screenshots and/or video captured live during the self-led operator walkthrough.
- README hero refresh — centered OG title card + centered buttons.
-
DONE this cycle: the Framework-detail
.border-destfix (validated 3×).
Resolved via brainstorming (2026-06-02): GUI scope = moderate (the
4 Tier-B screens above); the wiki is populated per-screen as built
(Phase E1 screenshots pulled into each screen's definition-of-done;
the operator-walkthrough video stays a live capture); execution runs
as 3 waves — (1) release-hardening + parity mechanism, (2) the 4 GUI
screens, (3) automation + polish — with a review checkpoint after each
wave and a consolidated /pre-release-review + /security-review +
/code-review gate before the (held) single push + tag.
Phase G — automatic-upkeep (resolved 2026-06-02): ADOPT this cycle — stale-branch-flagging workflow, Dependabot auto-merge (patch/minor, post-gate; needs the repo "Allow auto-merge" setting), quarterly safeguards re-sweep. DEFER to the automatic-upkeep backlog — doc/pointer-rot cadence, session → memory capture, a consolidate-memory pass (requires explicit approval; the private MEMORY.md index is over its size limit), and research-resync cadence.
Carry-over deferred backlog (from the v0.10.7 cycle, per
docs/v0.10.7-plan.md §6):
-
CIMD-terminology scrub across the 4 active non-wiki docs —
correct the "Cryptographic CIMD signatures" / "CIMD signing"
misnomer in
docs/ROADMAP.md,docs/capability-matrix.md,docs/positioning-and-value.md,docs/threat-model.mdto distinguishSignedToolOutputsigning (the real crypto feature) from CIMD (OAuth client-scope metadata). Careful per-hit pass, NOT a blind find-replace; append-only historical docs left untouched. Internal memory entries also flagged for correction. (See §12.5.) -
CatalogEntryphantom inapi-stability.md— thecatalog.pyfrozen-models row listsCatalogEntry, which doesn't exist in code (real models:FrameworkManifestEntry/CatalogControl/ControlCatalog/FrameworkMapping/CrosswalkDefinition). Correct the NORMATIVE row + regenerate its wiki mirror. -
OSPS crosswalk SME hand-verification — upgrade the 5 OSPS
crosswalks from
verification: "self-attested-via-upstream"to"hand-checked"where an SME confirms accuracy (SME-grade work; could fold into the v0.11 cycle). -
D2–D6 code-quality MINORs —
gen_osps_crosswalksdifflib + dynamic-load alignment;github/osps.pyerror-type narrowing + unreachable-fallback cleanup + a qa-02 UNKNOWN test;check_docs_health.pycontent-anchored cross-link allowlist (recurring absolute-line-number footgun);audit_workflow_permissions.pyCRLF JUSTIFIED-parser test; pre-push L2 bash smoke tests + content-anchored self-exclude; wiki-generator code-span-aware link rewriting. -
Deploy MkDocs to GitHub Pages — the 7 auto-generated API pages
link to a not-yet-deployed MkDocs site (mitigated: each also links
to live GitHub source). Needs a
pages: writePages workflow. -
docs/v0.9.3-plan.mdcross-link WARNs — 3 self-referential cross-project links in a historical plan doc; fix to plain relative links or accept as historical (low priority). - OSPS-LE-01.01 DCO sign-off — needs GOVERNANCE.md + a second contributor.
-
OSPS-VM-05.03
osv-scannerCI gate (verify-osv-scan.yml) — small enough to land standalone or fold into v0.11. - Pre-push gate L1 / L3 — defer-or-skip; revisit if a new pattern justifies them.
Theme: close the v0.10.8 ship findings and skill-iteration debt, and
harden the release machinery that cycle built. Full plan:
docs/v0.10.9-plan.md (approved 2026-06-10; scope
= moderate, all eight items). Scope summary:
-
eval CLI
_resolve_signOIDC graceful degrade — the product fix behind the v0.10.8 test fix. The eval CLI auto-signs whenGITHUB_ACTIONS=trueand sigstore is importable, but never checks OIDC-token obtainability, so it crashes (SigstoreSigningError) in any CI job lackingid-token: write. CheckACTIONS_ID_TOKEN_REQUEST_TOKENand degrade gracefully (write unsigned + warn) instead of crashing. -
SF-V108-3 —
check_uv_lock_pin_driftshould diff the bump-commit'suv.lockspecifically, not the aggregate push range, so a separately-committed dependency bump in the range no longer trips the false positive that forced a logged bypass in v0.10.8. -
SF-V108-4 — the
release-safeguards-scaffolderG4 template defaults to the gitleaks binary for organization-owned repos (the gitleaks action needs a paid license on org repos — the bug Evidentia hit post-v0.10.8). -
parity.ymladvisory → blocking — flip the CLI↔GUI parity gate now that a full cycle has run advisory. -
Deferred polish from the v0.10.8 review: widen
PoamGap→ControlGap-Output[]; extract a sharedlib/sse.ts(ExplainPage + RiskGeneratePage duplicate the SSE reader); safeguards-resweep exact-title idempotency. -
Watch-item: the accepted aiohttp client-cookie CVEs
(
osv-scanner.tomlignoreUntil2026-12-04) — if litellm relaxes its exactaiohttp==3.13.4pin so aiohttp can float to ≥ 3.14.0, drop both ignore entries and re-validate. -
Tier-4 post-v0.10.8 cleanups folded in: the Dependabot "Allow
auto-merge" repo setting (prerequisite for
dependabot-automerge.ymlto actually merge); the F-V107-1 ruleset admin-bypass removal. (The wiki sync proved already automatic —sync-wiki.ymlcarried the v0.10.8 guides on push.) - A competitive/market research refresh (the quarterly-ish resync;
last full pass at v0.9.5) runs alongside the cycle; outputs land in
docs/positioning-and-value.md.
Sourced from Phase B audit v3 + integration plan §"Per-release detailed integration plan" §v0.11. Substantive minor (~6-8 weeks):
-
KSI (Key Security Indicators) emission per FedRAMP's
machine-readable schemas (FRMR JSON; the
FedRAMP/schemasJSON-Schema repo) — wires as third output mode onevidentia conmonalongside the 7 bundled cadences shipped v0.9.0. (Re-based 2026-06-10: KSIs are FRMR JSON, not OSCAL feeds — OSCAL remains the Rev5/RFC-0024 package format per NTC-0009; see integration-survey §8.2.) Evidentia's natural slot per Phase B Stream E4: OSS engine for the audit-quality middle layer between Trestle (raw OSCAL SDK) and RegScale (commercial FedRAMP package generator). - Evaluate OSCAL 1.2.1 → 1.2.2 — OSCAL 1.2.2 released 2026-04-30; assess the schema delta against the current 1.2.1 surface before adopting.
-
evidentia incident emit --format dora-art-17(DoraIncident Pydantic record +classify_dora()per RTS 2024/1772 Art. 8 + Art. 9; auto-POA&M creation for 4h/24h/72h/1-month reporting clocks). First Apache-2.0 OSS DORA Art. 17 reference emitter (closed-source GRC vendors embed this inside paid platforms; no public OSS implementation exists). CIR 2025/302 Annex I/II/III/IV template alignment. -
nist-sp-800-218a-ai-codingTier-B bundled catalog — 11 controls covering the AI-assisted-code-production subset of NIST SP 800-218A (vs the broader AI-model-development scope). Pair withdocs/ai-coding-policy.mdtemplate (CLAUDE.md / .cursorrules / copilot-instructions.md skeleton ready for clients to fork). Strong dogfood narrative: Evidentia uses Claude Code to develop itself. -
AI-governance crosswalk enrichment 4-phase: (Phase 1) ISO
27001:2022 Amendment 1:2024 Climate as in-catalog addendum to
iso-27001-2022.yaml; (Phase 2) NIST AI 600-1 GenAI Profile + ISO/IEC 23894 as Tier-B catalogs; (Phase 3) first Apache-2.0 machine-readable EU AI Act ↔ ISO/IEC 42001 crosswalk —docs/crosswalks/eu-ai-act-to-iso-42001.yamlclean-room from EU AI Act Annex III + ISO 42001 Clauses 4-10 + Annex A controls (zero public OSS equivalent); (Phase 4) Council of Europe AI Convention (CETS 225) Tier-C stub. -
evidentia vex publish --rekor— Sigstore Rekor attestation viacosign attest --type openvex. Closes OSPS-VM-04 maturity-3 control + CISA SbD Goal 6 alignment. -
VSA (Verification Summary Attestation) emit per SLSA v1.2 —
evidentia oscal vsa <ar.json>→ consumer-facing verification policy. Closes SLSA Source Track L2 claim path. -
Auto-generate
docs/security-review-vX.Y.Z.mdfrom per-run JSON via skill v5.1 Q9 mechanism. -
DORA-metrics extractor
scripts/extract_dora_metrics.py— passive collection across 30+ Evidentia releases reading per-run JSONs → MTTR / lead-time / change-failure-rate / bypass-rate. Enables ESEM 2027 SEIP short-paper submission. - arXiv preprint authored: "Evidentia: OSS Reference Implementation of Computational Compliance for Multi-Framework Regulatory Assurance" — 6-8pp, cites Marino & Lane (arXiv:2601.04474) blueprint, establishes priority before another impl beats Evidentia to the generalist-GRC-OSCAL niche.
-
SARIF-ingestion collector (
evidentia collect sarif) — the consume-side counterpart to the v0.10.0 SARIF emit, and theintegration-survey.md§3 #5 candidate. One adapter ingests any SARIF 2.1.0 emitter (Trivy / Checkov / Semgrep / CodeQL, and the Clear Capabilitiesagentic-securityscanner — seeintegration-survey.md§9) intoSecurityFindings: maps SARIFlevel→Severity, preservescodeFlowstaint traces + KEV/EPSSpropertiesas provenance, and reuses the v0.10.1 OCSF collector's HTTPS/SSRF guard (--block-private-ips). Mirrors theevidentia_collectors.ocsfmodule; data-layer interop only (no third-party code dependency — see theintegration-survey.md§9 licensing note). Design spec:sarif-ingestion-collector-design.md(control-agnostic default + attestation-gated candidate mappings from SARIF-native taxa / operator map / derived; reusesControlMapping+OLIRRelationship). -
Refresh
docs/integration-survey.mdcompetitive section post-operator-deep-dive (incorporate AWS OSCAL MCP / Vanta MCP / ComplianceCow MCP / Snyk AI Trust Platform shifts).
Items deferred from Phase B audit v3 + integration plan §"v1.1+":
- Multi-tenant RBAC full CLI/REST wire (v0.11+ scope; primitives shipped v0.9.7).
- PR-time auto-blocking workflow (closes OSPS-VM-05 + VM-06 maturity-3 controls; 100-LOC workflow YAML).
- AIReg-Bench adapter (
evidentia_eval/aireg_bench.py) — scores Evidentia against Marino & Lane benchmark; establishes computational-compliance reference-implementation priority. - Auto-redaction script for per-run JSON publication
(
scripts/redact_for_publication.py) — strips client-PII; enables public dataset release alongside ESEM 2027 paper + pairs with MSR 2027 Mining Challenge candidacy. - ESEM 2027 SEIP short-paper submission (~May 2027 deadline) OR ICSE 2027 Demonstrations track (4-6pp tool demo).
- Persona modes full UX (auditor / engineer / TPRM specialists) — scope post-v1.0.
- Hosted federal-cloud variant — scope post-v1.0.
- OpenSSF Best Practices Badge Gold tier — unblocked only when Polycentric-Labs has ≥2 active core maintainers (tied to organizational-onboarding milestone; specifics out of scope pre-v1.0).
- Architectural Tier 3 items from Phase B audit v3 (control-chart script ships skill-side; dynamic-install eBPF scan; etc.).
See docs/v1.0-transition.md for the full
narrative. v1.0 combines Candidate A (federal-compliance theme
accepted by domain expert) and Candidate B (public API contract
frozen). Acceptance gates include: domain-expert walk-through
completed, 1+ external operator validation, API stability docs
published, deprecation calendar, OpenSSF Gold tier (if achievable),
cryptographic CIMD signatures, and pre-release-review PROCEED-CLEAN.
Commercial packages (evidentia-pro, evidentia-enterprise,
evidentia-federal) launch post-v1.0 as separate PyPI packages with
proprietary licenses.
Every AI-generated risk statement gets scored against NIST SP 800-30 / IR 8286 criteria. Statements that fail validation are automatically regenerated with corrective instructions. Produces audit-survivable output that no other open-source tool guarantees.
Same infrastructure as the shipped AWS / GitHub / Jira implementations, more sources:
-
evidentia-collectors[aws]— IAM Access Analyzer (AC-3, AC-6, IA-2) -
evidentia-collectors[github]— Dependabot alerts (SI-2; requiressecurity-eventsscope) -
evidentia-collectors[okta]— MFA enforcement, inactive users, privileged account counts (AC-2, IA-2, IA-5) -
evidentia-integrations[servicenow]— push tosn_compliance_taskvia REST with OAuth 2.0 -
evidentia-integrations[vanta]and[drata]— custom test results push via their public APIs
Reframes the cross-framework efficiency feature as "close N gaps across M frameworks with one remediation." CFOs and CISOs respond to ROI framing in ways they don't respond to "coverage %".
- Auto-generated TypeScript types from FastAPI's OpenAPI schema (hand-authored in v0.4.0; auto-gen removes the drift class entirely)
- Tauri desktop packaging option for offline-first users who prefer
an installable app over
evidentia serve - Optional multi-user auth / RBAC for network deployments (localhost-only in v0.4.0 — v0.7.0+ adds token auth)
- RSA Archer integration — deferred indefinitely. Enterprise-only, requires an Archer instance to develop against, and the market has been moving to REST-native alternatives for years.
- COSO framework content — legally non-starter (AICPA copyright, same basis as the SOC 2 Tier-C stub treatment).
- Per-framework crosswalk auto-generation via LLM — rejected on correctness grounds. Crosswalks are audit-critical and need human-in-the-loop review. An LLM-authored crosswalk should be reviewed and committed, not generated at runtime.
PyPI Trusted Publisher (OIDC) migration: DONE in v0.7.0 for the
6 published evidentia-* packages. The legacy PYPI_API_TOKEN was
deleted from the pypi GitHub environment during v0.7.0 ship-day
housekeeping (verified absent post-v0.7.1 via
gh secret list --env=pypi --repo polycentric-labs/evidentia — zero
secrets remain at the repo or env level). The originally-queued
v0.7.1 deletion-verification step is therefore a no-op carried into
v0.7.2 only as a bookkeeping line in docs/v0.7.2-plan.md.
-
- AI Governance
- Air Gapped Install
- Ci Integration
- CONMON Deployment
- Emit Cyclonedx VEX
- Emit OCSF Detection
- Emit SARIF
- Explain Controls
- Generate And Quantify Risk
- Governance Metrics And Workflows
- Ingest OCSF
- Manage Model Risk
- Manage POAM
- Manage Third Party Risk
- MCP Client Setup
- OSPS Self Assessment
- Run Gap Analysis
- Serve The Web Ui
- Sign And Verify Evidence