Skip to content

Contributing A Catalog

evidentia-wiki-sync[bot] edited this page May 29, 2026 · 1 revision

Auto-generated mirror. This page mirrors docs/contributing-a-catalog.md, the canonical source of truth. Do not edit this file directly; edit docs/contributing-a-catalog.md and re-run scripts/wiki/sync_mirrors.py.

Contributing a new framework catalog

v0.10.3+. Lowers the barrier to adding a new compliance framework to Evidentia. Catalogs can now be YAML or JSON — pick whichever is friendlier; the loader auto-detects by file extension.

TL;DR — the 3-file PR

  1. Add a catalog file at one of:
    • packages/evidentia-core/src/evidentia_core/catalogs/data/us-federal/<framework-id>.yaml
    • …/data/international/<framework-id>.yaml
    • …/data/stubs/<framework-id>.yaml (Tier-C, copyrighted text not bundled)
    • …/data/state-privacy/<framework-id>.yaml
    • …/data/threats/<framework-id>.yaml
    • …/data/mappings/<framework-id>.yaml
  2. Run python scripts/catalogs/regenerate_manifest.py to update data/frameworks.yaml (the manifest is auto-generated from the files on disk — never hand-edited).
  3. Open a PR with both files staged.

Why YAML (v0.10.3+) vs. JSON

JSON YAML
Comments ❌ Not supported # inline + block
Multi-line strings Awkward (\n escapes) ✅ `
Trailing commas Forbidden Allowed
Hand-edit friendliness Brittle Significantly better
Tooling Universal Universal (PyYAML, yamllint, etc.)

Both formats produce the same ControlCatalog; YAML is just the nicer surface for hand-authoring. Existing JSON catalogs stay JSON — there's no auto-migration. Convert one when you next edit it if you like.

Required schema (Evidentia-format)

framework_id: my-framework-id     # kebab-case; stable across versions
framework_name: "Display name"
version: "1.0"                    # framework version, not catalog version
source: "Upstream URL or citation"
tier: A | B | C | D                # see ATTRIBUTION.md
category: control                  # control | technique | vulnerability | obligation

# Tier-C only (control text is copyrighted):
placeholder: true
license_required: true
license_terms: "© <Holder>. Brief redistribution rationale."
license_url: "URL where users can license/download the authoritative text"

families:
  - "Family 1"
  - "Family 2"
controls:
  - id: AC-1
    title: "Control title"
    description: "Control text — or '[Licensed content — see license_url]' for Tier-C stubs"
    family: "Family 1"
    # Tier-C stub controls also carry:
    placeholder: true
    tier: C
    license_required: true
    license_url: "URL where users can license/download the authoritative text"

The manifest.py model validates every field; malformed catalogs get caught by the existing tests/unit/test_catalogs/test_all_bundled.py suite (which parametrizes a smoke test per bundled framework).

Worked examples

7-control stub — iso-27017-2015.yaml (v0.10.3 proof)

See packages/evidentia-core/src/evidentia_core/catalogs/data/stubs/iso-27017-2015.yaml for the first YAML catalog in the bundled set — a 7-control Tier-C cloud-services stub. The corresponding JSON equivalent was removed in v0.10.3; both formats produce identical ControlCatalog objects when loaded.

18-control stub — cis-controls-v8.1.yaml (v0.10.4 C3 richer proof)

See packages/evidentia-core/src/evidentia_core/catalogs/data/stubs/cis-controls-v8.1.yaml for a richer hand-authored example. The 18 CIS Critical Security Controls v8.1 are grouped by Implementation Group (IG1 / IG2 / IG3) using # ── ... ── comment headers — a pattern that's impossible in JSON. This demonstrates the actual contributor benefit of YAML for catalogs that need browsability beyond a flat control list.

The 7-control vs 18-control pair lets contributors see both ends of the size spectrum the YAML format is intended to serve.

When NOT to use YAML

  • OSCAL catalogs (e.g., NIST 800-53 Rev 5): stick with JSON. Upstream NIST publishes OSCAL JSON; the existing JSON files are byte-identical to NIST's release. Keep the chain of custody.
  • Machine-generated catalogs: JSON is easier for tooling to emit + diff against the upstream.

Other tier conventions

  • tier: A — verbatim-redistributable control text. Two sub-categories:
    • Public-domain works (US federal, NIST, CISA, FedRAMP).
    • Permissive-licensed works (Apache-2.0, MIT, CC-BY) where the upstream license explicitly permits verbatim redistribution. The catalog's license_terms field MUST cite the upstream license; the source URL pins the upstream commit/version so the chain-of-custody is auditable. Catalog bundles full control bodies in both cases.
  • tier: B — copyrighted but licensed for embedded distribution (rare).
  • tier: C — copyrighted; only IDs + titles bundled. Control body is a placeholder; operators run evidentia catalog import with their licensed copy to override.
  • tier: D — government edicts (statutes, regulations) — uncopyrightable; full text bundled.

See ATTRIBUTION.md for the redistribution rationale per tier.

OSCAL sidecar artifacts (v0.10.6+)

A bundled catalog MAY ship a companion OSCAL Catalog 1.2.1 serialization alongside the Evidentia-format YAML, named <framework-id>.oscal.json (or .oscal.yaml). Worked example: the OpenSSF OSPS Baseline ships three Evidentia per-maturity YAMLs (osps-baseline-m{1,2,3}.yaml) plus one OSCAL serialization (osps-baseline.oscal.json) covering the full 41-control surface.

OSCAL sidecars are downstream-consumption artifacts, not framework-manifest entries. The scripts/catalogs/regenerate_manifest.py scanner skips any file matching *.oscal.json / *.oscal.yaml so they don't register as separate frameworks. Validation: the OSCAL JSON MUST validate against trestle.oscal.catalog.Catalog (Trestle 4.0.3+ pydantic models) or oscal-cli validate. Authoring guidance:

  • Top-level: {"catalog": {"uuid": "<stable UUID v5>", "metadata": {...}, "groups": [...]}}
  • Each upstream family → one OSCAL group with class: "family".
  • Each control → one OSCAL control with lowercase hyphenated id (e.g. osps-ac-01) and uppercase class preserving the canonical upstream ID (e.g. OSPS-AC-01).
  • Per-control props: include a maturity-level prop where the upstream framework declares one (OSPS Baseline does).
  • parts: a statement part for the control objective + per- assessment-requirement objective parts (with applicability prop) + an optional guidance part holding upstream crosswalk references (informative; treat as self-attested by upstream).

Tests + CI

tests/unit/test_catalogs/test_all_bundled.py parametrizes a catalog-load smoke test per bundled framework. Adding a new catalog auto-extends the test suite — no new test file required for the catalog itself.

tests/unit/test_catalogs/test_yaml_loader.py (v0.10.3+) covers the YAML loader paths directly.

tests/unit/test_catalogs/test_osps_baseline.py (v0.10.6+) is a worked-example test suite for a multi-maturity Tier-A catalog with companion OSCAL sidecar. Mirror this pattern when contributing a similarly-structured catalog (per-maturity / per-baseline / per- profile slices of a single upstream source).

Clone this wiki locally