Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .well-known/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
// SPDX-License-Identifier: PMPL-1.0-or-later
= .well-known/ — RFC 8615 web metadata
:status: populated web metadata (see docs/decisions/ADR-0002-data-subdir-population.adoc)

RFC 8615 well-known URIs, published at the site root when this repo is served
via GitHub Pages. This is web-standard metadata, *not* a pipeline data sink —
it is listed here only because issue #5 grouped it with the data dirs.

Files::
* `ai.txt` — AI interaction policy (training/summarisation/generation
permissions; PMPL Emotional Lineage requirement). Points agents at
`0-AI-MANIFEST.a2ml`.
* `humans.txt` — humanstxt.org credits/colophon.
* `security.txt` — RFC 9116 security disclosure contact.

Writer: maintainers (hand-edited). Reader: web clients, crawlers, AI
agents, security researchers. Edited in place; not deleted.
21 changes: 21 additions & 0 deletions dispatch/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
// SPDX-License-Identifier: PMPL-1.0-or-later
= dispatch/ — scan→triage work queue
:status: populated data store (see docs/decisions/ADR-0002-data-subdir-population.adoc)

Append-only JSONL work queue emitted by the panic-attacker scan → triage
stage. One JSON object per line; one finding per object.

Files::
* `dispatch-YYYY-MM-DD.jsonl` — the day's dispatched findings (immutable once
written).
* `pending.jsonl` — findings accepted but not yet actioned by a bot.
* `held.jsonl` — findings deferred or quarantined (e.g. low confidence,
manual-review required).

Record shape::
`action`, `category`, `confidence`, `auto_fixable`, `description`,
`pattern_id`, `recipe_id`, `replacement`, `repo`, `program_path`, `severity`,
`strategy`, `tier`, `timestamp`.

Writer: the triage stage of the ingest pipeline. Reader: the autofix /
review bots. Append-only — never rewritten in place.
89 changes: 89 additions & 0 deletions docs/decisions/ADR-0002-data-subdir-population.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
// SPDX-License-Identifier: PMPL-1.0-or-later
// Copyright (c) 2026 Jonathan D.A. Jewell (hyperpolymath) <j.d.a.jewell@open.ac.uk>
= ADR-0002: data subdirectories are populated and self-documented, not removed
:revdate: 2026-05-17
:status: Accepted

== Status

Accepted — 2026-05-17.

Resolves: https://github.com/hyperpolymath/verisimdb-data/issues/5[V-L3-S1].

Builds on: xref:ADR-0001-repo-purpose.adoc[ADR-0001] (the two-purpose framing;
these directories are purpose 1, the flat-file data store).

== Context

Issue #5 observed that `dispatch/`, `patterns/`, `recipes/`, `outcomes/`,
`policy/`, `health/` and `.well-known/` looked like "mostly empty placeholder
directories" and asked for a binary decision: *populate* them (one issue per
subdir to make the contents concrete) or *delete* them and rebuild as content
lands.

That premise is now stale. ADR-0001 already declared all of these subtrees as
the repository's flat-file data store, and the panic-attacker → triage →
autofix pipeline has since filled every one of them with production data:

* `dispatch/` — dated work-queue JSONL plus `pending.jsonl` / `held.jsonl`
* `patterns/` — `registry.json` (the cross-repo pattern registry)
* `recipes/` — ~46 `recipe-*.json` remediation recipes
* `outcomes/` — monthly applied-fix ledgers plus a fleet import
* `policy/` — `policy.ncl` (the Nickel baseline policy contract)
* `health/` — `sitrep.txt` and `hypatia.json` operational telemetry
* `.well-known/` — RFC 8615 web metadata (`ai.txt`, `humans.txt`,
`security.txt`)

There are no empty placeholder directories left in the tree. The only real
gap against the issue's acceptance criteria was the absence of a per-directory
README explaining what each one holds.

== Decision

**Populate, not remove.** The data subdirectories are a permanent, declared
part of the repository (per ADR-0001) and are now backed by real, actively
written content. They are *not* placeholders and must not be deleted.

To make the contents concrete and self-documenting — the spirit of the issue's
"one issue per subdir" alternative, achieved in one PR — each data directory
carries a `README.adoc` that states:

. what artefact(s) it holds and their on-disk format,
. who writes it (which pipeline stage / bot) and who reads it,
. whether it is append-only or rewritten in place,
. its retention / `.gitkeep` behaviour.

Splitting this into seven tracking issues was rejected: the directories are
already populated, so seven issues would each open and immediately close as
"add a README" with no design content. One ADR plus seven READMEs records the
decision and discharges the acceptance criteria together.

== Consequences

. The `dispatch/ patterns/ recipes/ outcomes/ policy/ health/ .well-known/`
directories each gain a `README.adoc`; no data files are moved or removed.
. `outcomes/.gitkeep` is retained: the ledger is monthly, so the directory
can legitimately contain only `.gitkeep` at the start of a month, and that
is now documented rather than mistaken for an empty placeholder.
. Future data directories under purpose 1 must ship with a `README.adoc`
from the first commit (close the "is this a placeholder?" question at
creation time).
. ADR-0001's directory-layout list remains authoritative for *which* subtree
serves *which* purpose; this ADR documents the *contents* of the data
subtree.

== Alternatives considered

Delete the directories and rebuild as content lands::
Rejected. The content has already landed and is being written continuously by
the ingest/triage/autofix pipeline; deleting live data sinks would break that
pipeline and lose history. This alternative only made sense under the (now
false) assumption that the directories were empty.

One tracking issue per subdirectory::
Rejected as busywork. The directories are populated; each issue would reduce
to "add a README." Folded into this single ADR + the seven READMEs instead.

Move telemetry / web metadata out of the data store::
Out of scope. `health/` and `.well-known/` are small and co-located by
design; relocating them is a separate decision if either grows.
17 changes: 17 additions & 0 deletions health/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
// SPDX-License-Identifier: PMPL-1.0-or-later
= health/ — operational telemetry
:status: populated data store (see docs/decisions/ADR-0002-data-subdir-population.adoc)

Latest-state operational telemetry for the data store and its pipeline.
These are *snapshots* (rewritten in place each run), not append-only logs —
history lives in `outcomes/` and `dispatch/`.

Files::
* `sitrep.txt` — human-readable situation report: last-run timestamp,
counts (scans / patterns / actions / outcomes / recipes), the
auto/review/report split, process up/down, contingency status, run time.
* `hypatia.json` — machine-readable health snapshot consumed by the
Hypatia scanner / Scorecard surface.

Writer: the pipeline's reporting stage. Reader: humans (sitrep) and the
Hypatia / health-monitoring tooling (json).
19 changes: 19 additions & 0 deletions outcomes/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
// SPDX-License-Identifier: PMPL-1.0-or-later
= outcomes/ — applied-fix outcome ledger
:status: populated data store (see docs/decisions/ADR-0002-data-subdir-population.adoc)

Append-only JSONL ledger of what happened when a recipe was applied to a
repo. Feeds the recipe success/fail counters and health reporting.

Files::
* `YYYY-MM.jsonl` — one month of outcomes (immutable once the month closes).
* `*-fleet-import.jsonl` — bulk historical imports.
* `.gitkeep` — retained on purpose: the ledger is monthly, so at the start
of a month this directory can legitimately hold only `.gitkeep`. That is
the expected steady state, *not* an empty placeholder (see ADR-0002).

Record shape::
`pattern`, `repo`, `bot`, `outcome` (`success` / failure), `fixed_at`.

Writer: the autofix/review bots after each attempt. Reader: recipe
counter rollup and `health/`. Append-only.
16 changes: 16 additions & 0 deletions patterns/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
// SPDX-License-Identifier: PMPL-1.0-or-later
= patterns/ — canonical cross-repo pattern registry
:status: populated data store (see docs/decisions/ADR-0002-data-subdir-population.adoc)

`registry.json` — the single canonical pattern registry. It deduplicates
raw findings across every scanned repo into trackable, long-lived patterns.

Shape::
Top-level `description`, `last_updated`, and a `patterns` map keyed by
pattern id. Each entry: `id`, `category`, `description`, `pa_rule`,
`occurrences`, `first_seen`, `last_seen`, `recipe_id` (the remediation
recipe, or `null` if none yet), `repo_paths`.

Writer: the pattern-aggregation stage of the ingest pipeline (rewritten in
place on each run — it is a registry, not a log). Reader: triage (to attach
`recipe_id`) and reporting/health.
18 changes: 18 additions & 0 deletions policy/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
// SPDX-License-Identifier: PMPL-1.0-or-later
= policy/ — baseline policy contract
:status: populated data store (see docs/decisions/ADR-0002-data-subdir-population.adoc)

`policy.ncl` — the Nickel baseline policy contract for this data store.

Contents::
* `enforcement` — repo hygiene flags (`require_spdx_headers`,
`require_ci_security_checks`, `block_committed_secrets`,
`require_pinned_actions`).
* `triage` — confidence thresholds: `auto_fix_min_confidence`,
`review_min_confidence`.
* `version` — policy schema version.

Writer: maintainers (hand-edited; version-bumped on change). Reader:
`contractiles/must/Mustfile` consumes it to gate enforcement, and triage
reads the confidence thresholds to choose auto-fix vs. review vs. report.
Single file, edited in place.
20 changes: 20 additions & 0 deletions recipes/README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
// SPDX-License-Identifier: PMPL-1.0-or-later
= recipes/ — remediation recipe knowledge base
:status: populated data store (see docs/decisions/ADR-0002-data-subdir-population.adoc)

One `recipe-*.json` file per remediation recipe — the knowledge base the
autofix bot consults to turn a matched pattern into a fix.

Shape (per file)::
`id`, `action`, `description`, `confidence`, `auto_fixable`, `fix_script`
(or `null`), `languages`, `match`, `replacement`, `pattern_ids` (the
patterns this recipe remediates), `proven_module`, `triangle_tier`
(eliminate / substitute / …), and running counters
`total_attempts` / `successful_fixes` / `failed_fixes`.

Naming::
`recipe-<slug>.json`; the `recipe-scorecard-*` family maps to OpenSSF
Scorecard checks.

Writer: curated, plus counter updates from outcome ingest. Reader: the
autofix bot and triage (recipe lookup by `pattern_id`).
Loading