Add A2ML Record Dialect specification + conformance vectors#456
Merged
Conversation
Specifies the TOML-like [section]/key=value surface that ~435 estate descriptile files (STATE/META/ECOSYSTEM/AGENTIC/NEUROSYM/PLAYBOOK/ANCHOR) already use but which was never formally defined. Grounded in a measured survey of the deployed corpus (1,798 .a2ml files), not the abandoned .scm-era specs. The record dialect is a second concrete surface over the SAME typed core as the existing markup dialect (SPEC.adoc section 3): it reuses the validation modes, base invariants, Base Record vocabulary, profile mechanism, and canonical hash form unchanged. Not a rival format -- a co-equal reader. This is a structural fix for the "three coexisting format generations, no deprecation story" finding. Adds: - a2ml/RECORD-DIALECT-SPEC.adoc -- normative spec (RFC 2119, EBNF, core mapping, R0/R1/R2 conformance classes, normative divergences from TOML, security, versioning). Status: Draft -- not yet ratified, wired into CI, or exercised by a reference reader (see the spec's Appendix D). - a2ml/tests/record-dialect/ -- executable conformance-vector suite: 8 valid + 12 invalid vectors, each mapped to a spec clause and required class, plus a dogfooded record-dialect MANIFEST.a2ml and a README. Regenerated .machine_readable/REGISTRY.a2ml with scripts/build-registry.sh (the sanctioned generator; the file is not hand-edited). This updates the a2ml/ home source_hash for the new files and incidentally re-pins the k9-svc/ home hash, which was already drifted on main since 6df21b1 (the registry was not regenerated after that commit changed k9-svc files). The drift gate (build-registry.sh --check) passes after regeneration. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_0112RkdER2wtwHdNmbEhThUz
…-architecture-6j606d # Conflicts: # .machine_readable/REGISTRY.a2ml
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



What & why
The seven canonical descriptile files that every estate repo deploys —
STATE,META,ECOSYSTEM,AGENTIC,NEUROSYM,PLAYBOOK,ANCHOR— are written in a TOML-like[section]/key = valuesurface. Until now that surface was used but never specified: ~435 files across the estate conformed to a grammar that existed only by imitation. The existinga2ml/SPEC.adocv1.1.0 specifies the markup dialect (@directive:blocks) and an@record/key: valueform, but never the[section]/key = valuerecord surface those files actually use.This PR fills that gap with a normative spec grounded in a measured survey of the deployed corpus (1,798
.a2mlfiles), not the abandoned.scm-era specs.Key design decision
The Record Dialect is not a rival format — it is a second concrete surface over the same typed core as the markup dialect (
SPEC.adoc§3). It reuses, unchanged:a2ml/*profiles already target this dialectOne abstract model, two surfaces, no third generation. This is a structural fix for the audit finding that three format generations (
.scm→ TOML-like → A2ML profiles) coexist with no deprecation story.Contents
a2ml/RECORD-DIALECT-SPEC.adoc— normative spec: RFC-2119 language, EBNF surface grammar (every production grounded in the corpus), the record-tree data model, translation to the A2ML core, R0/R1/R2 conformance classes, a normative "Divergences from TOML" table (so an implementer can't conform by piping input through a stock TOML parser), security considerations, IANA note, versioning, and a worked example annotating a real deployedSTATE.a2ml.a2ml/tests/record-dialect/— an executable conformance-vector suite: 8 valid + 12 invalid vectors, each minimal and mapped to one spec clause + the class at which rejection is required; a dogfooded record-dialectMANIFEST.a2mlindexing them; and aREADME.adoc.Status: Draft (deliberately)
Per the no-overclaim doctrine, the spec is marked
Draftand states it MUST NOT be cited as ratified until a reference reader greens the vectors and a corpus conformance run is published. The path toStableis in the spec's Appendix D:|| echo SKIP)."…ZT00:00:00Z") as findings.a2ml/home.StatustoStable.Registry regeneration — note for the reviewer
.machine_readable/REGISTRY.a2mlwas regenerated withscripts/build-registry.sh(the sanctioned generator — not hand-edited). This updates thea2ml/homesource_hashfor the new files and incidentally re-pins thek9-svc/home hash, which was already drifted onmainsince6df21b1(the registry wasn't regenerated after that commit changedk9-svcfiles).build-registry.sh --checkpasses after regeneration. Flagging the k9-svc rehash explicitly since it's unrelated to the spec itself.Guardrails respected
No licence content touched; no SPDX sweeps; new files carry correct SPDX from birth per the repo's MPL-2.0 / CC-BY-SA-4.0 classification.
🤖 Generated with Claude Code
https://claude.ai/code/session_0112RkdER2wtwHdNmbEhThUz
Generated by Claude Code