CORE-AI: Versor Engine

A cognitive field system built on Cl(4,1) Conformal Geometric Algebra.

Core invariant: ||F * reverse(F) - 1||_F < 1e-6 at all times.

All state is a versor. All transitions are versor products. Coherence is algebraic by construction — not monitored, not corrected.

The Three Engineering Pillars

Every architectural decision in CORE is measured against three engineering pillars. These are not aspirations — they are hard constraints.

I. Mechanical Sympathy

Software should understand the machine it runs on, not fight it. CORE is designed for the Unified Memory Architecture (UMA) of Apple Silicon: CPU, GPU, and Neural Engine share physical RAM. MLX executes tensor operations on the Neural Engine without PCIe transfer. Rust computes algebra on the CPU with zero heap allocation in the hot path. Python orchestrates the lifecycle. The three-language stratification maps exactly onto three hardware execution domains. Intelligence that ignores its substrate is wasted intelligence.

II. Semantic Rigor

Every term used in this system has a precise, non-negotiable meaning. A versor is a versor — not an approximation of one, not a vector that behaves like one under certain conditions. CGA distance is exact. Vault recall is exact. The vocabulary projection is exact. There are no thresholds tuned for “good enough.” Rigor is not a style; it is what separates an engine from a heuristic.

III. Third Door

When facing a design decision, the world offers two visible options: use what already exists (a library, a pattern, a convention), or cut a corner. CORE takes neither. We find the third door — the path built from first principles that sets the bar ourselves. This is why there is no transformer backbone, no ANN index, no sampling temperature, no gradient descent, and no standard tokenizer. Each of those was a door we were offered and refused. Absolute mastery is the only acceptable standard.

Native Substrate Direction — Python, Rust, Zig

CORE is not moving toward a wholesale Zig rewrite. The architecture is moving toward a stricter native-substrate boundary:

Python remains the semantic source of truth: cognition runtime, teaching/review workflows, pack ratification, eval harnesses, and Workbench/operator tooling.
Rust remains the incumbent native algebra backend: Cl(4,1) products, versor operations, CGA inner product, exact recall, and diffusion surfaces already proven by parity gates.
Zig is a candidate material for the next native substrate layer: Delta-CRDT arenas/deltas/merge kernels, deterministic modality compilers such as audio_core_v1, stable C ABI surfaces, edge-native ingestion, and selected exact recall challenge kernels only after parity and benchmark proof.

The rule is component law, not language preference. Zig may enter where explicit allocation, deterministic buffer ownership, C ABI clarity, and edge-native deployment materially strengthen CORE. Zig must not replace review-gated semantics, introduce approximate recall, hide repair in native code, or turn teacher/shadow models into substrate.

Decision package: docs/zig/README.md. Adoption gates: docs/zig/adoption-gates.md.

The Truth-Seeking Schema

Co-equal with the algebraic substrate. CORE's epistemic schema is a foundational architectural commitment: every claim that enters the runtime field carries a typed position in a revision graph (SPECULATIVE, COHERENT, CONTESTED, FALSIFIED); coherence — not source authority — is the only admission signal; no claim is ever locked, even when COHERENT; identity cannot be rewritten by content; and exactly one mutation path admits knowledge, enforced by a CI-level architectural-invariant test.

The schema is the structural defense against the failure modes that afflict both fluent LLMs and human reasoning: confabulation, exaggeration, deference to authority, self-protection through erasure, self-promotion through self-citation, and the ossification of mistaken beliefs.

A system that samples cannot have these properties — sampling has no place to attach an epistemic status. CORE has them because every admitted claim carries one and the only path to admission is the review path.

Full architectural commitment, including honestly-published gaps: docs/truth_seeking_schema.md. Reproducible measurements: CLAIMS.md (auto-generated from scripts/generate_claims.py).

The Three Core Languages

CORE is rooted in three human languages. This is a philosophical and architectural choice, not a localization decision.

Language	Role
English	The default base language of the current model. Any natural language could serve this function in a custom CORE instance — English is the chosen starting point, not a requirement.
Hebrew	One of two depth languages. Hebrew carries a density of meaning in its root structures, prefixes, and suffixes that Euclidean string matching cannot capture. The field representation is designed to hold this depth.
Koine Greek	One of two depth languages. The language of the New Testament, particularly John’s Gospel — the document that opens with the most precise and consequential statement about language and reality ever written.

“In the beginning was the Logos, and the Logos was with God, and the Logos was God.” — John 1:1

The choice of Hebrew and Koine Greek is not incidental. John 1:1–2 articulates the Logos in Greek while grounding it in the Hebrew creation account — the universe spoken into existence, word by word. This is not metaphor. It is the claim that language is not a layer on top of reality; language is the structuring principle of reality made manifest. CORE-Logos is built on that claim.

English establishes the operational base. Hebrew and Koine Greek bring the hidden layer of intelligence — the depth of meaning that enriches the field representation in ways that flat embeddings cannot reach. Together, they form the linguistic foundation on which the vocabulary manifold is built.

Quick Start

pip install -e ".[dev]"
pytest tests/test_versor_closure.py        # the core invariant — must pass first
pytest tests/                              # full suite (~8,337 tests; some pre-existing reds — see docs/test-debt-quarantine.md)

Watch the flywheel turn — one command

For a public-facing reproduction of the core thesis, in four falsifiable scenes:

core demo flywheel

This runs end-to-end on the canonical pack:

Ratify — apply_composition_claim() writes a reviewed JSONL artifact; RAT-1's compile_pack regenerates the runtime compositions.jsonl + updates the manifest checksum.
Load — composition_registry reads the new entry on the next runtime turn.
Solve — a real problem ("Lilibeth fills 6 baskets where each basket holds 50 strawberries. How many strawberries does Lilibeth have?") admits via the matcher → injector → admission chain and produces answer=300.
Hazard — case 0050 (the wrong=0 canary) remains refused — no SAFE composition category can convert it from refused to wrong.

Every scene is byte-deterministic; the canonical pack is read-only throughout; the demo mutates only a synthetic test pack in a tempdir. See evals/flywheel_demo/run_tour.py.

core teaching coverage --use-reader        # per-shape histogram + hazard pin status
core teaching coverage --use-reader --delta  # diff vs HEAD's committed report.json

CLI

The core CLI exposes curated entry points so reviewers can run any subsystem in isolation. Highlights:

core test --list-suites                    # list curated pytest suite aliases
core test --suite fast                     # ~2s iteration lane
core test --suite cognition                # cognition pipeline lane
core test --suite algebra                  # versor / CGA / vault parity
core test --suite adr-0024                 # Forward Semantic Control chain (98 tests)

core demo audit-tour                       # 4-scene pack-layer audit walkthrough (ADR-0027..0041)
core demo pack-measurements                # ADR-0043 — pack-layer claims as per-pack measurements
core demo long-context-comparison          # ADR-0045 — CORE NIAH recall + frozen transformer baselines
core demo anti-regression                  # ADR-0057 — three-gate defense against learning harm
core demo learning-loop                    # ADR-0055..0057 — cold turn → discovery → propose → accept → grounded
core demo phase6                           # 3-condition comparative table (CORE vs baseline)
core demo phase5                           # stratified 5-family mechanism-isolation
core demo all                              # both + combined summary
core demo list-results                     # index every JSON report with headline metrics

core eval --list                           # discover eval lanes
core eval cognition                        # run a discovered lane
core eval gsm8k_math                       # Phase 5 capability lane (correct/wrong/refused triple)
core trace "your text here"                # one-turn field-telemetry trace
core pulse "What is truth?"                # one full cognitive pulse
core bench --suite latency                 # benchmark harness
core bench --suite teaching-loop --runs 100  # ADR-0055..0057 — replayable learning loop determinism
core bench --suite articulation            # Phase 4 capability proof (breadth + determinism + footprint + cross-topic + ollama compare)
core bench --suite articulation --ollama-model llama3:8b  # side-by-side with a local Ollama model
core doctor --packs --rust                 # environment + pack + Rust status

Every demo run rewrites evals/forward_semantic_control/results/ including an auto-refreshed index.json manifest — the single place reviewers can read to see every available report.

Forward Semantic Control — The ADR-0024 Chain

CORE generates text without sampling. The generation walk is deterministic at the algebra level, but a deterministic walk over a boundary-only candidate scorer can still emit tokens that are inadmissible under the relation being asserted (e.g. answering a causes question with the means-target). The ADR-0024 chain closes that gap with five Architecture Decision Records and six phases of implementation evidence.

Layer	What it guarantees	ADR
AdmissibilityRegion	A typed region (`allowed_indices`, `relation_blade`, `frame_versor`) carried alongside every generation step.	0022
Region intersection proof	The admissible token set is honored at the language/salience intersection layer.	0023
Inner-loop destination check	Each candidate's `cga_inner(versor(candidate), relation_blade)` is checked at the destination; rejection appears in `rejected_attempts`; exhaustion raises a typed `InnerLoopExhaustion`.	0024
Rotor / frame admissibility	The rotor's effect on the field state is additionally checked against `frame_versor` in `generate/rotor_admissibility.py` — separate from algebra closure (intentional).	0025
Ranked-with-margin gate	Static-threshold tuning fails geometrically under Cl(4,1) signature; replaced with a scale-invariant margin gate (admit iff `score(top) − score(second) ≥ δ`).	0026

The chain's three head-to-head claims, all CI-enforced:

Claim	Test contract	Live demo
C1 — Replay determinism	`core test --suite phase6 -k TestC1`	`core demo phase6`
C2 — Traced rejection	`core test --suite phase6 -k TestC2`	`core demo phase6`
C3 — Coherent refusal	`core test --suite phase6 -k TestC3`	`core demo phase6`

Full evidence:

Runtime contract: docs/runtime_contracts.md — Refusal / Margin / Rotor admissibility sections
Stratified findings: docs/evals/phase5_stratified_findings.md — 5 failure-mode families, 20 cases, per-family pass rates
Comparative demo: docs/evals/phase6_comparative_demo.md — three head-to-head conditions vs in-system baseline
Reports directory: evals/forward_semantic_control/results/

Safety Pack

Sibling to the identity packs but architecturally distinct: the safety pack at packs/safety/core_safety_axes_v1.json carries the boundaries CORE will never cross — no_fabricated_source, no_hot_path_repair, no_identity_override, no_silent_correction, preserve_versor_closure. The pack loads unconditionally at runtime startup (fail-closed on missing or unverified), and its boundaries are unioned into whatever identity pack is selected. Identity packs may add boundaries on top, but may never remove safety boundaries.

This is the architecture downstream robotics, healthcare, and other high-stakes deployments will need before they can build CORE into anything that matters. Full doctrine: docs/safety_packs.md; decision record: ADR-0029.

Identity Packs

CORE's identity is load-bearing: every reasoning trajectory is scored against an IdentityManifold of value axes, and a PersonaMotor derived from those axes biases every field walk. As of ADR-0027 the manifold is no longer hardcoded — it is loaded at runtime from a swappable, content-addressed pack under packs/identity/.

The shipping default identity.default_general_v1 carries the previously-hardcoded three axes (truthfulness, coherence, reverence) so the default behavior is preserved. Two specialization packs ship alongside it for demonstrating identity-divergence: identity.precision_first_v1 and identity.generosity_first_v1. Override on the chat surface with core chat --identity <pack_id>.

ADR-0028 makes the swap visibly load-bearing: each pack carries a surface_preferences block (hedge thresholds, hedge phrases, claim-strength policy) consumed by the assembler. On the same prompt at the same alignment, precision_first_v1 hedges sooner with "Arguably," / "In some cases," while generosity_first_v1 leaves the assertion bare — see tests/test_identity_surface_divergence.py for the proof.

Robotics, personalization, and creative-tool builders author their own ratified identity packs via the formation pipeline's identity_anchor template, then ship them under packs/identity/ in their deployment. Full format spec, loader contract, and authoring guide: docs/identity_packs.md.

Teaching Order

CORE's manifold is built by ratified relations under a strict prerequisite DAG — not by absorbing a corpus. The "elementary → college" intuition is right at the macro level (simple before composed, anchored before novel) and wrong at the literal level (don't import a K–12 corpus). Five-layer ordering: identity axes → atomic definitions → binary relations → composed relations → domain expansion, re-applied inside every new domain.

Full doctrine, decision rules, and curriculum-platform locations: docs/teaching_order.md.

Inter-Session Memory — Reviewed Learning

CORE extends its own teaching corpus through a four-tier path: session vault → turn-event audit → reviewed teaching corpus → ratified packs. No opaque gradient updates, no uncurated ingestion. The only path to active-corpus extension is the review-gated TeachingChainProposal (ADR-0057), built from a contemplated DiscoveryCandidate (ADR-0056) emitted by the turn loop (ADR-0055).

Three independent gates every extension must pass:

Gate	What it checks	Trust property
Eligibility predicate	polarity ∈ {affirms, falsifies} ∧ ≥1 `source='corpus'` evidence ∧ claim_domain ≠ evaluative ∧ boundary_clean ∧ chain complete	Pre-replay; raises `ProposalError`; no log entry.
Replay-equivalence gate	Full cognition lane on active vs transient-with-append; any strict-decrease in `intent_accuracy / surface_groundedness / term_capture_rate / versor_closure_rate` auto-rejects with named metrics.	Active corpus byte-identical pre/post.
Operator review	Explicit `core teaching review <id> --accept` writes one JSONL line via `append_chain_to_corpus` (the sole corpus-write surface).	No auto-apply; replay-equivalence is a precondition, not a permission.

Supersession is the second operator-direct mutation surface: core teaching supersede <old_chain_id> retires an active chain by appending a replacement with superseded_by, with byte-identical rollback on any post-audit failure.

Three live demos / benchmarks make the chain demoable end-to-end:

Demo	Headline claim	Live command	Writeup
Anti-regression	Three independent gates each fail closed; bad proposals stop at the cheapest applicable gate.	`core demo anti-regression`	`docs/evals/anti_regression_demo.md`
Learning loop	Same deterministic prompt: `[none] I don't know…` before, `[teaching] thought reveals meaning…` after one accept.	`core demo learning-loop`	`docs/evals/learning_loop_demo.md`
Determinism bench	N identical inputs → N byte-identical proposal_id / replay metrics / chain_id. 100 runs: `unique=1` everywhere, mean ≈ 1.85s.	`core bench --suite teaching-loop --runs 100`	`docs/evals/teaching_loop_bench.md`
Articulation suite	Every intent shape fires + byte-identical surfaces across reruns + flat per-turn ΔRSS + cross-topic thread context + side-by-side with a local Ollama model showing CORE unique=1, Ollama unique≥2.	`core bench --suite articulation --ollama-model llama3:8b`	`benchmarks/README.md`

Operator surfaces:

core teaching audit                                 # surface load decisions + drop reasons
core teaching propose <candidate-jsonl-path>        # build a proposal, run the replay gate
core teaching proposals --state pending             # inspect the proposal log
core teaching review <proposal_id> --accept --review-date YYYY-MM-DD
core teaching supersede <old_chain_id> --subject ... --intent ... --connective ... --object ... --review-date YYYY-MM-DD
core teaching supersessions                         # pair retired chains with replacements (orphan-aware)

Evidence-Governed Domain Layer — The ADR-0091 Chain

CORE distinguishes contract-passing from demonstrated. A pack that satisfies the nine ADR-0091 predicates earns a reasoning-capable ledger row; that's a structural claim, not an empirical one. Promotion to audit_passed=true (formerly expert_demo; renamed by ADR-0113) requires a reviewer-signed evidence-bundle digest that reproduces byte-for-byte from on-disk lane results (ADR-0106 + ADR-0109).

What audit-passed actually means — and what it does NOT mean. The gate verifies CORE claim-shape compliance: signed digest, replay determinism, typed refusal, exact recall, grounding-source provenance. These are claim shapes a transformer LLM cannot structurally produce regardless of raw accuracy. A frontier LLM might score higher on the same benchmark but cannot pass this contract because it cannot produce a digest that re-derives, cannot guarantee typed refusal, cannot emit a deterministic trace hash, cannot replay byte-equal. This is NOT a raw-capability claim. The future expert ledger tier (ADR-0114) is reserved for an actual benchmark-calibrated capability claim; no domain holds it yet.

Layer	What it guarantees	ADR
Domain Pack Contract v1	Nine predicate checks on every ratified pack (lemma coverage, operator chain count, intent shapes, holdout coverage, reviewer-resolution, etc.).	0091
Reviewer Registry v1	YAML-anchored, schema-validated reviewer roster. Wildcard `*` reserved for primary reviewers; domain-scoped reviewers gated by `can_review(domain, scope)`.	0092
Fabrication-control eval lane	Negative-control lane: phantom endpoints, cross-pack non-bridges, sibling collapses must all refuse. `fabricated=0` across all by-class buckets is the gate.	0096
Audit-passed promotion contract	Domain-aware, reviewer-signed, replay-deterministic. No domain promotes silently; every `audit_passed=true` row points to an `audit_passed_claims` entry whose SHA-256 reproduces. (Originally landed as `expert-demo`; renamed by ADR-0113.)	0106, 0113
Lane-shape registry	Eight lane ids dispatch to five shapes (`cognition_shape`, `accuracy_shape`, `inference_shape`, `refusal_shape`, `symbolic_logic_shape`); unknown lanes fail-closed.	0109

Current ledger state (per core capability ledger):

Domain	Status
`mathematics_logic`	`audit-passed` (first promotion, ADR-0110; status string renamed by ADR-0113)
`physics`	`audit-passed` (second promotion, ADR-0111)
`systems_software`	`audit-passed` (third promotion, ADR-0124)
`hebrew_greek_textual_reasoning`	`reasoning-capable`
`philosophy_theology`	`reasoning-capable`

The contract has now demonstrated its load-bearing behavior end-to-end: refused one promotion attempt honestly (ADR-0107), amended its threshold rules once cleanly (ADR-0109), succeeded against mathematics_logic (ADR-0110), and succeeded against a second distinct domain physics without further contract change (ADR-0111). External readers can distinguish the two ceilings at a glance; the "math-only" objection is retired.

See the actual demonstration (ADR-0112, renamed by ADR-0113):

core demo audit-passed --domain mathematics_logic
core demo audit-passed --domain physics
# → evals/audit_passed/<domain>/latest/audit_passed.html

Each run re-derives the signed evidence-bundle digest from on-disk lane result files, asserts byte-for-byte match against docs/reviewers.yaml, and renders an HTML showcase with per-lane shape-check verdicts plus the first three sample cases from each split. The composer is read-only and byte-deterministic (same inputs → same SHA-256). An unpromoted domain produces a typed refusal, not a fake showcase.

Path to actual expert-level capability — Phase 5 substrate complete

The audit-passed gate above is intentionally not a raw-capability claim. The honest path to one is laid out in ADR-0114 — Expert-Capability Roadmap: GSM8K-Math First. Phases 1–4 (parser, solver, verifier, stepped-realizer) and Phase 5 (GSM8K eval lane) have now all landed.

Phase 5 substrate is complete as of 2026-05-23. All 8 sub-phases of ADR-0119 have landed. ADR-0114a's 10 anti-overfitting proof obligations are all discharged for the gsm8k_math lane.

First honest CORE-vs-real-GSM8K measurement (ADR-0119.7): 0/1,319 correct, 0/1,319 wrong, 1,319/1,319 refused. CORE refuses what it cannot grammar-handle; it does not confabulate. The zero-confabulation property holds against the external benchmark.

ADR-0120 (the first expert promotion contract) has since been built and exercised. On 2026-05-23 mathematics_logic was signed and briefly promoted to expert — then auto-reverted to audit-passed when its evidence bundle drifted: a non-gating GSM8K coverage metric moved, which changed the evidence-derived digest and invalidated the signature. That revert is the contract's fail-closed property working as designed — CORE revoked its own expert claim rather than carry a stale one. No domain is at expert today, and when expert is held at all it rests on CORE-authored lanes, not external GSM8K. Full record: ADR-0200 and docs/claims_ledger.md.

To run the GSM8K math eval lane:

core eval gsm8k_math            # run against CORE-original public split
# evals/gsm8k_math/runner.py   # lane runner (LaneReport with correct/wrong/refused)

Full ADR index, frontier, and chain notes: docs/decisions/README.md.

Architecture

raw input -> ingest/gate.py       (normalize once)
          -> field/propagate.py   (versor_apply every step)
          -> generate/stream.py   (nearest by cga_inner)
          -> vault/store.py       (store and recall by cga_inner)
          -> persona/motor.py     (rigid motor, not weight overlay)

The Two Primitives

versor_apply(V, F) = V * F * reverse(V) — the only field transition
cga_inner(X, Y) = -d^2 / 2 — the only distance metric

Layers

Layer	Purpose
`algebra/`	Cl(4,1) multivector math, versor ops, CGA, holonomy
`ingest/`	Single injection gate — the only normalization site
`field/`	FieldState dataclass and propagation loop
`vocab/`	Surface-token manifold points; indexed access for algebraic transition construction
`vault/`	Exact CGA inner product memory store
`persona/`	Persona as CGA motor (screw motion)
`generate/`	Token streaming loop
`session/`	Session binding: field + vault + vocab + persona

Signature

Cl(4,1): (+, +, +, +, -) — conformal model of 3D Euclidean space. Multivectors: float32 arrays of shape (32,), ordered by grade.

For architectural vision, seven axioms, and formal specification, see docs/Whitepaper.md and docs/Yellowpaper.md.

Name		Name	Last commit message	Last commit date
Latest commit History 1,049 Commits
.github		.github
algebra		algebra
alignment		alignment
benchmarks		benchmarks
calibration		calibration
chat		chat
contemplation/runs		contemplation/runs
core-rs		core-rs
core		core
core_ingest		core_ingest
demos/amr_decision_substrate		demos/amr_decision_substrate
docs		docs
engine_state		engine_state
evals		evals
field		field
formation		formation
generate		generate
ingest		ingest
language_packs		language_packs
morphology		morphology
notes		notes
packs		packs
persona		persona
probe		probe
recognition		recognition
scripts		scripts
sensorium		sensorium
session		session
teaching		teaching
tests		tests
vault		vault
vocab		vocab
workbench-ui		workbench-ui
workbench		workbench
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAIMS.md		CLAIMS.md
CLAUDE.md		CLAUDE.md
COMMERCIAL_LICENSE.md		COMMERCIAL_LICENSE.md
LICENSE		LICENSE
README.md		README.md
conftest.py		conftest.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CORE-AI: Versor Engine

The Three Engineering Pillars

I. Mechanical Sympathy

II. Semantic Rigor

III. Third Door

Native Substrate Direction — Python, Rust, Zig

The Truth-Seeking Schema

The Three Core Languages

Quick Start

Watch the flywheel turn — one command

CLI

Forward Semantic Control — The ADR-0024 Chain

Safety Pack

Identity Packs

Teaching Order

Inter-Session Memory — Reviewed Learning

Evidence-Governed Domain Layer — The ADR-0091 Chain

Path to actual expert-level capability — Phase 5 substrate complete

Architecture

The Two Primitives

Layers

Signature

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CORE-AI: Versor Engine

The Three Engineering Pillars

I. Mechanical Sympathy

II. Semantic Rigor

III. Third Door

Native Substrate Direction — Python, Rust, Zig

The Truth-Seeking Schema

The Three Core Languages

Quick Start

Watch the flywheel turn — one command

CLI

Forward Semantic Control — The ADR-0024 Chain

Safety Pack

Identity Packs

Teaching Order

Inter-Session Memory — Reviewed Learning

Evidence-Governed Domain Layer — The ADR-0091 Chain

Path to actual expert-level capability — Phase 5 substrate complete

Architecture

The Two Primitives

Layers

Signature

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages