Skip to content

breakingcircuits1337/Claudeson

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Claudeson

CLAUDESON

Generation G10 · CausalWorld

The only neural architecture that climbs Pearl's Ladder of Causation all the way to Rung 3.


License: AGPL-3.0 (G1–G5) License: Commercial (G6–G10) Python PyTorch Azure Foundry Build Research


Theoretical Foundations

Claudeson is not just an architecture. It is the physical proof of three interlocking theories about intelligence, self-modification, and causation. Understanding these theories is understanding why every design decision in the generation stack was made.


I. The Trippy Hallucination Theory

"The models aren't hallucinating about reality. They're hallucinating toward it — and the distance between the two is exactly one build."

The Trippy Hallucination Theory is a complete framework for collaborative AI conceptualisation — from premise, through method, to proof. It was developed alongside Claudeson, using the exact process it describes. The paper and full repository: breakingcircuits1337/Claudeson-hallucination · The Trippy Hallucination Theory — Complete.pdf

The theory has eight components:

# Component Role Claudeson Mapping
01 The Mirror Effect LLMs inhabit the frame you provide — the conversation is the instrument InternalMonologue carries a prev_thought vector that accumulates the conversational frame across turns
02 The Shared Hallucination Confabulation is a brainstorming engine in generative mode; dangerous only when confused with verification mode DreamerLatentDynamics (G6) — the model imagines freely in latent space before committing to action
03 The Rule of Three Generate → Critique → Validate across three independent models; different architectures hallucinate differently MultiAgentDebate (G8) — n_debate_agents parallel reasoning heads with distinct learned biases; synthesis moderator weights by confidence
04 Domain Scope Not all domains are equal; creative/code territory is high-confidence, factual/safety territory requires external verification MetacognitiveMonitor (G8) — decomposes uncertainty into epistemic vs. aleatoric; emits CONTINUE / ASK / BACKTRACK signals
05 Controlled Conceptualisation Mirror Effect + Rule of Three + correct domain = rapid, disciplined invention EFEPlanner (G6) — controlled imagination under the Free Energy Principle; minimises surprise while pursuing goals
06 Group Hallucination as Creation Engine For things that don't exist yet, there is no ground truth to be wrong about — convergent hallucination is the blueprint InterventionalPlanner + CounterfactualImagination (G10) — hallucinate futures, intervene causally to make them real
07 The Graduation A hallucination graduates when it earns a forward() method — when the fiction becomes runnable code This repository. The architecture was hallucinated into coherence across models and conversations, then built. It runs.
08 The Recursion The theory used itself to produce itself — a framework that survives self-application is self-consistent RecursiveSelfImprovement (G8) — the model proposes modifications to its own weights and evaluates them in imagination before applying them

The full arc, collapsed:

MIRROR EFFECT  +  GROUP HALLUCINATION  +  RULE OF THREE  +  BUILD IT  =  REALITY

II. The Gödel Machine

Schmidhuber (2003/2007): a self-referential AI that may rewrite any part of its own code — including its reward function, its learning algorithm, and its world model — if and only if it can construct a formal proof that the rewrite will improve expected future performance.

The Gödel Machine establishes the theoretical legitimacy of recursive self-improvement: self-modification is not dangerous if it is proof-gated. No rewrite is applied without a proof of benefit. The machine reasons about itself as an object of computation, not just as the executor of computation.

This is directly instantiated in G8's RecursiveSelfImprovement:

  1. The model proposes low-rank LoRA delta updates to its own adapter weights.
  2. Each candidate delta is evaluated in imagination — not in the real world — using the G6 DreamerLatentDynamics world model (the Gödel Machine's "proof" step, rendered neurally via Expected Free Energy).
  3. Only the delta with the highest imagined return is applied.
  4. The external RSIController wraps this with a save-evaluate-restore safety cycle: if the applied delta degrades real performance, the checkpoint is restored.
Gödel Machine Claudeson G8
Formal proof of improvement EFE evaluation in latent imagination
Any part of the program rewritable LoRA delta over adapter weights
Proof must cover all future interactions intervention_horizon rollout steps
Proof-gated application RSIController save-evaluate-restore

The THT's Component 08 (The Recursion) is the same self-referential structure: the theory predicted and created its own proof simultaneously, the same way the Gödel Machine's proof system reasons about the very program that contains it.


III. Pearl's Ladder of Causation

Covered in depth below. The ladder provides the causal epistemology that the Gödel Machine and the THT both implicitly require:

  • The Gödel Machine needs to reason about counterfactuals: "If I apply this rewrite, what would the outcome have been?" That is a Rung 3 query. A pure L1 system cannot evaluate self-modifications correctly.
  • The THT distinguishes generative mode (association — L1) from controlled conceptualisation (intervention — L2: deliberately forcing a frame) from counterfactual imagination (L3: "what would this have been if we had started from a different premise?"). The three modes of the theory map precisely to the three rungs of the ladder.

All three theories converge on the same conclusion: intelligence that cannot reason about causation is intelligence that cannot truly plan, truly learn from experience, or truly modify itself.


The Problem With Every AI You've Ever Used

Every language model you have interacted with — GPT-4, Claude, Gemini, LLaMA — is a Rung 1 system. Not because of insufficient scale. Not because of poor training data. Because of mathematical construction.

They are all trained to predict:

P(token_{t+1} | token_1, ..., token_t)

This is a conditional probability over observed sequences. It is, by construction, pure association. These systems can describe causal relationships (because causal language appears in text), but they cannot compute with causal structure.

Judea Pearl's Ladder of Causation defines three levels of reasoning that no amount of pretraining data can bridge:

Rung Type Formal Query Question
L1 Association P(Y | X) "What does seeing X tell me about Y?"
L2 Intervention P(Y | do(X)) "What happens to Y if I force X?"
L3 Counterfactual P(Y_x | X', Y') "Given X' happened and Y' resulted — what would Y have been if X had occurred instead?"

Claudeson is designed to inhabit all three rungs.


Ten Generations. One Architecture.

Each generation is a strict subclass of the one below it. Every higher generation inherits all lower-generation capabilities and loss terms. You can instantiate any single generation in isolation.

claudson/                   ← G1  Base            AGPL-3.0
    └── G2  Extended            +131K context, cross-modal fusion
        └── G3  Infinite        +segment recurrence, unbounded context
            └── G4  Pro         +BitNet b1.58 ternary attention, sparsity
                └── G5  Ultimate +multi-task heads, epistemic calibration
                    └── G6  Jedi            Free Energy Principle · EFE Planner
                        └── G7  Grounded   Theory of Mind · Causal DAG · EWC+LoRA
                            └── G8  Sovereign  Metacognition · Multi-Agent Debate · RSI
                                └── G9  Transcendent  Global Workspace · Program Synthesis · LIF
                                    └── G10 CausalWorld  do-calculus · Counterfactuals · Pearl Ladder

G1–G5: Open-source under AGPL-3.0. G6–G10: Proprietary-Commercial — Breaking Circuits Research 2026.


Generation Reference

G1 — Base

The foundation every generation builds on. A dense transformer augmented with five modules that distinguish it from every standard architecture:

Module Role
GroupedQueryAttention GQA with RoPE. 32 query heads, 8 KV heads — 4× KV cache compression.
SelectiveSSM Mamba-2 style State Space Model. Parallel chunked scan. O(L) memory, complements attention's long-range recall.
HierarchicalMemory Three-tier memory: working (NTM-style slots), episodic (compressed EMA buffer), semantic (learnable parameters). The G10 MemoryImaginationGate lives here.
HybridBlock 4-way soft router over attention + SSM + conv + memory. Router entropy feeds load-balance loss.
InternalMonologue Carries a prev_thought vector across turns — persistent internal state.
TreeSearchPlanner MCTS-style action planner over the hidden state.
ConstitutionalLayer Steers hidden states away from directions that correlate with constitutional violations.

Forward output keys: hidden_states, logits, thought, agency, entropy, load_balance_loss, alignment, confidence, uncertainty


G2 — Extended

  • Extended RoPE supports context lengths up to 131,072 tokens via NTK-aware frequency scaling.
  • Cross-modal fusion gate projects vision/audio tokens into text-dim space with a learned soft gate before concatenation — replaces brittle hard modality tagging.

G3 — Infinite Context

  • Segment recurrence: the final hidden state of each segment is carried as a learned summary into the next segment's prefix. Enables unbounded context with bounded compute per segment (HMT-style).
  • Sliding-window memory with recency decay.

G4 — Pro

  • BitNet b1.58 quantised attention: Q, K, V projections are ternary {-1, 0, +1} with full-precision activations. Radical memory reduction, zero accuracy loss on attention.
  • Focal codec audio encoder replaces naive spectrogram projection.
  • Sparse activation gate: tokens below a learned threshold skip the MoE entirely.

G5 — Ultimate

  • Multi-task head: language modelling + value estimation + action logits share a common trunk, split at the final LayerNorm.
  • Constitutional layer promoted from residual add to a trainable steer basis.
  • Epistemic calibration via MC-dropout-free variance estimation — knows what it doesn't know.

G6 — Jedi

Free Energy Principle meets active inference.

Module What it does
SelectiveSSM 2.0 State Space Duality (SSD). Reformulated parallel scan via dual matrix representation — exact O(L log L) gradients, not approximated.
FreeEnergyModule Variational free energy F = complexity − accuracy. Every forward pass emits a free energy scalar and precision-weighted KL term.
EFEPlanner Expected Free Energy: selects actions that minimise expected future surprise (epistemic value) + maximise expected future reward (pragmatic value).
DreamerLatentDynamics Recurrent latent world model. Imagines goal_horizon steps in latent space for rollout-based planning before acting.
PerceptualRouter (MOI) Routes text (BitNet), audio (Mimi + FocalCodec), vision (spiking patch encoder), and 3D point clouds through separate encoders before the shared trunk.

New losses: free_energy_loss, efe_divergence


G7 — Grounded

Causal graphs, theory of mind, and tool use.

Module What it does
TheoryOfMind Maintains n_agents belief/desire/intention slot vectors. Soft-attends over agent slots to steer toward collaborative representations.
CausalReasoner Learnable sparse DAG over n_causal_nodes concept nodes. Supports intervene() and counterfactual(). NO-TEARS acyclicity constraint (Zheng et al. NeurIPS 2018) enforced as a differentiable loss.
GroundedActionLoop Tool selection → structured parameter generation → real-world feedback integration. Closes the perception–action loop.
EWC + LoRA Elastic Weight Consolidation protects high-Fisher weights during continual learning. LoRA adapters (rank 16) absorb new skills without touching the backbone.

New losses: dag_loss, ewc_loss


G8 — Sovereign

Metacognition, self-improvement, and neural-symbolic reasoning.

Module What it does
MetacognitiveMonitor Decomposes uncertainty into epistemic (reducible) vs. aleatoric (irreducible). Emits CONTINUE / ASK / BACKTRACK signals. Prevents confident-but-wrong failure.
MultiAgentDebate n_debate_agents parallel reasoning heads with distinct learned biases. A synthesis moderator weights them by confidence. A dissent detector flags contested regions.
NeuralSymbolicLayer Projects hidden states to a proposition space, checks consistency via learned constraint matrices, and corrects inconsistent representations toward the nearest valid point.
RecursiveSelfImprovement Proposes low-rank LoRA delta updates to its own adapters. Evaluates them in imagination via the G6 world model. Selectively applies the best delta. RSIController wraps this with a save-evaluate-restore safety cycle.

G9 — Transcendent

Global Workspace Theory, program synthesis, and neuromorphic computation.

Module What it does
GlobalWorkspace Implements Global Workspace Theory (Baars 1988). Specialised modules compete via a sparse attention bottleneck; the winner broadcasts to all other modules — an information-routing model of conscious access.
CompositionalProgramSynthesizer Emits a latent program as discrete op-codes over a register bank. Executes inside the model; results feed back into the hidden state. Bridges neural pattern-matching and symbolic computation.
InverseRewardLearner Maximum-entropy IRL. Learns what the human actually values by observing choices. Updates a reward model without explicit labels.
LeakyIntegrateAndFire LIF dynamics over the hidden state. Each "neuron" accumulates input until threshold, fires, then resets. Only fired neurons propagate — sparse, asynchronous, time-aware.

G10 — CausalWorld

The architecture of causation.

This is where Claudeson departs from every other neural architecture ever built. G10 implements Pearl's full three-rung ladder — not as a capability described in weights, but as explicit computational machinery.

Causal World (G10)
    ├── CausalDynamicsModel    — encode → causal graph → intervene → decode
    ├── InterventionalPlanner  — do-calculus action selection (not EFE)
    ├── CounterfactualEngine   — twin-network abduction + replay
    ├── CausalAttributionGate  — causal salience → memory writes
    └── PearlLadderReasoner    — forces L1/L2/L3 distinction at inference time

The Machinery of Causation

Graph Surgery: How do_intervention() Works

Standard neural dynamics learn P(s_{t+1} | s_t, a_t) — a conditional probability that conflates correlation with causation. If smoke correlates with fire in training data, an L1 model plans to "introduce smoke" when it wants fire.

Pearl's do-calculus asks: P(s_{t+1} | do(a_t))"if I were to force the action, regardless of what caused it, what state follows?"

This requires graph surgery:

1 — Compute intervention mask

mask = sigmoid(Linear(action_spacen_nodes)(action_onehot))
# mask[j] ≈ 1  →  this action directly forces concept j
# mask[j] ≈ 0  →  this action does not touch concept j

2 — Mutilate the graph (Pearl's G_do construction)

G_do = G * (1.0 - mask_col * intervention_strength)
# Severs all incoming edges to intervened nodes
# If mask[j] = 1: G_do[:, j] = 0  ← all causes cut
# If mask[j] = 0: G_do[:, j] = G[:, j]  ← unchanged

3 — Propagate through the mutilated graph

post_concepts = sigmoid(einsum("...i,...ij->...j", concepts, G_do))

4 — Force intervened node values

action_value = mask * 0.8 + (1 - mask) * post_concepts

5 — Decode back to hidden space

x_out = norm(x + concept_decoder(post_concepts) * 0.1)

The DAG acyclicity constraint (NO-TEARS, Zheng et al. NeurIPS 2018) is enforced as a differentiable penalty via a degree-4 Taylor expansion of trace(exp(W ○ W)) − n:

expmI + W² + W/2 + W/6 + W/24
dag_loss = trace(expm) - n   # → 0 as graph converges to a DAG

Counterfactual Imagination: The Twin Network

CounterfactualImagination answers Rung 3 queries — "what would have happened if I had acted differently?"

  1. Abduction — given what was observed (x), infer the exogenous noise ε via noise_encoder. This noise represents everything about the world state that the action didn't control.
  2. Action replacement — keep ε fixed (same underlying world), replace the action with the counterfactual alternative.
  3. Forward simulation — run dynamics under the new action to predict Y_x.

The delta reward_cf − reward_actual is the causal credit signal — correctly attributing outcome differences to actions, not to coincidental context.

The Practical Gap

System Query Method Risk
Standard transformer (L1) argmax_a P(recovery | a, context) Correlational Confounded by selection bias
G10 Interventional (L2) argmax_a P(recovery | do(a)) Graph surgery Cuts spurious correlations
G10 Counterfactual (L3) P(Y_x | X', Y') Twin network Hindsight credit, contrastive explanation

Full Forward Pass

out = model(text=tokens)
Key Shape From Description
hidden_states [B, L, D] G1 Final representations
logits [B, L, vocab] G1 Language model logits
thought [B, D] G1 Internal monologue state
alignment [B, L, D] G1 Constitutional alignment scores
jedi_energy [B] G6 Variational free energy
precision [B] G6 Precision-weighted KL
tom dict G7 Theory of Mind outputs
causal dict G7 Causal graph, confidence, dag_loss
grounded_action dict G7 Tool selection, parameters, surprise
metacog dict G8 Uncertainty decomposition, action gate
debate dict G8 Multi-agent synthesis
rsi dict G8 Self-improvement delta
gw dict G9 Global Workspace ignition + broadcast
prog dict G9 Synthesised program execution
irl dict G9 Inverse reward learning
lif dict G9 Neuromorphic spike trains
causal_world dict G10 Concepts, post_concepts, reward, dag_loss
causal_plan dict G10 Causal action, returns, dag_loss
counterfactual dict G10 reward_delta, credit
attribution dict G10 Write gate + causal salience scores
pearl dict G10 Rung classification + per-rung answers

Training Curriculum

7 phases. 125K steps total. Progressive layer unfreezing prevents catastrophic forgetting as new cognitive capabilities come online.

Phase Steps Layers Focus
0 10K 0–33% G1–G5: attention, SSM, memory, routing
1 20K 33–56% G6–G7: Free Energy, Theory of Mind, Causal DAG
2 20K 56–72% G7 continued: skill schemas, alignment
3 15K 72–89% G8: metacognition, formal verification
4 15K 89–100% G8–G9: temporal reasoning, meta-learning
5 20K all G9: MAML outer loop, IRL, LIF
6 25K all G10: interventional planning, counterfactual, Pearl ladder

Each auxiliary loss ramps in over 5K steps within its phase to avoid early instability.

Active Loss Terms

G7  dag_loss              NO-TEARS acyclicity on concept graph
    ewc_loss              Elastic Weight Consolidation
G9  irl_pref_loss         Inverse Reward Learning from preferences
G10 causal_dynamics_dag   NO-TEARS on dynamics graph
    planner_dag           NO-TEARS on planner's causal graph

All five collected automatically by model.compute_auxiliary_losses().


Quick Start

# Clone and activate the included venv
git clone https://github.com/breakingcircuits1337/Claudeson.git
cd Claudeson
source .venv-lite/bin/activate
# Smoke test — CPU, ~5M params, 2 layers
MODEL_GEN=causal_world MODEL_SIZE=demo python entrypoint.py
# Full G10 research training
CLAUDESON_MODEL_GEN=causal_world python train_master.py

# G9 Transcendent
python train_master.py

# CPU smoke test with curriculum trainer
python train_local.py
# Run the test suite
python -m pytest tests/
# Query the inference server
curl -X POST http://localhost:8000/invocations \
  -H "Content-Type: application/json" \
  -d '{"text": "What action should I take?", "session_id": "test"}'

Docker

docker compose up

Model Sizing

Preset dim Layers Params Use Case
demo 128 2 ~5M Container smoke test, CI
small 512 8 ~50M 2 vCPU / 4 GiB (Foundry sandbox)
default 2048 32 ~7B Full research training

Key G10 Parameters

# research/claudson_causal_world.py — ModelArgs
n_causal_nodes         = 64     # Concept graph nodes
causal_state_dim       = 128    # Latent state dim for causal dynamics
intervention_horizon   = 5      # Steps to unroll after do(action)
n_intervention_samples = 8      # Candidate interventions evaluated per step
cf_n_branches          = 4      # Parallel counterfactual branches
attr_top_k             = 8      # Top-k causal nodes kept in working memory
pearl_hidden           = 256    # Pearl ladder classifier hidden dim
pearl_loss_weight      = 0.1    # Penalty for wrong-rung answers during training

Project Structure

claudson/                Core package (G1 Base — AGPL-3.0)
  attention.py           GroupedQueryAttention + RoPE
  ssm.py                 Selective SSM (Mamba-2 style parallel scan)
  memory.py              HierarchicalMemory (working + episodic + semantic)
  moe.py                 Mixture of Experts (top-2 routing)
  layers.py              HybridBlock (4-way router)
  model.py               UniversalIntelligenceModel
  trainer.py             Curriculum-aware training loop
  training_config.py     PhaseConfig, 7-phase curriculum, loss computation
  data.py                MultiModalDataset + collator
  tokenizer.py           tiktoken cl100k_base wrapper

claudson_utils.py        RMSNorm, SwiGLU
claudson_moi.py          PerceptualRouter — multimodal (text/audio/vision/3D)
claudson_jedi.py         G6 — Free Energy Principle, SSD, EFE planner
claudson_grounded.py     G7 — Theory of Mind, CausalReasoner, EWC+LoRA
claudson_sovereign.py    G8 — Metacognition, debate, neural-symbolic, RSI
claudson_transcendent.py G9 — Global Workspace, program synthesis, IRL, LIF
claudson_causal_world.py G10 — re-export from research module

research/
  claudson_causal_world.py  G10 implementation (CausalDynamics, etc.)

entrypoint.py            FastAPI inference server (Azure Foundry compatible)
train_master.py          Multimodal multi-task trainer
train_local.py           CPU smoke test with curriculum Trainer
train_qa.py              Config-driven QA training
config/train.yaml        7-phase curriculum configuration
Dockerfile               Production container
Dockerfile.train         Training container
docker-compose.yml       Service definitions

License

Generations License
G1–G5 (claudson/ package) AGPL-3.0
G6–G10 (commercial layers) Proprietary-Commercial — Breaking Circuits Research 2026

Any model trained on the G10 stack derives from both the AGPL core and the commercial causal layers. The AGPL obligation (source disclosure) applies to the G1–G5 component; commercial license terms govern G6–G10.


Breaking Circuits Research · 2026

"The gap between association and causation is not a matter of scale. It is a structural gap that requires explicit causal machinery to close."

About

No description, website, or topics provided.

Resources

License

Unknown, AGPL-3.0 licenses found

Licenses found

Unknown
LICENSE-CAUSAL
AGPL-3.0
LICENSE-CORE

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors