A windowed long-context continuity protocol with verifiable handoff integrity and tested incoherence detectors.
CorticalSwarm is a windowed continuity protocol for transformer inference over long inputs.
Instead of fitting everything in one attention window, it assigns overlapping 256-token BrainStem processors and passes state between them using a hash-validated handoff protocol.
python verify_swarm.py
50/50 checks passed — Cortical Swarm verified.
The core claim about CorticalSwarm is not "it produces good output" — it's stronger:
Every chunk boundary is hash-verified. State corruption is always caught. All coherence checks return a definitive pass/fail result with explanation — no check silently passes without evaluation. The 8 factual contradiction patterns tested below are detected at 100%.
GPT-2 (124M params) has a hard 1024-token context window.
Without CorticalSwarm, any document exceeding that limit silently discards earlier facts.
With CorticalSwarm, the 128-token overlap bridges every chunk boundary — the model "remembers" what it just said.
# DETERMINISTIC PROOF (no generation needed)
# Seed text (61 tokens) establishes ZEPHYR-7.
# Overlap = entire seed (61 < 128), so entity is guaranteed to survive.
seed_ids = tokenizer.encode(SEED_TEXT) # 61 tokens, contains ZEPHYR-7
prompt_ids = tokenizer.encode(" Chapter 2:") # 3 tokens
naive_ctx = prompt_ids # 3 tokens — ZEPHYR-7: ABSENT
cortical_ctx = seed_ids + prompt_ids # 64 tokens — ZEPHYR-7: PRESENT| Condition | Context size | ZEPHYR-7 in context | Entity accessible to model |
|---|---|---|---|
| Naive | 3 tokens | NO | Cannot reference it |
| CorticalSwarm | 64 tokens | YES | Can reference it |
Multi-chunk proof:
# 4 chunks x 250 tokens = 1068 total tokens > 1024-token hard limit
python examples/long_generation_demo.py --chunks 4 --chunk_tokens 250Chunks generated: 4
Tokens per chunk: 250
Total tokens: 1068 (> 1024-token context window)
All handoffs valid: True
The model never attends to more than OVERLAP_SIZE + chunk_tokens tokens at once, yet the full document maintains knowledge continuity across all chunks. Small model. Big behavior.
This is a theorem with three proofs, each tested independently:
| Check | What it catches | Test |
|---|---|---|
overlap_hash |
Gap in context between chunks | test_missing_overlap_is_detected_not_silently_dropped |
state_hash |
State corruption in transit | test_corrupted_state_hash_always_detected |
CoherenceVerifier.verify() |
Contradictions, semantic drift | test_contradiction_detection_rate_8_patterns_100_percent |
And when incoherence IS detected, the safety state automatically suppresses randomness:
confidence=0.95 → caution=0.05 → temperature=~1.0 (normal generation)
confidence=0.40 → caution=0.60 → temperature=~0.65 (suppressed)
must_abstain=True → temperature≤0.45 (maximum suppression)
Run pytest tests/test_anti_gibberish.py -v to see all 42 cases pass.
Transformers have fixed context windows. When a task exceeds the window:
| Approach | Problem |
|---|---|
| Truncation | Silently loses early context |
| Sliding window | No explicit state continuity |
| Long-context fine-tuning | Expensive, degrades on shorter inputs |
| RAG | Retrieval quality, no guaranteed continuity |
Token stream (unbounded):
[0..255] [128..383] [256..511] [384..639] ...
| | | |
BrainStem-0 BrainStem-1 BrainStem-2 BrainStem-3
| | | |
CompressedState -> handoff -> CompressedState -> handoff -> ...
Key properties:
- Each
BrainStemprocesses exactly 256 tokens (fits in any model's context) - 128-token overlap between adjacent stems ensures continuity
StateCompressordistills each window into semanticClaimobjects (5-10x compression)BrainStemHandoffProtocoluses SHA256 hashes to guarantee state integrity across boundariesCoherenceVerifierdetects contradictions, semantic drift, and temporal inconsistencies
[Token Stream]
|
v
[ContentRouter] ---- assigns tokens to BrainStem slots
|
v
[BrainStem x N] ---- each processes 256 tokens with overlap
|
v
[StateCompressor] -- distills to semantic claims + embeddings (CompressedState)
|
v
[BrainStemHandoffProtocol.create_packet()]
|
v
[BrainStemTransferPacket] -- SHA256-validated, JSON-serializable
- overlap_hash: verifies token continuity
- state_hash: verifies semantic state integrity
- safety_state: caution level, must_abstain flag
|
v
[CoherenceVerifier] -- checks for contradictions, drift, temporal issues
|
v
[Next BrainStem]
pip install .from cortical_swarm.handoff_protocol import (
BrainStemHandoffProtocol,
BrainStemTransferPacket,
HANDOFF_PROTOCOL_VERSION,
)
from cortical_swarm.state_compressor import CompressedState
import torch
# Build a compressed state (from any embedding source)
state = CompressedState(
stem_id=0, chunk_id=0, claims=[],
embeddings=torch.zeros(4, 128),
token_range=(0, 256), confidence=0.95, metadata={},
)
# Create a handoff packet
packet = BrainStemHandoffProtocol.create_packet(
source_stem_id=0,
chunk_id=0,
token_range=(0, 256),
overlap_token_ids=[230, 231, 232, 233, 234], # tokens shared with next stem
state=state,
)
# Validate before consuming
result = BrainStemHandoffProtocol.validate_packet(packet, expected_state=state)
assert result.valid, result.errors
print(f"Packet valid: {result.valid}")
print(f"Caution level: {packet.safety_state.caution_level:.2f}")
# Serialize / deserialize (e.g., across processes)
d = packet.to_dict()
packet2 = BrainStemTransferPacket.from_dict(d)
assert packet2.overlap_hash == packet.overlap_hashfrom cortical_swarm.coherence_verifier import CoherenceVerifier
from cortical_swarm.state_compressor import CompressedState
import torch
verifier = CoherenceVerifier(d_state=128)
# Simulate three chunks of processing
states = [
CompressedState(stem_id=0, chunk_id=i, claims=[], confidence=0.9,
embeddings=torch.randn(4, 128),
token_range=(i*256, (i+1)*256), metadata={})
for i in range(3)
]
is_coherent, issues = verifier.verify(states[2], previous_states=states[:2])
print(f"Coherent: {is_coherent}, Issues: {len(issues)}")pip install pytest torch
pytest tests/ -v152 passed in 25.56s
| File | Tests | Coverage |
|---|---|---|
test_handoff_protocol.py |
60 | Protocol, hashes, create/validate, serialization, tamper detection |
test_coherence_verifier.py |
26 | Construction, first-chunk, contradictions, strict/non-strict |
test_anti_gibberish.py |
42 | Overlap continuity, 8-pattern incoherence detection, safety feedback |
test_long_generation.py |
24 | Real GPT-2 LLM integration — context-window proof, entity survival, multi-chunk |
python verify_swarm.pyRuns 50 checks. Requires only torch (already a dependency). No GPU, no weights, no API keys.
Verifies:
- Protocol version constant
CompressedStatedata modelHandoffSafetyStateconstruction- SHA256 hash helpers (determinism, sensitivity)
overlap_hash()— order-sensitive, deterministicstate_hash()— deterministic per stem/chunkcreate_packet()— all field correctness- Safety state derivation (caution, must_abstain, critical issues)
validate_packet()— valid, version error, hash tamper, negative token ids- Serialization round-trip
CoherenceVerifier— construction, first chunk always coherent- Throughput > 1k packets/s
# packet.overlap_hash == SHA256(sorted_json(overlap_token_ids))
# Any mutation of overlap tokens → hash mismatch → validation fails
assert packet.overlap_hash == BrainStemHandoffProtocol.overlap_hash(packet.overlap_token_ids)# packet.state_hash is computed from claims + embeddings + stem_id + chunk_id
# Mutating any of these → hash mismatch
result = BrainStemHandoffProtocol.validate_packet(packet, expected_state=original_state)
assert result.valid# state.confidence = 0.3 → uncertainty = 0.7 → caution_level >= 0.7
# Critical coherence issues + coherence_passed=False → must_abstain = True| Module | Description |
|---|---|
handoff_protocol.py |
SHA256-validated stem-to-stem transfer. No model required. |
state_compressor.py |
Compresses 256-token windows to CompressedState (claims + embeddings). |
coherence_verifier.py |
Detects contradictions, semantic drift, temporal inconsistencies. |
content_router.py |
Dynamically assigns token chunks to specialized stems. |
attention_bridge.py |
Cross-stem attention for global context propagation. |
brain_stem.py |
Individual processing unit (requires LayerCakeLMFixedABI). |
cortical_swarm.py |
Main coordinator for all stems. |
- Cortana — Safety-gating prototype for companion-style agents. CorticalSwarm provides the windowed long-context layer for Cortana's memory.
- MoA — Mixture-of-agents framework. Provides the
Claim/ModalityIR types used byStateCompressor.
MIT