GitHub - BEKO2210/Javis: An associative Spiking Neural Network (SNN) memory co-processor for LLM agents, built in Rust. Javis leverages biologically inspired pattern completion as an efficient alternative to classic RAG. (Research-only)

An associative SNN memory co-processor for LLM agents, built in Rust.

A spiking neural network that stores knowledge as emergent cell assemblies and retrieves it through pattern completion. Sits between your retrieval layer and your LLM, returning a few decoded concepts instead of full document chunks.

⚠ Research-only license. Javis is licensed under PolyForm Noncommercial 1.0.0. It is not for production deployment, commercial integration, or any operational system that affects real people, real decisions, real services, or real infrastructure. See the research-use addendum for the licensor's intent statement.

Why Javis — the pitch in 30 seconds.
Architecture diagram — animated R1 → DG → R2 walk-through.
Performance profile — what the network actually does today.
Architecture (text) — the iter-44 plasticity stack and decoder.
Quick start — clone → build → recall.
Live 3D brain — the in-browser visualisation.
Plasticity — STDP / iSTDP / BCM / R-STDP / homeostasis.
Token efficiency — Javis vs. naive RAG.
Production readiness — tracing, metrics, container, MSRV, deny.
Project structure — the workspace layout.
Tests — what runs in CI and what the gates are.
Iterations — every hypothesis, pre-fixed criterion, and verdict from iter-00 to today.
References — papers behind the plasticity choices.
License — research-use only.

Why Javis

Modern LLM pipelines spend most of their token budget on retrieval context. Naive RAG ships an entire chunk to the model on every query — even when only a single fact inside that chunk matters.

Javis flips the architecture: knowledge is stored as emergent cell assemblies in a spiking neural network. A query is a partial cue; pattern completion inside the network reactivates the relevant assembly; only the few decoded concepts go to the LLM.

Naive RAG:   "Rust is a systems programming language focused on memory
              safety and ownership; the borrow checker prevents data races
              at compile time."                                       63 tokens
Javis:       "rust"                                                    2 tokens

That gap is the whole pitch. Whether it holds at scale — and where it stops holding — is reported below in plain numbers, not slogans.

Architecture diagram

The iter-60+ DG bridge architecture, with spike pulses propagating from R1 (input cortex, k=20 SDR) through the DG hash (4 000 cells, k-of-n sparse code) and the direct perforant path (configurable scale, iter-64 sweet-spot at 0.3) into R2 (memory cortex, 2 000 LIF, 70 % E + 30 % I, fully recurrent with STDP / iSTDP / BCM / homeostasis / intrinsic-θ / R-STDP / heterosynaptic / structural plasticity). A 3-second loop:

Animated architecture: R1 → DG (mossy fibres) + direct perforant path → R2, with spike pulses propagating along axons

The blue paths (R1 → DG → R2 mossy fibres) are always on at full weight; the dashed grey path (direct R1 → R2 perforant) is the iter-64 mechanism diagnosis axis — at scale 0.0 (DG-only, iter-63 baseline) the brain separates cues but does not learn cue → target on the decoder; at scale 0.3 it does (iter-64 axis C verdict, see Iterations).

Performance profile

Measured on a deterministic 100-sentence / 286-vocabulary benchmark (cargo run --release -p eval --example scale_benchmark -- --sentences 100). Reproducible from a single --seed; no external dataset, no network.

What survives

Property	Value
Self-recall (query concept always retrievable)	100 %
Token reduction vs naïve-RAG baseline	35 – 45 %
Decoder latency at vocab ≤ 300	sub-millisecond
Self-recall test suite	113 / 113 passing

The first row is the architectural claim that Javis stands behind: train a concept once, recall it deterministically. The second row is the headline number — modest but real, on a non-toy corpus. The third row makes Javis practical as a co-processor in front of an LLM.

Known limits (iter ≤24 baseline)

These are the failure modes a senior reviewer would find on day one. Better to publish them than have someone tweet them.

Limit	Measured value	Mechanism
Associative recall	≈ 2 %	Of every word that genuinely co-occurs with the query in the corpus, only ~ 2 % is decoded. Javis returns the query plus 5 noise words, not the expected 5–10 related concepts.
Cross-domain bleed	4.7 / 6 decoded words	At N > 50 distinct concepts the R2 layer (2 000 neurons, K=220, 11 % sparsity) saturates; iSTDP can no longer build separating walls between engrams, so unrelated domains leak into each other.
Engram capacity	≈ 50 concepts	Geometric upper bound from R2_size / KWTA_K = 2 000 / 220 ≈ 9 fully-orthogonal engrams; with overlap-tolerance about 50 work cleanly before interference dominates.

What changes in iter 25 (this branch)

R2 was scaled from 2 000 → 10 000 neurons, recurrent connectivity sparsened from p=0.10 → p=0.03, KWTA from 220 → 100 (1 % sparsity), iSTDP retuned for aggressive LTD on co-active E-targets. The 113 existing tests still pass at the new topology. Updated cross-bleed and recall numbers are in notes/43-topology-scaling.md once the benchmark run completes.

What changes in iter 47a (this branch)

The iter-46 negative-margin diagnosis identified the R1 → R2 forward drive as the dominant factor in the cue's R2 response. Iter-47a tests the literature-grounded fix (Brunel scaling + Diehl-Cook adaptive threshold) through a sequential 4-epoch sweep with pre-fixed acceptance criteria. Result, on the same 16 + 16 pair / vocab-32 corpus, seed 42:

INTER_WEIGHT	r2_act mean	tgt_hit	selectivity	margin
0.5	0.8	0.00	-0.0005	-0.01
1.0	139	2.59	-0.0005	-0.02
0.7	507 (cascade)	9.38	-0.0047	-0.02

iter-47a-2 alone does not flip the margin sign. But the diagnosis is sharper than iter-46's: at INTER_WEIGHT = 1.0, target_hit_mean grew monotonically over 4 epochs (1.16 → 2.59) and selectivity_index rose from -0.022 to -0.0005 — the right direction. The 0.7 bistability (recurrent cascade in epoch 3) is the key second-order finding: hard sparsity control (k-WTA, iter-48 entry) is necessary, not optional. The iter-47 metrics (r2_active_pre_teacher_{mean,p10,p90}, selectivity_index) are wired to A/B-test it cleanly. See notes/47a.

What changes in iter 46 (this branch)

The pair-association harness from iter-45 grows a teacher-forcing training arm: a deterministic per-word canonical R2-E SDR (canonical_target_r2_sdr) and a drive_with_r2_clamp primitive that injects target spikes directly into R2 — bypassing the random R1 → R2 forward path. Plus a six-phase trial schedule (cue → delay → prediction → teacher → reward → tail) with plasticity gating around the prediction window so evaluation never contaminates training, an anti-causal STDP timing fix (cue lead-in before the clamp), and a --association-training- gate-r1r2 flag to attenuate forward drive during the prediction phase only.

Honest result on the same 16-pair + 16-noise corpus, seed 42: target_clamp_hit_rate = 1.00 across every teacher epoch (the clamp itself works perfectly), but correct_minus_incorrect_margin stays in [-0.06, -0.03] — the canonical-target cells fire less than the rest under cue-only recall, even with the timing fix and the R1 → R2 gate. The first non-zero prediction_top3_before_teacher = 0.02 appears at epoch 3 with homeostasis on, but does not stabilise above the 9.4 % chance floor in any run. The bottleneck has moved from iter-45's "we can't measure it" to iter-46's "we can measure it; here is the number". See notes/46 for the full chain of measurements and the next-iter (47) directions (reduce INTER_WEIGHT, add an association-bridge region, or make R1 → R2 itself learnable).

What changes in iter 45 (this branch)

A reward-aware pair-association benchmark (cargo run --release -p eval --example reward_benchmark) that finally lets the iter-44 R-STDP / dopamine machinery be exercised: 16 (cue, target) pairs with 16 distractor pairs, staggered cue → target training, per-trial reward delivery, per-epoch top-1 / top-3 readout. Pure STDP is run as the baseline arm.

The honest reading: neither arm reaches above-chance accuracy in the available training time. R-STDP shows a small advantage on noise suppression (mean noise-top-3 0.10 vs pure STDP 0.16) but the architecture's R1 → R2 forward path dominates the cue's R2 representation, leaving STDP too little room to grow strong recurrent associations. The infrastructure is in place; the next experiment (teacher-forcing the target SDR into R2 during training) is documented in notes/45.

What changes in iter 44.1 (this branch)

A decoder confidence floor via --decode-threshold (default 0.0 = pre-iter-44 behaviour, recommended 0.2 for the 32-sentence corpus). The original decode_top always returned k results even when the highest scoring engram sat right at the random-overlap baseline (KWTA_K / R2_E = 12.5 %). The floor omits low-confidence matches instead of filling the slot with garbage.

Measured on the same 32-sentence corpus, seed 42, --iter44 off:

`--decode-threshold`	FP / Q	Token reduction	Self-recall
`0.0` (pre-iter-44)	4.50	38.9 %	100 %
`0.20`	0.62	79.7 %	100 %
`0.30`	0.00	84.7 %	100 %

That is FP − 86 % and token reduction × 2.0 with no plasticity change at all; the SNN's engrams were already orthogonal, the decoder just refused to admit it.

What changes in iter 44 (this branch)

Seven new biology-grade plasticity mechanisms join the existing LIF / STDP / iSTDP / homeostasis / BTSP stack, all opt-in and default-off so every pre-iter-44 test stays bit-identical.

Honest benchmark result: on the deterministic 32-sentence corpus (seed 42), the new mechanisms do not improve recall over the iter-43 baseline out of the box — off 4.4 %, stability 4.4 %, tuned 2.7 %, full 1.6 %. Heterosynaptic / BCM scale weights uniformly per post and don't change the kWTA fingerprint; reward-modulated STDP and replay both need longer training windows or a reward signal the current eval harness does not provide. The stack is infrastructure for the next benchmark — multi-epoch streaming corpora and reward-aware retrieval — see notes/44 for the full reading. The mechanisms themselves:

Triplet STDP (Pfister-Gerstner 2006) — frequency-dependent LTP.
Reward-modulated STDP with eligibility traces — three-factor learning, gated by Brain::set_neuromodulator(...) (the dopamine surrogate). Closes the temporal-credit-assignment loop that pure pair-STDP cannot solve.
BCM metaplasticity — sliding LTP/LTD threshold per post-neuron; stops the runaway-LTP failure mode under sustained drive.
Intrinsic plasticity — adaptive per-neuron threshold; every cell drifts towards its target rate, no dead or saturated neurons.
Heterosynaptic L2 normalisation — the direct fix for the R2 saturation problem in notes/43. Hard-bounds each post-neuron's incoming-weight budget.
Structural plasticity — sprout new edges between repeatedly co-active E cells, prune persistently-dormant ones. Engram capacity stops being a hard topology constant.
Offline replay / consolidation — Brain::consolidate(...) drives the top-k engram cells in pulses with full plasticity on, the way slow-wave-sleep replay deepens hippocampal engrams.

Switch the whole stack on in the live viz with JAVIS_ITER44=1 cargo run -p viz --release.

The full architectural rationale, composition into the existing pipeline, and 15 new tests are documented in notes/44-breakthrough-plasticity.md.

Reproducibility

# Train + evaluate on 100 sentences. ~5 min wall on R2=10 000.
cargo run --release -p eval --example scale_benchmark \
    -- --sentences 100 --queries 30 --decode-k 6 --seed 42

# Smaller smoke run for CI / quick checks (~30 s):
cargo run --release -p eval --example scale_benchmark -- --sentences 32

The benchmark prints a Markdown summary table; redirect stdout to capture it verbatim into a release note.

Architecture

The full pipeline runs end-to-end: text in, encoded into a Sparse Distributed Representation, injected into R1 (input cortex), routed via address-event spikes into R2 (memory cortex) where STDP, iSTDP and homeostasis shape an engram, then read out by kWTA and an engram dictionary back into a list of text concepts.

Every box on the diagram corresponds to a real Rust module:

Stage	Module
`Text → SDR`	`crates/encoders`
`R1 / R2 / AER`	`crates/snn-core`
`Plasticity`	`crates/snn-core` (`stdp`, `istdp`, `homeostasis`)
`Decode`	`crates/encoders/src/decode.rs`
`Eval / RAG`	`crates/eval`
`LLM (Anthropic)`	`crates/llm`
`Live UI`	`crates/viz`

Quick start

# build everything
cargo build --release

# run the full test suite (98/98 should pass)
cargo test --release

# minimal 30-line demo printing RAG vs Javis token saving
cargo run --release -p eval --example hello_javis

# fire up the live 3D brain in a browser
cargo run -p viz --release --bin javis-viz
# → http://127.0.0.1:7777

Optional persistent brain:

cargo run -p viz --release -- --snapshot brain.json
# trains the bootstrap corpus, persists on Ctrl-C, reloads on next start

Optional real Claude API calls (otherwise the LLM adapter runs in mock mode):

ANTHROPIC_API_KEY=sk-ant-... cargo run -p viz --release
# the "send both to Claude" button now fires real calls

Run with Docker

A multi-stage Dockerfile plus a docker-compose.yml brings up the full observability stack — Javis, Prometheus, and Grafana — in one command:

docker compose up --build

URL	What
http://localhost:7777	Javis 3D brain (WebSocket + frontend)
http://localhost:7777/metrics	Prometheus exposition
http://localhost:9090	Prometheus UI (already scraping Javis)
http://localhost:3000	Grafana, Javis dashboard pre-provisioned

The brain state lives on a named volume (javis-data:/app/data), so docker compose restart saves a brain.snapshot.json on shutdown and reloads it on startup — no retraining needed.

The Grafana instance runs anonymous-admin and the Prometheus datasource is auto-wired — meant for local-dev only, see docker-compose.yml for the relevant GF_AUTH_* flags before exposing it anywhere.

Live 3D brain

Open http://127.0.0.1:7777 and you get a Three.js / 3d-force-graph view of the live brain:

Two anatomical lobes — R1 input cortex (blue) and R2 memory cortex (yellow) with embedded inhibitory cells (pink)
Spike pulses light each neuron as it fires, fading back over ~220 ms
A side panel streams phase, live spike rates, decoded concepts, the token saving headline and the actual RAG-vs-Javis payloads
Two text inputs let you live-train sentences and live-query the brain
A "send both to Claude" button fires both payloads to the Anthropic API in parallel and shows the answers + real input/output token counts

Plasticity

Javis composes twelve biologically-motivated plasticity mechanisms, each opt-in:

Mechanism	Purpose	Reference
LIF dynamics	leaky integrate-and-fire neurons with refractory period	classical
Pair STDP (E)	Hebbian potentiation between excitatory neurons	Bi & Poo 1998
iSTDP	heterosynaptic plasticity at I→E, gives engram selectivity	Vogels et al. 2011
Asymmetric homeostasis	scale-only-down multiplicative renormalisation	Turrigiano 2008
BTSP soft bounds	`Δw = a · trace · (w_max − w)` instead of hard clamp	Bittner 2017 / Milstein 2024
Contextual engrams	fingerprints captured during co-activity, not post-hoc	Tonegawa engram-cell line
Triplet STDP (iter-44)	frequency-dependent LTP via slow `r2` / `o2` traces	Pfister & Gerstner 2006
Reward-modulated STDP (iter-44)	three-factor learning, dopamine-gated eligibility tag	Frémaux & Gerstner 2016; Izhikevich 2007
Metaplasticity (BCM) (iter-44)	sliding LTP/LTD threshold per post-neuron	BCM 1982; Cooper & Bear 2012
Intrinsic plasticity (SFA) (iter-44)	adaptive per-neuron threshold	Desai 1999; Chrol-Cannon 2014
Heterosynaptic L1/L2 norm (iter-44)	per-post incoming-weight budget	Royer & Paré 2003; Field 2020
Structural plasticity (iter-44)	sprout + prune to grow/shrink topology	Yang 2009; Holtmaat & Svoboda 2009
Offline replay / consolidation (iter-44)	drives top-k engram cells with plasticity on	Buzsáki 2015; Wilson & McNaughton 1994

The math behind each lives in crates/snn-core/src/{stdp,istdp,homeostasis, metaplasticity,intrinsic,heterosynaptic,structural,reward,replay}.rs, the trade-offs are documented in notes/, and the full iter-44 rationale is in notes/44-breakthrough-plasticity.md.

Token efficiency — the small-corpus picture

Two integration tests measure Javis against a naïve RAG baseline on small, hand-curated corpora. The numbers here are favourable to Javis (each query returns a single decoded concept, full RAG returns the whole paragraph) and are the floor of the architecture's reach, not its ceiling:

Corpus	Mean RAG	Mean Javis	Mean reduction
3 paragraphs about programming languages	27 tok	2.3 tok	91.3 %
5 Wikipedia-shaped paragraphs (geology, transport, biology, …)	60 tok	2.0 tok	96.6 %

These are the ideal-conditions numbers. For the benchmark that includes every realistic failure mode — cross-bleed, missed co-occurrences, decoder saturation — read Performance profile above.

cargo test -p eval --release token_efficiency  -- --nocapture
cargo test -p eval --release wiki_benchmark    -- --nocapture

Production readiness

What separates Javis from a typical research demo:

Observability (notes 24–26)

Endpoint	Purpose
`tracing` + `RUST_LOG`	structured logs, JSON mode via `JAVIS_LOG_FORMAT=json`, per-WebSocket-session spans
`GET /health`	liveness — always 200
`GET /ready`	readiness — JSON with `sentences`, `words`, `llm` mode
`GET /metrics`	Prometheus exposition: counters, histograms (5 ms – 30 s buckets), gauges

Supply-chain (notes 27–30)

Tool	Where	Catches
`cargo-deny`	CI `deny` job	RustSec advisories, license drift, banned/duplicate crates, unknown sources
Pinned MSRV (1.86)	CI `msrv` job	accidental use of newer-rustc-only features
Dependabot	weekly	grouped `cargo` and `github-actions` updates
`cargo doc -D warnings`	CI `docs` job	broken intra-doc links, invalid codeblock attrs

Container (notes 32–33)


Multi-stage `Dockerfile`	`rust:1.86-bookworm` builder → `debian:bookworm-slim` runtime, ~150 MB final
Non-root user	`javis` (uid 1000) with `tini` as PID 1
HEALTHCHECK	`curl /health`, 15 s interval
Snapshot volume	`javis-data:/app/data` survives restarts
Optional CA secret	for sandbox / corporate-proxy environments

Performance baselines (note 31, local x86_64 Linux)

Path	Time
`Network::step` (1 000 neurons, sparse, passive)	3.2 µs
`Network::step` (1 000 neurons, sparse, +STDP)	3.4 µs
`Network::step_immutable` (1 000 neurons, recall path, post-SoA)	2.7 µs
`Brain::step` (two regions × 1 000)	7.7 µs
`encode_sentence` (18 words)	21 µs
`decode_strict` (vocab 1 000)	253 µs

End-to-end load profile (note 41, against docker compose stack)

Concurrent WS clients	Throughput	p50 / p99 latency	Server-mean
1	138 ops/s	7.2 / 8.9 ms	5.8 ms
10	430 ops/s	22.5 / 41 ms	7.5 ms
50	436 ops/s	116 / 197 ms	7.6 ms
100	432 ops/s	229 / 486 ms	7.6 ms

Recall runs against an Arc<RwLock<Inner>> with a per-call BrainState, so multiple recalls proceed in parallel. After the SoA refactor (note 41), server-side latency is ~7.6 ms across all concurrency levels — Brain step is now ~4.5 ms / recall, ws-stream 0.31 ms, decode 0.13 ms.

CI runs eight jobs on every push: fmt, clippy -D warnings, test, doc-tests, deny, msrv, docs, benches (compile-only).

Project structure

javis/
├── crates/
│   ├── snn-core/   ─ LIF neurons, STDP, iSTDP, homeostasis, BTSP, AER routing
│   ├── encoders/   ─ Text → SDR (DefaultHasher, k-of-n) + EngramDictionary
│   ├── eval/       ─ Token-efficiency benchmarks vs. naive RAG
│   ├── llm/        ─ Anthropic API adapter (real + deterministic mock)
│   └── viz/        ─ Axum + WebSocket server, 3D-force-graph frontend
├── notes/          ─ 43 research notes — every decision documented
├── scripts/        ─ End-to-end sanity check + load test (Python)
├── deploy/         ─ Prometheus + Grafana provisioning for docker-compose
└── assets/         ─ Logo and architecture diagram (programmatic SVG)

Tests

cargo test --release

Suite	Tests	Validates
`snn-core`	74	LIF dynamics, pair / triplet STDP & iSTDP, homeostasis, BCM metaplasticity, intrinsic plasticity, heterosynaptic L2, structural sprout/prune, offline replay/consolidation, reward-modulated STDP / eligibility, BTSP soft bounds, BTSP plateau-eligibility (iter-67: tag accumulation, plateau-arm threshold, one-shot potentiation, disarm-after-silence, weight clamp, off-path bit-identity), E/I balance, multi-region routing, snapshot serde, assembly formation, bounds-checked APIs, heap pending queue, AMPA/NMDA/GABA channels, read-only step equivalence
`encoders`	26	SDR union/overlap, hash determinism, top-k decode, threshold-floor decode (iter 44.1), injection, full pattern completion
`eval`	28	RAG-vs-Javis token efficiency, Wikipedia scaling, intra-topic recall, contextual mode, scale-bench smoke, iter-65 / iter-66 / iter-66.5 reward-bench snapshots, axis-sweep harness, postmortem diagnostics
`llm`	3	Anthropic adapter mock contract, token heuristic
`viz`	16	WebSocket smoke, train+recall, ask both, snapshot round-trip, `/health` + `/ready`, `/metrics`, concurrency cap (deflaked iter-67-γ.4 chore), snapshot schema migration (v1→v2)
Doc-tests	3	Public quick-start examples in `snn-core` and `encoders`
Total	150	with zero clippy warnings workspace-wide (1 test ignored — long-running multi-region soak)

Iterations

Every iteration is logged in notes/. Each note is a single hypothesis, a pre-fixed acceptance criterion, and the measurement that either confirms or falsifies it. The chain is the public artefact.

Latest snapshot (iter-67-γ.1.1 Gate-B Class C · γ.4 pre-registered). The iter-66 CA3/CA1 split + iter-66.5 eval-aligned R-STDP did not produce a robust C1-target signal on its own; iter-67 introduced BTSP (Behavioral-Timescale Synaptic Plasticity, Bittner 2017 / Magee & Grienberger 2020) as the binding rule on R2-E → C1. Locked configuration γ.1.1 (reports/gate_a_gamma_1_1_config.md) cleared Gate-A (3/4 seeds PASS at last-8 mean ≥ 0.05). The full Gate-B 8-seed run (reports/gate_b_gamma_1_1_8seed_summary.md) lands at 5/8 PASS, mean(last-8) = 0.0693 ± 0.0375 — Class (C) Partial per the locked acceptance matrix. Diagnostic: training-side metrics bit-identical across all 8 seeds; failures are eval-phase fingerprint discrimination, with all 3 FAIL seeds showing DEGRADING-or-flat per-cue trajectories. γ.1.1 binds via fingerprint geometry (w_ratio ≈ 1.000 universally), not via weight magnitude. The (C) branch's locked fallback, iter-67-γ.4 (per-post target-gating with non-target depression), is now implemented + pre-registered (reports/gate_b_gamma_4_entry.md) awaiting the 8-seed compute on the same locked seed set {0..7}. → reports/gate_b_gamma_1_1_8seed_summary.md, reports/gate_b_gamma_4_entry.md, notes/67, notes/66.5, notes/66

Phase 0 — Bio foundations · iter 00–19

20 iterations: core SNN, encoder, decoder, viz, persistence

#	Topic
00	Architecture sketch
01	snn-core baseline
02	Assembly formation + throughput budget
03	E/I balance + sparse adjacency
04	Multi-region AER
05	Encoder stub
06	Pattern completion
07	Homeostatic scaling
08	Pattern completion + homeostasis
09	Decoder
10	Multi-concept coexistence
11	iSTDP — intrinsic selectivity
12	Token-efficiency benchmark
13	Live viz iter 1 (raster)
14	Live viz iter 2 (3D brain)
15	Live viz iter 3 (persistent training)
16	Live viz iter 4 (Claude API)
17	Persistence (snapshots)
18	Wikipedia scaling
19	Two decode modes

Phase 1 — Production polish · iter 20–43

24 iterations: hardening, CI, observability, deploy, scaling

#	Topic
20	Bio-inspired optimisations: contextual engrams + BTSP
21	Architecture hardening: dead code, bounds checks, lints
22	Min-heap pending queue, AMPA/NMDA/GABA, zero lints
23	Production polish: CI, doc-tests, examples, CHANGELOG
24	Structured logging via `tracing` (RUST_LOG, JSON, spans)
25	`/health` + `/ready` probes
26	Prometheus metrics: `/metrics` endpoint
27	Supply-chain hygiene: `cargo-deny`
28	MSRV pinned to Rust 1.86
29	Dependabot (cargo + GH-actions, weekly)
30	`cargo doc -D warnings` as CI gate
31	Criterion benchmarks for `step` / encode / decode
32	Container & deploy: Docker + Compose + Prom + Grafana
33	Docker stack verified end-to-end + snapshot volume
34	End-to-end sanity script + Grafana datasource fix
35	Load test: ~141 recalls/sec sustained, no leak
36	Concurrency cap: Semaphore + 503/Retry-After
37	Snapshot schema versioning: v2 + migration chain
38	Read-only recall: `step_immutable` + `RwLock`, 2.5×
39	Profile-driven LIF rewrite: pre-summed channels, 1.5×
40	Pipeline profile: brain compute is 77 % of recall
41	AoS → SoA + WS fire-and-forget: 1.40× pipeline, 2× LIF
42	Validation-at-scale: 100-sentence FP/FN/recall benchmark
43	Topology scaling: R2 2 000→10 000, sparser, retuned iSTDP

Phase 2 — Associative learning research · iter 44–66

The pair-association track. Each row = one hypothesis, pre-fixed acceptance, measurable outcome. Verdict: ✅ pass · ⚠ partial / diagnosis · ❌ fail · 🚀 architectural pivot.

#	Headline	Verdict	Note
44	Plasticity stack: triplet-STDP, R-STDP, BCM, intrinsic, heterosynaptic, structural, replay	✅ landed	→
44.1	Decoder confidence floor `--decode-threshold`: FP −86 %, token reduction +2×	✅	→
45	Reward harness: dopamine + eligibility tag exercised end-to-end	⚠ no convergence	→
46	Teacher-forcing: 6-phase + R2 clamp + anti-causal STDP fix; clamp = 1.00	⚠ R1→R2 dominates	→
47a	Forward-drive sweep + Diehl-Cook θ: first monotone learning signal at INTER_WEIGHT = 1.0	⚠ collapses ep ≥ 5	→
47a-pm	Postmortem: oscillatory bursts, θ effect 0.05 mV (< 0.3 % of LIF swing) — pivots iter-48 plan	⚠ diagnosis	→
48	iSTDP-tightening (Vogels 2011): selectivity flips +0.014 stable	⚠ acceptance 1.5/3	→
48-sat	Phase-A 16-ep saturation: peak ep 1–4, hard collapse ep 5; iSTDP cumulative over-inhibition	❌ acceptance 0/3	→
49	iSTDP bounds & schedule sweep (3 axes): 0/3 produce learning	❌ iSTDP not the lever	→
50	Arm B reproduction `--iter46-baseline`: `selectivity_index` was wrong metric for 5 iterations	⚠ measurement bug	→
51	Arm B 16-epoch saturation: top-3 mean 0.107 vs random 0.094, 95 % CI includes random	❌ chance-level	→
52	Untrained control `--no-plasticity`: trained vs untrained Δ = 0.068, ≈ 2.2 σ	⚠ Mess-Frage	→
53	Decoder-relative Jaccard (cross-cue + same-cue + Δ-of-Δ); 4 seeds × 16 ep	❌ Δ-of-Δ = −0.121	→
54	Hard-decorrelated R1 → R2 init (disjoint blocks per cue); paired t(3) ≈ −16, p ≪ 0.001	✅ Δ-of-Δ = +0.160	→
55	Epoch sweep 16/32/64: per-doubling Δ −0.054 → −0.016, asymptote ~0.21	⚠ saturation	→
56	Clamp-strength sweep 125/250/500: trained 0.272 → 0.245 → 0.230, 5× tighter std at c500	⚠ magnitude-limited	→
57	Phase-length sweep 40/80/120: t40 best, t80 catastrophic, t120 recovers — non-monotone	⚠ ceiling holds	→
58	Geometry-vs-plasticity diagnostic: vocab=32 → 64 raises trained_cross +0.192	✅ architecture floor	→
59	R2 capacity scaling: Δ deepens 13× (R2 2 000 → 4 000) but absolute floor moves only 0.04	⚠ branch-B mixed	→
60	DG bridge (R1 → DG → R2, k-of-n hashed SDRs): trained cross 0.45 → 0.03 (−94 %)	🚀 architecture pivot	→
61	DG full replication 4 seeds × 32 ep: cross robust; 2/4 seeds erode same-cue (0.875, 0.898)	⚠ recall instability	→
62	Recall-mode `--plasticity-off-during-eval`: same-cue = 1.000 on 4/4 seeds, post-eval L2 bit-identical	✅ stability solved	→
63	Direct cue→target metric on DG brain: `target_top3_overlap` mean across epochs, threshold = `max(0.05, μ + 2σ) = 0.0621` locked from calibration	❌ Branch (B) FAIL · Δ̄ = −0.003, t(3) = −0.18	→
64	Mechanism diagnosis (3 axes complete). Axis C `value=0.3` (perforant + DG): α at 4 seeds (smoke `Δ̄=+0.019`, full `Δ̄=+0.016`, n_pos=3/4 both phases). Axis A + Axis B both narrow-window: every non-default value is sub-floor or locked-state (`Δ = 0 bit-for-bit` on most seed-value points); the iter-46 defaults are highly tuned	⚠ Mechanism candidate (axis C) for iter-65	→
65	Perforant path robustness check. Axis C value=0.3 at 8 seeds × 32 ep: `Δ̄ = +0.0068`, `t(7) = +0.779`, `n_pos = 4/8` (chance level) → Branch (C) Reject. The 4-seed α was a sample-frequency artefact of a true ~50 % success-rate distribution. Original 4 seeds reproduced bit-identical to iter-64; new seeds 1, 3, 4 split mostly-negative; seed=99 deterministic outlier persists	❌ 4-seed α was sample artefact	→
66	Deep-research literature pivot. 28 peer-reviewed sources (Marr, Treves & Rolls, O'Reilly & McClelland, Norman & O'Reilly, Schapiro, Cassenaer & Laurent, Bellec, Izhikevich, Frémaux & Gerstner, Bittner, Magee & Grienberger, Krotov & Hopfield, Ramsauer, Willshaw, Kanerva, …) converge on: current architecture lacks a CA1-equivalent heteroassociative readout. Recommendation: CA3/CA1 split (Mechanism M1) — new C1 layer with target-presence-gated three-factor R-STDP on R2 → C1, primary metric `c1_target_top3_overlap`	🚀 Architecture pivot recommended	→
66.5	iter-66 readout pivot. Eval-aligned R-STDP on R2-E → C1: drop the canonical R2 target SDR from the teacher Phase 4 clamp so R-STDP trains on the natural cue-driven R2 pattern instead of the canonical pattern (Path-1 fix). Improves cue→target binding in single-seed smoke but multi-seed `c1_target_top3_overlap` still flat at 0	❌ R-STDP alone insufficient — pivot to BTSP	→
67	BTSP (Bittner 2017 / Magee & Grienberger 2020) on R2-E → C1. Plateau-eligibility kernel: per-synapse tag accumulates on every pre-spike with 200 ms decay; per-post-cell `burst_trace` arms plateau at ≥ 5 spikes / 30 ms; on disarm→arm transition, all incoming tagged synapses receive `Δw = +strength × tag` one-shot. iter-67-γ.1 / γ.1.1 sweeps locked an E/I-split partial-echo-state config (E=1.0, I=0.3, R2-isolation OFF). γ.1.1 cleared Gate-A (3/4 seeds PASS); Gate-B 8-seed run 5/8 PASS, mean(last-8) = 0.069 ± 0.038 → Class (C) Partial per locked acceptance matrix. γ.4 fallback (per-post target-gating + non-target depression) implemented and pre-registered, awaiting compute	⚠ Class C — γ.4 next	→ · Gate-A · Gate-B · γ.4 ENTRY

Where we are. iter-63 closed the Jaccard chain by re-introducing the iter-44/45 decoder-relative top3_accuracy on the DG-enabled brain. iter-64/65 mechanism-diagnosis axes failed at 8 seeds — Branch (C) Reject for the perforant-path α. iter-66 deep-research scan recommended a CA3/CA1 split (Mechanism M1), and iter-66/66.5 implemented the C1 readout with target-presence-gated R-STDP. Multi- seed c1_target_top3_overlap was still flat at 0 — R-STDP alone is insufficient on the binding pathway.

iter-67 introduces BTSP (Bittner 2017 / Magee & Grienberger 2020) as the plateau-gated retroactive-potentiation rule on R2-E → C1, with a 200 ms eligibility window that bridges the iter-46 cue → delay → prediction → teacher window pair-STDP cannot reach. Locked configuration γ.1.1 cleared Gate-A (3/4 seeds PASS) — the first iter-66+ configuration with a multi-seed-confirmed non-zero C1 readout signal. Gate-B 8-seed run (reports/ gate_b_gamma_1_1_8seed_summary.md) lands at 5/8 PASS, mean(last-8) = 0.0693 ± 0.0375, t(7) vs 0 = 5.23 — Class (C) Partial. Per-seed diagnostic: all 3 FAIL seeds show DEGRADING-or-flat trajectories; all 5 PASS seeds show improving trajectories; training-side metrics are bit-identical across all 8 seeds. γ.1.1 binds via fingerprint geometry (w_ratio ≈ 1.000 universally), not via per-class weight magnitude.

The (C) branch's locked fallback iter-67-γ.4 — per-post target-gating with non-target depression is now implemented in crates/snn-core/ src/btsp.rs (BtspParams::non_target_depression_strength) + pre-registered in reports/gate_b_gamma_4_entry.md with the same locked seed set {0..7}, the same Gate-B acceptance matrix, and four explicit hypotheses (H1 lift FAIL seeds, H2 preserve PASS seeds, H3 weight separation w_ratio > 1.05, H4 trajectory pattern eliminated). γ.1.1 numerics are bit-identical when --c1-btsp-non-target-depression- strength = 0.0 (verified: 6/6 BTSP tests + 11/11 eval tests PASS unchanged).

References

The plasticity rules and architectural choices come from current SNN literature. Key papers:

A. C. Vogels et al. — Inhibitory Plasticity Balances Excitation and Inhibition · Science 2011
A. D. Milstein et al. — Rapid memory encoding in a recurrent network model with BTSP · PLOS Comp Bio 2023
L. Bittner et al. — Behavioral Time Scale Synaptic Plasticity (Nature Comms 2024)
Caligiore et al. — Selective inhibition in CA3 · PLOS Comp Bio 2024
L. Hu et al. — Dynamic and selective engrams emerge with memory consolidation · Nature Neurosci. 2024

License

PolyForm Noncommercial 1.0.0 — research, education, and personal study only. Not for commercial use, production deployment, or any operational system that affects real people, real decisions, real services, or real infrastructure.

The full text is in LICENSE along with a project-specific, non-binding research-use addendum spelling out the licensor's intent.

Researchers using Javis are expected to:

Run experiments in controlled / sandboxed environments, not on production infrastructure.
Not deploy any derivative work as a service to third parties.
Cite the project and the relevant notes/NN-*.md iteration when publishing results.
Disclose modifications to the codebase when sharing experimental results that depend on those modifications.

Commercial licenses are not currently offered. Inquiries: open an issue.

The Cargo.toml license field uses the standard LicenseRef-PolyForm-Noncommercial-1.0.0 SPDX-LicenseRef form; cargo-deny [licenses].private = { ignore = true } is set in deny.toml so transitive dependencies are still license-checked against the SPDX allow-list, but the workspace's own crates (all publish = false) are not flagged for the non-SPDX identifier.

Name		Name	Last commit message	Last commit date
Latest commit History 206 Commits
.claude		.claude
.codex/skills/workgraph		.codex/skills/workgraph
.github		.github
.workgraph		.workgraph
assets		assets
crates		crates
deploy		deploy
notes		notes
reports		reports
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
WORKGRAPH.md		WORKGRAPH.md
deny.toml		deny.toml
docker-compose.yml		docker-compose.yml
geschichte.txt		geschichte.txt
javis-SKILL.md		javis-SKILL.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of contents

Why Javis

Architecture diagram

Performance profile

What survives

Known limits (iter ≤24 baseline)

What changes in iter 25 (this branch)

What changes in iter 47a (this branch)

What changes in iter 46 (this branch)

What changes in iter 45 (this branch)

What changes in iter 44.1 (this branch)

What changes in iter 44 (this branch)

Reproducibility

Architecture

Quick start

Run with Docker

Live 3D brain

Plasticity

Token efficiency — the small-corpus picture

Production readiness

Project structure

Tests

Iterations

Phase 0 — Bio foundations · iter 00–19

Phase 1 — Production polish · iter 20–43

Phase 2 — Associative learning research · iter 44–66

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Table of contents

Why Javis

Architecture diagram

Performance profile

What survives

Known limits (iter ≤24 baseline)

What changes in iter 25 (this branch)

What changes in iter 47a (this branch)

What changes in iter 46 (this branch)

What changes in iter 45 (this branch)

What changes in iter 44.1 (this branch)

What changes in iter 44 (this branch)

Reproducibility

Architecture

Quick start

Run with Docker

Live 3D brain

Plasticity

Token efficiency — the small-corpus picture

Production readiness

Project structure

Tests

Iterations

Phase 0 — Bio foundations · iter 00–19

Phase 1 — Production polish · iter 20–43

Phase 2 — Associative learning research · iter 44–66

References

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages