An associative SNN memory co-processor for LLM agents, built in Rust.
A spiking neural network that stores knowledge as emergent cell assemblies and retrieves it through pattern completion. Sits between your retrieval layer and your LLM, returning a few decoded concepts instead of full document chunks.
⚠ Research-only license. Javis is licensed under PolyForm Noncommercial 1.0.0. It is not for production deployment, commercial integration, or any operational system that affects real people, real decisions, real services, or real infrastructure. See the research-use addendum for the licensor's intent statement.
- Why Javis — the pitch in 30 seconds.
- Architecture diagram — animated R1 → DG → R2 walk-through.
- Performance profile — what the network actually does today.
- Architecture (text) — the iter-44 plasticity stack and decoder.
- Quick start — clone → build → recall.
- Live 3D brain — the in-browser visualisation.
- Plasticity — STDP / iSTDP / BCM / R-STDP / homeostasis.
- Token efficiency — Javis vs. naive RAG.
- Production readiness — tracing, metrics, container, MSRV, deny.
- Project structure — the workspace layout.
- Tests — what runs in CI and what the gates are.
- Iterations — every hypothesis, pre-fixed criterion, and verdict from iter-00 to today.
- References — papers behind the plasticity choices.
- License — research-use only.
Modern LLM pipelines spend most of their token budget on retrieval context. Naive RAG ships an entire chunk to the model on every query — even when only a single fact inside that chunk matters.
Javis flips the architecture: knowledge is stored as emergent cell assemblies in a spiking neural network. A query is a partial cue; pattern completion inside the network reactivates the relevant assembly; only the few decoded concepts go to the LLM.
Naive RAG: "Rust is a systems programming language focused on memory
safety and ownership; the borrow checker prevents data races
at compile time." 63 tokens
Javis: "rust" 2 tokens
That gap is the whole pitch. Whether it holds at scale — and where it stops holding — is reported below in plain numbers, not slogans.
The iter-60+ DG bridge architecture, with spike pulses propagating from R1 (input cortex, k=20 SDR) through the DG hash (4 000 cells, k-of-n sparse code) and the direct perforant path (configurable scale, iter-64 sweet-spot at 0.3) into R2 (memory cortex, 2 000 LIF, 70 % E + 30 % I, fully recurrent with STDP / iSTDP / BCM / homeostasis / intrinsic-θ / R-STDP / heterosynaptic / structural plasticity). A 3-second loop:
The blue paths (R1 → DG → R2 mossy fibres) are always on at full
weight; the dashed grey path (direct R1 → R2 perforant) is the iter-64
mechanism diagnosis axis — at scale 0.0 (DG-only, iter-63 baseline) the
brain separates cues but does not learn cue → target on the decoder; at
scale 0.3 it does (iter-64 axis C verdict, see Iterations).
Measured on a deterministic 100-sentence / 286-vocabulary benchmark
(cargo run --release -p eval --example scale_benchmark -- --sentences 100).
Reproducible from a single --seed; no external dataset, no network.
| Property | Value |
|---|---|
| Self-recall (query concept always retrievable) | 100 % |
| Token reduction vs naïve-RAG baseline | 35 – 45 % |
| Decoder latency at vocab ≤ 300 | sub-millisecond |
| Self-recall test suite | 113 / 113 passing |
The first row is the architectural claim that Javis stands behind: train a concept once, recall it deterministically. The second row is the headline number — modest but real, on a non-toy corpus. The third row makes Javis practical as a co-processor in front of an LLM.
These are the failure modes a senior reviewer would find on day one. Better to publish them than have someone tweet them.
| Limit | Measured value | Mechanism |
|---|---|---|
| Associative recall | ≈ 2 % | Of every word that genuinely co-occurs with the query in the corpus, only ~ 2 % is decoded. Javis returns the query plus 5 noise words, not the expected 5–10 related concepts. |
| Cross-domain bleed | 4.7 / 6 decoded words | At N > 50 distinct concepts the R2 layer (2 000 neurons, K=220, 11 % sparsity) saturates; iSTDP can no longer build separating walls between engrams, so unrelated domains leak into each other. |
| Engram capacity | ≈ 50 concepts | Geometric upper bound from R2_size / KWTA_K = 2 000 / 220 ≈ 9 fully-orthogonal engrams; with overlap-tolerance about 50 work cleanly before interference dominates. |
R2 was scaled from 2 000 → 10 000 neurons, recurrent connectivity sparsened
from p=0.10 → p=0.03, KWTA from 220 → 100 (1 % sparsity), iSTDP retuned for
aggressive LTD on co-active E-targets. The 113 existing tests still pass at
the new topology. Updated cross-bleed and recall numbers are in
notes/43-topology-scaling.md once the
benchmark run completes.
The iter-46 negative-margin diagnosis identified the R1 → R2 forward drive as the dominant factor in the cue's R2 response. Iter-47a tests the literature-grounded fix (Brunel scaling + Diehl-Cook adaptive threshold) through a sequential 4-epoch sweep with pre-fixed acceptance criteria. Result, on the same 16 + 16 pair / vocab-32 corpus, seed 42:
| INTER_WEIGHT | r2_act mean | tgt_hit | selectivity | margin |
|---|---|---|---|---|
| 0.5 | 0.8 | 0.00 | -0.0005 | -0.01 |
| 1.0 | 139 | 2.59 | -0.0005 | -0.02 |
| 0.7 | 507 (cascade) | 9.38 | -0.0047 | -0.02 |
iter-47a-2 alone does not flip the margin sign. But the
diagnosis is sharper than iter-46's: at INTER_WEIGHT = 1.0,
target_hit_mean grew monotonically over 4 epochs (1.16 → 2.59)
and selectivity_index rose from -0.022 to -0.0005 — the right
direction. The 0.7 bistability (recurrent cascade in epoch 3) is
the key second-order finding: hard sparsity control (k-WTA, iter-48
entry) is necessary, not optional. The iter-47 metrics
(r2_active_pre_teacher_{mean,p10,p90}, selectivity_index) are
wired to A/B-test it cleanly. See
notes/47a.
The pair-association harness from iter-45 grows a teacher-forcing
training arm: a deterministic per-word canonical R2-E SDR
(canonical_target_r2_sdr) and a drive_with_r2_clamp primitive
that injects target spikes directly into R2 — bypassing the
random R1 → R2 forward path. Plus a six-phase trial schedule
(cue → delay → prediction → teacher → reward → tail) with
plasticity gating around the prediction window so evaluation
never contaminates training, an anti-causal STDP timing fix
(cue lead-in before the clamp), and a --association-training- gate-r1r2 flag to attenuate forward drive during the prediction
phase only.
Honest result on the same 16-pair + 16-noise corpus, seed 42:
target_clamp_hit_rate = 1.00 across every teacher epoch (the
clamp itself works perfectly), but correct_minus_incorrect_margin
stays in [-0.06, -0.03] — the canonical-target cells fire
less than the rest under cue-only recall, even with the
timing fix and the R1 → R2 gate. The first non-zero
prediction_top3_before_teacher = 0.02 appears at epoch 3 with
homeostasis on, but does not stabilise above the 9.4 % chance
floor in any run. The bottleneck has moved from iter-45's "we
can't measure it" to iter-46's "we can measure it; here is the
number". See notes/46 for the
full chain of measurements and the next-iter (47) directions
(reduce INTER_WEIGHT, add an association-bridge region, or
make R1 → R2 itself learnable).
A reward-aware pair-association benchmark
(cargo run --release -p eval --example reward_benchmark) that
finally lets the iter-44 R-STDP / dopamine machinery be exercised:
16 (cue, target) pairs with 16 distractor pairs, staggered
cue → target training, per-trial reward delivery, per-epoch
top-1 / top-3 readout. Pure STDP is run as the baseline arm.
The honest reading: neither arm reaches above-chance accuracy in
the available training time. R-STDP shows a small advantage on
noise suppression (mean noise-top-3 0.10 vs pure STDP 0.16)
but the architecture's R1 → R2 forward path dominates the cue's
R2 representation, leaving STDP too little room to grow strong
recurrent associations. The infrastructure is in place; the next
experiment (teacher-forcing the target SDR into R2 during
training) is documented in
notes/45.
A decoder confidence floor via --decode-threshold (default 0.0
= pre-iter-44 behaviour, recommended 0.2 for the 32-sentence
corpus). The original decode_top always returned k results even
when the highest scoring engram sat right at the random-overlap
baseline (KWTA_K / R2_E = 12.5 %). The floor omits low-confidence
matches instead of filling the slot with garbage.
Measured on the same 32-sentence corpus, seed 42, --iter44 off:
--decode-threshold |
FP / Q | Token reduction | Self-recall |
|---|---|---|---|
0.0 (pre-iter-44) |
4.50 | 38.9 % | 100 % |
0.20 |
0.62 | 79.7 % | 100 % |
0.30 |
0.00 | 84.7 % | 100 % |
That is FP − 86 % and token reduction × 2.0 with no plasticity change at all; the SNN's engrams were already orthogonal, the decoder just refused to admit it.
Seven new biology-grade plasticity mechanisms join the existing LIF / STDP / iSTDP / homeostasis / BTSP stack, all opt-in and default-off so every pre-iter-44 test stays bit-identical.
Honest benchmark result: on the deterministic 32-sentence corpus
(seed 42), the new mechanisms do not improve recall over the
iter-43 baseline out of the box — off 4.4 %, stability 4.4 %,
tuned 2.7 %, full 1.6 %. Heterosynaptic / BCM scale weights
uniformly per post and don't change the kWTA fingerprint;
reward-modulated STDP and replay both need longer training windows
or a reward signal the current eval harness does not provide. The
stack is infrastructure for the next benchmark — multi-epoch
streaming corpora and reward-aware retrieval — see
notes/44 for the full
reading. The mechanisms themselves:
- Triplet STDP (Pfister-Gerstner 2006) — frequency-dependent LTP.
- Reward-modulated STDP with eligibility traces — three-factor
learning, gated by
Brain::set_neuromodulator(...)(the dopamine surrogate). Closes the temporal-credit-assignment loop that pure pair-STDP cannot solve. - BCM metaplasticity — sliding LTP/LTD threshold per post-neuron; stops the runaway-LTP failure mode under sustained drive.
- Intrinsic plasticity — adaptive per-neuron threshold; every cell drifts towards its target rate, no dead or saturated neurons.
- Heterosynaptic L2 normalisation — the direct fix for the R2
saturation problem in
notes/43. Hard-bounds each post-neuron's incoming-weight budget. - Structural plasticity — sprout new edges between repeatedly co-active E cells, prune persistently-dormant ones. Engram capacity stops being a hard topology constant.
- Offline replay / consolidation —
Brain::consolidate(...)drives the top-k engram cells in pulses with full plasticity on, the way slow-wave-sleep replay deepens hippocampal engrams.
Switch the whole stack on in the live viz with
JAVIS_ITER44=1 cargo run -p viz --release.
The full architectural rationale, composition into the existing
pipeline, and 15 new tests are documented in
notes/44-breakthrough-plasticity.md.
# Train + evaluate on 100 sentences. ~5 min wall on R2=10 000.
cargo run --release -p eval --example scale_benchmark \
-- --sentences 100 --queries 30 --decode-k 6 --seed 42
# Smaller smoke run for CI / quick checks (~30 s):
cargo run --release -p eval --example scale_benchmark -- --sentences 32The benchmark prints a Markdown summary table; redirect stdout to capture it verbatim into a release note.
The full pipeline runs end-to-end: text in, encoded into a Sparse Distributed Representation, injected into R1 (input cortex), routed via address-event spikes into R2 (memory cortex) where STDP, iSTDP and homeostasis shape an engram, then read out by kWTA and an engram dictionary back into a list of text concepts.
Every box on the diagram corresponds to a real Rust module:
| Stage | Module |
|---|---|
Text → SDR |
crates/encoders |
R1 / R2 / AER |
crates/snn-core |
Plasticity |
crates/snn-core (stdp, istdp, homeostasis) |
Decode |
crates/encoders/src/decode.rs |
Eval / RAG |
crates/eval |
LLM (Anthropic) |
crates/llm |
Live UI |
crates/viz |
# build everything
cargo build --release
# run the full test suite (98/98 should pass)
cargo test --release
# minimal 30-line demo printing RAG vs Javis token saving
cargo run --release -p eval --example hello_javis
# fire up the live 3D brain in a browser
cargo run -p viz --release --bin javis-viz
# → http://127.0.0.1:7777Optional persistent brain:
cargo run -p viz --release -- --snapshot brain.json
# trains the bootstrap corpus, persists on Ctrl-C, reloads on next startOptional real Claude API calls (otherwise the LLM adapter runs in mock mode):
ANTHROPIC_API_KEY=sk-ant-... cargo run -p viz --release
# the "send both to Claude" button now fires real callsA multi-stage Dockerfile plus a docker-compose.yml brings up the
full observability stack — Javis, Prometheus, and Grafana — in one
command:
docker compose up --build| URL | What |
|---|---|
| http://localhost:7777 | Javis 3D brain (WebSocket + frontend) |
| http://localhost:7777/metrics | Prometheus exposition |
| http://localhost:9090 | Prometheus UI (already scraping Javis) |
| http://localhost:3000 | Grafana, Javis dashboard pre-provisioned |
The brain state lives on a named volume (javis-data:/app/data),
so docker compose restart saves a brain.snapshot.json on
shutdown and reloads it on startup — no retraining needed.
The Grafana instance runs anonymous-admin and the Prometheus
datasource is auto-wired — meant for local-dev only, see
docker-compose.yml for the relevant GF_AUTH_* flags before
exposing it anywhere.
Open http://127.0.0.1:7777 and you get a Three.js / 3d-force-graph view of
the live brain:
- Two anatomical lobes — R1 input cortex (blue) and R2 memory cortex (yellow) with embedded inhibitory cells (pink)
- Spike pulses light each neuron as it fires, fading back over ~220 ms
- A side panel streams phase, live spike rates, decoded concepts, the token saving headline and the actual RAG-vs-Javis payloads
- Two text inputs let you live-train sentences and live-query the brain
- A "send both to Claude" button fires both payloads to the Anthropic API in parallel and shows the answers + real input/output token counts
Javis composes twelve biologically-motivated plasticity mechanisms, each opt-in:
| Mechanism | Purpose | Reference |
|---|---|---|
| LIF dynamics | leaky integrate-and-fire neurons with refractory period | classical |
| Pair STDP (E) | Hebbian potentiation between excitatory neurons | Bi & Poo 1998 |
| iSTDP | heterosynaptic plasticity at I→E, gives engram selectivity | Vogels et al. 2011 |
| Asymmetric homeostasis | scale-only-down multiplicative renormalisation | Turrigiano 2008 |
| BTSP soft bounds | Δw = a · trace · (w_max − w) instead of hard clamp |
Bittner 2017 / Milstein 2024 |
| Contextual engrams | fingerprints captured during co-activity, not post-hoc | Tonegawa engram-cell line |
| Triplet STDP (iter-44) | frequency-dependent LTP via slow r2 / o2 traces |
Pfister & Gerstner 2006 |
| Reward-modulated STDP (iter-44) | three-factor learning, dopamine-gated eligibility tag | Frémaux & Gerstner 2016; Izhikevich 2007 |
| Metaplasticity (BCM) (iter-44) | sliding LTP/LTD threshold per post-neuron | BCM 1982; Cooper & Bear 2012 |
| Intrinsic plasticity (SFA) (iter-44) | adaptive per-neuron threshold | Desai 1999; Chrol-Cannon 2014 |
| Heterosynaptic L1/L2 norm (iter-44) | per-post incoming-weight budget | Royer & Paré 2003; Field 2020 |
| Structural plasticity (iter-44) | sprout + prune to grow/shrink topology | Yang 2009; Holtmaat & Svoboda 2009 |
| Offline replay / consolidation (iter-44) | drives top-k engram cells with plasticity on | Buzsáki 2015; Wilson & McNaughton 1994 |
The math behind each lives in crates/snn-core/src/{stdp,istdp,homeostasis, metaplasticity,intrinsic,heterosynaptic,structural,reward,replay}.rs,
the trade-offs are documented in notes/, and the full iter-44
rationale is in notes/44-breakthrough-plasticity.md.
Two integration tests measure Javis against a naïve RAG baseline on small, hand-curated corpora. The numbers here are favourable to Javis (each query returns a single decoded concept, full RAG returns the whole paragraph) and are the floor of the architecture's reach, not its ceiling:
| Corpus | Mean RAG | Mean Javis | Mean reduction |
|---|---|---|---|
| 3 paragraphs about programming languages | 27 tok | 2.3 tok | 91.3 % |
| 5 Wikipedia-shaped paragraphs (geology, transport, biology, …) | 60 tok | 2.0 tok | 96.6 % |
These are the ideal-conditions numbers. For the benchmark that includes every realistic failure mode — cross-bleed, missed co-occurrences, decoder saturation — read Performance profile above.
cargo test -p eval --release token_efficiency -- --nocapture
cargo test -p eval --release wiki_benchmark -- --nocaptureWhat separates Javis from a typical research demo:
Observability (notes 24–26)
| Endpoint | Purpose |
|---|---|
tracing + RUST_LOG |
structured logs, JSON mode via JAVIS_LOG_FORMAT=json, per-WebSocket-session spans |
GET /health |
liveness — always 200 |
GET /ready |
readiness — JSON with sentences, words, llm mode |
GET /metrics |
Prometheus exposition: counters, histograms (5 ms – 30 s buckets), gauges |
Supply-chain (notes 27–30)
| Tool | Where | Catches |
|---|---|---|
cargo-deny |
CI deny job |
RustSec advisories, license drift, banned/duplicate crates, unknown sources |
| Pinned MSRV (1.86) | CI msrv job |
accidental use of newer-rustc-only features |
| Dependabot | weekly | grouped cargo and github-actions updates |
cargo doc -D warnings |
CI docs job |
broken intra-doc links, invalid codeblock attrs |
Container (notes 32–33)
Multi-stage Dockerfile |
rust:1.86-bookworm builder → debian:bookworm-slim runtime, ~150 MB final |
| Non-root user | javis (uid 1000) with tini as PID 1 |
| HEALTHCHECK | curl /health, 15 s interval |
| Snapshot volume | javis-data:/app/data survives restarts |
| Optional CA secret | for sandbox / corporate-proxy environments |
Performance baselines (note 31, local x86_64 Linux)
| Path | Time |
|---|---|
Network::step (1 000 neurons, sparse, passive) |
3.2 µs |
Network::step (1 000 neurons, sparse, +STDP) |
3.4 µs |
Network::step_immutable (1 000 neurons, recall path, post-SoA) |
2.7 µs |
Brain::step (two regions × 1 000) |
7.7 µs |
encode_sentence (18 words) |
21 µs |
decode_strict (vocab 1 000) |
253 µs |
End-to-end load profile (note 41, against docker compose stack)
| Concurrent WS clients | Throughput | p50 / p99 latency | Server-mean |
|---|---|---|---|
| 1 | 138 ops/s | 7.2 / 8.9 ms | 5.8 ms |
| 10 | 430 ops/s | 22.5 / 41 ms | 7.5 ms |
| 50 | 436 ops/s | 116 / 197 ms | 7.6 ms |
| 100 | 432 ops/s | 229 / 486 ms | 7.6 ms |
Recall runs against an Arc<RwLock<Inner>> with a per-call
BrainState, so multiple recalls proceed in parallel. After the
SoA refactor (note 41), server-side latency is ~7.6 ms across all
concurrency levels — Brain step is now ~4.5 ms / recall, ws-stream
0.31 ms, decode 0.13 ms.
CI runs eight jobs on every push: fmt, clippy -D warnings,
test, doc-tests, deny, msrv, docs, benches (compile-only).
javis/
├── crates/
│ ├── snn-core/ ─ LIF neurons, STDP, iSTDP, homeostasis, BTSP, AER routing
│ ├── encoders/ ─ Text → SDR (DefaultHasher, k-of-n) + EngramDictionary
│ ├── eval/ ─ Token-efficiency benchmarks vs. naive RAG
│ ├── llm/ ─ Anthropic API adapter (real + deterministic mock)
│ └── viz/ ─ Axum + WebSocket server, 3D-force-graph frontend
├── notes/ ─ 43 research notes — every decision documented
├── scripts/ ─ End-to-end sanity check + load test (Python)
├── deploy/ ─ Prometheus + Grafana provisioning for docker-compose
└── assets/ ─ Logo and architecture diagram (programmatic SVG)
cargo test --release| Suite | Tests | Validates |
|---|---|---|
snn-core |
74 | LIF dynamics, pair / triplet STDP & iSTDP, homeostasis, BCM metaplasticity, intrinsic plasticity, heterosynaptic L2, structural sprout/prune, offline replay/consolidation, reward-modulated STDP / eligibility, BTSP soft bounds, BTSP plateau-eligibility (iter-67: tag accumulation, plateau-arm threshold, one-shot potentiation, disarm-after-silence, weight clamp, off-path bit-identity), E/I balance, multi-region routing, snapshot serde, assembly formation, bounds-checked APIs, heap pending queue, AMPA/NMDA/GABA channels, read-only step equivalence |
encoders |
26 | SDR union/overlap, hash determinism, top-k decode, threshold-floor decode (iter 44.1), injection, full pattern completion |
eval |
28 | RAG-vs-Javis token efficiency, Wikipedia scaling, intra-topic recall, contextual mode, scale-bench smoke, iter-65 / iter-66 / iter-66.5 reward-bench snapshots, axis-sweep harness, postmortem diagnostics |
llm |
3 | Anthropic adapter mock contract, token heuristic |
viz |
16 | WebSocket smoke, train+recall, ask both, snapshot round-trip, /health + /ready, /metrics, concurrency cap (deflaked iter-67-γ.4 chore), snapshot schema migration (v1→v2) |
| Doc-tests | 3 | Public quick-start examples in snn-core and encoders |
| Total | 150 | with zero clippy warnings workspace-wide (1 test ignored — long-running multi-region soak) |
Every iteration is logged in notes/. Each note is a single
hypothesis, a pre-fixed acceptance criterion, and the measurement that
either confirms or falsifies it. The chain is the public artefact.
Latest snapshot (iter-67-γ.1.1 Gate-B Class C · γ.4 pre-registered). The iter-66 CA3/CA1 split + iter-66.5 eval-aligned R-STDP did not produce a robust C1-target signal on its own; iter-67 introduced BTSP (Behavioral-Timescale Synaptic Plasticity, Bittner 2017 / Magee & Grienberger 2020) as the binding rule on R2-E → C1. Locked configuration γ.1.1 (
reports/gate_a_gamma_1_1_config.md) cleared Gate-A (3/4 seeds PASS at last-8 mean ≥ 0.05). The full Gate-B 8-seed run (reports/gate_b_gamma_1_1_8seed_summary.md) lands at 5/8 PASS, mean(last-8) = 0.0693 ± 0.0375 — Class (C) Partial per the locked acceptance matrix. Diagnostic: training-side metrics bit-identical across all 8 seeds; failures are eval-phase fingerprint discrimination, with all 3 FAIL seeds showing DEGRADING-or-flat per-cue trajectories. γ.1.1 binds via fingerprint geometry (w_ratio ≈ 1.000universally), not via weight magnitude. The (C) branch's locked fallback, iter-67-γ.4 (per-post target-gating with non-target depression), is now implemented + pre-registered (reports/gate_b_gamma_4_entry.md) awaiting the 8-seed compute on the same locked seed set {0..7}. → reports/gate_b_gamma_1_1_8seed_summary.md, reports/gate_b_gamma_4_entry.md, notes/67, notes/66.5, notes/66
20 iterations: core SNN, encoder, decoder, viz, persistence
24 iterations: hardening, CI, observability, deploy, scaling
The pair-association track. Each row = one hypothesis, pre-fixed acceptance, measurable outcome. Verdict: ✅ pass · ⚠ partial / diagnosis · ❌ fail · 🚀 architectural pivot.
| # | Headline | Verdict | Note |
|---|---|---|---|
| 44 | Plasticity stack: triplet-STDP, R-STDP, BCM, intrinsic, heterosynaptic, structural, replay | ✅ landed | → |
| 44.1 | Decoder confidence floor --decode-threshold: FP −86 %, token reduction +2× |
✅ | → |
| 45 | Reward harness: dopamine + eligibility tag exercised end-to-end | ⚠ no convergence | → |
| 46 | Teacher-forcing: 6-phase + R2 clamp + anti-causal STDP fix; clamp = 1.00 | ⚠ R1→R2 dominates | → |
| 47a | Forward-drive sweep + Diehl-Cook θ: first monotone learning signal at INTER_WEIGHT = 1.0 | ⚠ collapses ep ≥ 5 | → |
| 47a-pm | Postmortem: oscillatory bursts, θ effect 0.05 mV (< 0.3 % of LIF swing) — pivots iter-48 plan | ⚠ diagnosis | → |
| 48 | iSTDP-tightening (Vogels 2011): selectivity flips +0.014 stable | ⚠ acceptance 1.5/3 | → |
| 48-sat | Phase-A 16-ep saturation: peak ep 1–4, hard collapse ep 5; iSTDP cumulative over-inhibition | ❌ acceptance 0/3 | → |
| 49 | iSTDP bounds & schedule sweep (3 axes): 0/3 produce learning | ❌ iSTDP not the lever | → |
| 50 | Arm B reproduction --iter46-baseline: selectivity_index was wrong metric for 5 iterations |
⚠ measurement bug | → |
| 51 | Arm B 16-epoch saturation: top-3 mean 0.107 vs random 0.094, 95 % CI includes random | ❌ chance-level | → |
| 52 | Untrained control --no-plasticity: trained vs untrained Δ = 0.068, ≈ 2.2 σ |
⚠ Mess-Frage | → |
| 53 | Decoder-relative Jaccard (cross-cue + same-cue + Δ-of-Δ); 4 seeds × 16 ep | ❌ Δ-of-Δ = −0.121 | → |
| 54 | Hard-decorrelated R1 → R2 init (disjoint blocks per cue); paired t(3) ≈ −16, p ≪ 0.001 | ✅ Δ-of-Δ = +0.160 | → |
| 55 | Epoch sweep 16/32/64: per-doubling Δ −0.054 → −0.016, asymptote ~0.21 | ⚠ saturation | → |
| 56 | Clamp-strength sweep 125/250/500: trained 0.272 → 0.245 → 0.230, 5× tighter std at c500 | ⚠ magnitude-limited | → |
| 57 | Phase-length sweep 40/80/120: t40 best, t80 catastrophic, t120 recovers — non-monotone | ⚠ ceiling holds | → |
| 58 | Geometry-vs-plasticity diagnostic: vocab=32 → 64 raises trained_cross +0.192 | ✅ architecture floor | → |
| 59 | R2 capacity scaling: Δ deepens 13× (R2 2 000 → 4 000) but absolute floor moves only 0.04 | ⚠ branch-B mixed | → |
| 60 | DG bridge (R1 → DG → R2, k-of-n hashed SDRs): trained cross 0.45 → 0.03 (−94 %) | 🚀 architecture pivot | → |
| 61 | DG full replication 4 seeds × 32 ep: cross robust; 2/4 seeds erode same-cue (0.875, 0.898) | ⚠ recall instability | → |
| 62 | Recall-mode --plasticity-off-during-eval: same-cue = 1.000 on 4/4 seeds, post-eval L2 bit-identical |
✅ stability solved | → |
| 63 | Direct cue→target metric on DG brain: target_top3_overlap mean across epochs, threshold = max(0.05, μ + 2σ) = 0.0621 locked from calibration |
❌ Branch (B) FAIL · Δ̄ = −0.003, t(3) = −0.18 | → |
| 64 | Mechanism diagnosis (3 axes complete). Axis C value=0.3 (perforant + DG): α at 4 seeds (smoke Δ̄=+0.019, full Δ̄=+0.016, n_pos=3/4 both phases). Axis A + Axis B both narrow-window: every non-default value is sub-floor or locked-state (Δ = 0 bit-for-bit on most seed-value points); the iter-46 defaults are highly tuned |
⚠ Mechanism candidate (axis C) for iter-65 | → |
| 65 | Perforant path robustness check. Axis C value=0.3 at 8 seeds × 32 ep: Δ̄ = +0.0068, t(7) = +0.779, n_pos = 4/8 (chance level) → Branch (C) Reject. The 4-seed α was a sample-frequency artefact of a true ~50 % success-rate distribution. Original 4 seeds reproduced bit-identical to iter-64; new seeds 1, 3, 4 split mostly-negative; seed=99 deterministic outlier persists |
❌ 4-seed α was sample artefact | → |
| 66 | Deep-research literature pivot. 28 peer-reviewed sources (Marr, Treves & Rolls, O'Reilly & McClelland, Norman & O'Reilly, Schapiro, Cassenaer & Laurent, Bellec, Izhikevich, Frémaux & Gerstner, Bittner, Magee & Grienberger, Krotov & Hopfield, Ramsauer, Willshaw, Kanerva, …) converge on: current architecture lacks a CA1-equivalent heteroassociative readout. Recommendation: CA3/CA1 split (Mechanism M1) — new C1 layer with target-presence-gated three-factor R-STDP on R2 → C1, primary metric c1_target_top3_overlap |
🚀 Architecture pivot recommended | → |
| 66.5 | iter-66 readout pivot. Eval-aligned R-STDP on R2-E → C1: drop the canonical R2 target SDR from the teacher Phase 4 clamp so R-STDP trains on the natural cue-driven R2 pattern instead of the canonical pattern (Path-1 fix). Improves cue→target binding in single-seed smoke but multi-seed c1_target_top3_overlap still flat at 0 |
❌ R-STDP alone insufficient — pivot to BTSP | → |
| 67 | BTSP (Bittner 2017 / Magee & Grienberger 2020) on R2-E → C1. Plateau-eligibility kernel: per-synapse tag accumulates on every pre-spike with 200 ms decay; per-post-cell burst_trace arms plateau at ≥ 5 spikes / 30 ms; on disarm→arm transition, all incoming tagged synapses receive Δw = +strength × tag one-shot. iter-67-γ.1 / γ.1.1 sweeps locked an E/I-split partial-echo-state config (E=1.0, I=0.3, R2-isolation OFF). γ.1.1 cleared Gate-A (3/4 seeds PASS); Gate-B 8-seed run 5/8 PASS, mean(last-8) = 0.069 ± 0.038 → Class (C) Partial per locked acceptance matrix. γ.4 fallback (per-post target-gating + non-target depression) implemented and pre-registered, awaiting compute |
⚠ Class C — γ.4 next | → · Gate-A · Gate-B · γ.4 ENTRY |
Where we are. iter-63 closed the Jaccard chain by re-introducing
the iter-44/45 decoder-relative top3_accuracy on the DG-enabled brain.
iter-64/65 mechanism-diagnosis axes failed at 8 seeds — Branch (C)
Reject for the perforant-path α. iter-66 deep-research scan
recommended a CA3/CA1 split (Mechanism M1), and iter-66/66.5
implemented the C1 readout with target-presence-gated R-STDP. Multi-
seed c1_target_top3_overlap was still flat at 0 — R-STDP alone is
insufficient on the binding pathway.
iter-67 introduces BTSP (Bittner 2017 / Magee & Grienberger 2020)
as the plateau-gated retroactive-potentiation rule on R2-E → C1, with
a 200 ms eligibility window that bridges the iter-46 cue → delay →
prediction → teacher window pair-STDP cannot reach. Locked
configuration γ.1.1 cleared Gate-A (3/4 seeds PASS) —
the first iter-66+ configuration with a multi-seed-confirmed
non-zero C1 readout signal. Gate-B 8-seed run (reports/ gate_b_gamma_1_1_8seed_summary.md) lands at 5/8 PASS, mean(last-8)
= 0.0693 ± 0.0375, t(7) vs 0 = 5.23 — Class (C) Partial.
Per-seed diagnostic: all 3 FAIL seeds show DEGRADING-or-flat
trajectories; all 5 PASS seeds show improving trajectories;
training-side metrics are bit-identical across all 8 seeds.
γ.1.1 binds via fingerprint geometry (w_ratio ≈ 1.000 universally),
not via per-class weight magnitude.
The (C) branch's locked fallback iter-67-γ.4 — per-post target-gating
with non-target depression is now implemented in crates/snn-core/ src/btsp.rs (BtspParams::non_target_depression_strength) +
pre-registered in reports/gate_b_gamma_4_entry.md with the same
locked seed set {0..7}, the same Gate-B acceptance matrix, and four
explicit hypotheses (H1 lift FAIL seeds, H2 preserve PASS seeds, H3
weight separation w_ratio > 1.05, H4 trajectory pattern eliminated).
γ.1.1 numerics are bit-identical when --c1-btsp-non-target-depression- strength = 0.0 (verified: 6/6 BTSP tests + 11/11 eval tests PASS
unchanged).
The plasticity rules and architectural choices come from current SNN literature. Key papers:
- A. C. Vogels et al. — Inhibitory Plasticity Balances Excitation and Inhibition · Science 2011
- A. D. Milstein et al. — Rapid memory encoding in a recurrent network model with BTSP · PLOS Comp Bio 2023
- L. Bittner et al. — Behavioral Time Scale Synaptic Plasticity (Nature Comms 2024)
- Caligiore et al. — Selective inhibition in CA3 · PLOS Comp Bio 2024
- L. Hu et al. — Dynamic and selective engrams emerge with memory consolidation · Nature Neurosci. 2024
PolyForm Noncommercial 1.0.0 — research, education, and personal study only. Not for commercial use, production deployment, or any operational system that affects real people, real decisions, real services, or real infrastructure.
The full text is in LICENSE along with a project-specific,
non-binding research-use addendum
spelling out the licensor's intent.
Researchers using Javis are expected to:
- Run experiments in controlled / sandboxed environments, not on production infrastructure.
- Not deploy any derivative work as a service to third parties.
- Cite the project and the relevant
notes/NN-*.mditeration when publishing results. - Disclose modifications to the codebase when sharing experimental results that depend on those modifications.
Commercial licenses are not currently offered. Inquiries: open an issue.
The Cargo.toml license field uses the standard
LicenseRef-PolyForm-Noncommercial-1.0.0 SPDX-LicenseRef form;
cargo-deny [licenses].private = { ignore = true } is set in
deny.toml so transitive dependencies are still license-checked
against the SPDX allow-list, but the workspace's own crates (all
publish = false) are not flagged for the non-SPDX identifier.