v0.3.0 — Structured lateral exchange (validated)

The first release with a statistically validated answer to PTG's core
question: does decentralized lateral exchange between cortical columns improve
answer quality over a monolithic equal-compute baseline?

Short answer: the raw lateral-text medium is quality-neutral to negative
and does not scale. The structured medium (bounded claim-excerpts + a
synthesis directive), on a 4B-class model, is quality-positive and stable across
4 → 150 columns (~80–85% win over the equal-call no-lateral control, p ≈ 0).

This is a research release, not a production claim. Every number is directional,
pre-registered, and caveated in the findings docs.

Highlights

The mechanism: structured lateral exchange

LateralContextMode::{Raw, Structured} in ptg-runtime. Structured mode
injects a bounded, char-safe claim-excerpt of each neighbor's prediction plus
a synthesis directive — never the full verbatim prediction.
--lateral-mode raw|structured in ptg and ptg-bench.

The evidence arc (raw → structured, 4 → 150 cols)

Run	lateral win	echo	notes
raw, 4-col e2b	coin flip (11v12)	25%	mechanism activates, no quality gain
raw, 150-col e2b	14%	57%	catastrophic at scale
structured, 4-col e4b	78.4%	11%	the medium, not the concept
structured, 50-col e4b (powered)	85.1%	6.7%	p ≈ 10⁻⁶
structured, 150-col e4b (powered)	82.4%	8.4%	p ≈ 0, CI [78%, 86%]

Length-confounding ruled out at every scale (lateral wins even when its draft is
shorter). The effect saturates at ~80–85%; it does not keep strengthening
past ~50 columns.

Infrastructure: bounded column concurrency

The 150-column ceiling that blocked e4b for most of development was not
server capacity — it was unbounded client fan-out (join_all over all columns
fired 150 concurrent requests at a 4-slot server). Fixed with
CorticalMesh.max_concurrent_column_ticks + ptg-bench --column-concurrency.

Benchmark + judge methodology

ptg-bench (conditions, per-tick observability, routing, scale flags) and
ptg-judge (programmatic perturbation delta primary + blind LLM corroborating
judge, echo screen, determinism gate, length control). Pre-registered decision
bars set before every run.

Honest caveats (please read before citing)

Survivorship at 150 cols: 3/15 mesh runs failed (persistent HTTP 500 in
MATH columns, retry-exhausted) and were excluded from the powered judge. If
those would have been low-quality, exclusion inflates the 82.4%. This is the
most important caveat and the top open item.
Single model: only gemma-4-e4b tested at scale.
Temperature-0 nondeterminism: the server is not perfectly deterministic at
temp 0; some control pairs were excluded as unstable.
The 150-col 1p1r run's 93% was small-sample optimism; the powered 82.4% is the
figure to cite.
ptg-belief (typed belief/evidence layer) is deferred — structured text
exchange works without it.

What's next

Survivorship follow-up (a 0-mesh-failure run).
A4 explicit self-revision control (lateral exchange vs "reconsider your
answer" at equal call budget).
Semantic embedding convergence (§9.3), blocked on the embeddings endpoint.

Full evidence + methodology: see docs/ROADMAP.md and the
docs/STRUCTURED_LATERAL_*.md series.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PTG v0.3.0 — Structured lateral exchange (validated)

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

v0.3.0 — Structured lateral exchange (validated)

Highlights

The mechanism: structured lateral exchange

The evidence arc (raw → structured, 4 → 150 cols)

Infrastructure: bounded column concurrency

Benchmark + judge methodology

Honest caveats (please read before citing)

What's next

Uh oh!