Skip to content

PTG v0.2.0 — Phase 3: convergence depth + lateral routing

Pre-release
Pre-release

Choose a tag to compare

@dirvine dirvine released this 25 Jun 14:59
· 30 commits to main since this release

PTG v0.2.0 — Phase 3: convergence depth + lateral routing

⚠️ Pre-release. A research artifact for internal team experimentation, not a stable API. PTG requires a live llama-server and a gated Gemma model.

Phase 3 lands two new mechanisms built around the lateral homogenization signal we surfaced in v0.1.0: the mesh tends to converge toward the dominant interpretation and erase minority frames. v0.2.0 gives you tools to measure that and control it.

What's new since v0.1.0

1. Diversity-preserving lateral routing (--routing-policy) — Phase 3B

Until now every column heard from all its neighbors. That homogenizes: a confident column overwrites its neighbors' opinions. v0.2.0 lets a column listen selectively:

# only hear the 2 most confident neighbors
--routing-policy confidence-top-k --routing-k 2

# diversity mode: hear neighbors that say DIFFERENT things, not just confident ones
--routing-policy diversity --routing-k 2

Diversity mode is the homogenization mitigation. It anchors on the most confident neighbor, then deliberately picks neighbors that disagree (by token overlap) with what it's already heard — so dissident/niche frames survive instead of being voted away. Every routing decision is captured per-tick in tick_outputs.routes (route_weight + confidence per source), so homogenization can now be measured, not just observed.

2. Prediction-stability convergence (--min-prediction-similarity) — Phase 3A

The mesh used to stop when self-reported confidence said "I'm sure." But a confident model just says 0.95 every tick — the old signal couldn't tell a settled mesh from a stuck one. Added a second stop signal based on whether predictions have actually stopped changing (word overlap), independent of confidence:

--min-prediction-similarity 0.85

The CLI now prints which criterion stopped the epoch:

convergence: prediction token-similarity stabilized

Pilot signal (single run — a lead, not a result)

On the operator/automation-fault prompt, the 9-column torus behaved differently under the two routing policies:

  • all: the psych column collapsed to physics "catastrophic failure threshold" language at 0.98 conf
  • diversity --routing-k 2: the psych column retained its own operator frame at 0.92 conf

That's the central Phase 3 hypothesis — topology/routing controls homogenization — showing up on a live run. Do not cite this as a finding; it's one run. It's a lead that needs the A3 scaled benchmark to become evidence.

Quick start (unchanged from v0.1.0)

git clone https://github.com/saorsa-labs/brain && cd brain
git checkout v0.2.0

scripts/start-gemma4-qat.sh        # http://127.0.0.1:18136

cargo run -p ptg-cli --bin ptg -- \
    --vllm-url http://127.0.0.1:18136 --model gemma-4-e2b-qat \
    --column-pack examples/column-packs/abstraction-ladder-9.toml \
    --topology torus --torus-width 3 --torus-height 3 --columns 9 \
    --routing-policy diversity --routing-k 2 \
    --min-ticks 2 --ticks 3 --max-tokens 2048 --temperature 0 \
    --input "<your prompt>"

Three things to try

# A) Baseline: everyone hears everyone (the old behavior)
--topology torus --torus-width 3 --torus-height 3 --columns 9

# B) Same prompt, diversity routing — do niche columns survive?
--topology torus --torus-width 3 --torus-height 3 --columns 9 \
  --routing-policy diversity --routing-k 2

# C) Deliberately ambiguous prompt (jumbled words) — does diversity keep the
#    mesh from collapsing to one reading?

What we'd love to hear back: does diversity routing help on your prompts (more useful, less groupthink) or hurt (columns can't agree)? Your prompts will tell us more than ours did.

What's in this release (cumulative)

  • 5-crate Rust workspace, panic-free, clippy-clean (-D warnings), 95 tests
  • ptg CLI: --topology {default,ring,ring-bi,torus,fully-connected,small-world}, --column-pack, --routing-policy {all,confidence-top-k,diversity}, --routing-k, --min-ticks, --min-prediction-similarity, --max-tokens, --temperature, --dry-run, --probe, multimodal --image-url
  • Convergence: confidence (mean/delta/cosine) + model-independent prediction-stability, with convergence_reason reporting
  • Routing: All / ConfidenceTopK / diversity-preserving (MMR via token-Jaccard), observable per-tick via TickOutputs.routes
  • scripts/start-gemma4-qat.sh portable launcher
  • ptg-bench + ptg-judge binaries
  • Docs: Tutorial, Specification, Architecture, Roadmap, Benchmarking

What's NOT proven yet (being honest)

  • All homogenization/routing observations are single runs, not statistics. Real on the prompts we tried; not enough runs to call a result.
  • Routing decisions are in the runtime's tick_outputs.routes and will surface in ptg-bench JSON once extended. The plain ptg run shows the effect in predictions, not a route log.
  • Semantic cosine convergence (§9.3) remains blocked — the live server returns HTTP 501 on /v1/embeddings.

Validation

  • cargo fmt --all --check
  • cargo clippy --all-features --all-targets -- -D warnings -D clippy::panic -D clippy::unwrap_used -D clippy::expect_used ✅ (0 issues)
  • cargo check --workspace --all-targets
  • cargo test ✅ (95 passed, 0 failed)
  • Live-validated against gemma-4-e2b-qat (both new mechanisms, multiple topologies).

License

Dual-licensed under MIT OR Apache-2.0.