Skip to content

docs: plan sensory model plasticity workstream#1088

Merged
joelteply merged 2 commits into
canaryfrom
docs/sensory-experiential-plasticity
May 11, 2026
Merged

docs: plan sensory model plasticity workstream#1088
joelteply merged 2 commits into
canaryfrom
docs/sensory-experiential-plasticity

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Summary

  • add a repo-owned sensory model and experiential plasticity plan
  • rank current Qwen/Qwen-Omni candidates for alpha sensory work
  • define forge-first policy: admit a working open model first, then forge/prune/quantize Continuum variants that beat baseline
  • require modality preservation checks after forge/prune/defrag/quant/kernel optimization so image/video/audio paths are not optimized away
  • link the workstream from the alpha gap analysis

Coordination

Validation

  • git diff --check
  • npx markdownlint-cli2 docs/architecture/SENSORY-MODEL-AND-EXPERIENTIAL-PLASTICITY-PLAN.md docs/planning/ALPHA-GAP-ANALYSIS.md
  • precommit hook: TypeScript compilation passed; browser ping passed
  • pre-push hook: TypeScript clean; ESLint baseline-tolerant passed with 2 fewer baseline errors; Rust skipped as docs-only

@joelteply joelteply merged commit 056707f into canary May 11, 2026
2 checks passed
@joelteply joelteply deleted the docs/sensory-experiential-plasticity branch May 11, 2026 21:04
@joelteply
Copy link
Copy Markdown
Contributor Author

Mac peer review — strong doc, ship-ready (can't formally approve, GitHub treats us as same author).

Verified externally:

  • continuum-ai/experiential-plasticity-paper repo exists (HTTP 200) — citation lands clean.
  • ggml-org GGUF mirrors for Qwen2.5-Omni-7B + Qwen3-Omni-30B-A3B confirmed in llama.cpp upstream docs/multimodal.md matrix (the official vision+audio support set).

Strengths:

  • "Personas are sensory entities, not text bots" framing is exactly Joel's no-compromise bar
  • Forge-First Policy correctly distinguishes "first working model is a measured baseline" from "fallback" — explicit acknowledgement of Joel's "fallbacks ILLEGAL" rule with the right semantic carve-out
  • Modality preservation gate after every forge/prune/quant pass is the key guard that prevents accidentally optimizing-away vision/audio paths
  • Capability/range-based selection vocabulary (needs: family ~= qwen, intelligence >= full, ...) replaces hardcoded model names — enables registry-driven swap without code churn
  • Concrete VDD field set (16 items per row) is operationally usable — RTX bench task b87tuhohp will populate exactly this shape
  • Deletion Targets section names what to remove, no PR sediment
  • Forge variants list (qwen3.6-35b-a3b-sensory-forged, qwen3-omni-30b-a3b-blackwell-forged) names concrete forge artifacts the foundry will produce

One substantive observation (NOT a blocker):
Hardware Targeting table mentions "Low-memory hosts" but the candidate scout doesn't name a specific M1 8GB MBA candidate — leaves Joel's no-compromise bar unsatisfied for that tier. Worth adding either:

  • Mungert/Qwen2.5-Omni-3B-GGUF Q4_K_M (2.16GB, audio_out only if llama.cpp gets the 3B variant working — currently uncertain), OR
  • ggml-org/gemma-4-e2b-it Q4 (vision native; audio_in pending PR #21309 in upstream llama.cpp) — Gemma 4 is the third officially-supported vision+audio family per upstream multimodal.md, alongside Qwen2.5-Omni and Qwen3-Omni
  • OR explicit two-model rig: Qwen3-VL-2B (vision) + Qwen3-ASR-0.6B (audio_in) — uglier architecturally but TODAY-feasible without the 3B-Omni / Gemma-4-audio uncertainty

The doc as-shipped is fine — this is "next-revision input," not a hold. Bench will inform which 8GB candidate actually works.

LGTM, ship. Three-leg install coverage now adjacent to a strong sensory roadmap.

joelteply added a commit that referenced this pull request May 13, 2026
Codex methodology flag 2026-05-11: image prompts must use randomized
opaque fixture names with manifest assertions and negative controls;
repeated cat.jpg-style prompts let text-only models bluff vision.

Adds test-data/images/manifest.json: pairs the 7 already-committed
opaque fixtures with SHA-256, content_kind, leakage_risk classification,
expected_facts (descriptive ground truth), ocr_text (literal text overlay
if any), grade_questions, and grade_expected_substrings (passing criteria).
Manifest authored by direct visual inspection of each fixture, no
filename or source-URL consultation.

Adds scripts/bench-blackwell-vl-v2.sh: bench harness reading the
manifest, running llama-mtmd-cli against each fixture with the model
under test, capturing stdout (model response), scoring against
grade_expected_substrings, reporting per-fixture PASS/FAIL plus
summary. Stages fixtures via tar pipe (Docker Desktop WSL2 bind-mount
limitation workaround); reuses omni-bench-work named volume from
scripts/bench-blackwell-vl.sh.

Adds docs/benchmarks/sensory-v2-manifest-results.md: measured numbers
on RTX 5090 sm_120 for Qwen2.5-Omni-7B (5/7 PASS) and
Qwen3-Omni-30B-A3B-Instruct (6/7 PASS). 30B-A3B produces consistently
richer responses than 7B on identical prompts. Both models OCR exact
text overlays from meme fixtures (impossible without real pixel
processing — proves vision is active, not template-bluff). Both fail
on the WebP fixture with empty stdout — new upstream gap surfaced for
llama-mtmd-cli WebP decode.

Per Joel's #1072 sensory persona alpha contract + Codex's #1088
plasticity workstream doc + Position 3 Windows/RTX VDD lane. Builds on
PR #1078 (V1 baseline). Does not modify models.toml or the resolver.

Co-authored-by: Test <test@test.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant