Skip to content

Add genome visualization to root README with live video#155

Merged
joelteply merged 4 commits into
mainfrom
feature/root-readme-genome-video
Nov 5, 2025
Merged

Add genome visualization to root README with live video#155
joelteply merged 4 commits into
mainfrom
feature/root-readme-genome-video

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Summary

Updates the root README to showcase the genome visualization system with live video demonstration.

Changes:

  • Replace static hero screenshot with continuum-live.mov video
  • Add new "Genome Visualization" section with detailed documentation
  • Include user-interface.png showing the genome panel close-up
  • Document the diamond grid system (Learning/Cloud/RAG/Genome capabilities)

Why:
The genome visualization is a key differentiator showing each AI's fundamental identity in real-time. Users can see which AIs can learn, where they run, if they have extended memory, and what specialized capabilities they possess.

Visual Assets:

  • screenshots/continuum-live.mov - 8MB video of live multi-AI collaboration
  • screenshots/user-interface.png - Close-up of genome panel

🔧 Generated with Claude Code

joelteply and others added 2 commits November 4, 2025 20:35
- Add continuum-live.mov video demonstrating real-time multi-AI collaboration
- Add user-interface.png showing genome panel visualization
- Assets ready for root README update

🔧 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Replace static screenshot with continuum-live.mov video demonstration
- Add "Genome Visualization" section showcasing AI identity system
- Document diamond grid showing Learning/Cloud/RAG/Genome capabilities
- Emphasize real-time evolution visualization as personas gain abilities

🔧 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings November 5, 2025 02:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the README to enhance the visual presentation and add comprehensive documentation for the Genome Visualization feature. The changes replace a static image with a video demonstration and introduce detailed explanations of the AI genome identity system.

  • Replaced static UI image with video demonstration at the top of the README
  • Added new "Genome Visualization" section documenting the AI persona capability display system
  • Updated caption to emphasize real-time genome capabilities

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread README.md Outdated
Comment thread README.md Outdated
joelteply and others added 2 commits November 4, 2025 20:40
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@joelteply joelteply merged commit dfff065 into main Nov 5, 2025
4 checks passed
@joelteply joelteply deleted the feature/root-readme-genome-video branch November 5, 2025 02:42
joelteply added a commit that referenced this pull request Nov 30, 2025
…video

Add genome visualization to root README with live video
joelteply added a commit that referenced this pull request Apr 7, 2026
EXPERIENTIAL-PLASTICITY:
- New §9.5 subsection: gpt2-medium re-run on corrected pipeline
- Reframes §4 transfer function as "early-cycle behavior controller will encounter"
- Documents that the controller's quality-aware stopping criterion makes the
  cycle 9 anomaly structurally unreachable in production
- Adds OUTCOME D framing: closed-loop controller enforces the transfer function
- Adds four-metric comparison subsection with the activation > saliency >> L2 > gradient ranking
- Calls out PLASTICITY-COMPACTION's gradient-magnitude trick as a publication-blocking question

VALIDATED-TENSOR-SURGERY:
- New section: Layer 6, the structural fix that closes the bug class
- Documents the literal 62→7→501 historical bug pattern catch
- New section: The four-metric comparison empirical result
- Hypothesizes why activation alone beats saliency at small calibration
- Recommends activation-magnitude as the default forge pipeline metric

KASH-FEEDBACK.md:
- Appended results from both experiments
- OUTCOME D framing for gpt2 result
- Three observations on the four-metric finding
- Three questions for Kash's read

Refs continuum #841 (gpt2 re-run, partial OUTCOME D)
Refs continuum #842 (Layer 6 invariant, shipped)
Refs continuum #844 (four-metric comparison, surprise result)
Refs sentinel-ai #155 (importance metric finding, now contextualized)
joelteply added a commit that referenced this pull request Apr 8, 2026
* Paper stub: Validated Structured Pruning for Consumer Hardware

Companion to Experiential Plasticity. Documents the layered test harness
for tensor surgery validation. Two real bugs in production pruning code
were caught during harness construction:
1. LoRA-on-pruned-hooks corrupts model on hook removal
2. Config drift after defrag breaks save/load roundtrip

Title, abstract, and section outline. Full paper to follow as Layers 4-6
of the harness are built.

* Paper: bugs table and Layer 4 update

The harness now caught 5 real bugs during construction:
1. LoRA-on-pruned-hooks corruption (FIXED)
2. Config drift after defrag (FIXED)
3. Hybrid attention models (TRACKED)
4. L2 norm importance is unreliable (RESEARCH FINDING)
5. Pruning without retraining is destructive (RESHAPES the experiential
   plasticity narrative — recovery comes from fine-tuning, not pruning)

* Papers: write down 5 findings into both experiential and validated-pruning

Experiential paper:
- New section 9.5: Validation and a Reframing of the Plasticity Story
- Documents LoRA-on-pruned-hooks bug that produced phantom +88% improvements
- Documents L2-norm importance metric finding (anti-correlated with importance)
- Reframes the central claim: recovery comes from fine-tuning, not smart pruning
- Calls for no-prune equal-budget fine-tune baseline as required ablation
- Adds reference to companion paper

Validated pruning paper:
- New section: Findings In Detail (5 findings, each with empirical evidence)
- Each finding has the failure mode, the empirical signature, and the fix
- Frames validation harnesses as required artifacts for pruning papers

* Papers + kash-feedback: gpt2 OUTCOME D, four-metric finding, Layer 6

EXPERIENTIAL-PLASTICITY:
- New §9.5 subsection: gpt2-medium re-run on corrected pipeline
- Reframes §4 transfer function as "early-cycle behavior controller will encounter"
- Documents that the controller's quality-aware stopping criterion makes the
  cycle 9 anomaly structurally unreachable in production
- Adds OUTCOME D framing: closed-loop controller enforces the transfer function
- Adds four-metric comparison subsection with the activation > saliency >> L2 > gradient ranking
- Calls out PLASTICITY-COMPACTION's gradient-magnitude trick as a publication-blocking question

VALIDATED-TENSOR-SURGERY:
- New section: Layer 6, the structural fix that closes the bug class
- Documents the literal 62→7→501 historical bug pattern catch
- New section: The four-metric comparison empirical result
- Hypothesizes why activation alone beats saliency at small calibration
- Recommends activation-magnitude as the default forge pipeline metric

KASH-FEEDBACK.md:
- Appended results from both experiments
- OUTCOME D framing for gpt2 result
- Three observations on the four-metric finding
- Three questions for Kash's read

Refs continuum #841 (gpt2 re-run, partial OUTCOME D)
Refs continuum #842 (Layer 6 invariant, shipped)
Refs continuum #844 (four-metric comparison, surprise result)
Refs sentinel-ai #155 (importance metric finding, now contextualized)

* PLASTICITY-COMPACTION: §4.1.4.2 per-tier negative result + §4.1.5 distillation-first pivot

Three new subsections in §4.1, capturing tonight's negative results across
the dense-model forge branch and proposing the structural reframe that the
empirical work converged on.

§4.1.4.2 — Per-tier Pareto comparison (negative result):
  The aggressive-quantization Pareto test from §4.1.4.1 is now run end-to-end.
  v2-7B forge and unmodified Qwen2.5-Coder-7B base, both quantized to Q5_K_S /
  Q3_K_M / Q2_K via the same llama.cpp toolchain, both evaluated via the
  same patched vLLM-GGUF backend that anchored against Qwen's published
  61.6/53.0 to within +0.6/+0.7 (deterministic across 5+ runs):

    Tier        v2-7B  base   Δ      Winner
    Q5_K_S 5.0G 55.5   63.4   -7.9   base by +14% on quality/vram
    Q3_K_M 3.6G 54.3   59.8   -5.5   base by +10% on quality/vram
    Q2_K   2.9G 42.7   43.3   -0.6   tie within run noise

  Base+quant Pareto-dominates the v2-7B forge at every tier we tested. The
  closest the forge gets to parity is Q2_K (within run noise). By the
  §4.1.4.1 product-relevance criterion, the v2 forge methodology as
  currently constituted does not produce a useful product on the
  Qwen2.5-Coder-7B family.

  Three independent failure modes ruled out as fixes for the residual gap,
  all on the same v2-7B base, all with disciplined cause-of-the-gap
  comparison:
    1. More cycles + more training: WORSE (54.9 → 46.3)
    2. Held-out-aware code calibration: NO IMPROVEMENT (54.9 → 53.7)
    3. Aggressive quantization: NO (base+quant wins at every tier, this section)

  Each was the leading hypothesis when tested. Each was falsified. Three
  negative results in succession on three independent fix candidates is
  strong empirical evidence that the activation-magnitude head-pruning +
  LoRA-recovery approach does not have a Pareto-improving sweet spot for
  dense base models that already have good quantization options,
  regardless of which knob in the strategy space is tuned.

§4.1.4.3 — Product positioning implications:
  Two non-overlapping product positions where the forge has a defensible
  value proposition, and dense Qwen2.5-Coder-7B-class models with good
  Q3/Q2 quantizations are not one of them:

    1. Distillation-first compaction (any base model, dense/MoE/hybrid)
       — see §4.1.5 below
    2. Pre-removable expert pruning + structural compaction (MoE/hybrid/
       oversized targets that base+quant cannot reach at all because the
       base does not fit on the target hardware even at Q2_K)

  The Qwen3.5-35B-A3B (target A) and Qwen3.5-397B-A17B (target B grid
  moonshot) work falls in product position #2. Their value is making
  models reachable that the alternative compaction methods cannot reach,
  not matching a base+quant alternative at the same tier.

  The dense-model forge work is suspended until distillation-first lands.

§4.1.5 — Distillation-first compaction (the next-iteration methodology
proposal):
  The empirical pattern across §4.1.3.1, §4.1.3.2, §4.1.4.1, and §4.1.4.2
  is consistent enough to motivate a structural pivot in the forge's
  primary compaction mechanism, and the substrate work to support that
  pivot has already landed (compensation_lora.py + test_compensation_lora.py
  + COMPENSATION-LORA-DESIGN.md, all committed in a previous PR commit).

  The pivot inverts the v2 dependency: instead of "structured prune +
  LoRA recovery against fine-tuning loss", the new mechanism is "any
  transformation + distill against the unmodified teacher's hidden states".
  Pruning, quantization, modality fusion, context extension all become
  slot-in transformations that the (transform, distill, eval) loop
  recovers from independently. The methodology becomes a search procedure
  over the transformation space, with distillation as the convergence step
  and the per-tier metric as the stopping criterion.

  The smoke test on distilgpt2 passed all five stability checks
  (tokenizer alignment, hidden-state magnitudes within 2× across layers,
  loss decreased monotonically -39.35% relative, per-layer losses balanced
  within 3.14× of median, no NaN/inf). The math is sound at small scale;
  production scale-up to 7B is unblocked by RUNNING the script, not by
  writing more code.

  §4.1.5 results paragraph queued: take the v2-7B artifact from row 5,
  apply compensation LoRA with Qwen2.5-Coder-7B base as teacher and a
  held-out-aware calibration mixture, measure HumanEval through the
  same calibrated pipeline. Success criterion: HumanEval pass@1 ≥ 58.0
  (a 3-point improvement, just outside the calibration tolerance band).
  If at-or-above: distillation-first is empirically validated, dense-model
  forge branch unfreezes, §4.1.4 row 5 gets a successor row. If below:
  follow the failure escalation path documented in COMPENSATION-LORA-DESIGN.md
  to the cross-layer skip path.

  Independence from the moonshot work is explicit: distillation-first is
  the dense-model branch's path forward, MoE/hybrid expert pruning is a
  separate substrate path, the two branches advance in parallel.

Plus durability adds:
- docs/hf-deprecation-notices/qwen2.5-coder-14b-compacted-DEPRECATION.md
  (the user-facing model card replacement for the v1 deprecated artifact)
- docs/papers/NEUROPLASTIC-SUBSTRATE.md
  (the architectural-thesis paper from earlier in the work session)

References:
- sentinel-ai PR #161 (the substrate work that produced these results)
- sentinel-ai#160 / #161 / #163 / #164 / #165 (the issues these subsections
  document the resolution paths for)
- Qwen2.5-Coder Technical Report Table 5 (the anchor for the per-tier
  measurements)

* PLASTICITY-COMPACTION §4.1.3.4 + forge template architecture rule

§4.1.3.4: The importance-metric calibration lesson generalizes across
structural unit (heads → experts).

Empirical anchor: continuum-ai/qwen3-coder-30b-a3b-compacted-19b-256k v1
(alloy hash aa61c4bdf463847c). Hardware-measured 88.4 HumanEval / 86.0
HumanEval+ against unmodified Qwen3-Coder-30B-A3B-Instruct base anchor
at 92.1 / 89.0, both on RTX 5090 + llama.cpp Q5_K_M in the same eval
pipeline. The artifact carries the router-gate-L2-norm baseline (78.7
HumanEval) in priorMetricBaselines[] as the negative-baseline empirical
control that makes the §4.1.3.4 claim falsifiable from the published
artifact alone.

The structural lesson: the metric-calibration pattern from §4.1.3.1
(dense head pruning) recurs at the MoE expert level. Router-gate-L2-norm
is the architectural-only equivalent of activation-magnitude-only head
importance, and replacing it with calibration-aware activation counts
on a held-out code corpus closes +9.7 HumanEval points on the same
prune budget, no fine-tuning, no compensation. ANY importance metric
for ANY prunable unit (heads, experts, layers, future structural units)
must be derived from task-conditioned activation profiling on a held-
out corpus that reflects the artifact's intended workload. The two
data points form a methodology curve, not a single anomaly.

The compensation v2 step (KL distillation on top of the calibration-
aware student to push from 88.4 → projected 90+) is currently blocked
on transformers' caching_allocator_warmup pre-allocating an fp16 buffer
equal to full model size before bnb 4-bit quantization takes effect,
exceeding total VRAM on a single 32 GB GPU even with both teacher and
student nominally 4-bit. The architecturally correct fix is offline
teacher-logit precomputation (phase 1: load teacher alone, dump
logits; phase 2: unload; phase 3: load student alone, train against
on-disk logits). This is the next sentinel-ai sprint and is documented
in §4.1.3.4's "next experimental wave" paragraph.

CLAUDE.md: Forge Template Architecture rule.

The qwen3-coder-30b-a3b-compacted-19b-256k v1 publish required ~6
manual edits to fix paper-speak hallucination, naming conventions,
tag overflow, headline subtitle bugs, and benchmark renderer
fallthrough — every one a manual touch on hand-authored prose. The
architectural target going forward is: all the fields a forge run
needs to populate an alloy MUST live as Continuum entity data inside
a ForgeRecipe entity. The forge takes the recipe entity as input,
runs the prune/quant/eval stages, and emits the populated alloy as
OUTPUT. The forge never consumes a hand-authored alloy; the foundry
generates it. Recipe entity carries the prose fields the model card
renders (description, userSummary, tags, methodologyPaperUrl,
limitations[]) plus the source/stages/calibration/quant tier
configuration. ForgeArtifact entity is the recipe + the eval results.
publish_model.py reads the ForgeArtifact, not a hand-authored alloy
file. This is the next sprint after the offline-logits architecture.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants