Skip to content
Oak Dev-inter edited this page Apr 23, 2026 · 1 revision

v0.8.0 — Tier Cascade + Local Embeddings

Released: 2026-04-15 Theme: Route tasks across tiers within one mission. Add semantic search over project knowledge.

TL;DR

One mission is rarely all-hard or all-easy. v0.8.0 introduces Tier Cascade: Opus plans, Sonnet writes, Haiku greps. Each step runs on the cheapest tier that can do it competently. A new local embedding layer (sentence-transformers) gives /kasi-search semantic queries over .kasidit/knowledge/ without phoning home.

Why

The preceding release (v0.7.4) locked tier behavior at session start — whichever model the user opened with drove the entire mission. That overcharged Opus for mechanical work and underserved Haiku on anything requiring reasoning. v0.8.0 decomposes missions into sub-tasks routed by reasoning requirement:

  • Plan (what to do, why, in what order) → Opus
  • Work (write code, follow plan) → Sonnet
  • Grep / scan / list (mechanical lookups, audits by checklist) → Haiku

This matches SWE-bench observation: Opus wastes budget on grep calls; Haiku hallucinates on architecture.

What's new

Tier Cascade orchestration

A mission entering Tier Cascade mode follows:

Opus: mission → plan (file list, step sequence, trade-offs)
  ↓ dispatch brief to Sonnet
Sonnet: plan → implementation (code, tests, docs)
  ↓ dispatch brief to Haiku (optional)
Haiku: implementation → audit (checklist pass, grep verification)
  ↓
Opus: synthesize → report

Each dispatch uses the same brief format later formalized in v0.9.1 (MISSION / INPUTS / CONSTRAINTS / EXPECTED OUTPUT / PRIOR CONTEXT).

/kasi-cascade command

Explicit entry point for cascade orchestration. User invokes when a mission is large enough to benefit from multi-tier routing. Small missions stay single-tier.

Local embedding layer

sentence-transformers model (default: all-MiniLM-L6-v2, 384-dim) runs on the user's machine. Embeds every markdown file under .kasidit/knowledge/ and indexes them. Queries are local — no API calls, no data leaves the machine.

Entry point: /kasi-search <query> returns ranked snippets.

/kasi-search command

/kasi-search "how did we handle CSRF last time"

Returns top-k semantic matches from .kasidit/knowledge/ with file path, line range, and similarity score. Intended for "we solved this before somewhere" recall.

What changed vs v0.7.4

  • Added: Tier Cascade orchestration as a formal pattern
  • Added: /kasi-cascade and /kasi-search commands
  • Added: local embedding layer under plugins/kasidit/embedding/
  • Refined: dispatch brief format (finalized in v0.9.1)

Breaking changes

None. Tier Cascade is opt-in via /kasi-cascade; /kasi-search requires embedding setup but does not break anything else.

Migration

First run of /kasi-search downloads the embedding model (~80 MB) and indexes .kasidit/knowledge/. After that, queries are instant.

Tier Cascade requires access to multiple tiers in the current environment. In single-tier setups (e.g. Haiku-only), /kasi-cascade falls back to single-tier behavior.

Known limitations

  • Embedding model is English-dominant. Thai / mixed-language content works but ranks lower.
  • Index is not watched — re-run /kasi-search --rebuild after large knowledge updates.
  • Cascade dispatch requires the harness to support multi-model routing in one session.

See also

Kasidit

Core

Version History

Concepts

Commands

Agents

Deprecated v0.10 (stubs → audit-specialist --focus=..., removed in v0.11)

Clone this wiki locally