Skip to content

Optimization Passes

Sasha Lopashev edited this page Jun 27, 2026 · 1 revision

Optimization Passes

Migaki optimization is a sequence of transformations over mIR.

Each pass should:

  • declare its name and version,
  • list required input capabilities,
  • return an updated plan,
  • emit a plan diff,
  • emit evidence,
  • emit warnings when assumptions are uncertain.
interface MigakiPass {
  name: string
  version: string
  inputCapabilities?: string[]
  outputCapabilities?: string[]
  apply(plan: MIRPlan, context: PassContext): Promise<PassResult>
}

type PassResult = {
  plan: MIRPlan
  diff: MIRPlanDiff
  evidence: Evidence[]
  warnings: Warning[]
}

v0 Passes

Exact Duplicate Context Elimination

Remove context blocks with identical content when their mutability and provenance allow deduplication.

This is one of the safest early passes because it is deterministic and inspectable.

Stable Prefix Detection

Identify fixed instructions, tool definitions, examples, and policies that can form a stable prompt prefix.

The pass should report cache layout opportunities without pretending every provider handles caching the same way.

Token and Cost Estimation

Estimate token counts and provider costs before and after optimization.

The pass should label estimates as estimates and include the model/provider assumptions used.

Prompt-Cache Layout Reporting

Suggest provider-specific cache layout decisions, such as stable prefix preservation or explicit breakpoint placement where supported.

v0 can report cache opportunities before it mutates provider requests.

Simple Retry and Fallback Planning

Represent retry boundaries and fallback behavior explicitly.

Retries should target the smallest failed branch possible, not blindly replay the entire workflow.

Basic Static Routing

Route known low-risk steps, such as ranking or classification, to a cheaper model or mock backend when constraints allow.

Static routing should be benchmarked against simple baselines before being treated as an improvement.

Later Passes

  • Near-duplicate retrieval chunk detection.
  • Dead context removal.
  • Tool definition pruning.
  • Memory compaction planning.
  • Provider-aware rate-limit routing.
  • Latency-aware routing.
  • Validator parallelization.
  • Human approval insertion.
  • Trace redaction.

Experimental Passes

These should stay behind explicit flags and acceptance tests:

  • semantic compression,
  • semantic deduplication,
  • semantic caching,
  • history summarization,
  • learned routing,
  • speculative branch execution.

Pass Safety Ladder

Safer:

  • exact duplicate context elimination,
  • dead context removal,
  • tool definition pruning,
  • stable prefix construction,
  • prompt-cache layout planning,
  • token counting,
  • cost estimation.

Riskier:

  • semantic compression,
  • semantic deduplication,
  • semantic caching,
  • history summarization,
  • retrieval chunk rewriting.

Clone this wiki locally