Skip to content

Integrate caveman-style token compression as companion skill #46

@jbaruch

Description

@jbaruch

Summary

Integrate caveman-style output token compression into IIKit as an optional companion mode. IIKit skills are verbose — they produce long explanations, status updates, and artifact summaries that consume significant tokens without adding proportional value.

Motivation

  • IIKit sessions are token-heavy. A full specify → plan → tasks → implement cycle can consume substantial output tokens on status updates, explanations, and artifact summaries alone.
  • Caveman demonstrates ~65% output token reduction with no loss of technical accuracy.
  • A referenced March 2026 study found brevity constraints improved accuracy by 26 percentage points on certain benchmarks.
  • Input compression of IIKit's own skill prompts and CLAUDE.md could reduce session-start token cost by ~46%.

Scope

Output Compression

  • Add a caveman mode toggle (e.g., /iikit-caveman on|off or a context.json flag)
  • When active, all skill output (status messages, artifact summaries, validation reports) uses terse caveman-style language
  • Code blocks, file paths, requirement IDs (FR-XXX, SC-XXX, T001), and artifact content pass through untouched
  • Three levels: lite (grammar intact, filler removed), full (fragments, no articles), ultra (telegraphic)

Input Compression

  • Apply caveman-compress to IIKit's own skill prompt files to reduce session-start token load
  • Keep human-readable originals alongside compressed versions (like caveman's CLAUDE.md / CLAUDE.original.md pattern)
  • Measure before/after token counts for the full skill set

Integration Points

  • SessionStart hook or context.json flag to activate
  • Per-skill opt-in (some skills like /iikit-01-specify produce user-facing artifacts that should stay verbose; status messages can compress)
  • Compatible with existing caveman installation if user already has it

Non-Goals

  • Does NOT change the IIKit workflow or phase structure
  • Does NOT compress artifact content (spec.md, plan.md, tasks.md remain full prose)
  • Does NOT affect thinking/reasoning tokens (only output)

Success Criteria

  • Measurable token reduction (target: 40%+ on output, 30%+ on input)
  • No loss of technical information in compressed output
  • Artifact quality unchanged (same spec.md, plan.md, tasks.md content)
  • User can toggle on/off without restarting session

Prior Art

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions