Integrate caveman-style token compression as companion skill

## Summary

Integrate [caveman](https://github.com/JuliusBrussee/caveman)-style output token compression into IIKit as an optional companion mode. IIKit skills are verbose — they produce long explanations, status updates, and artifact summaries that consume significant tokens without adding proportional value.

## Motivation

- IIKit sessions are token-heavy. A full specify → plan → tasks → implement cycle can consume substantial output tokens on status updates, explanations, and artifact summaries alone.
- Caveman demonstrates ~65% output token reduction with no loss of technical accuracy.
- A referenced [March 2026 study](https://arxiv.org/abs/2503.02948) found brevity constraints *improved* accuracy by 26 percentage points on certain benchmarks.
- Input compression of IIKit's own skill prompts and CLAUDE.md could reduce session-start token cost by ~46%.

## Scope

### Output Compression
- Add a `caveman` mode toggle (e.g., `/iikit-caveman on|off` or a context.json flag)
- When active, all skill output (status messages, artifact summaries, validation reports) uses terse caveman-style language
- Code blocks, file paths, requirement IDs (FR-XXX, SC-XXX, T001), and artifact content pass through untouched
- Three levels: `lite` (grammar intact, filler removed), `full` (fragments, no articles), `ultra` (telegraphic)

### Input Compression
- Apply caveman-compress to IIKit's own skill prompt files to reduce session-start token load
- Keep human-readable originals alongside compressed versions (like caveman's `CLAUDE.md` / `CLAUDE.original.md` pattern)
- Measure before/after token counts for the full skill set

### Integration Points
- SessionStart hook or context.json flag to activate
- Per-skill opt-in (some skills like `/iikit-01-specify` produce user-facing artifacts that should stay verbose; status messages can compress)
- Compatible with existing caveman installation if user already has it

## Non-Goals
- Does NOT change the IIKit workflow or phase structure
- Does NOT compress artifact content (spec.md, plan.md, tasks.md remain full prose)
- Does NOT affect thinking/reasoning tokens (only output)

## Success Criteria
- Measurable token reduction (target: 40%+ on output, 30%+ on input)
- No loss of technical information in compressed output
- Artifact quality unchanged (same spec.md, plan.md, tasks.md content)
- User can toggle on/off without restarting session

## Prior Art
- [caveman](https://github.com/JuliusBrussee/caveman) — 65% output compression, 46% input compression
- [caveman-compress](https://github.com/JuliusBrussee/caveman/tree/main/caveman-compress) — input token optimization

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate caveman-style token compression as companion skill #46

Summary

Motivation

Scope

Output Compression

Input Compression

Integration Points

Non-Goals

Success Criteria

Prior Art

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Integrate caveman-style token compression as companion skill #46

Description

Summary

Motivation

Scope

Output Compression

Input Compression

Integration Points

Non-Goals

Success Criteria

Prior Art

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions