v0.4.0
Highlights
Structured JSON output for AI agents
lessence --format json emits a JSONL stream where each folded group is a self-contained JSON record. Every record carries per-token-type rollup metadata — distinct counts, deterministic sample values, and a raw time range — so agents can answer triage questions ("which pods?", "how many distinct UUIDs?", "when did this start?") from a single invocation without re-reading the raw log.
kubectl logs pod/api | lessence --format json \
| jq 'select(.type == "group" and .count >= 100) | .variation.UUID'Output is byte-identical across runs (excluding elapsed_ms). Full schema: docs/format-json-schema.md.
Enriched text-mode compact marker
The collapsed-group marker now shows what actually varies inside each group:
# Before
[+1273 similar, varying: UUID, hash, name, path, timestamp]
# After
[+1273 similar | 13:07:09 → 13:45:17 | uuid×14, path×3 {/var/lib/pods/a, /var/lib/pods/b, /var/lib/pods/c}, hash×1273]
Time range, distinct counts per token type, and inline sample values (when the complete set is small enough to show) — all derived from the same rollup metadata that powers the JSON output.
Corpus-calibrated rollup parameters
The three rollup constants (sample count K=7, distinct-value cap=64, text inline threshold=3) are derived from P95/P99 measurements across an 18-file real-world log corpus, not guessed. Reproducible via cargo bench --bench calibrate_rollup. See docs/rollup-calibration.md for the methodology and evidence.
What's Changed
- Add
--format jsonemitting JSONL with group and summary records - Compute per-group rollup metadata (distinct counts, deterministic samples) at flush time
- Enrich text-mode compact marker with time range and per-type variation rollups
- Calibrate rollup parameters against full corpus (K=7, cap=64, threshold=3)
- Restore criterion bench suite for performance gating
- Document JSON schema, rollup calibration methodology, and bench procedures
Full Changelog: v0.3.1...v0.4.0