v0.4.4
Highlights
Smarter Grouping
The grouping engine compared lines byte-by-byte by position, so a single token inserted at the front of an otherwise identical line — a node name, a severity prefix — cascaded into "no match" and split what should have been one group. Grouping is now token-based (longest common subsequence over the normalized line), so insertions no longer break folds. On a real 70k-line kubelet log the same input compresses to 380 lines instead of 941, with distinct messages preserved — for example, 32 near-identical "Operation for volume X failed" groups that differed only in volume IDs collapse into 10 honest ones. --threshold keeps its meaning: the percentage of tokens two lines must share. One consequence: distinct-but-similar messages that clear the bar now genuinely fold — raise --threshold (e.g. 85) if you want per-message granularity in dense groups.
Pipeline-Clean stdout
lessence app.log | grep ERROR used to receive the statistics footer mixed into the log output. The footer now goes to stderr — stdout carries only log lines, no -q required for piping. In the same spirit: a misspelled input file now exits 1 (like cat and grep) instead of silently succeeding, so scripts notice.
Fewer False Detections
Five detectors learned to leave ordinary text alone:
- the word "request" in plain prose is no longer rewritten as
request_id=<UUID> - parenthesized counts like
(3)or(137)are no longer rewritten as PIDs — a process name must be attached,sshd(1234)-style - epoch timestamps and hex-looking words ("defaced") are no longer eaten as hashes
- dotted code identifiers like
hibernate.SQLorscope.goare no longer detected as hostnames — and real hostnames now get their ownFQDNcategory instead of masquerading as IPv4 addresses --disable-patterns brackets/json/key-valuenow actually disables every matching detector (two ran unconditionally before)
Accurate Statistics
--stats-json and the JSON summary lumped ports into "ips", JSON tokens into "paths", and six unrelated categories into "percentages". Every pattern category now has its own counter — the numbers finally mean what they say.
Bounded Work on Hostile Lines
A single long line of repeated key=value tokens triggered quadratic work in the key-value detector — a crafted 1 MB line could stall for minutes. The work is now linear: a 200 KB reproduction drops from 0.30s to 0.02s.
Ten of these eleven fixes came from a single fresh-eyes audit of the codebase by Claude Fable 5; the key=value stall had been flagged earlier by an automated threat-model/vuln-scan pass and was fixed in the same sweep.
0.4.4 (2026-06-09)
Bug Fixes
- --disable-patterns brackets/json/key-value now disables all matching detectors (0b87792)
- --stats-json and JSON summary report accurate per-category pattern counts (23aea53)
- dotted code identifiers like hibernate.SQL are no longer detected as hostnames (13e530a)
- epoch timestamps and hex-looking words are no longer detected as hashes (a585a63)
- exit with code 1 when an input file cannot be opened (b10701f)
- log lines containing the word "request" are no longer rewritten as request IDs (1446f48)
- parenthesized counts like "(3)" are no longer rewritten as PIDs (a67adf6)
- similarity grouping now tolerates inserted tokens instead of splitting groups (349493a)
- statistics footer now goes to stderr, keeping stdout clean for pipelines (7893fcd)
Performance
- flush remaining groups in O(n) instead of O(n^2) (2fcb135)
- long key=value lines no longer stall the key-value detector (1485e11)
Full changelog: v0.4.3...v0.4.4