[HDX-3964] Add event pattern mining to CLI (Shift+P)#2106
[HDX-3964] Add event pattern mining to CLI (Shift+P)#2106kodiakhq[bot] merged 11 commits intomainfrom
Conversation
🔴 Tier 4 — CriticalTouches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD. Why this tier:
Review process: Deep review from a domain expert. Synchronous walkthrough may be required. Stats
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
🦋 Changeset detectedLatest commit: 00c8507 The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
PR Review
|
E2E Test Results✅ All tests passed • 132 passed • 3 skipped • 1032s
Tests ran across 4 shards in parallel. |
- Press P to mine event patterns using Drain algorithm from common-utils - Pattern list shows count, percentage, and mined template - Press l/Enter to expand a pattern and view its sample events - Press h/Esc to go back, P/Esc to return to events view - Following mode is paused while in pattern view
…mated counts - Issue separate ORDER BY rand() LIMIT 100000 sampling query when P is pressed - Issue parallel count() query to get true total event count - Estimate pattern counts via sampleMultiplier (totalCount / sampledRowCount) - Display estimated counts with ~ prefix and percentage of total - Fix alias resolution: compute WITH clauses from source select aliases so Lucene searches like level:error work in pattern/count queries
…play - Pattern sample query now uses defaultTableSelectExpression instead of just body+timestamp, so sample rows have all fields (Timestamp, service, level, Body, etc.) - Fix timestamp column detection to use source.timestampValueExpression - Remove Pct and Trend columns from pattern view
## Summary Adds a pattern mining feature to the CLI, accessible via `Shift+P`. This mirrors the web app's Pattern Table functionality but runs entirely in TypeScript — no Pyodide/Python WASM needed. **Linear:** https://linear.app/hyperdx/issue/HDX-3964 ## What changed ### 1. Drain library in common-utils (`packages/common-utils/src/drain/`) Ported the [browser-drain](https://github.com/DeploySentinel/browser-drain) TypeScript library into `@hyperdx/common-utils`. This is a pure TypeScript implementation of the Drain3 log template mining algorithm, including: - `TemplateMiner` / `TemplateMinerConfig` — main API - `Drain` — core algorithm with prefix tree and LRU cluster cache - `LogMasker` — regex-based token masking (IPs, numbers, etc.) - `LruCache` — custom LRU cache matching Python Drain3's eviction semantics - 11 Jest tests ported from the original `node:test` suite ### 2. CLI pattern view (`packages/cli/src/components/EventViewer/`) **Keybinding:** `Shift+P` toggles pattern view (pauses follow mode, restores on exit) **Data flow (mirrors web app's `useGroupedPatterns`):** - Issues `SELECT ... ORDER BY rand() LIMIT 100000` to randomly sample up to 100K events - Issues parallel `SELECT count()` to get true total event count - Feeds sampled log bodies through the TypeScript `TemplateMiner` - Estimates pattern counts via `sampleMultiplier = totalCount / sampledRowCount` - Computes time-bucketed trend data per pattern **UI:** - Pattern list with columns: Est. Count (with `~` prefix), Pattern - `l`/`Enter` expands a pattern to show its sample events (full table columns) - `h`/`Esc` returns to pattern list - `j/k/G/g/Ctrl+D/Ctrl+U` navigation throughout - Loading spinner while sampling query runs **Alias fix:** Pattern and count queries compute `WITH` clauses from the source's `defaultTableSelectExpression` so Lucene searches using aliases (e.g. `level:error` where `level` is an alias for `SeverityText`) resolve correctly. ### New files - `packages/common-utils/src/drain/` — 7 source files + barrel index - `packages/common-utils/src/__tests__/drain.test.ts` - `packages/cli/src/components/EventViewer/usePatternData.ts` - `packages/cli/src/components/EventViewer/PatternView.tsx` - `packages/cli/src/components/EventViewer/PatternSamplesView.tsx` ### Modified files - `packages/cli/src/api/eventQuery.ts` — added `buildPatternSampleQuery`, `buildTotalCountQuery`, `buildAliasWithClauses` - `packages/cli/src/components/EventViewer/EventViewer.tsx` — wired in pattern state + rendering - `packages/cli/src/components/EventViewer/useKeybindings.ts` — added P, l, h keybindings + pattern/sample navigation - `packages/cli/src/components/EventViewer/SubComponents.tsx` — added P to help screen ### Demo https://github.com/user-attachments/assets/50a2edfc-8891-43ae-ab86-b96fca778c66
## Summary Adds a pattern mining feature to the CLI, accessible via `Shift+P`. This mirrors the web app's Pattern Table functionality but runs entirely in TypeScript — no Pyodide/Python WASM needed. **Linear:** https://linear.app/hyperdx/issue/HDX-3964 ## What changed ### 1. Drain library in common-utils (`packages/common-utils/src/drain/`) Ported the [browser-drain](https://github.com/DeploySentinel/browser-drain) TypeScript library into `@hyperdx/common-utils`. This is a pure TypeScript implementation of the Drain3 log template mining algorithm, including: - `TemplateMiner` / `TemplateMinerConfig` — main API - `Drain` — core algorithm with prefix tree and LRU cluster cache - `LogMasker` — regex-based token masking (IPs, numbers, etc.) - `LruCache` — custom LRU cache matching Python Drain3's eviction semantics - 11 Jest tests ported from the original `node:test` suite ### 2. CLI pattern view (`packages/cli/src/components/EventViewer/`) **Keybinding:** `Shift+P` toggles pattern view (pauses follow mode, restores on exit) **Data flow (mirrors web app's `useGroupedPatterns`):** - Issues `SELECT ... ORDER BY rand() LIMIT 100000` to randomly sample up to 100K events - Issues parallel `SELECT count()` to get true total event count - Feeds sampled log bodies through the TypeScript `TemplateMiner` - Estimates pattern counts via `sampleMultiplier = totalCount / sampledRowCount` - Computes time-bucketed trend data per pattern **UI:** - Pattern list with columns: Est. Count (with `~` prefix), Pattern - `l`/`Enter` expands a pattern to show its sample events (full table columns) - `h`/`Esc` returns to pattern list - `j/k/G/g/Ctrl+D/Ctrl+U` navigation throughout - Loading spinner while sampling query runs **Alias fix:** Pattern and count queries compute `WITH` clauses from the source's `defaultTableSelectExpression` so Lucene searches using aliases (e.g. `level:error` where `level` is an alias for `SeverityText`) resolve correctly. ### New files - `packages/common-utils/src/drain/` — 7 source files + barrel index - `packages/common-utils/src/__tests__/drain.test.ts` - `packages/cli/src/components/EventViewer/usePatternData.ts` - `packages/cli/src/components/EventViewer/PatternView.tsx` - `packages/cli/src/components/EventViewer/PatternSamplesView.tsx` ### Modified files - `packages/cli/src/api/eventQuery.ts` — added `buildPatternSampleQuery`, `buildTotalCountQuery`, `buildAliasWithClauses` - `packages/cli/src/components/EventViewer/EventViewer.tsx` — wired in pattern state + rendering - `packages/cli/src/components/EventViewer/useKeybindings.ts` — added P, l, h keybindings + pattern/sample navigation - `packages/cli/src/components/EventViewer/SubComponents.tsx` — added P to help screen ### Demo https://github.com/user-attachments/assets/50a2edfc-8891-43ae-ab86-b96fca778c66 Co-authored-by: peter-leonov-ch <209667683+peter-leonov-ch@users.noreply.github.com>
Summary
Adds a pattern mining feature to the CLI, accessible via
Shift+P. This mirrors the web app's Pattern Table functionality but runs entirely in TypeScript — no Pyodide/Python WASM needed.Linear: https://linear.app/hyperdx/issue/HDX-3964
What changed
1. Drain library in common-utils (
packages/common-utils/src/drain/)Ported the browser-drain TypeScript library into
@hyperdx/common-utils. This is a pure TypeScript implementation of the Drain3 log template mining algorithm, including:TemplateMiner/TemplateMinerConfig— main APIDrain— core algorithm with prefix tree and LRU cluster cacheLogMasker— regex-based token masking (IPs, numbers, etc.)LruCache— custom LRU cache matching Python Drain3's eviction semanticsnode:testsuite2. CLI pattern view (
packages/cli/src/components/EventViewer/)Keybinding:
Shift+Ptoggles pattern view (pauses follow mode, restores on exit)Data flow (mirrors web app's
useGroupedPatterns):SELECT ... ORDER BY rand() LIMIT 100000to randomly sample up to 100K eventsSELECT count()to get true total event countTemplateMinersampleMultiplier = totalCount / sampledRowCountUI:
~prefix), Patternl/Enterexpands a pattern to show its sample events (full table columns)h/Escreturns to pattern listj/k/G/g/Ctrl+D/Ctrl+Unavigation throughoutAlias fix: Pattern and count queries compute
WITHclauses from the source'sdefaultTableSelectExpressionso Lucene searches using aliases (e.g.level:errorwherelevelis an alias forSeverityText) resolve correctly.New files
packages/common-utils/src/drain/— 7 source files + barrel indexpackages/common-utils/src/__tests__/drain.test.tspackages/cli/src/components/EventViewer/usePatternData.tspackages/cli/src/components/EventViewer/PatternView.tsxpackages/cli/src/components/EventViewer/PatternSamplesView.tsxModified files
packages/cli/src/api/eventQuery.ts— addedbuildPatternSampleQuery,buildTotalCountQuery,buildAliasWithClausespackages/cli/src/components/EventViewer/EventViewer.tsx— wired in pattern state + renderingpackages/cli/src/components/EventViewer/useKeybindings.ts— added P, l, h keybindings + pattern/sample navigationpackages/cli/src/components/EventViewer/SubComponents.tsx— added P to help screenDemo
Screen.Recording.2026-04-10.at.6.06.18.PM.mov