Skip to content

feat(mcp): add event_patterns display type to query tool#2250

Merged
kodiakhq[bot] merged 10 commits into
mainfrom
brandon/mcp-event-patterns
May 12, 2026
Merged

feat(mcp): add event_patterns display type to query tool#2250
kodiakhq[bot] merged 10 commits into
mainfrom
brandon/mcp-event-patterns

Conversation

@brandon-pereira
Copy link
Copy Markdown
Member

@brandon-pereira brandon-pereira commented May 8, 2026

Summary

  • Port the Event Patterns feature to the MCP server's hyperdx_query tool as a new displayType: "event_patterns" option
  • Extract shared minePatterns() function into @hyperdx/common-utils/drain to eliminate duplication between the MCP server and CLI
  • Add integration tests covering the new display type

Screenshot

MCP
Screenshot 2026-05-08 at 3 18 05 PM

CLI (to ensure backwards compatible)
Screenshot 2026-05-08 at 4 00 03 PM

How it works

  1. Claude calls hyperdx_query with displayType: "event_patterns" and a sourceId
  2. The handler auto-detects the body column (Body for logs, SpanName for traces), or accepts an explicit bodyExpression override
  3. Two ClickHouse queries run in parallel: a random sample (default 10K rows, configurable via sampleSize) and a total count
  4. The shared minePatterns() function from @hyperdx/common-utils runs the Drain template mining algorithm, groups results by cluster, computes estimated counts, and generates time-bucketed trend data
  5. Response includes pattern templates with <*> wildcards, estimated/sample counts, trend arrays, and sample body texts

References

Resolves HDX-4142

Port the Event Patterns feature to the MCP server's query tool so Claude
can surface recurring log/trace patterns during investigations.

- Add displayType 'event_patterns' to hyperdx_query with sampleSize and
  bodyExpression parameters
- Extract shared minePatterns() into @hyperdx/common-utils/drain to
  eliminate duplication between MCP server and CLI
- Auto-detect body column from source kind (Body for logs, SpanName for
  traces) with explicit override support
- Sample random events from ClickHouse, run Drain template mining,
  return patterns sorted by estimated frequency with time-bucketed trends
- Add 8 integration tests covering log/trace sources, filters, custom
  body expressions, validation, and schema serialization
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 8, 2026

🦋 Changeset detected

Latest commit: 39e3346

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 5 packages
Name Type
@hyperdx/api Minor
@hyperdx/common-utils Patch
@hyperdx/cli Patch
@hyperdx/app Minor
@hyperdx/otel-collector Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown

vercel Bot commented May 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperdx-oss Ready Ready Preview, Comment May 12, 2026 1:31pm

Request Review

@github-actions github-actions Bot added the review/tier-3 Standard — full human review required label May 8, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

🟡 Tier 3 — Standard

Introduces new logic, modifies core functionality, or touches areas with non-trivial risk.

Why this tier:

  • Diff size: 797 production lines changed (Tier 2 max: < 250)
  • Cross-layer change: touches backend (packages/api) + shared utils (packages/common-utils)

Review process: Full human review — logic, architecture, edge cases.
SLA: First-pass feedback within 1 business day.

Stats
  • Production files changed: 6
  • Production lines changed: 797 (+ 617 in test files, excluded from tier calculation)
  • Branch: brandon/mcp-event-patterns
  • Author: brandon-pereira

To override this classification, remove the review/tier-3 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

E2E Test Results

All tests passed • 177 passed • 3 skipped • 1231s

Status Count
✅ Passed 177
❌ Failed 0
⚠️ Flaky 2
⏭️ Skipped 3

Tests ran across 4 shards in parallel.

View full report →

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

PR Review

Scope: Adds event_patterns displayType to MCP hyperdx_query, extracts shared minePatterns() into @hyperdx/common-utils/drain, and adds integration + unit tests. Refactor is well-contained and the CLI consumer is correctly migrated.

Findings

  • ⚠️ whereSnippet is always emitted in Lucene syntax (${bodyColumn}:"..."), but the response doesn't tell the agent which whereLanguage to pair it with. If the agent earlier called with whereLanguage: 'sql', blindly reusing whereSnippet will fail. → Either include whereLanguageHint: 'lucene' in the response, or update the schema description for whereSnippet to explicitly say "use with whereLanguage: 'lucene'".
  • ⚠️ whereSnippet breaks when bodyColumn is a bracket/quoted expression like SpanAttributes['http.url'] — Lucene field names don't support brackets/quotes unescaped. The validation regex SAFE_BODY_EXPR_CHARS explicitly permits these. → Skip emitting whereSnippet when bodyColumn isn't a bare identifier, or escape it for Lucene.
  • ⚠️ eventPatterns.ts:386 calls resolveBodyExpression(source) passing the Mongoose SourceDocument, but the helper's SourceBodyFields interface uses kind: string. source.kind is the Mongoose enum value — works at runtime, but consider typing SourceBodyFields.kind as SourceKind for consistency with the constant comparison.
  • ℹ️ flattenBody is exported from @hyperdx/common-utils/drain but only used internally by minePatterns. Consider keeping it module-private unless there's a planned external caller.
  • ℹ️ eventPatterns.ts:317 — the eslint-disable-next-line no-useless-escape comment is correctly applied; the \[\]\- escapes inside the character class are intentional for clarity, so this is fine.

Strengths

  • bodyExpression is validated against an allowlist that rejects spaces/operators — prevents SQL injection in the select clause.
  • ✅ Schema-level cross-field validation rejects sampleSize / bodyExpression on non-event_patterns display types (good defense-in-depth).
  • ✅ Sample + count queries run in parallel via Promise.all; 30s max_execution_time cap is sensible.
  • ✅ Excellent test coverage: seeded-data integration test verifies pattern monotonic ordering, <*> placeholders, and non-zero trend buckets.
  • ✅ Refactor genuinely deduplicates — CLI loses ~111 lines and the shared module is well-typed with TRow generic.

No critical bugs or security issues. The Lucene/SQL whereSnippet mismatch is the only item worth resolving before merge.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Deep Review

Scope: PR #2250 vs 41eefec7 — 9 files, ~1,125 lines added across packages/api/src/mcp/tools/query/, new shared packages/common-utils/src/drain/mine-patterns.ts, CLI refactor, plus tests and a changeset.

Intent: Port the Event Patterns feature to the MCP server as displayType: "event_patterns", and extract the Drain-mining pipeline into common-utils so the MCP server and CLI share one implementation.

✅ No critical issues found. The diff has no exploitable injection (the caller-supplied bodyExpression is gated by SAFE_BODY_EXPR_CHARS plus splitAndTrimWithBracket, and the auto-detect path falls within the existing source-admin trust boundary). The strongest concerns are a likely-broken advertised affordance (whereSnippetsearch chain), reliability gaps around the parallel ClickHouse queries, and a security-critical allowlist that ships with only happy-path tests.

🟡 P2 -- recommended

  • packages/api/src/mcp/tools/query/eventPatterns.ts:300-311 -- The whereSnippet emits a quoted Lucene phrase against an explicit field (e.g. Body:"GET 200 OK"), which the HyperDX queryParser compiles to exact-equality SQL (Body = 'GET 200 OK'); rows whose body had words between the literal tokens never match, so the advertised pattern→search chain returns zero rows on the happy path.
    • Fix: Build the snippet as AND-joined bare substring terms against the implicit field, or target the implicit <body> field with a quoted phrase so it routes through the fieldSearch substring path; add an integration test that round-trips a mined whereSnippet through a follow-up search and asserts non-empty rows.
    • correctness, agent-native
  • packages/api/src/mcp/tools/query/eventPatterns.ts:300-311 -- When bodyColumn is SpanAttributes['http.url'] (the form the bodyExpression description explicitly invites), the resulting snippet SpanAttributes['http.url']:"…" contains [, ], and ' in the field-name position, which the Lucene grammar does not accept; the agent's chained search either errors or silently no-ops.
    • Fix: Translate bracket-syntax bodyColumn into the dot-notation form Lucene accepts (e.g. SpanAttributes.http.url) before emitting whereSnippet, or document the constraint in the bodyExpression schema description.
    • correctness, agent-native
  • packages/api/src/mcp/tools/query/eventPatterns.ts:308-310 -- Patterns composed entirely of <*> placeholders silently emit whereSnippet: ""; the tool description tells the agent "use it as the where parameter," so an agent that follows the documented contract issues an unfiltered search.
    • Fix: Omit the whereSnippet field (or set it to null with a sibling unanchored: true flag) when no literal tokens remain, so the agent's schema-aware reasoning can detect the unavailable case.
    • agent-native
  • packages/api/src/mcp/tools/query/eventPatterns.ts:111-131 -- The caller-supplied bodyExpression is gated by SAFE_BODY_EXPR_CHARS, but the auto-detect path through resolveBodyExpression reads source.bodyExpression/source.implicitColumnExpression directly with no character allowlist, so a source-stored value containing function calls, subqueries, or comma-balanced expressions interpolates raw into SELECT … as __hdx_pattern_body; this is within the existing source-admin trust model but contradicts the in-file comment that promises injection rejection.
    • Fix: Run the auto-detected expression through the same SAFE_BODY_EXPR_CHARS check (or a centralized expression validator) before interpolation so the security contract is honored on both paths.
    • adversarial, kieran-typescript
  • packages/api/src/mcp/tools/query/eventPatterns.ts:147,176 -- ORDER BY rand() DESC LIMIT sampleSize (default 10k, max 25k) forces a full-set scan plus partial sort over the matching time range with no max_memory_usage or max_rows_to_read limit; combined with the 100/min MCP rate limit and the two parallel queries per call, a single team member can amplify ClickHouse load substantially.
    • Fix: Switch to ClickHouse's SAMPLE n / SAMPLE 0.x primitive where the table has a SAMPLE BY clause, or add max_memory_usage / max_rows_to_read settings alongside max_execution_time, and consider lowering the default time window when the caller does not narrow it.
    • performance, adversarial
  • packages/common-utils/src/drain/mine-patterns.ts:122-127 -- miner.addLogMessage runs in a synchronous for loop over up to 25k rows on the API request thread; Drain prefix-tree traversal plus token-similarity scoring at that size stalls the Node event loop for hundreds of ms to several seconds, blocking every other API request including unrelated tool calls and health checks.
    • Fix: Chunk the loop with await new Promise(r => setImmediate(r)) every ~500 rows, or run the Drain pass in a worker_threads pool when row counts exceed a threshold.
    • performance
  • packages/api/src/mcp/tools/query/eventPatterns.ts:206-220 -- Promise.all rejects on the first failed query but leaves the surviving query running for up to the full max_execution_time (30s) because no abort_signal is passed even though queryChartConfig accepts one; under burst failures this wastes 30s of ClickHouse execution time per orphan.
    • Fix: Wire an AbortController, pass opts.abort_signal to both queryChartConfig calls, and .abort() in the catch block so the surviving query terminates immediately.
    • reliability
  • packages/api/src/mcp/tools/query/eventPatterns.ts:206-232 -- A single try/catch wraps both queries, so a transient count() failure (the heavier query, with no LIMIT) discards a successfully-sampled patterns response even though minePatterns already supports sampleMultiplier=1 fallback when totalCount=0.
    • Fix: Wrap the count query in its own try/catch, treat its failure as totalCount=0 plus a diagnostic field in the response, and only hard-fail on sample query failure.
    • reliability
  • packages/api/src/mcp/tools/query/eventPatterns.ts:212,218 -- max_execution_time: 30 is hardcoded per parallel query and exceeds typical MCP/agent interactive tool-call SLAs (<10s); when the sample query saturates the timeout, the caller waits 30s for a generic error envelope and may retry, amplifying the orphan-query problem above.
    • Fix: Lower the default to ~10-15s, plumb the setting through source.querySettings so it stays user-configurable, and consider returning a "narrow your time range" hint when the timeout fires.
    • performance, reliability
  • packages/common-utils/src/drain/mine-patterns.ts:113,130 -- getTimestamp falls back to startDate only on null/undefined (?? startDate.getTime()), not on NaN; an unparseable timestamp string yields new Date(NaN)-keyed buckets that never appear in allBuckets, so counts silently drop from the trend output without surfacing any error.
    • Fix: Treat non-finite results from getTimestamp as falling back to startDate.getTime() (e.g. Number.isFinite(t) ? t : startDate.getTime()).
    • reliability
  • packages/api/src/mcp/tools/query/eventPatterns.ts:304-307 -- samples are returned as plain body strings, dropping the per-sample timestamp, service, and severity that the CLI/web PatternGroup carries; an agent investigating "when did this fire" or "which service produced this" must make a second search call to recover information the sample query already had.
    • Fix: Include the __hdx_pattern_ts value on each sample (it is already fetched) and consider adding service/severity for log sources, gated by the existing trimToolResponse budget.
    • agent-native
  • packages/common-utils/src/drain/mine-patterns.ts:101-107 -- The trend bucket granularity is computed internally but never surfaced; agents that interpret a trend (e.g. "this pattern spiked in the last bucket") must subtract adjacent timestamps to recover the bucket width.
    • Fix: Surface granularity (or bucketSeconds) alongside timeRange in the response envelope from eventPatterns.ts.
    • agent-native
  • packages/api/src/mcp/__tests__/queryTool.test.ts -- The SAFE_BODY_EXPR_CHARS security allowlist and the splitAndTrimWithBracket length check ship with happy-path tests only ('Body', 'SpanName', 'SeverityText'); no test confirms that injection-style inputs like "Body) OR (1=1", "Body, password", or "Body; DROP TABLE x" are rejected with the documented error, so a future regex loosening would not fail any test.
    • Fix: Add a small parameterized test case in the event_patterns describe block that passes each injection-style bodyExpression, asserts isError === true, and asserts the error text matches the documented "single column expression" message.
    • testing, security, adversarial, kieran-typescript
  • packages/cli/src/components/EventViewer/usePatternData.ts -- The CLI refactor that swaps inline Drain mining for the shared minePatterns() has zero tests under packages/cli/; the "backwards compatible" claim in the PR description (and the secondary CLI screenshot) is the only verification that sampleCount → count mapping, full-EventRow sample preservation, and estimatedCount sort order survived the refactor.
    • Fix: Extract the post-minePatterns mapping into a pure helper and add a unit test under packages/cli/src/components/EventViewer/ asserting that the per-pattern shape (count === sampleCount, samples preserve full EventRow, sorted by estimatedCount desc) matches the pre-refactor behavior.
    • testing
  • packages/api/src/mcp/__tests__/queryTool.test.ts:262-291 -- The 'should respect sampleSize parameter' and 'should respect where filter' tests assert only isError is falsy and content has length 1, providing false coverage: a regression where sampleSize is ignored or the where filter is silently dropped would still pass.
    • Fix: Seed > sampleSize rows and assert output.result.sampledRows <= sampleSize; for where, seed a mix of INFO and ERROR rows and assert only ERROR-bodied patterns appear when the SQL filter is applied.
    • testing
🔵 P3 nitpicks (10)
  • packages/api/src/mcp/tools/query/eventPatterns.ts:60-61 -- SAFE_BODY_EXPR_CHARS name and comment imply full syntax validation, but the regex is a conservative character allowlist; strings that pass it (e.g. Body:foo:bar) can still fail at the SQL parser.
    • Fix: Rename to BODY_EXPR_CHAR_ALLOWLIST and reword the comment as "reject body expressions containing characters outside a conservative allowlist; final validity is enforced by the ClickHouse parser."
  • packages/api/src/mcp/tools/query/eventPatterns.ts:132-145 -- The "Could not determine body column" error tells the agent to "specify bodyExpression explicitly" but does not point at the discovery primitive.
    • Fix: Append "Call hyperdx_list_sources to see available columns on this source" to the message and to the bodyExpression validation error.
  • packages/api/src/mcp/tools/query/eventPatterns.ts:343-344 -- The output is JSON.stringifyed twice (once on the original, once on the trimmed) just to detect whether trimToolResponse modified it; on a near-trim-budget response this can serialize several MB of data redundantly.
    • Fix: Have trimToolResponse return { data, trimmed: boolean }, or check the existing __hdx_trimmed sentinel that trimObject already writes onto the root.
  • packages/api/src/mcp/tools/query/eventPatterns.ts:304-307 -- literalTokens escapes only \ and "; Lucene metacharacters in pattern words (:, (, ), +, -) survive into the phrase context and may interact unexpectedly with encodeSpecialTokens' preprocessing of http://, localhost:N, etc.
    • Fix: Pass each literal token through the same Lucene-escape helper the parser's own tests use, or add a unit test that feeds pattern bodies containing each metacharacter back through SearchQueryBuilder to assert sensible output.
  • packages/common-utils/src/drain/mine-patterns.ts:822-825 -- clusterCounts is built in a redundant second pass over clustered; the count can be tracked inline on each group struct during the first pass.
    • Fix: Add a count field to the group record and increment on both branches of the if (existing).
  • packages/common-utils/src/drain/mine-patterns.ts:683,698 -- PatternGroup<TRow = Record<string, unknown>> and MinePatternResult<TRow = …> have a default type parameter but no extends bound, so external code can instantiate PatternGroup<number> even though MinePatternOptions/minePatterns enforce the constraint.
    • Fix: Change both declarations to <TRow extends Record<string, unknown> = Record<string, unknown>>.
  • packages/common-utils/src/drain/mine-patterns.ts:140 -- toStartOfInterval(new Date(tsMs), granularity) (UTC clock-time floor) replaces the CLI's previous Math.floor(ts / granularityMs) * granularityMs (pure epoch-ms floor); these agree on common granularities but diverge for irregular ones like 7m/13m, so the "backwards compatible" claim is approximate.
    • Fix: Either acknowledge the alignment-with-SQL semantics improvement explicitly in the changeset, or add a unit test fixture that pins behavior on a 7-minute granularity.
  • packages/cli/src/components/EventViewer/usePatternData.ts:11-29 -- The import { type TrendBucket } followed by export type { TrendBucket } re-exports a type no other file in packages/cli/src imports from this hook.
    • Fix: Drop the re-export; consumers can import TrendBucket directly from @hyperdx/common-utils/dist/drain.
  • packages/common-utils/src/drain/index.ts:7-13 -- Six new symbols are exported from the drain barrel, but only minePatterns and TrendBucket have consumers outside the new module; flattenBody, PatternGroup, MinePatternOptions, MinePatternResult widen the public surface preemptively.
    • Fix: Export only the consumed symbols and promote the rest from module-local to barrel when a second consumer needs them.
  • .changeset/dry-donkeys-add.md -- @hyperdx/common-utils is bumped as patch despite adding six new exported symbols (strict semver would call this minor); package is private: true so external impact is bounded, but the changelog entry will mislead future maintainers.
    • Fix: Bump @hyperdx/common-utils to minor to match the additive surface change.

Reviewers (12): correctness, testing, maintainability, project-standards, agent-native, learnings-researcher, security, performance, api-contract, reliability, adversarial, kieran-typescript.

Testing gaps:

  • Empty-sample "no data found" hint return and ClickHouse query failed / Pattern mining failed catch arms are not exercised by any test.
  • trimToolResponse "Result was trimmed for context size" note branch is never asserted.
  • Mutual-exclusion validation that rejects sampleSize / bodyExpression on non-event_patterns displayTypes (schemas.ts:618-625) has no negative test.
  • The whereSnippetsearch chain advertised in the tool description is never round-tripped through an integration test.
  • Seeded event_patterns integration test uses a Date.now() window against ClickHouse-written timestamps; pin to fixed dates around the seeded rows to harden against clock skew and Drain ordering sensitivity.

@brandon-pereira brandon-pereira requested review from a team and dhable and removed request for a team May 11, 2026 19:01
@kodiakhq kodiakhq Bot merged commit f6a1d02 into main May 12, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge review/tier-3 Standard — full human review required

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants