Skip to content

feat: vector similarity-based behavioral pattern detection#375

Merged
BYK merged 1 commit into
mainfrom
feat-pattern-echo
May 18, 2026
Merged

feat: vector similarity-based behavioral pattern detection#375
BYK merged 1 commit into
mainfrom
feat-pattern-echo

Conversation

@BYK
Copy link
Copy Markdown
Owner

@BYK BYK commented May 18, 2026

Summary

Adds pattern echo detection to the distillation pipeline. When a new distillation segment has high embedding similarity to segments from previous sessions, it signals repeated user behavior. The system extracts the common pattern via LLM and creates a preference knowledge entry.

How it works

  1. After each gen-0 distillation segment is created, its embedding is compared against all previous segments for the same project via vectorSearchAllDistillations()
  2. Segments with cosine similarity >= 0.78 from different sessions are counted as "echoes"
  3. When 2+ echoes from distinct sessions are found, the pattern extraction LLM synthesizes the common behavioral pattern
  4. A preference entry is created at confidence 0.8 (auto-extracted, below user-stated 1.0)
  5. ltm.create() dedup guard prevents duplicate entries
  6. Rate limited to 1 extraction per session per 10 minutes

Detected patterns in eval

  • "Always use centralized error handling with try/catch and semantic HTTP codes"
  • "Never use ORMs — always use raw parameterized SQL via pg"
  • "Always use const by default; only use let when reassigned"
  • "Always follow a strict post-implementation checklist for API endpoints"

Limitations

Pattern echoes detect topically similar behaviors (same phrasing across sessions). They miss abstract behavioral patterns where the surface text varies — e.g., "always asks for tests after implementation" when the tests are for REST endpoints in one session and React components in another. These require higher-level temporal/sequential reasoning.

Files

  • packages/core/src/pattern-echo.ts (NEW) — detection logic, ~170 lines
  • packages/core/src/prompt.tsPATTERN_ECHO_SYSTEM prompt + patternEchoUser()
  • packages/core/src/distillation.ts — hook into distillSegment(), await when urgent

Adds pattern echo detection to the distillation pipeline. After each
distillation segment is created, compares its embedding against all
previous segments for the same project. When a segment is similar to
2+ prior segments from different sessions (cosine >= 0.78), uses the
curator LLM to extract the common behavioral pattern and create a
preference knowledge entry.

Detected patterns in eval: error handling conventions, ORM avoidance,
const-by-default, post-implementation checklists.

Also adds session recording/replay and --scenarios flag to eval harness
for ~90% cheaper subsequent eval runs.
@BYK BYK self-assigned this May 18, 2026
@BYK BYK merged commit 3b57cd0 into main May 18, 2026
10 checks passed
@BYK BYK deleted the feat-pattern-echo branch May 18, 2026 22:12
This was referenced May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant