feat(insights): exact-first saved insights search#61611
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: abba632d4e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Size Change: 0 B Total Size: 81.9 MB ℹ️ View Unchanged
|
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
frontend/src/scenes/saved-insights/SavedInsights.tsx:1085-1087
When `appliedMatchMode === 'fuzzy'` returns zero results (`insights.count === 0`) but `hasIncludesMatches === true`, the banner renders as a bare "Show exact matches" button with no surrounding text — the `insights.count > 0` guard suppresses the "Showing results similar to…" sentence. A common trigger: any unambiguous term like "sales" where every trigram match is also an exact match, so `fuzzy_only_qs` is empty. The user sees an empty table and a floating button with no explanation of why the similar search returned nothing.
```suggestion
<div className="flex items-center gap-2 text-secondary text-sm mt-2">
{insights.count > 0 ? (
<span>Showing results similar to "{filters.search}".</span>
) : (
<span>No additional similar results found.</span>
)}
{hasIncludesMatches && (
```
### Issue 2 of 2
products/product_analytics/backend/api/insight.py:1402
**Missing `match_mode` enum validation**
`match_mode` is accepted from the query string but never validated against its documented enum (`auto`, `includes`, `fuzzy`). An unexpected value (e.g. `match_mode=typo`) falls through to the `elif has_includes:` / `else:` branches (effectively `auto` behaviour) while `applied_match_mode` in the response correctly reflects the resolved phase. The behaviour is safe, but callers receive no error signal and the OpenAPI contract is silently unenforced. A short `ValidationError` guard would make the contract explicit.
Reviews (1): Last reviewed commit: "feat(insights): two-phase saved insights..." | Re-trigger Greptile |
pauldambra
left a comment
There was a problem hiding this comment.
Note
🤖 Automated comment by QA Swarm — not written by a human
QA Swarm multi-perspective review (qa-team specialists, paul-reviewer, xp-reviewer, security-audit). See inline comments; full verdict in the summary comment.
|
Note 🤖 Automated comment by QA Swarm — not written by a human Multi-perspective review: qa-team (specialists + generalists), paul-reviewer, xp-reviewer, security-audit Verdict: 💬 APPROVE WITH NITSClean, well-tested, self-contained change — no correctness or security blockers. Two convergent items are worth a look (search query cost; response-metadata coupling); everything else is LOW/NIT. Key findings🟡 MEDIUM
🟢 LOW
⚪ NIT
Convergence (highest confidence)
Praise
Reviewer summaries
Automated by QA Swarm — not a human review |
There was a problem hiding this comment.
Two substantive unresolved review concerns need human judgment: (1) a convergent MEDIUM performance finding (three DB queries per search request — two exists() probes plus the main paginated query — including an expensive trigram similarity scan that runs on every ?search= call) that was raised by multiple reviewers and never marked resolved or deferred; (2) a P2 Codex finding that the new response fields (match_mode, has_includes_matches, has_fuzzy_matches) are dynamically patched onto response.data after super().list() and are absent from any serializer or @extend_schema response declaration, violating the project's documented API schema guidelines.
|
🎭 Playwright report · View test results → ❌ 1 failed test:
🔁 Flake verification failed (--repeat-each=10):
The report only shows the tests under verification. View report → Fix these before merging. These issues are not necessarily caused by your changes. |
049c383 to
ffce385
Compare
There was a problem hiding this comment.
No showstoppers. The implementation uses a clean single-query approach annotating _is_exact via Case/When, avoiding the extra exists() probes that earlier reviewers flagged (those comments are on older commits). Schema exposure, generated types, and backward compatibility are all handled correctly.
pauldambra
left a comment
There was a problem hiding this comment.
Note
🤖 Automated comment by QA Swarm — not written by a human
QA Swarm re-review of the simplified single-list design (qa-team, paul-reviewer, xp-reviewer, security-audit). The two convergent MEDIUM items from the first review are resolved by the simplification. See inline; verdict in the summary comment.
|
Note 🤖 Automated comment by QA Swarm — not written by a human Multi-perspective re-review (qa-team, paul-reviewer, xp-reviewer, security-audit) of the simplified single-list design. Verdict: 💬 APPROVE WITH NITSThe simplification landed cleanly. The two convergent MEDIUM items from the first review — the unconditional double Key findings (all LOW / NIT — none blocking)🟢 LOW
⚪ NIT
Convergence (highest confidence)
Praise
Reviewer summaries
Automated by QA Swarm — not a human review |
ffce385 to
26edba8
Compare
There was a problem hiding this comment.
All substantive review concerns were addressed in follow-up commits: the double-exists() probe is replaced by a single annotated query, schema exposure is handled via @extend_schema_field on the serializer, and match_mode validation was added. The remaining unresolved comments are advisory LOW/NIT items explicitly deemed "fine as-is." The change is backward-compatible (nullable field, null when no search) with consistent generated-type updates and good test coverage including a query-count guard.
26edba8 to
b78d670
Compare
There was a problem hiding this comment.
All substantive review concerns from earlier iterations (double-exists probes, missing schema annotation, missing enum validation) were addressed in follow-up commits. The current implementation is a clean single-query approach using Case/When to annotate _is_exact, backward-compatible nullable field, consistent generated-type updates, and solid test coverage including a query-count guard.
|
✅ Visual changes approved by @pauldambra — baseline updated in 10 changed, 2 new. |
b78d670 to
69de1bf
Compare
There was a problem hiding this comment.
Clean single-query implementation using Case/When to annotate exact vs. fuzzy match per row, with backward-compatible nullable field, proper schema annotation via @extend_schema_field, consistent generated-type updates, and a query-count guard test. All substantive concerns from earlier review rounds were addressed; remaining unresolved comments are low/nit self-authored notes.
Saved-insights search returns literal substring ("exact") and fuzzy trigram
("similar") matches together in one relevance-ranked list, ordered exact-first.
Each result carries `search_match_type` (exact|similar) and the UI badges the
"similar" rows. A single query with no mode parameter or response flags, so
agents and MCP get the same exact-first results from one insights-list call.
Generated-By: PostHog Code
Task-Id: 63a82e0c-6cc0-4cb0-8898-1f8f78586f7a
69de1bf to
6fcece5
Compare
There was a problem hiding this comment.
Clean single-query implementation using Case/When to annotate exact vs. fuzzy match per row; all substantive bot-review concerns from earlier iterations were addressed in follow-up commits, remaining comments are LOW/NIT items explicitly marked as non-blocking. Backward-compatible nullable field with consistent generated-type updates and solid test coverage including a query-count guard.
|
Note 🤖 Automated comment by PostHog Code — not written by a human Resolved the review threads that have been addressed:
Left open for a human call (NITs I considered but chose not to change):
|
2 updated Run: 8ea2a8cf-1b21-4d83-9298-9a6be014f035 Co-authored-by: pauldambra <984817+pauldambra@users.noreply.github.com>
New commits pushed (delta classified non_trivial_delta) — stamphog approval dismissed; re-review running automatically.
There was a problem hiding this comment.
Clean single-query implementation adding exact-first search with per-row search_match_type annotation; backward-compatible nullable field, consistently updated generated types, and solid test coverage including a query-count guard. All substantive concerns from earlier review rounds were addressed.

Problem
Saved-insights search routed the whole query through Postgres trigram similarity with loose thresholds, so short terms like
revenuesurfaced fuzzy, noisy matches with no way to tell exact matches from approximate ones — for the UI or for agents/MCP hitting the same endpoint.Changes
GET /api/environments/:id/insights/?search=now returns literal substring ("exact") matches and fuzzy trigram ("similar") matches together in one relevance-ranked list, ordered exact-first. Each result carriessearch_match_type(exact|similar); it is null when the list isn't filtered bysearch.match_modeparameter, no cross-set response flags — the response describes the results, so it's self-describing for agents/MCP (no skill needed to drive it).SearchResultsStorybook story (Scenes-App/Saved Insights) exercises the badge + ordering for visual review._apply_searchannotates_is_exactfrom the icontains/tag predicate and orders-_is_exact, -_search_score, name; asearch_match_typeserializer field exposes it.This collapses an earlier two-phase (
match_mode+has_includes/has_fuzzyflags) design that briefly lived in this PR — simplified to one labeled list to cut consumer-side complexity while keeping all ranking server-side.How did you test this code?
I'm an agent (PostHog Code); no manual UI clicks beyond a local Storybook render of the new story. Automated, run locally:
test_insight.py: union exact-first ordering + per-rowsearch_match_type(exact vs similar),search_match_typenull without search, plus the pre-existingtest_list_filter_by_search_*and the N+1 query-count snapshot (21 pass).savedInsightsLogic.test.ts: per-rowsearch_match_typecarried through to results, plus the existing suite (12 pass).tsgoclean for every touched file;ruff/oxlintclean; OpenAPI + MCP regenerated and the MCP tool-schema snapshot updated.Automatic notifications
Docs update
No user-facing docs change.
🤖 Agent context
Authored by an agent (PostHog Code, Claude Opus) completing a Slack thread on noisy insight search.
match_modedesign (exact-only / fuzzy "show similar" toggle), reviewed by a multi-agent swarm. The requester then chose to reduce result/contract complexity: collapse to one exact-first list with a per-rowsearch_match_type, keeping the smarts server-side so agents/MCP get the same behavior from a singleinsights-listcall with no special handling.search_match_typelives onQueryBasedInsightModel(not the baseInsightModel) to avoid breaking the exhaustive activity-log describer record.Created with PostHog Code