feat(taxonomic-filter): explicit opt-in for stale event results#59816
Conversation
## Problem
The taxonomic filter's Events and Custom Events tabs were including event
definitions that hadn't been seen for more than 30 days alongside fresh
events, marking them visually with a "Stale" tag but not hiding them.
Users searching for a common term (`url`, `email`, `utm`) frequently end
up with stale or legacy events at positions where they expect current
ones. The MCP event_definitions tool had no way to make this same
distinction.
## Changes
Backend:
- New `STALE_EVENT_DAYS = 30` constant in `posthog/taxonomy/taxonomy.py`.
- New `?exclude_stale=true` query parameter on the
`/api/projects/.../event_definitions/` list endpoint. Filters by
`last_seen_at` against `NOW() - INTERVAL '30 days'`. Event definitions
with a NULL `last_seen_at` (manually created, not yet ingested) are
kept so newly-defined events remain discoverable.
- `@extend_schema(parameters=[...])` on the list method so the flag
surfaces in OpenAPI / generated MCP tool definitions with a
description telling LLM callers to retry with `exclude_stale=false`
when a search returns zero results.
- Three parameterized backend tests in a dedicated `APIBaseTest`
subclass (the existing test class is wrapped in `freeze_time`, which
Postgres `NOW()` does not respect).
Frontend:
- `STALE_EVENT_SECONDS` in `lib/constants` now derives from a frontend
`STALE_EVENT_DAYS = 30` so the threshold has a single value on each
side rather than drifting independently.
- `infiniteListLogic.loadRemoteItems` adds `exclude_stale=true` to its
`searchParams` for the Events / Custom Events tabs when the new
`includeStaleEvents` reducer is `false`. Putting the flag in
searchParams (rather than baking it into the tab's `endpoint` URL via
`taxonomicGroups`) keeps the group definitions stable and avoided
pushing the `taxonomicGroups` selector past kea's 16-input cap.
- New `setIncludeStaleEvents` action + reducer in
`taxonomicFilterLogic`. Resets back to `false` on `setSearchQuery`
and `setActiveTab` so the user has to re-opt-in for every new query.
- Empty-state "Include stale events" button shown only on the Events
and Custom Events tabs after a search that returned no fresh
matches.
- Telemetry additions (all additive — existing payload shapes
unchanged):
- `taxonomic_filter_search_query` gains `excludeStale`.
- `taxonomic filter item selected` gains `wasStale` (only set for
Events / Custom Events selections).
- New `taxonomic filter include stale toggled` event captured when
the user clicks the empty-state button.
MCP: types regenerated via `hogli build:openapi` so the new parameter
appears in `services/mcp/src/api/generated.ts` and
`products/event_definitions/frontend/generated/api.schemas.ts`.
## How did you test this code?
This PR was written by an agent. Automated tests run:
- `hogli test posthog/api/test/test_event_definition.py` — 37/37 pass,
including the three new parameterized exclude_stale cases.
- `hogli test frontend/src/lib/components/TaxonomicFilter/` —
362/362 pass.
- `ruff check` and `ruff format` on touched Python files — clean.
- `pnpm format` on touched frontend files — clean.
- `hogli build:openapi` — confirmed `exclude_stale` shows up in the
generated MCP and frontend types with the retry-on-empty hint.
The agent did not manually click through the UI.
## 🤖 Agent context
Written by PostHog Code (Claude Opus 4.7). Discussion before
implementation considered whether the API should auto-fall-back to
including stale results on zero matches; we explicitly chose **not**
to, so the contract stays "`exclude_stale` is a pure filter" — the MCP
client is responsible for retrying with `exclude_stale=false` if it
wants stale results.
The frontend originally tried to thread `includeStaleEvents` through
`taxonomicGroups` as a 17th selector input — kea's selector tuple type
caps at 16, which broke logic mounting in tests. Reworked to inject
`exclude_stale` into searchParams inside `infiniteListLogic` instead,
which also avoids `taxonomicGroups` recomputing every time the toggle
flips.
This change defaults users to seeing fewer rows on the Events tab —
that's an ordering / position-0 effect. The taxonomic-filter skill
flags such changes for explicit human sign-off; the human asked for
this design.
Generated-By: PostHog Code
Task-Id: 2649f7ae-c1f7-40ae-8866-be024f3f1285
|
🎭 Playwright didn't run on this PR — your changes touch code that could affect E2E behavior, but Playwright is opt-in via label now to keep CI cost down. Add the Most PRs don't need this. Real regressions still get caught on master and fix-forward. |
|
Size Change: +4.32 kB (+0.01%) Total Size: 79.8 MB 📦 View Changed
ℹ️ View Unchanged
|
ClickHouse migration SQL per cloud environmentNo ClickHouse migrations changed in this PR. |
|
Note 🤖 Automated comment by QA Swarm — not written by a human Verdict: 💬 APPROVE WITH NITS (MEDIUM)Four reviewer perspectives applied (qa-team, paul-reviewer, xp-reviewer, security-audit). The stale-events behaviour is well-scoped, the SQL is parameterised, tenant isolation is preserved by the upstream Findings below are all non-blocking. 🟡 MEDIUM[security + database] SQL predicate has no supporting index —
[paul + xp] Unsafe-ish cast The cast is guarded by 🟢 LOW / NIT[paul] Comment in reducer says what not why — The 4-line block comment explains the resetting behaviour. The reducer itself (with [paul] Empty-state-only placement of the toggle — Once you've clicked "Include stale events" and the list populates, the toggle disappears. You can only turn it back off by setting a new search or switching tabs (which is the implicit reset). i wonder if folks will hunt for it. data-attr is in place which is great — the [xp] STALE_EVENT_DAYS duplicated across frontend + backend — Both constants are [security] Tenant isolation preserved — informational Confirmed: the new predicate is appended to Convergence
Per-reviewer assessment
|
pauldambra
left a comment
There was a problem hiding this comment.
Note
🤖 Automated comment by QA Swarm — not written by a human
Inline findings from a 4-perspective review (qa-team, paul-reviewer, xp-reviewer, security-audit). Top-level summary in a separate comment.
- `event_definition.py`: add comment documenting the access-pattern assumption for the new `last_seen_at` predicate. The team-scoped `(team_id, name)` index already narrows the row set per tenant and the response is paginated, so the unindexed `last_seen_at` filter is bounded by upstream constraints. Flagged by qa-team-database + security-audit (convergent) on PR #59816 — the data warehouse can still re-EXPLAIN this if the largest tenants start showing it in slow-query logs. - `taxonomicFilterLogic.tsx`: replace `item as EventDefinition` cast with a structural `'last_seen_at' in item` guard. Documents the shape contract for the `wasStale` telemetry property and lets `isDefinitionStale` accept the runtime-shaped object without a bare-faced type assertion. Flagged by paul-reviewer + xp-reviewer (convergent) on PR #59816. No behaviour change. Tests still pass (`test_event_definition.py:: TestEventDefinitionExcludeStale` 3/3, `taxonomicFilterLogic.test.ts` 63/63). Generated-By: PostHog Code Task-Id: 2649f7ae-c1f7-40ae-8866-be024f3f1285
There was a problem hiding this comment.
Additive feature that safely filters stale events via a new opt-in API parameter. SQL injection is not a concern since the interval value comes from a hardcoded constant. The missing-index concern from the bot review is explicitly acknowledged in a code comment with documented reasoning (team-scoped index + pagination narrows rows before the predicate). All other inline feedback was NITs or positive.
|
Reviews (1): Last reviewed commit: "fix(taxonomic-filter): address QA Swarm ..." | Re-trigger Greptile |
The comment said the predicate runs after `(team_id, name)` narrows the row set, but the actual pre-filter uses `project_id` (from `create_event_definitions_sql`). Corrected to "project-scoped pre-filter" — the performance reasoning is unchanged. Caught by qa-team in second qa-swarm pass on PR #59816. Generated-By: PostHog Code Task-Id: 2649f7ae-c1f7-40ae-8866-be024f3f1285
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 54a5c8f5ae
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Note 🤖 Automated comment by QA Swarm — not written by a human Multi-perspective review pass 2 — HEAD Verdict: 💬 APPROVE WITH NITSThe second qa-swarm pass on the updated HEAD found no new blocking issues. Two MEDIUM convergent findings worth tracking: Key findings🟡 MEDIUM (convergent: qa-team + xp) — 🟡 MEDIUM (xp) — OnceAndOnlyOnce violation ( 🟢 LOW (paul) — 🟢 LOW (qa-team) — No test for the exact 30-day boundary case (fresh=1d, stale=45d, ancient=365d, never-seen=null; missing: exactly=30d). Fix committed
Convergence
Reviewer summaries
Automated by QA Swarm — not a human review |
New commits pushed (delta classified non_trivial_delta) — stamphog approval dismissed; re-review running automatically.
There was a problem hiding this comment.
Additive, opt-in feature with backward-compatible defaults. SQL uses parameterized intervals from a hardcoded constant (no injection risk), test coverage is solid, and generated types are properly updated. All substantive review comments are resolved; the lone unresolved Codex P2 comment about the strict > boundary at exactly 30 days is a negligible real-world difference, not a showstopper.
Shows the "Include stale events" button that appears when a search on the Events tab returns no results because all matches are filtered out by exclude_stale=true. Generated-By: PostHog Code Task-Id: 2649f7ae-c1f7-40ae-8866-be024f3f1285
New commits pushed (delta classified non_trivial_delta) — stamphog approval dismissed; re-review running automatically.
There was a problem hiding this comment.
Purely additive, backward-compatible feature. The new exclude_stale SQL predicate uses a properly parameterized interval from a hardcoded constant (no injection risk), null values are handled correctly to keep newly-defined events discoverable, tests cover default/false/true cases, and all generated types are updated. All substantive review comments are resolved.
|
✅ Visual changes approved by @pauldambra — baseline updated in 4 changed. |
3 updated Run: 43e7ea7f-15b4-4db0-b3c3-b4ff17dec2ae Co-authored-by: pauldambra <984817+pauldambra@users.noreply.github.com>
New commits pushed (delta classified non_trivial_delta) — stamphog approval dismissed; re-review running automatically.
There was a problem hiding this comment.
Additive, backward-compatible opt-in filter. SQL uses a parameterized interval from a hardcoded constant (no injection risk), null handling keeps newly-defined events discoverable, tests cover default/false/true cases, generated types are updated, and all substantive review comments are resolved.
3 updated Run: 31ab1f99-122a-4d85-895d-2c10c351d1d4 Co-authored-by: pauldambra <984817+pauldambra@users.noreply.github.com>
New commits pushed (delta classified non_trivial_delta) — stamphog approval dismissed; re-review running automatically.
There was a problem hiding this comment.
Purely additive, backward-compatible opt-in feature. SQL uses a parameterized interval from a hardcoded constant (no injection risk), null handling keeps newly-defined events discoverable, tests cover default/false/true cases, and generated types are updated in both consumer packages. All substantive review comments are resolved.
7 updated Run: e63c9606-52ba-460c-97a0-13705b59c5f7 Co-authored-by: pauldambra <984817+pauldambra@users.noreply.github.com>
New commits pushed (delta classified non_trivial_delta) — stamphog approval dismissed; re-review running automatically.
There was a problem hiding this comment.
Purely additive, backward-compatible opt-in feature. The SQL predicate uses a properly parameterized interval from a hardcoded constant (no injection risk), null handling preserves newly-defined events, tests cover all three cases, generated types are updated in both consumer packages, and all substantive review comments are resolved.
4 updated Run: e3f7810a-6170-47ae-b81d-c66137864fc0 Co-authored-by: pauldambra <984817+pauldambra@users.noreply.github.com>
New commits pushed (delta classified non_trivial_delta) — stamphog approval dismissed; re-review running automatically.
There was a problem hiding this comment.
Purely additive, backward-compatible opt-in feature. The SQL predicate is parameterized from a hardcoded constant (no injection risk), null handling correctly preserves never-seen events, tests cover all three cases, and generated types are updated in both consumer packages. All inline review comments are resolved with no substantive unaddressed concerns.

Problem
Changes
How did you test this code?
👉 Stay up-to-date with PostHog coding conventions for a smoother review.
Publish to changelog?
Docs update
🤖 Agent context