feat(heatmaps): Add cohort filtering to heatmaps#59909
Conversation
|
🎭 Playwright didn't run on this PR — your changes touch code that could affect E2E behavior, but Playwright is opt-in via label now to keep CI cost down. Add the Most PRs don't need this. Real regressions still get caught on master and fix-forward. |
|
Size Change: 0 B Total Size: 79.9 MB ℹ️ View Unchanged
|
Prompt To Fix All With AIFix the following 3 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 3
frontend/src/lib/components/heatmaps/heatmapDataLogic.ts:190
**Cohort IDs serialisation mismatch with backend**
`cohort_ids` is passed to `encodeParams` as a raw `number[]`, but the backend `validate_cohort_ids` deserialises it with `json.loads(value)`, which expects a JSON-encoded string like `"[1,2]"`. `encodeParams` (from kea-router) serialises arrays as repeated query-string entries (`?cohort_ids=1&cohort_ids=2`), and Django's `CharField` serialiser receives only one of those values; `loads("1")` produces an integer, which immediately fails the `isinstance(cohort_ids, list)` check. Notice that `points` on the same call is explicitly wrapped in `JSON.stringify` — `cohort_ids` needs the same treatment. The same issue applies to all three call sites in this file.
### Issue 2 of 3
products/web_analytics/backend/api/heatmaps_api.py:389-398
**Multiple cohorts are ANDed, not ORed**
Each cohort ID appends its own predicate to `predicate_expressions`, and the entire list is later combined with `ast.And(exprs=exprs)`. Selecting two cohorts therefore requires a visitor to be in *both* cohorts simultaneously. If the intent is to show data for visitors belonging to *any* of the selected cohorts, the per-cohort sub-queries should be combined with OR and the result ANDed into the outer list as a single expression: `ast.Or(exprs=[...per-cohort-exprs...])`. Even if AND semantics are intentional, this is worth documenting so future reviewers don't assume OR.
### Issue 3 of 3
products/web_analytics/backend/api/heatmaps_api.py:292-295
**Feature-flag gate silently drops cohort filter instead of rejecting the request**
When the flag is off, `cohort_ids` is popped and the query runs without the filter, so the caller receives unfiltered results with no indication the filter was ignored. A caller (or a UI bug) that sends cohort IDs while the flag is disabled will silently get wrong data. Returning a 403 or a validation error would make the boundary explicit. The same pattern repeats in the `events` action.
Reviews (1): Last reviewed commit: "Add cohort filtering to heatmaps" | Re-trigger Greptile |
| viewport_width_max: values.viewportRange.max, | ||
| aggregation, | ||
| filter_test_accounts, | ||
| cohort_ids: cohort_ids && cohort_ids.length > 0 ? cohort_ids : undefined, |
There was a problem hiding this comment.
Cohort IDs serialisation mismatch with backend
cohort_ids is passed to encodeParams as a raw number[], but the backend validate_cohort_ids deserialises it with json.loads(value), which expects a JSON-encoded string like "[1,2]". encodeParams (from kea-router) serialises arrays as repeated query-string entries (?cohort_ids=1&cohort_ids=2), and Django's CharField serialiser receives only one of those values; loads("1") produces an integer, which immediately fails the isinstance(cohort_ids, list) check. Notice that points on the same call is explicitly wrapped in JSON.stringify — cohort_ids needs the same treatment. The same issue applies to all three call sites in this file.
Prompt To Fix With AI
This is a comment left during a code review.
Path: frontend/src/lib/components/heatmaps/heatmapDataLogic.ts
Line: 190
Comment:
**Cohort IDs serialisation mismatch with backend**
`cohort_ids` is passed to `encodeParams` as a raw `number[]`, but the backend `validate_cohort_ids` deserialises it with `json.loads(value)`, which expects a JSON-encoded string like `"[1,2]"`. `encodeParams` (from kea-router) serialises arrays as repeated query-string entries (`?cohort_ids=1&cohort_ids=2`), and Django's `CharField` serialiser receives only one of those values; `loads("1")` produces an integer, which immediately fails the `isinstance(cohort_ids, list)` check. Notice that `points` on the same call is explicitly wrapped in `JSON.stringify` — `cohort_ids` needs the same treatment. The same issue applies to all three call sites in this file.
How can I resolve this? If you propose a fix, please make it concise.| cohort_ids_expr = placeholders.get("cohort_ids") | ||
| if isinstance(cohort_ids_expr, ast.Array): | ||
| for cohort_id_expr in cohort_ids_expr.exprs: | ||
| predicate_expressions.append( | ||
| parse_expr( | ||
| "distinct_id IN (SELECT distinct_id FROM person_distinct_ids " | ||
| "WHERE person_id IN COHORT {cohort_id})", | ||
| {"cohort_id": cohort_id_expr}, | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Multiple cohorts are ANDed, not ORed
Each cohort ID appends its own predicate to predicate_expressions, and the entire list is later combined with ast.And(exprs=exprs). Selecting two cohorts therefore requires a visitor to be in both cohorts simultaneously. If the intent is to show data for visitors belonging to any of the selected cohorts, the per-cohort sub-queries should be combined with OR and the result ANDed into the outer list as a single expression: ast.Or(exprs=[...per-cohort-exprs...]). Even if AND semantics are intentional, this is worth documenting so future reviewers don't assume OR.
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/web_analytics/backend/api/heatmaps_api.py
Line: 389-398
Comment:
**Multiple cohorts are ANDed, not ORed**
Each cohort ID appends its own predicate to `predicate_expressions`, and the entire list is later combined with `ast.And(exprs=exprs)`. Selecting two cohorts therefore requires a visitor to be in *both* cohorts simultaneously. If the intent is to show data for visitors belonging to *any* of the selected cohorts, the per-cohort sub-queries should be combined with OR and the result ANDed into the outer list as a single expression: `ast.Or(exprs=[...per-cohort-exprs...])`. Even if AND semantics are intentional, this is worth documenting so future reviewers don't assume OR.
How can I resolve this? If you propose a fix, please make it concise.| if request_serializer.validated_data.get("cohort_ids") and not _heatmaps_cohort_filter_enabled( | ||
| cast(User, request.user), self.team | ||
| ): | ||
| request_serializer.validated_data.pop("cohort_ids") |
There was a problem hiding this comment.
Feature-flag gate silently drops cohort filter instead of rejecting the request
When the flag is off, cohort_ids is popped and the query runs without the filter, so the caller receives unfiltered results with no indication the filter was ignored. A caller (or a UI bug) that sends cohort IDs while the flag is disabled will silently get wrong data. Returning a 403 or a validation error would make the boundary explicit. The same pattern repeats in the events action.
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/web_analytics/backend/api/heatmaps_api.py
Line: 292-295
Comment:
**Feature-flag gate silently drops cohort filter instead of rejecting the request**
When the flag is off, `cohort_ids` is popped and the query runs without the filter, so the caller receives unfiltered results with no indication the filter was ignored. A caller (or a UI bug) that sends cohort IDs while the flag is disabled will silently get wrong data. Returning a 403 or a validation error would make the boundary explicit. The same pattern repeats in the `events` action.
How can I resolve this? If you propose a fix, please make it concise.
ClickHouse migration SQL per cloud environmentNo ClickHouse migrations changed in this PR. |
arthurdedeus
left a comment
There was a problem hiding this comment.
Nice! Love the videos
|
👋 I analyzed this PR for documentation updates, but the cohort filtering feature is gated behind the I'll hold off on creating documentation until the flag is removed and the feature is publicly available. Once that happens, the heatmaps documentation will need to be updated to cover the new cohort filtering capability. |
Problem
Previously, it was impossible to filter heatmaps by cohort. This made it difficult to see how different groups interacted with your app
Changes
Adds a new cohort filter on the heatmaps page.
Screen.Recording.2026-05-25.at.9.49.38.AM.mov
How did you test this code?
Locally
👉 Stay up-to-date with PostHog coding conventions for a smoother review.
Publish to changelog?
No
Docs update
🤖 Agent context