feat(hogql): add HogQL expression breakdowns to retention#60853
Merged
Conversation
Retention already supported breakdowns by event, person, group, cohort and data-warehouse person property, but not by a custom HogQL/SQL expression like trends does. Add the missing piece on both ends: - backend: handle breakdown_type "hogql" in breakdown_extract_expr by parsing the value as a HogQL expression (and stripping display-only aliases), instead of falling through to the event-property path where it was misread as a property name - frontend: surface the HogQL expression option in the retention breakdown picker's taxonomic group types Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Project convention: Python tests don't use doc comments. Move the rationale to an inline comment. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
posthog/hogql_queries/insights/retention/test/test_retention_query_runner.py:4004-4058
**Non-parameterized test for a multi-case feature**
The project convention prefers parameterized tests. This test verifies a single expression (`upper(properties.browser)`), but the change introduces logic with two distinct code paths worth covering independently: the expression-evaluation path and the alias-stripping path (`strip_user_aliases`). A parameterized test could cover both, e.g. one case for a plain expression and one for `upper(properties.browser) AS browser_upper` (with alias). Without the alias case, the `strip_user_aliases` call goes untested and a regression there would silently break HogQL breakdowns that include user-supplied `AS` aliases.
Reviews (1): Last reviewed commit: "chore(retention): drop docstring from ho..." | Re-trigger Greptile |
Contributor
|
Size Change: 0 B Total Size: 80.8 MB ℹ️ View Unchanged
|
Adds a `... AS browser_upper` case so strip_user_aliases is exercised, not just the plain-expression path. Addresses Greptile review feedback. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
|
🎭 Playwright report · View test results →
These issues are not necessarily caused by your changes. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Retention insights support breakdowns by event property, person property, feature flag, group, cohort, and data-warehouse person property, but not by a custom HogQL/SQL expression. Trends already supports this (the "HogQL expression" option in the breakdown picker), and users have asked for the same on retention so they can break down by a derived or computed value rather than a single raw property.
Changes
Two small additions, one on each side of the stack:
posthog/hogql_queries/insights/retention/utils.py):breakdown_extract_exprnow handlesbreakdown_type == "hogql". The value is parsed as a HogQL expression viaparse_expr(with display-onlyASaliases stripped, matching trends), instead of falling through to the event-property branch where the whole expression was misread as a property name. The parsed expression flows through the sametoString()/ifNull()wrapping as every other breakdown type, sobreakdown_valuestays a String and the response shape is unchanged.tag_contains_user_hogql()is called for the same query tagging trends applies to user-supplied HogQL.TaxonomicBreakdownPopover.tsx): addsTaxonomicFilterGroupType.HogQLExpressionto the taxonomic group types offered in the retention breakdown picker, in the same relative position it occupies for trends. No reordering of existing options.The schema already carried
breakdownFilter: BreakdownFilteronRetentionQueryandMultipleBreakdownType.HOGQL, so no schema or migration changes were needed. HogQL breakdowns are not cohort breakdowns, so they route through the standard_base_query()path; the cohort-specific union path is untouched.How did you test this code?
Automated (agent-run):
test_retention_with_breakdown_hogql_expression, which breaks down byupper(properties.browser)and asserts the buckets come back uppercased (CHROME/SAFARI/FIREFOX). That only happens if the expression is evaluated as SQL rather than read as a property name. It also checks the retention matrix for one bucket. Passes locally.-k breakdown). Every non-actor breakdown test passes (event, person, group, cohort, data-warehouse, virtual property). The actor-query tests fail only because the local personhog gRPC service was down; confirmed they fail identically on master without this change, so it is an environment limitation, not a regression. They run against personhog in CI.ty check), ruff lint/format, JS formatting.Manual:
upper(properties.$browser)as the breakdown and confirmed the retention chart split into the expected computed buckets.Automatic notifications
🤖 Agent context
Co-authored by Claude Code (Opus 4.8)
Investigation found that retention breakdown support was effectively complete except for the HogQL/SQL expression type: the backend
breakdown_extract_exprhad nohogqlbranch (so a HogQL breakdown was silently misinterpreted as an event-property name), and the frontend breakdown picker omitted theHogQLExpressiongroup for retention only. The trends implementation (posthog/hogql_queries/insights/trends/breakdown.py) was used as the reference for parsing and alias-stripping behavior so the two stay consistent.A note for the security-minded reviewer: parsing user input as HogQL is intentional and matches trends. HogQL is resolved and printed through the standard pipeline that scopes to the team, enforces field-access rules, and parameterizes literals;
strip_user_aliasesremoves display-onlyASclauses that would corrupt the wrapping SQL.Considered but left out of scope: multiple simultaneous HogQL breakdowns (retention is single-breakdown only today) and any interaction with the in-progress retention pre-aggregation read path (HogQL breakdowns fall back to the standard query path).