Skip to content

fix(customer-analytics): Optimize usage metrics query runner with interval grouping#54348

Merged
arthurdedeus merged 1 commit intomasterfrom
posthog-code/usage-metrics-query-optimization
Apr 22, 2026
Merged

fix(customer-analytics): Optimize usage metrics query runner with interval grouping#54348
arthurdedeus merged 1 commit intomasterfrom
posthog-code/usage-metrics-query-optimization

Conversation

@arthurdedeus
Copy link
Copy Markdown
Contributor

@arthurdedeus arthurdedeus commented Apr 13, 2026

Problem

The usage metrics query runner generated one SELECT per metric, UNION ALL'd together, each independently scanning the events table. With multiple metrics configured, this meant redundant scans of the same data — all metrics for a given entity (person/group) share the same base filter.

Changes

  • Refactored query runner to group metrics by interval and build one query per interval group using countIf/sumIf conditional aggregation
  • Daily bucketing with toStartOfDay + gap-filling in Python, replacing per-metric parse_select template queries
  • Single events table scan per interval group instead of per-metric UNION ALL

How did you test this code?

Existing test suite passes with updated snapshots confirming the new query structure. hogli test products/customer_analytics/backend/hogql_queries/test/test_usage_metrics_query_runner.py — 25 tests pass.

Publish to changelog?

No

Docs update

🤖 LLM context

Authored by PostHog Code (Claude Code).


Created with PostHog Code

@github-actions
Copy link
Copy Markdown
Contributor

Hey @arthurdedeus! 👋\nThis pull request seems to contain no description. Please add useful context, rationale, and/or any other information that will help make sense of this change now and in the distant Mars-based future.

@arthurdedeus arthurdedeus changed the title fix(customer-analytics): optimize usage metrics query runner with interval grouping fix(customer-analytics): Optimize usage metrics query runner with interval grouping Apr 13, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 13, 2026

Prompt To Fix All With AI
This is a comment left during a code review.
Path: products/customer_analytics/backend/hogql_queries/usage_metrics_query_runner.py
Line: 222-224

Comment:
**`datetime.now()` called independently from query builder**

`_process_group_results` recomputes `date_to`/`date_from`/`prev_date_from` via a fresh `datetime.now()`, but `_build_interval_group_query` already called `datetime.now()` at line 121 to embed timestamps in the SQL. If any time passes between these two calls (including query execution time), the date window used to interpret results won't match the one encoded in the query. At a day boundary this means the wrong days are included in `current_dates`/`previous_dates`, silently producing incorrect totals.

The fix is to compute the reference time once in `_calculate` and pass it to both methods:

```python
# In _calculate, before the for-loop:
date_to = datetime.now(tz=ZoneInfo("UTC"))

# Pass it down:
query = self._build_interval_group_query(interval, group, date_to=date_to)
results = self._process_group_results(response, interval, group, date_to=date_to)
```

Then update both method signatures to accept `date_to: datetime` and remove the internal `datetime.now()` calls.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(customer-analytics): optimize usage ..." | Re-trigger Greptile

Comment thread products/customer_analytics/backend/hogql_queries/usage_metrics_query_runner.py Outdated
@arthurdedeus arthurdedeus marked this pull request as draft April 13, 2026 21:02
@arthurdedeus arthurdedeus force-pushed the posthog-code/usage-metric-sum-aggregation-frontend branch from 9b7a48c to 3e24102 Compare April 14, 2026 09:53
@arthurdedeus arthurdedeus changed the base branch from posthog-code/usage-metric-sum-aggregation-frontend to graphite-base/54348 April 14, 2026 10:06
@arthurdedeus arthurdedeus force-pushed the posthog-code/usage-metrics-query-optimization branch from 853c9b3 to 01e2feb Compare April 20, 2026 22:23
@arthurdedeus arthurdedeus changed the base branch from graphite-base/54348 to posthog-code/usage-metric-sum-aggregation-frontend April 20, 2026 22:23
@blacksmith-sh
Copy link
Copy Markdown

blacksmith-sh Bot commented Apr 20, 2026

Found 2 test failures on Blacksmith runners:

Failures

Test View Logs
TestDataWarehouseManagedViewSetModel/test_sync_views_creates_views View Logs
TestSavedQueryDagSyncIntegration/test_materialize_updates_node_type View Logs

Fix in Cursor

@tests-posthog
Copy link
Copy Markdown
Contributor

tests-posthog Bot commented Apr 20, 2026

Query snapshots: Backend query snapshots updated

Changes: 1 snapshots (1 modified, 0 added, 0 deleted)

What this means:

  • Query snapshots have been automatically updated to match current output
  • These changes reflect modifications to database queries or schema

Next steps:

  • Review the query changes to ensure they're intentional
  • If unexpected, investigate what caused the query to change

Review snapshot changes →

@arthurdedeus arthurdedeus marked this pull request as ready for review April 22, 2026 13:46
@arthurdedeus arthurdedeus self-assigned this Apr 22, 2026
@arthurdedeus arthurdedeus requested a review from a team April 22, 2026 13:46
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 22, 2026

Prompt To Fix All With AI
This is a comment left during a code review.
Path: products/customer_analytics/backend/hogql_queries/usage_metrics_query_runner.py
Line: 90-102

Comment:
**Repeated DB queries from dropping `@cached_property`**

`_get_usage_metrics()` was previously a `@cached_property` that executed the ORM query once per runner instance. It is now a plain method called independently from `get_cache_payload()`, `to_query()`, and `_calculate()` — each invocation issues a separate `SELECT` against the database. For callers that touch both `get_cache_payload()` (to check/populate the cache) and `_calculate()` (to run the query), the metrics table is read at least twice per request cycle.

Alternatively, keep the method but cache its result as `self._usage_metrics` on first call, or convert the three callers to share a single result computed once in `_calculate`/`to_query`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: products/customer_analytics/backend/hogql_queries/usage_metrics_query_runner.py
Line: 236

Comment:
**Off-by-one in `previous_dates` upper bound**

The upper bound for the previous period is derived as `(date_from - timedelta(seconds=1)).date()`. The expression works, but it is expressing `date_from.date() - timedelta(days=1)` in a roundabout way that depends on `date_from` never being exactly midnight. The cleaner and more direct form is `(date_from - timedelta(days=1)).date()`, which is explicitly "one day before `date_from`'s date" and is always equivalent for daily bucketing.

```suggestion
        previous_dates = self._date_range(prev_date_from.date(), (date_from - timedelta(days=1)).date())
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (2): Last reviewed commit: "test(backend): update query snapshots" | Re-trigger Greptile

Comment thread products/customer_analytics/backend/hogql_queries/usage_metrics_query_runner.py Outdated
@arthurdedeus arthurdedeus changed the base branch from posthog-code/usage-metric-sum-aggregation-frontend to graphite-base/54348 April 22, 2026 17:32
@arthurdedeus arthurdedeus force-pushed the posthog-code/usage-metrics-query-optimization branch from f18bbea to 3cc2265 Compare April 22, 2026 19:55
@arthurdedeus arthurdedeus requested review from a team as code owners April 22, 2026 19:55
@assign-reviewers-posthog assign-reviewers-posthog Bot requested a review from a team April 22, 2026 19:55
@arthurdedeus arthurdedeus changed the base branch from graphite-base/54348 to master April 22, 2026 19:59
…erval grouping

Generated-By: PostHog Code
Task-Id: 59552aa7-0856-47b1-86c6-034bb0a4ac82
@arthurdedeus arthurdedeus force-pushed the posthog-code/usage-metrics-query-optimization branch from 3cc2265 to 76f742d Compare April 22, 2026 20:06
@arthurdedeus arthurdedeus removed request for a team April 22, 2026 20:07
@arthurdedeus arthurdedeus merged commit 8af7ea7 into master Apr 22, 2026
228 checks passed
@arthurdedeus arthurdedeus deleted the posthog-code/usage-metrics-query-optimization branch April 22, 2026 20:52
@deployment-status-posthog
Copy link
Copy Markdown

deployment-status-posthog Bot commented Apr 22, 2026

Deploy status

Environment Status Deployed At Workflow
dev ✅ Deployed 2026-04-22 21:17 UTC Run
prod-us ✅ Deployed 2026-04-22 21:27 UTC Run
prod-eu ✅ Deployed 2026-04-22 21:30 UTC Run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants