You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The daily trace-clustering coordinator was emitting $ai_generation_clusters and $ai_trace_clusters events on behalf of teams that had nothing to cluster, and those empty events kept the same teams re-eligible for clustering the next day. The result was a self-perpetuating loop that inflated the apparent set of AI-active teams by ~50% versus reality, and roughly 70% of cluster events in the last week carried $ai_total_items_analyzed = 0.
Three colluding bugs:
The DailyTraceClusteringWorkflow did not short-circuit when the compute activity returned an empty result — it still ran labeling, aggregates, and emission, writing a cluster event with zero items.
Team discovery derived its trigger-event list from AIEventType, which includes the cluster events themselves. Any team that had an empty cluster event written on its behalf stayed eligible for the next 30 days regardless of customer activity.
The discovery window (30 days) was decoupled from the workflow window (7 days), so even without bug Reports that users will need #1 discovery would enroll teams whose data the workflow then ignored.
Changes
Fix 1 — Workflow short-circuit (workflow.py): after the compute activity returns, if compute_result.items is empty, return a zero-item ClusteringResult immediately and skip labeling, aggregates, and emission. Stops the loop at the source.
Fix 2 — Decoupled discovery trigger list (llm_analytics_usage_report.py, team_discovery.py): added LLM_ANALYTICS_DISCOVERY_TRIGGER_EVENTS containing only customer-emitted signals (no $ai_*_clusters, no $ai_*_summary, no $ai_tag). get_teams_with_ai_events takes the trigger list as a required parameter; usage-report and discovery pass their respective constants explicitly.
Fix 3 — Aligned discovery window (team_discovery.py, clustering coordinator): DEFAULT_DISCOVERY_LOOKBACK_DAYS is now 7 (matches DEFAULT_LOOKBACK_DAYS). The clustering coordinator passes lookback_days=inputs.lookback_days through TeamDiscoveryInput, so discovery scopes its eligibility query to the workflow's own window.
How did you test this code?
I'm an agent. Automated tests only — no manual verification.
New TestEmptyComputeResultShortCircuit::test_empty_compute_skips_label_aggregates_and_emit drives the workflow with a compute activity returning an empty ClusteringComputeResult and asserts that the label/aggregates/emit activities are never invoked and that no event is written.
New TestGetTeamIdsForLlmAnalytics::test_uses_discovery_trigger_events_not_report_list asserts the narrow LLM_ANALYTICS_DISCOVERY_TRIGGER_EVENTS list is passed to the ClickHouse query and excludes the cluster/summary/tag events.
New TestGetTeamIdsForLlmAnalytics::test_lookback_uses_inputs_value verifies the discovery window now derives from TeamDiscoveryInput.lookback_days.
Existing TestGetLlmaWorkflowConfig tests updated for the simplified LLMAWorkflowConfig (no more discovery_lookback_days field).
Existing usage-report tests updated to pass LLM_ANALYTICS_REPORT_TRIGGER_EVENTS explicitly.
Full local run of trace_clustering/tests/test_workflow.py, trace_clustering/tests/test_coordinator.py, and llm_analytics/test_team_discovery.py — 53/53 pass. Ruff lint and format clean on all modified files.
Publish to changelog?
no
Docs update
No docs change needed.
🤖 Agent context
Authored by PostHog Code (Claude Opus 4.7) under my supervision.
The bug analysis (three colluding causes, prod impact numbers, historical chronology) was provided up-front so the agent had concrete file:line targets and could focus on implementation rather than discovery.
Two notable course corrections during the session:
Avoided over-preserving the FF override hook for discovery_lookback_days. First pass kept a full FF-payload override path (config field as int | None, fallback ternary in the activity, dedicated tests). On reflection, with the coordinator now passing lookback_days through explicitly, the override hook was speculative — removed the field, validation, ternary, and ~3 obsolete test cases. Can be added back if any team actually needs to widen the window without a deploy.
Avoided over-engineering get_teams_with_ai_events's signature. First pass made trigger_events optional with a None default that fell back to the broad report list. Made the parameter required and updated both call sites (usage-report and discovery) to pass their constants explicitly — call sites now self-document which list is in play.
Rejected a _shortcircuit_pipeline helper. Tried extracting the short-circuit return into a helper function; the data lives across inputs, compute_result, and local workflow state with no single carrier object, so the helper had 6 loose parameters and was less readable than the inline version. Kept the inline early return.
No manual UI verification — pure backend / Temporal workflow logic. Verification per the original spec's ClickHouse query will need to happen post-deploy.
🎭 Playwright didn't run on this PR — your changes touch code that could affect E2E behavior, but Playwright is opt-in via label now to keep CI cost down.
Add the run-playwright label if you want an E2E sweep before merging — CI will pick it up automatically.
Most PRs don't need this. Real regressions still get caught on master and fix-forward.
posthog/tasks/llm_analytics_usage_report.py, line 239-258 (link)
The SQL named parameter llm_analytics_report_trigger_events is now used by both the usage-report caller and the discovery caller, so the name no longer reflects its purpose and will confuse anyone reading the query. Renaming it to trigger_events keeps the SQL aligned with the function signature.
Prompt To Fix With AI
This is a comment left during a code review.
Path: posthog/tasks/llm_analytics_usage_report.py
Line: 239-258
Comment:
The SQL named parameter `llm_analytics_report_trigger_events` is now used by both the usage-report caller and the discovery caller, so the name no longer reflects its purpose and will confuse anyone reading the query. Renaming it to `trigger_events` keeps the SQL aligned with the function signature.
How can I resolve this? If you propose a fix, please make it concise.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---### Issue 1 of 2
posthog/temporal/llm_analytics/team_discovery.py:143-146
**Misleading comment after `lookback_days` change**
The comment says "FF payload overrides TeamDiscoveryInput when set" but the lines that follow do the opposite for `lookback_days`: it now reads from `inputs.lookback_days` (the Temporal input), not from the feature-flag config. A reader scanning this block will incorrectly infer that both `sample_percentage` and `lookback_days` are sourced from the FF payload, when only the former is. The comment should be split or reworded to reflect that `sample_percentage` comes from the FF config while `lookback_days` comes from `inputs`.
### Issue 2 of 2
posthog/tasks/llm_analytics_usage_report.py:239-258
The SQL named parameter `llm_analytics_report_trigger_events` is now used by both the usage-report caller and the discovery caller, so the name no longer reflects its purpose and will confuse anyone reading the query. Renaming it to `trigger_events` keeps the SQL aligned with the function signature.
```suggestion query = """ SELECT DISTINCT team_id FROM events WHERE event IN %(trigger_events)s AND timestamp >= %(begin)s AND timestamp < %(end)s """ with tags_context( product=Product.LLM_ANALYTICS, feature=Feature.QUERY, kind="celery", id=CELERY_TASK_ID, name="Get teams with AI observability trigger events", workload=Workload.OFFLINE.value, ): results = sync_execute( query, { "trigger_events": trigger_events,```
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The daily trace-clustering coordinator was emitting
$ai_generation_clustersand$ai_trace_clustersevents on behalf of teams that had nothing to cluster, and those empty events kept the same teams re-eligible for clustering the next day. The result was a self-perpetuating loop that inflated the apparent set of AI-active teams by ~50% versus reality, and roughly 70% of cluster events in the last week carried$ai_total_items_analyzed = 0.Three colluding bugs:
DailyTraceClusteringWorkflowdid not short-circuit when the compute activity returned an empty result — it still ran labeling, aggregates, and emission, writing a cluster event with zero items.AIEventType, which includes the cluster events themselves. Any team that had an empty cluster event written on its behalf stayed eligible for the next 30 days regardless of customer activity.Changes
workflow.py): after the compute activity returns, ifcompute_result.itemsis empty, return a zero-itemClusteringResultimmediately and skip labeling, aggregates, and emission. Stops the loop at the source.llm_analytics_usage_report.py,team_discovery.py): addedLLM_ANALYTICS_DISCOVERY_TRIGGER_EVENTScontaining only customer-emitted signals (no$ai_*_clusters, no$ai_*_summary, no$ai_tag).get_teams_with_ai_eventstakes the trigger list as a required parameter; usage-report and discovery pass their respective constants explicitly.team_discovery.py, clustering coordinator):DEFAULT_DISCOVERY_LOOKBACK_DAYSis now 7 (matchesDEFAULT_LOOKBACK_DAYS). The clustering coordinator passeslookback_days=inputs.lookback_daysthroughTeamDiscoveryInput, so discovery scopes its eligibility query to the workflow's own window.How did you test this code?
I'm an agent. Automated tests only — no manual verification.
TestEmptyComputeResultShortCircuit::test_empty_compute_skips_label_aggregates_and_emitdrives the workflow with a compute activity returning an emptyClusteringComputeResultand asserts that the label/aggregates/emit activities are never invoked and that no event is written.TestGetTeamIdsForLlmAnalytics::test_uses_discovery_trigger_events_not_report_listasserts the narrowLLM_ANALYTICS_DISCOVERY_TRIGGER_EVENTSlist is passed to the ClickHouse query and excludes the cluster/summary/tag events.TestGetTeamIdsForLlmAnalytics::test_lookback_uses_inputs_valueverifies the discovery window now derives fromTeamDiscoveryInput.lookback_days.TestGetLlmaWorkflowConfigtests updated for the simplifiedLLMAWorkflowConfig(no morediscovery_lookback_daysfield).LLM_ANALYTICS_REPORT_TRIGGER_EVENTSexplicitly.trace_clustering/tests/test_workflow.py,trace_clustering/tests/test_coordinator.py, andllm_analytics/test_team_discovery.py— 53/53 pass. Ruff lint and format clean on all modified files.Publish to changelog?
no
Docs update
No docs change needed.
🤖 Agent context
Authored by PostHog Code (Claude Opus 4.7) under my supervision.
The bug analysis (three colluding causes, prod impact numbers, historical chronology) was provided up-front so the agent had concrete file:line targets and could focus on implementation rather than discovery.
Two notable course corrections during the session:
discovery_lookback_days. First pass kept a full FF-payload override path (config field asint | None, fallback ternary in the activity, dedicated tests). On reflection, with the coordinator now passinglookback_daysthrough explicitly, the override hook was speculative — removed the field, validation, ternary, and ~3 obsolete test cases. Can be added back if any team actually needs to widen the window without a deploy.get_teams_with_ai_events's signature. First pass madetrigger_eventsoptional with aNonedefault that fell back to the broad report list. Made the parameter required and updated both call sites (usage-report and discovery) to pass their constants explicitly — call sites now self-document which list is in play._shortcircuit_pipelinehelper. Tried extracting the short-circuit return into a helper function; the data lives acrossinputs,compute_result, and local workflow state with no single carrier object, so the helper had 6 loose parameters and was less readable than the inline version. Kept the inline early return.No manual UI verification — pure backend / Temporal workflow logic. Verification per the original spec's ClickHouse query will need to happen post-deploy.