chore(hogql): tune parse_*_seconds histogram buckets for sub-ms parses#60414
Merged
Conversation
The Histogram instances for parse_expr_seconds, parse_order_expr_seconds, parse_select_seconds, and parse_full_template_string_seconds were created with the default Prometheus buckets, whose lowest bound is 5ms. Real parses run an order of magnitude or more below that: rust-py around 10μs, cpp typically under 1ms. Every typical parse landed in the lowest bucket, so histogram_quantile was useless at this scale. Replace with a 1-2-5 progression from 5μs through 10s, which keeps usable resolution across the full range while still capturing pathological queries that take seconds. Means (_sum / _count) are unaffected, so existing dashboards built on those keep working. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
Reviews (1): Last reviewed commit: "chore(hogql): tune parse_*_seconds histo..." | Re-trigger Greptile |
Contributor
|
🎭 Playwright report · View test results →
These issues are not necessarily caused by your changes. |
georgemunyoro
approved these changes
May 29, 2026
mayteio
pushed a commit
that referenced
this pull request
May 29, 2026
#60414) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
#60201 shipped per-backend parse timings on the existing Prometheus histograms
parse_expr_seconds,parse_order_expr_seconds,parse_select_seconds, andparse_full_template_string_secondsinposthog/hogql/parser.py. Looking at the live metrics, the default Prometheus buckets (lowest bound 5ms) are far too coarse for these parses. rust-py runs around 10μs, cpp is typically under 1ms, and pathological queries can take seconds. With the default buckets, every typical parse lands in the lowest bucket andhistogram_quantileis effectively useless at this scale.Means (
_sum / _count) are fine, and the current dashboards use those, but quantiles are not usable.Changes
Add an explicit
buckets=tuple on the fourparse_*_secondshistograms: a 1-2-5 progression from 5μs through 10s (14 buckets). This is the entire change; no other files are touched.How did you test this code?
Agent-authored (Claude Code); requires human review. This is a histogram bucket config change with no logic touches. Python syntax check on
posthog/hogql/parser.pypasses. No tests assert on bucket values; CI will run the full backend suite. No manual or UI testing was done.Publish to changelog?
no, internal observability tuning.
Docs update
No user-facing changes.
🤖 Agent context
Source: observation from #60201 once the shadow-parser rollout reached Grafana. The existing histograms were created with
prometheus_client's default buckets, which start at 5ms; with rust-py parses around 10μs and cpp typically sub-ms, p50/p90/p99 all collapsed into the lowest bucket. The fix setsbuckets=on the sameHistograminstances and leaves everything else untouched.Behavior note for the rollout window: quantile queries that span the bucket-change boundary will look odd for a short period (old samples have only the 5ms-and-up buckets, new samples have the full 5μs-through-10s range). Mean-based dashboards (
_sum / _count) are unaffected.Cardinality: modest. 14 buckets × 4 rules × small number of backends × small number of versions, which is well within typical Prometheus sizing.
Tools: Claude Code (Edit, Bash, gh).