Materialized view guide for clickstack#5351
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
3 Skipped Deployments
|
| import generated_sql from '@site/static/images/clickstack/materialized_views/generated_sql.png'; | ||
| import accelerated_visual from '@site/static/images/clickstack/materialized_views/accelerated_visual.png'; | ||
|
|
||
| <BetaBadge/> |
There was a problem hiding this comment.
I have removed the beta badge from the actual in-app feature - I can add it back if we think there are major gaps?
| Rather than creating separate materialized views for each query and chart, you can combine these into a single view aggregating by service name and status code. This single view can compute multiple metrics such as count, average duration, max duration, and also percentiles, which can then be reused across several visualizations. An example query, combining the above, is shown below: | ||
|
|
||
| ```sql | ||
| SELECT avg(Duration), max(Duration), count(), quantiles(0.95,0.99)(Duration), toStartOfMinute(Timestamp) as time, ServiceName, StatusCode |
There was a problem hiding this comment.
nit - quantiles is not mentioned above but is included in the combined query
| `count` SimpleAggregateFunction(sum, UInt64), | ||
| `avg__Duration` AggregateFunction(avg, UInt64), | ||
| `max__Duration` SimpleAggregateFunction(max, Int64), | ||
| `quantiles__Duration` AggregateFunction(quantiles(0.95, 0.99), Int64) |
There was a problem hiding this comment.
You use quantiles throughout this doc but HyperDX uses quantile (singular)
I think we found quantileState() supports various quantile levels, quantiles shouldn't be needed.
There was a problem hiding this comment.
Also nit - is there a reason to have inconsistent types for Duration (Int64 vs UInt64?)
| - The **visualization granularity** isn't a multiple of the view's granularity. | ||
| - The **aggregation function** requested by the query isn't present in the view. | ||
| - The query uses **custom count expressions**, such as `count(if(...))`, that can't be derived from the view's aggregation states. | ||
|
|
There was a problem hiding this comment.
| - The query is filtering by a column that is not included in the view | |
| ClickStack first determines whether a materialized view is eligible for the query by checking: | ||
| - **Time coverage**: the query's time range must fall entirely within the materialized view's available data range. | ||
| - **Granularity**: the visualization's time bucket must be equal to or coarser than the view's granularity. | ||
| - **Aggregations**: the requested metrics must be present in the view and computable from its aggregation states. |
There was a problem hiding this comment.
| - **Aggregations**: the requested metrics must be present in the view and computable from its aggregation states. | |
| - **Aggregations**: the requested metrics must be present in the view and computable from its aggregation states. | |
| - **Filter compatibility**: the columns that the query filters on must be present as dimension columns in the view. |
| Because of these rules: | ||
|
|
||
| - **Don't create materialized views with a 10-minute granularity**. | ||
| ClickStack supports 15-minute granularity for charts and alerts, but not 10-minute. A 10-minute materialized view would therefore be incompatible with common 15-minute visualizations and alerts. |
There was a problem hiding this comment.
Technically we support 10m charts too, we just won't auto-select 10m granularity anymore. Alerts don't support 10m.
| ClickStack supports 15-minute granularity for charts and alerts, but not 10-minute. A 10-minute materialized view would therefore be incompatible with common 15-minute visualizations and alerts. | |
| ClickStack supports 15-minute granularity for charts and alerts, which would not be compatible with a 10-minute materialized view. |
| ClickStack supports 15-minute granularity for charts and alerts, but not 10-minute. A 10-minute materialized view would therefore be incompatible with common 15-minute visualizations and alerts. | ||
| - Prefer **1-minute** or **1-hour** granularities, which compose cleanly with most chart and alert configurations. | ||
|
|
||
| Higher granularity (for example, 1 hour) produces smaller views and lower storage overhead, while lower granularity (for example, 1 minute) provides more flexibility for fine-grained analysis. Choose the smallest granularity that supports your critical workflows. |
There was a problem hiding this comment.
I think high granularity = short interval, so this is backwards?
| Higher granularity (for example, 1 hour) produces smaller views and lower storage overhead, while lower granularity (for example, 1 minute) provides more flexibility for fine-grained analysis. Choose the smallest granularity that supports your critical workflows. | |
| Lower granularity (for example, 1 hour) produces smaller views and lower storage overhead, while higher granularity (for example, 1 minute) provides more flexibility for fine-grained analysis. Choose the largest granularity that supports your critical workflows. |
|
|
||
| Different quantile functions have different performance and storage characteristics: | ||
|
|
||
| - `quantiles` produces larger sketches on disk but are cheaper to compute at insert time. |
There was a problem hiding this comment.
| - `quantiles` produces larger sketches on disk but are cheaper to compute at insert time. | |
| - `quantile` produces larger sketches on disk but are cheaper to compute at insert time. |
|
|
||
| For more complex workloads, multiple materialized views can be used to support different access patterns. Examples include: | ||
|
|
||
| - **High-resolution recent data with coarse historical views** |
There was a problem hiding this comment.
I don't think we support this well right now, since the minimum date is not dynamic (we can't say "only use this high resolution MV if the date range is in the last 3 days).
| Views with very fine granularity increase storage size and insert-time overhead, while coarse-grained views reduce flexibility. Granularity must be chosen carefully to match expected query patterns. | ||
|
|
||
| - **dimension explosion** | ||
| Adding many grouping dimensions significantly increases view size and can reduce effectiveness. Views should include only commonly used grouping and filtering columns. |
There was a problem hiding this comment.
| Adding many grouping dimensions significantly increases view size and can reduce effectiveness. Views should include only commonly used grouping and filtering columns. | |
| Adding many grouping dimensions significantly increases view size and can reduce effectiveness. Views should include only commonly used grouping and filtering columns with relatively low cardinality. |
Summary
A guide for mvs and clickstack
Checklist