New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for PromQL #57545
Comments
This schema looks strange: But as I wrote somewhere in related threads, I believe that we need to store vector for some larger granularity(i.e. hour) as a single value in a column. |
It would be beneficial to include a |
https://github.com/metrico/qryn they do the similar, but outside of ClickHouse. There is quite a few issues arise, when you start to use ClickHouse as backend for promethreus like data.
|
Interesting feature. This helps reducing one more moving part from the observability backend stack storing metrics along side traces and logs in clickhouse.
This is from OpenTelemetry Exporter -
|
I'd be happy with a limited solution that maps the Prometheus data into a more conventional table structure:
How to map the Prometheus metrics to the table structures? This could be achieved by decorating the table with hints for the write driver (for example with special keywords in the comment fields) This approach would of course not be the "full" Prometheus experience because you would need to predefine the schemas but from a performance standpoint it would be superior. As an added bonus, the query ergonomics would be better also for non-PromQL uses. With some tooling the schema management could also be somewhat painless. This approach would of course not work for all cases but for the cases where the Prometheus data is structured and well-known, it would be sufficient and would leverage the advantages of ClickHouse. |
For this solution you do not need this solution at all. If you know some most common attributes you can create a materialized column out of them. There is a feature Parsing of prometheus data format is just
|
Hi, is there any updates? Thanks! |
@fzyzcjy Stay tuned 😌 |
@nikitamikhaylov Wow it will be implemented? Looking forward to it! (I personally have interest in using clickhouse as an alternative to prometheus in order to handle system metrics) |
Table per resource will indeed perform better. BUT as you noted, schema creation, insertion and query routing to bunch of tables will become nightmare over the period of time. In addition to this, managing the carnality is also a challenge. |
This depends a definitely on the use case. In my situation a steady state has been reached and additional metrics are not that common. In this case optimizing for performance has higher value than providing flexibility: Schema creation is a one-off cost and a pretty standard template can be used. The insertion/query routing could be done either with a materialized views and async inserts or a fairly simple external driver. I'm not really sure how cardinality would be a challenge? I have ClickHouse tables where the cardinality is extremely high without significant performance impact. I would argue that with a schema the cardinality is easier to deal with than a table where the data ordering is decoupled from the actual hierarchy (e.g. metadata and data in separate tables). |
+1 |
1 similar comment
+1 |
Prometheus Query Language is a powerful tool to that lets the user select and aggregate time series data efficiently. https://prometheus.io/docs/prometheus/latest/querying/basics/
Use case
Describe the solution you'd like
The task is divided into several sub-tasks:
PromQL interpreter
The biggest challenge are several functions like
rate
,increase
- functions which take possibly huge range-vector and depend on the order of elements in this vector.PromQL parser and other functions (simple, aggregate and window ones) are doable.
We can support even different "dialects" of this language e.g. support MetricsQL or Grafana's LogQL.
Table structure
Similar to what we have right now for Kusto or PRQL we can put it under the setting
sql_dialect
. When thesql_dialect
is set topromql
ClickHouse will read only from the tables with the specified structure. Specifically:(timestamp, value, fingerprint)
. Fingerprint could be just a hash of all the tags associated with the metric.(tag_key, tag_value, fingerprint)
(date, fingerprint, tags_array)
The way how to define the name for these tables can be different - it could be another user-level setting or part of the configuration file.
Native integration with Prometheus API:
What needs to be done for the native integration with Prometheus and drop-in replacement?
The text was updated successfully, but these errors were encountered: