Add optional backend for ClickHouse storage#6194
Conversation
Add a new optional storage backend backed by a ClickHouse database. This should improve performance over the FrostDB backend and help users send data from more nodes and query it efficiently. This commit is the first working version of the code, improvements will follow in later commits.
- move Symbolizer interface to the symbolizer module - remove deduplication and ordering of results on the Go side, relying on the database instead - in QueryRange, move grouping by labels to ClickHouse - in QuerySingle and QueryMerge, match FrostDB aggregation behavior - remove dead code - formatting
The function was moved to that module so it makes sense for the test to be there too. Remove test that only checked constants had specific values.
|
Awesome! Thanks for this. Will review it next week |
213a51a to
597c762
Compare
|
This looks great! Thanks for the contribution. I would like to mark this as an experimental feature for now. As such can we move the clickhouse flags into the |
Hide them by default as it's an experimental feature
|
Sure! Done in the latest commit |
|
Hey, kindly pinging, if you get a chance to take another look 👀 |
thorfour
left a comment
There was a problem hiding this comment.
Thanks for the contribution!
|
I just wanted to point out, this is experimental and hidden, there is a possible future where we might do a major overhaul of Parca, and it's possible that in an effort like that this type of feature may be dropped. Nevertheless, we want to enable users in the meantime to do what they would like, but there is no guarantee for the longevity of this feature. |
It would be interesting to know more about this, do you have some kind of public roadmap? |
|
There isn't anything public yet since we haven't fully made up our mind, but our internal systems at Polar Signals have dramatically diverged from Parca to the point where Parca maintenance has suffered since we are maintaining effectively two projects that achieve much of the same things (hence why we're ok with merging things like this even though it's very far from our original goal of having a single statically linked binary akin to Prometheus for profiling). An idea has emerged that maybe we will be open-sourcing some of the work we've done at Polar Signals, but that would mean very dramatic changes in Parca, since we've practically fully moved to rust at Polar Signals. |
Adds an optional flag to enable storage of traces in a ClickHouse server. This enables better scalability for users who wish to ingest and query data from many nodes, as the current FrostDB storage struggles with heavy use cases.
This introduces a new module implementing the same public interface as the
parcacolmodule, based on FrostDB, and initializes the selected backend depending on the flag. Traces are inserted into and queried from aMergeTreetable, the basic engine in ClickHouse. This could be modified to be aReplicatedMergeTreeto enable more complex setups. For the labels, it uses a column with the JSON type, which stores most frequent labels in separate columns for query efficiency (this was a design objective of FrostDB AFAIK).Caveats