Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Refactoring Databend Metrics Component with Prometheus #9422

Closed
flaneur2020 opened this issue Dec 30, 2022 · 6 comments
Closed

Feature: Refactoring Databend Metrics Component with Prometheus #9422

flaneur2020 opened this issue Dec 30, 2022 · 6 comments
Assignees

Comments

@flaneur2020
Copy link
Member

flaneur2020 commented Dec 30, 2022

Summary

metrics-rs is a general abstraction layer across different metrics solutions like statsd, prometheus, new-relic, etc.

However, a general abstraction have a bit additional cost above a raw proemtheus client:

In a pulling metrics model, the memory allocation is mostly fixed for a set of labels. However in a push model, sometimes we need a buffer to store the recent metrics temporarly, like:

metrics-rs/metrics#245 (comment)

Nowadays prometheus is the de facto standard in the metrics area, we can consider take a raw prometheus client without considering pushing metrics to others like statsd, it have some benefits:

  • better performance: with proper metrics practise, the memory allocation of metrics should be O(1), no local buffered queue at all
  • reducing the abstraction layers: we can reduce abstraction layer about metrics, to make the code path easier to understand.
  • better coding standard: currently we have different metric practices among different modules, we can learn about the metrics practices in the prometheus community to unify our metrics practices.

We can choose between:

@Xuanwo
Copy link
Member

Xuanwo commented Dec 30, 2022

OpenDAL will support both native proemtheus client and metrics without histogram:

@flaneur2020 flaneur2020 changed the title Feature: switch to prometheus's rust_client instead of metrics-rs Feature: switch to prometheus's client instead of metrics-rs Dec 30, 2022
@flaneur2020
Copy link
Member Author

@everpcpc can you help take a look? quq

@Xuanwo
Copy link
Member

Xuanwo commented Dec 30, 2022

Something we can refactor during this issue: collect all metrics together as a GlobalMetricsCollector and implement macros on it to implement helper methods like self.inc_query_read_bytes() to make simplify the code.

@PsiACE PsiACE changed the title Feature: switch to prometheus's client instead of metrics-rs Feature: Refactoring Databend Metrics Component with Prometheus Apr 29, 2023
@youngsofun
Copy link
Member

youngsofun commented May 17, 2023

do we need Summary(Quantiles)?

both crate prometheus-client and prometheus do not support Summary(Quantiles) now, they only support Buckets.
the relevant pr/issues are prometheus/client_rust#67 (comment) and tikv/rust-prometheus#5, the latter has some disscussion.

Summary is not aggregatable.
but do not have the burden of choosing the right buckets.

Buckets is much easier to impl, without worry about OOM.

in contrast, metric-rs use Summary by default. which means we are using it (but not enabled by default)

meta:

  • snapshot_sent_seconds
  • meta_grpc_client_request_duration_ms

query:

  • rpc_delay_seconds
  • mysql.process_request_duration
  • interpreter.usedtime
  • query_duration_ms
  • query_write_io_bytes_cost_ms
  • query_scan_io_bytes_cost_ms

background:

this issue is now a project of OSPP https://summer-ospp.ac.cn/org/prodetail/236460552
the tasks will be very different if we need to support Summary.

personally, I think buckets is more useful in production. Just to confirm.
@flaneur2020 cc @everpcpc @drmingdrmer @PsiACE @sundy-li @dantengsky

@drmingdrmer
Copy link
Member

I'd prefer a just enough solution like the Bucket.

In cases such as databend-query or meta, the distribution of a metric value is predictable, making a versatile Summary cumbersome for such needs. :)

@flaneur2020
Copy link
Member Author

flaneur2020 commented Aug 31, 2023

I'll take this issue because there are some needs to cover more metrics. Please contact me if you're interested to work on this together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants