-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add exemplar support #395
Add exemplar support #395
Conversation
Well, I've done the propagation..one way. It was hard to retrofit the whole cold/hot shard stuff with an extra optional struct which I don't know how to do atomic style for. Also not cared for protobuf beyond what was necessary. But can now do: let mut exemplar_labels = HashMap::new();
exemplar_labels.insert("traceID".into(), trace_id);
let ex = Exemplar::new_with_labels(duration, exemplar_labels);
reconcile_duration_histogram
.with_label_values(&[])
.observe_with_exemplar(duration, ex); and text encoder outputs:
|
have had to subvert the histogramcore sharding a bit, wasn't sure it made any sense to do a hot/cold path for exemplars when you have to sync write the whole struct, but maybe i am wrong. for now it's sitting straight as a vector of optional exemplars (same length as bucket len) and writes to them are sync, which probably makes the existing sync mechanism useless. not sure how to make this better. could feature flag exemplars? |
need to lock to write it to write, so don't really see a point so lift them up one level
Is there anything left to do on this PR? Anything I can do to help? |
Well, I guess the main thing is that no matter how much I wrangle prometheus, it does not accept my output, even though it looks like it should conform to the spec. I have a demo app that is outputting:
in its /metrics but prometheus fails to parse it with a:
I have a demo implementation in kube-rs/controller-rs#12, and it's definitely the exemplar output that is failing, so not really sure what to do :-( |
Exemplars must have the timestamp after the example value, just like the error says. |
@bobrik : The error is asking for a histogram timestamp or a new record, not an exemplar timestamp (which is optional by the spec). These two are both valid: $ echo 'foo_controller_reconcile_duration_seconds_bucket{le="0.01"} 1' | promtool check metrics
foo_controller_reconcile_duration_seconds_bucket no help text
$ echo 'foo_controller_reconcile_duration_seconds_bucket{le="0.01"} 1 1623574027' | promtool check metrics
foo_controller_reconcile_duration_seconds_bucket no help text But none of these are: $ echo 'foo_controller_reconcile_duration_seconds_bucket{le="0.01"} 1 1623574027 # {trace_id="69d991d2ef39b15e71e2f42eb02a9b55"} 0.006"' | promtool check metrics
error while linting: text format parsing error in line 1: spurious string after timestamp: " # {trace_id=\"69d991d2ef39b15e71e2f42eb02a9b55\"} 0.006\""
$ echo 'foo_controller_reconcile_duration_seconds_bucket{le="0.01"} 1 1623574027 # {trace_id="69d991d2ef39b15e71e2f42eb02a9b55"} 0.006 1623574027"' | promtool check metrics
error while linting: text format parsing error in line 1: spurious string after timestamp: " # {trace_id=\"69d991d2ef39b15e71e2f42eb02a9b55\"} 0.006 1623574027\""
$ echo 'foo_controller_reconcile_duration_seconds_bucket{le="0.01"} 1 1623574027 # ' | promtool check metrics
error while linting: text format parsing error in line 1: spurious string after timestamp: " # " and without both optional timestamps (orig error): $ echo 'foo_controller_reconcile_duration_seconds_bucket{le="0.01"} 1 # {trace_id="69d991d2ef39b15e71e2f42eb02a9b55"} 0.006"' | promtool check metrics
error while linting: text format parsing error in line 1: expected integer as timestamp, got "#" |
This reverts commit e8c1983.
Prometheus has 2 metrics format: Prometheus text format, and OpenMetrics. Only OpenMetrics supports examplars. There are a few other differences between the Text Format and OpenMetrics. |
this is probably a backwards incompatible change that requres the new openmetrics content-type header.
Thanks a lot @roidelapluie ! That solves the mystery. I can get this client to work with openmetrics format, but it required a different exposition ending ( So while this does seem parseable by prometheus if we skip optional exemplar timestamps and include the header, I am starting to fear that this PR might be a big one to merge if it requires changing the entire content format :( Edit for anyone is visiting this in the future. I have stopped working on it. This PoC works for histograms, but it needs a lot of work in the library to support two metric formats. |
We are not evolving the text format anymore, we want to promote OpenMetrics. If you plan for adding full openmetrics support for Rust, do not hesitate to reach for help. There are a few differences between both formats, and the library ideally would handle both (via negotiation in the headers). |
This is a quick draft go at adding exemplar support from #393.
Is this something you would be willing to accept if it was cleaned up? Note that it doesn't compile yet.
Have added them to counters atm, but looks like there are a lot of layers between the Counter and the encoding bits so haven't gotten that far. If this is welcome, then any pointers on how to best propagate the data to the encoders would be appreciated.