-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse created timestamps from OpenMetrics-Text format #13506
base: main
Are you sure you want to change the base?
Conversation
a529cad
to
78a9972
Compare
6743011
to
424b6bf
Compare
424b6bf
to
e3ffab3
Compare
// The difference between an usual metric and the created timestamp is that for | ||
// the created timestamp, the metric name ends with "_created". | ||
// | ||
// Created timestamps are always exposed right after the metric they are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right after
Is that part of the OpenMetrics spec? I think in practice I have only seen _created
as the last metric in a MetricFamily, but I haven't read anything in the spec that requires that. It does have to be with the other samples, but e.g. _created
before _total
might be possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the example in their specification, I see _created lines for each metric point and not MetricFamily.
In the section Overall Structure, we have one created line for acme_http_router_request_seconds_created{path="/api/v1",method="GET"}
and another for acme_http_router_request_seconds_created{path="/api/v2",method="POST"}
(notice the label difference while using the same name)
but e.g. _created before _total might be possible
Do you mean for the same MetricPoint? All the examples I've seen have the _created line as the last line of a MetricPoint, but I couldn't find anything in the spec saying that this order is a MUST. I'm not sure if we should expect something different 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct, I meant MetricPoint, sorry!
I agree that all of the samples show _created
as the last line, but that doesn't seem enforced so I think it is possible and valid OpenMetrics for _created
to come first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😭😭😭😭
That makes things so much more complicated 🥲
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it might be worth getting an opinion from someone more involved in OpenMetrics than myself as I agree, it would be nice not to have to worry about the order!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not more involved in OpenMetrics, but I think it's totally fine that an experimental feature relies on this accidental property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, the order of _created is not guaranteed to be the last. But also I don't understand you worries @ArthurSens -- if it's last we already have the problem of having to "buffer" previous samples related to the _created metric which is potentially at the end. So it does not matter if it's end, middle or first. We still have to buffer samples (and corresponding series, which should be the same) no matter what, so we can add potential 0 sample OR in future add it to metadata.
Right now CreatedTimestamp() returns number ONLY when parser hits _created metric, which is clearly not enough for OM as we already written sample for related series no? We have to change bit this logic to e.g. buffer series in (OpenMetrics parsing only).
To me that means e.g. having different append
function all together for OM, and that's fine. It should motivate, heavy changes to newer OM protocol to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure yet(doing research at the moment), but I believe all official client libraries that offer OpenMetrics support today expose _created
lines after the related metric. If I can confirm this, and seeing that this feature is experimental, could we rely on this order for this PR?
Expecting different order for the _created lines will make the code a lot more complex, we could make reviews easier if we split this up in different work streams :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, but my point stays Arthur, how it makes the code more complex if it's before , in the middle or after the line? The main challenge is that you have to keep buffering samples somewhere BEFORE appending until you know that _created line. This is because we try to be lean and stream parsing line by line.
This is the main complexity to the code, not searching where that line is or something 🤔 I also don't see that buffering in this PR, can you elaborate a planned algorithm for parsing with OM text then you envision (with and without assumption of _created line being last)? Maybe in pseudocode here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a few stylistic thoughts.
Note there are some merge clashes , also a CI error.
if et, err = p.Next(); err != nil { | ||
if errors.Is(err, io.EOF) { | ||
err = nil | ||
if !skipCallNextEntry { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if it would be simpler to re-arrange so Next
has always been called at the start of the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see how that could be done... (or maybe I couldn't understand what you mean)
For OM-text, we need to parse 2 lines to identify created timestamps and the parser can't go back once a line is parsed, therefore we need to skip parsing yet another line if we haven't found a _created line in the last call 🤔
e3ffab3
to
e79b9ed
Compare
whooops, a bad push accidentally closed the PR 😅 |
6282cf4
to
419ed90
Compare
I can see that the changes I've made have broken the fuzz testing, but I do not understand how I'm supposed to fix it 🤔, could someone point me in the right direction? And the go tests are passing locally... not sure if this one is flaky or not 😬 |
I think the Go test is currently flakey, I'm seeing various build failures on other PRs. |
Looks like you just need to update
|
Oh wow, not sure how my IDE didn't catch that 🤔 |
cb0dc26
to
a7226df
Compare
Conflicts resolved again, reviews are appreciated 😅 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start! Thanks! Sorry for delays in reviews.
However, something is off with the current flow for OM text, shouldn't we buffer appends or potential series related to _created metric?
Let's figure this out https://github.com/prometheus/prometheus/pull/13506/files#r1506306215
@@ -64,7 +64,10 @@ _metric_starting_with_underscore 1 | |||
testmetric{_label_starting_with_underscore="foo"} 1 | |||
testmetric{label="\"bar\""} 1 | |||
# TYPE foo counter | |||
foo_total 17.0 1520879607.789 # {id="counter-test"} 5` | |||
foo_total 17.0 1520879607.789 # {id="counter-test"} 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's change order of _created to explictly show (and test) how it will be handled.
// CT is an experimental feature. For now, we don't need to fail the | ||
// scrape on errors updating the created timestamp, log debug. | ||
level.Debug(sl.l).Log("msg", "Error when appending CT in scrape loop", "series", string(met), "ct", *ctMs, "t", t, "err", err) | ||
if sl.enableCTZeroIngestion { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest burning that _created metric in fire in this mode (not ingest it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't get it 😬 what do you mean with "in this mode"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you are (trying) to do this already with that skip line? (in this mode == when CT zero ingestion, so CT handling is enabled. if enabled we should probably kill _created metrics too once used).
// The difference between an usual metric and the created timestamp is that for | ||
// the created timestamp, the metric name ends with "_created". | ||
// | ||
// Created timestamps are always exposed right after the metric they are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, the order of _created is not guaranteed to be the last. But also I don't understand you worries @ArthurSens -- if it's last we already have the problem of having to "buffer" previous samples related to the _created metric which is potentially at the end. So it does not matter if it's end, middle or first. We still have to buffer samples (and corresponding series, which should be the same) no matter what, so we can add potential 0 sample OR in future add it to metadata.
Right now CreatedTimestamp() returns number ONLY when parser hits _created metric, which is clearly not enough for OM as we already written sample for related series no? We have to change bit this logic to e.g. buffer series in (OpenMetrics parsing only).
To me that means e.g. having different append
function all together for OM, and that's fine. It should motivate, heavy changes to newer OM protocol to fix this.
56772d3
to
f189fd0
Compare
Signed-off-by: Arthur Silva Sens <arthur.sens@coralogix.com>
f189fd0
to
d19875d
Compare
Fix #12980
The already existing feature-flag 'created-timestamp-zero-ingestion' is extended to also cover OpenMetrics-Text format.
Once enabled, _created lines, that before were treated as a whole new timeseries, will be appended as a synthetic zero to the timeseries they are related to.