Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upExpose `up_info` metric with label for potential errors #3068
Comments
RichiH
added
the
kind/enhancement
label
Aug 14, 2017
This comment has been minimized.
This comment has been minimized.
|
This is event logging, which doesn't belong as a metric. There's also a really high bar for adding any metric to be ingested on every scrape, and a metric with unbounded cardinality is not likely to meet that bar. This was already discussed on #2317 |
This comment has been minimized.
This comment has been minimized.
|
It's not unbounded as the source is Prometheus and it's only fixed strings. |
This comment has been minimized.
This comment has been minimized.
|
It's unbounded as it may contain arbitrary urls and error messages. This is something for logging and/or tracing. |
brian-brazil
added
the
wont-fix
label
Aug 21, 2017
brian-brazil
closed this
Aug 21, 2017
This comment has been minimized.
This comment has been minimized.
|
Given our recent discussions in OpenMetrics about ENUM and STATESET, it is probably possible to find some middle ground in this use case. I would love to get a somewhat wider discussion regarding this going, but maybe it's best to add something like this to the dev summit. |
RichiH
reopened this
May 28, 2018
This comment has been minimized.
This comment has been minimized.
|
OpenMetrics doesn't change anything, this is still event logging with unbounded cardinality. This can't be represented as an enum. |
This comment has been minimized.
This comment has been minimized.
|
You are thinking of the maximally flexible case, whereas I was, and still
am, thinking of having a limited set of information which can be encoded.
For example, no one would require this system to carry dynamic URLs or
return text. Storing HTTP return codes might make sense, but is still
relatively large in space. Library integrations signalling state and
Prometheus using this and other information to categorize would be even
more curated.
On Mon, May 28, 2018 at 10:17 AM Brian Brazil ***@***.***> wrote:
OpenMetrics doesn't change anything, this is still event logging with
unbounded cardinality. This can't be represented as an enum.
… —
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or mute the thread.
|
This comment has been minimized.
This comment has been minimized.
|
You are asking it to return dynamic text and URLs, as that's what the error from a failed scrape looks like. I see what you are trying to do, but metrics are fundamentally not suitable for this use case. The only thing it is sane for us to expose as part of every single scrape that happens is |
This comment has been minimized.
This comment has been minimized.
|
There's no information that wasn't considered originally, so closing again. |
brian-brazil
closed this
Jun 13, 2018
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 22, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
RichiH commentedAug 14, 2017
upis obviously somewhat special. While it's possible to navigate to/targets, wait for the target list to load, and then search for the job/instance having issues, that's hardly a nice way to get at what's the underlying issue./api/v1/targetsexists as well, but that also lives outside of what PromQL can access.It would be simpler to have an
up_infoor similar which could expose the current error state and use that as part of alerting.This comes close to event logging, but I would argue that this is a somewhat special case as it's built into Prometheus so it's hard to get context, otherwise.
https://github.com/RichiH/OpenMetrics/issues/3 is somwhat related to this FR.