Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upMore flexible label name pattern #1178
Comments
This comment has been minimized.
This comment has been minimized.
|
I think we should be vary wary of extending what we allow, as it limits our ability to add new syntax in the future.
This is an acknowledged flaw in how the current Prometheus<->Alertmanager API developed, this will be resolved as part of the new alertmanager.
The problem here is much broader - Prometheus has no knowledge as to what the meaning of any labels are, as this going to be almost entirely defined by the context of the particular organisation that's using it. Given that adding first-class namespace support doesn't really add anything for our users, as they just end up with longer label names and for the things that have more standard names (e.g. job and alert summary) the semantics are no clearer.
The labelmap relabel action was added for exactly this purpose. |
This comment has been minimized.
This comment has been minimized.
|
I think this is not the first time this comes up. While a dot should be fine, a |
This comment has been minimized.
This comment has been minimized.
I want to use the dot so that
I've used systems in the past where |
This comment has been minimized.
This comment has been minimized.
We should always carefully consider any changes. That's the point of this issue. The statement can never be false though, so it's not an argument.
Please elaborate in which way this is solved. The new API has annotations, which is just another word for payload. We still get sets of key/value pairs. Any assumptions we make about those is an implicit acknowledgement of their source.
The using organisation and components it uses. That is exactly the point though the proposal though.
There are also users of our users and users of components Prometheus uses (AM), who benefit greatly from any reliable definition of labels that have a broader meaning.
It's a tool to transform labels, nothing more. |
This comment has been minimized.
This comment has been minimized.
Sorry, this has been an ongoing pseudo-argument for many months now. I do absolutely not see why syntactic sugar for which there has been absolutely no need over an extended time justifies blocking things that have actual value. Also, this does not work by definition as job can have any unicode character. I see that there are potential problems when this propagates into metrics. One could limit this to label names for external communication. |
This comment has been minimized.
This comment has been minimized.
We added new things with specific meaning to the alertmanager code, rather than making them a separate structured field. generatorURL is the most obvious one there. With the new API we won't do so, anything which the alertmanager depends on at a code level should be an explicit field in the API.
Did you lose some words here?
My argument is that this broader meaning doesn't exist in any useful sense. It's hard to get two teams in the same company to agree on what label taxonomies to use, let alone between companies. Target and alert labels tend to be deployment and team specific (though there's a few things like "severity" that are a defacto-standard, and within a company there tends to be some notion of "datacenter"). In my experience alert routing ends up being handled on a team by team basis. Every team ends up with their own part of the routing tree that they setup according to their particular needs.
It allows you to feed labels defined as labels from other systems into Prometheus. Did you mean something else?
I've used this and seen it used to great benefit previously, multi-tenant Prometheus setups and setups used by multiple groups who aren't all experts pose special challenges. This helps prevent one group inadvertently taking out another's monitoring (or their own). If you don't have such a setup you're mostly fine, but if you do this is essential. I consider most of the examples of recording rules we have to be dangerous due to lacking a job label, as we're setting users up for an outage. It's too easy to inadvertently muck up some other job's data.
Hmm, what are you thinking here? I can see potential uses for remote storage, but I'm not seeing it for alerting or federation. |
This comment has been minimized.
This comment has been minimized.
No, we explicitly agreed after long back-and-forth that there will be labels and annotations as the only sources for data (aside from start and end times for an alert). I hope we do not have to reopen this discussion.
Okay, I sense there might be some misunderstanding. I do not want to promote prefixing all labels everywhere. I care about labels that affect the processing behavior of components. They are not some free key/value pairs to slice and dice along but parameters.
The Alertmanager is designed in a way that alerts can be pushed from any system. Prometheus alerts certainly have some annotation data that may be worth submitting to AM and can be used in notification templates. There is also potential for more than one hop. Say I setup an alert that is dispatched to a webhook that performs some automatic action – the system behind that webhook has a parameter I define in my alert via labels or annotations. I see that the line gets blurry for labels local to an organizational setup, e.g. |
This comment has been minimized.
This comment has been minimized.
That agreement was for what comes from the alert rule itself. GeneratorURL belongs as a separate field similar to the times, I don't believe there's any others (the HA might require some, but the user shouldn't be exposed to them).
I see where you're coming from now. The way I'd handle that is that there'd be some labelset that restricts what alerts the system behind the webhook only looks at whether via alertmanager routing config, or the system itself so you know it has the meaning intended.
Per the above the problem isn't just the clash of that particular label, it's a clash within the labelspace of your organisation. I can see how what you're proposing would help, but it's just solving one part of your label namespacing. You still need to do the rest for all the org-specific labels so you're not saving much overall. |
This comment has been minimized.
This comment has been minimized.
|
To me acknowledging something as a problem and then providing a workaround that will 'mostly be fine' makes me feel uneasy – structuring something conceptually structured is not bad. As you said elsewhere, there is such a thing as being too generic. To make this short. As the discussion alone lacks opinions and participation it is fair to say that the far reaching changes to implement this are just unrealistic. So we can prune this one out to focus on other things. Since you brought up the shortcut for |
fabxc
closed this
Oct 28, 2015
This comment has been minimized.
This comment has been minimized.
|
(/me still in catch-up mode. Just as an afterthought.) Allowing anything that has a different meaning in PromQL as part of a label name would be a truly fundamental syntax change. "dot-job" notation would be a fundamental change of another kind. In any case, all three things I've mentioned above are different problems (albeit not completely orthogonal), and any single one would require pretty deep changes. Neither is on the agenda right now. |
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 24, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
fabxc commentedOct 23, 2015
Prometheus and Kubernetes have common ground in their usage of labels, which makes them a very good fit for each other. However, Kubernetes has a more flexible pattern for label names. Prometheus only allows alpha-numeric characters and '_'. Kubernetes allows prefixing using
.and/(detailed description).When retrieving labels/annotations from Kubernetes via service discovery, we currently convert all
.and/characters to_to fit into our pattern.This partially breaks an idiomatic interface between the two and may cause ambiguity and obfuscation.
Unideal integration with a single third-party component is of course not enough reason to change our model. However, I think adapting the Kubernetes pattern has advantages for Prometheus as well.
In our retrieval layer we have a multitude of labels with well-defined semantics that are used internally and set by the user in their configuration (specifically relabeling rules).
The Alertmanager is a component that should ideally have zero-knowledge about Prometheus. Currently, this is only partially true because certain labels or annotations (currently named 'payload') are interpreted in a specific way. The semantics are implied which inhibits flexibility to integrate well with other systems via templating.
More broadly, Prometheus has currently no way to expose labels with well-defined meaning to other systems without them implicitly interpreting their source. On the same note it offers no way to other systems to feed those labels into Prometheus explicitly.
More or less we are currently achieving a subset of those possibilities with our internal labels, which are prefixed by
__. In seamless integration with other systems, the idea of a global namespaceprometheus.io/<label>is superior.There are some implications that would have to be solved by breaking or simply sticking with 'legacy', i.e.
__name__would more appropriately beprometheus.io/metric_name,alertnamewould beprometheus.io/alert_name. But that can be handled as far as I can see.Normal metric labels are generally not affected by this, even though it is possible.
Adapting the Kubernetes approach will make it easier for us to keep track of labels within our eco-system that have a well-defined meaning and improve explicitness in their usage in third-party components.