-
Notifications
You must be signed in to change notification settings - Fork 621
Parameterize metric/label name validation scheme #11848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e47a7e5
to
78ba72c
Compare
pkg/mimirtool/rules/rules.go
Outdated
@@ -255,7 +256,8 @@ func (r RuleNamespace) Validate(groupNodes []rulefmt.RuleGroupNode) []error { | |||
func ValidateRuleGroup(g rwrulefmt.RuleGroup, node rulefmt.RuleGroupNode) []error { | |||
var errs []error | |||
for i, r := range g.Rules { | |||
for _, err := range r.Validate(node.Rules[i]) { | |||
// TODO(juliusmh): | |||
for _, err := range r.Validate(node.Rules[i], validation.LegacyNamingScheme) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: How should we handle validation checks in mimirtool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can think about it after the initial review :D
78ba72c
to
549cafd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, gave it a quick review. Nice work! Please check my questions :)
if e.limits.NameValidationScheme(tenantID) == prom_validation.LegacyNamingScheme { | ||
return prom_validation.LegacyNamingScheme, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it bad to enforce the legacy name validation scheme on the query path? All it will do is prevent the user from querying any metrics they ingested in the past with the UTF-8 name validation scheme?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR tries to maintain the current behavior, and currently, we validate label names in label_join
, label_replace
and count_values
with legacy scheme. Queries that don't use those functions support UTF8, today. This can be seen in the tests I did.
The question we have to answer: is there any scenario in which data returned by a querier or frontend is fed back to mimir without validation (e.g. to ingesters directly, or with label validation disabled)? If the answer is no, we can safely remove the validation. I'm assuming the answer is yes though, based on this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@charleskorn WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that data will always be validated when written back by rulers. They don't skip validation.
However, I believe we should validate names in at least label_replace
and label_join
, as not validating may lead to unexpected results:
- functions that consume the results of
label_replace
andlabel_join
may have undefined behaviour with invalid label names - a user testing a recording or alerting rule as a query may not realise that their rule will later fail to write samples when evaluated by rulers
pkg/streamingpromql/operators/aggregations/count_values_test.go
Outdated
Show resolved
Hide resolved
c1310a3
to
3888883
Compare
pkg/cardinality/request.go
Outdated
@@ -262,10 +263,10 @@ func extractLabelNames(values url.Values) ([]model.LabelName, error) { | |||
|
|||
labelNames := make([]model.LabelName, 0, len(labelNamesParams)) | |||
for _, labelNameParam := range labelNamesParams { | |||
labelName := model.LabelName(labelNameParam) | |||
if !labelName.IsValid() { | |||
if !validation.UTF8NamingScheme.IsValidLabelName(labelNameParam) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the naming scheme used here be configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering in this case whether this code should use a parameter (instead of hard coding), but not let it be user configurable since, as I explain above, I'm not sure we want to enforce the legacy naming scheme in queries. WDYT @charleskorn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know enough about this code path to know what makes sense here. Does Prometheus apply validation to the label names here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does Prometheus apply validation to the label names here as well?
@juliusmh Did you check?
In any case, this is an example of read path isn't it, where we don't want to enforce the "legacy" naming scheme? If so, I think it could a non-configurable parameter/argument, just to avoid hard coding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Afaik, the api/v1/cardinality/*
endpoints are Mimir specific.
The closest is probably db stats in prometheus admin API (calculated here), where you can get some analysis on label name/value cardinality, which, as far as I can see, doesn't involve any validation. But I'm not super confident in that analysis.
756ef29
to
448dde2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is starting to look good. We should keep that test case though.
35aeae6
to
cc723b6
Compare
cc723b6
to
936bd5c
Compare
As discussed @aknuds1 :
For promql, the main problem is that query opts are directly constructed and passed to query engine in prometheus' v1.NewApi. Unless we want to rewrite that handler we need to somehow "modify" You're right, for MQE this can be implemented differently, see this commit. The wrapper, is/was a way of unifying this logic for promql/mimir query engine paths. |
936bd5c
to
0522664
Compare
you're right 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dimitarvdimitrov requests a test case for an invalid label name in the cardinality API. Could you add that @juliusmh?
Also, could you add a CHANGE type changelog entry about how the read path now supports all valid UTF-8 strings at least for the label value cardinality API (/api/v1/cardinality/label_values
)? Please see Slack discussion regarding the latter.
6fc01ca
to
b83fffe
Compare
Given this was a bit chaotic, quick summary of what was addressed:
and what is still open:
|
thanks @juliusmh . i think only the last thread still needs resolving. i left a suggestion |
72ecc48
to
25e0428
Compare
CHANGELOG.md
Outdated
* [CHANGE] Query-frontend: Add support for utf8 label/metric names in `/api/v1/cardinality/{label_values|label_values|active_series}` endpoints. #11848. | ||
* [CHANGE] Querier: Add support for utf8 metric names, support utf8 label names in `label_join`, `label_replace` and `count_values` PromQL functions. #11848. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* [CHANGE] Query-frontend: Add support for utf8 label/metric names in `/api/v1/cardinality/{label_values|label_values|active_series}` endpoints. #11848. | |
* [CHANGE] Querier: Add support for utf8 metric names, support utf8 label names in `label_join`, `label_replace` and `count_values` PromQL functions. #11848. | |
* [CHANGE] Query-frontend: Add support for UTF-8 metric and label names in `/api/v1/cardinality/{label_values|label_values|active_series}` endpoints. #11848. | |
* [CHANGE] Querier: Add support for UTF-8 metric and label names in `label_join`, `label_replace`, and `count_values` PromQL functions. #11848. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what looks like an accidental test change. Also please review my suggestions.
25e0428
to
5cc2c4b
Compare
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
…(rather than NameValidatingEngine)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to update mimir-prometheus.
5cc2c4b
to
65bc6f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
What this PR does
Direct and indirect references to the global name validation scheme were removed in favor of a per-tenant override.
TODO:
streamingpromqlcompat.NameValidatingEngine
is the right approach: Decided to do UTF8 validation in query path.Which issue(s) this PR fixes or relates to
Depends on
Fixes: #11503
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
. If changelog entry is not needed, please add thechangelog-not-needed
label to the PR.about-versioning.md
updated with experimental features.