Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/operator: create Select method for rule selection #5221

Merged

Conversation

slashpai
Copy link
Contributor

Fixes #5191

Signed-off-by: Jayapriya Pai slashpai9@gmail.com

Description

Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request.
If it fixes a bug or resolves a feature request, be sure to link to that issue.

Type of change

What type of changes does your code introduce to the Prometheus operator? Put an x in the box that apply.

  • CHANGE (fix or feature that would cause existing functionality to not work as expected)
  • FEATURE (non-breaking change which adds functionality)
  • BUGFIX (non-breaking change which fixes an issue)
  • ENHANCEMENT (non-breaking change which improves existing functionality)
  • NONE (if none of the other choices apply. Example, tooling, build system, CI, docs, etc.)

Changelog entry

Please put a one-line changelog entry below. This will be copied to the changelog file during the release process.

Update logic on selecting prometheus rules by rejecting invalid ones instead of failing reconciliation

@slashpai slashpai requested a review from a team as a code owner December 12, 2022 14:13
@slashpai
Copy link
Contributor Author

This is not yet done for Thanos, still working on updating PR and removing duplicates hence moving to draft :)

@slashpai slashpai marked this pull request as draft December 12, 2022 14:17
@slashpai slashpai marked this pull request as ready for review December 21, 2022 12:59
@slashpai
Copy link
Contributor Author

Removing draft for initial review

Copy link
Contributor

@JoaoBraveCoding JoaoBraveCoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👍

@slashpai
Copy link
Contributor Author

Need to implement for Thanos as well. I will look in reducing duplicate code between Thanos and Prometheus

@slashpai
Copy link
Contributor Author

slashpai commented Jan 3, 2023

@simonpasquier Can you take another look at this?

Copy link
Contributor

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we factorize the 2 selectRules() methods (probably in pkg/operator)?

pkg/operator/rules.go Outdated Show resolved Hide resolved
@slashpai
Copy link
Contributor Author

@simonpasquier PTAL when you get a chance :)

pkg/operator/rules.go Outdated Show resolved Hide resolved
pkg/operator/rules.go Outdated Show resolved Hide resolved
pkg/operator/rules.go Outdated Show resolved Hide resolved
pkg/operator/rules.go Outdated Show resolved Hide resolved
@slashpai slashpai changed the title pkg/prometheus: Update selectRules method to be robust pkg/operator: create Select method for rule selection Jan 19, 2023
pkg/operator/rules.go Outdated Show resolved Hide resolved
@slashpai
Copy link
Contributor Author

I updated with changes suggested, need to remove a few units as well. Updating that as well

Copy link
Contributor

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a few nits :)

pkg/thanos/rules.go Show resolved Hide resolved
pkg/operator/rules.go Outdated Show resolved Hide resolved
pkg/operator/rules.go Outdated Show resolved Hide resolved
pkg/operator/rules.go Outdated Show resolved Hide resolved
pkg/operator/rules.go Outdated Show resolved Hide resolved
@slashpai slashpai force-pushed the refactor_rule_selection branch 2 times, most recently from d0c112f to df1eefe Compare January 20, 2023 06:38
@slashpai
Copy link
Contributor Author

@simonpasquier Addressed all comments except the nsLabeler (added comment in line)

if !strings.Contains(content, "partial_response_strategy: warn") {
t.Fatalf("expected `partial_response_strategy` to be set in PrometheusRule as `warn`")

}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are tests moved from Prometheus package

Copy link
Contributor

@JoaoBraveCoding JoaoBraveCoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet refactoring 👍 Great job JP!

Copy link
Contributor

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm but you'll have to rebase on main :)

@slashpai
Copy link
Contributor Author

lgtm but you'll have to rebase on main :)

@simonpasquier rebased

@simonpasquier
Copy link
Contributor

I realize that we don't have an e2e test verifying that the operator doesn't select invalid rules now. As the validating webhook config watches for all namespaces, an invalid rule can never reach the operator. I'd suggest to tweak the namespaceSelector field of the webhook configuration to exclude namespaces with a certain label:

namespaceSelector:
  matchExpressions:
  - key: excludeFromWebhook
    operator: NotIn
    values: ["true"]

Then a rule created in a namespace with the excludeFromWebhook: "true" would skip the validation and we can check that the operator discards it.

@slashpai slashpai force-pushed the refactor_rule_selection branch 2 times, most recently from 06ac5ac to 7a66710 Compare January 30, 2023 06:33
@slashpai slashpai force-pushed the refactor_rule_selection branch 4 times, most recently from 0af5c62 to d522534 Compare March 9, 2023 11:34
defer testCtx.Cleanup(t)
ns := framework.CreateNamespace(context.Background(), t, testCtx)
framework.SetupPrometheusRBAC(context.Background(), t, testCtx, ns)
ruleFilesNamespaceSelector := map[string]string{"excludeFromWebhook": "true", "role": "rulefile"}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note on labels added:

"role": "rulefile" label is added here so that operator selects the rule created for evaulation. We use this label in ruleSelector in framework.MakeBasicPrometheus() as well as framework..MakeBasicRule()

"excludeFromWebhook": "true" is added to skip webhook validation and validate if operator doesn't fail reconcilation due to invalid rule

@slashpai
Copy link
Contributor Author

slashpai commented Mar 9, 2023

@simonpasquier Rebased and updated the e2e test to check prometheus reconciliation status after invalid rule created

@slashpai
Copy link
Contributor Author

Just for more validation I tested this change manually as well checked the e2e test against main branch on which refactoring is not there. Prometheus fails to reconcile without this change. Also the metric prometheus_operator_managed_resources{controller="prometheus",resource="PrometheusRule",state="rejected"} 1 is updated correctly

@slashpai
Copy link
Contributor Author

@simonpasquier Can you take another look at this one, updated test :)

reorganized rule selection code from prometheus
and thanos package to reduce duplicate code.

This change will also make sure
there is no reconcillation failure on controller
when a bad PrometheusRule object is created.
Instead of a reconcillation failure, controller
will just reject the bad PrometheusRule.

Fixes prometheus-operator#5191

Signed-off-by: Jayapriya Pai <slashpai9@gmail.com>
@simonpasquier
Copy link
Contributor

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reject invalid PrometheusRule objects beforehand rather than failing the reconciliation
3 participants