Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate alerting mixin from cluster-monitoring-operator #613

Merged
merged 5 commits into from Jul 13, 2021

Conversation

lilic
Copy link
Contributor

@lilic lilic commented Jun 18, 2021

As upstream etcd uses jsonnet mixin, as discussed I used jsonnet and jb to generate the manifests. We also override openshift specific things. In the next steps I will be adding fixes for alerts but this just brings in what we already today have in OpenShift via cluster-monitoring-operator. https://github.com/openshift/cluster-monitoring-operator/blob/master/jsonnet/control-plane.libsonnet#L8
Original PromRule manifest file is located https://github.com/openshift/cluster-monitoring-operator/blob/7f4925a7203622d70b3007fbddfb6bc5cce6c1d9/assets/control-plane/etcd-prometheus-rule.yaml.

Reqs:

Mainly opening so we discuss this approach, I can cleanup any generation scripts (its hacky).


# Generate jsonnet mixin prometheusrule manifest.

cd jsonnet && jb update && jsonnet -J vendor main.jsonnet | gojsontoyaml > ../manifests/0000_90_etcd-operator_03_prometheusrule.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: could we look into adding this to github.com/openshift/build-machinery-go so the rest of the control-plane can emulate this process if they would like. Then also it will become part of our Makefile workflow? make update

Copy link
Contributor Author

@lilic lilic Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that sound good, if you don't mind I would do that in a follow-up?

Note that this would only be used if we fix things upstream or once a new release is brought in.

Copy link
Contributor

@hexfusion hexfusion Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a linter test as part of the process? Is jsonnet-lint useful?

Copy link
Contributor Author

@lilic lilic Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use jsonnetfmt integration in my vim so it formats everything correctly, but haven't tried jsonnet-lint. Can look into it.

@hexfusion
Copy link
Contributor

/approve

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 18, 2021
@hexfusion
Copy link
Contributor

/test e2e-agnostic-upgrade


@lilic lilic changed the title WIP: Migrate alerting mixin from cluster-monitoring-operator Migrate alerting mixin from cluster-monitoring-operator Jun 21, 2021
@lilic
Copy link
Contributor Author

lilic commented Jun 21, 2021

/hold

Until CMO PR is approved, so it's both merged on the same day into the same nightly.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 21, 2021
@hexfusion
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 21, 2021
namespace: 'openshift-etcd-operator',
annotations:
{
'include.release.openshift.io/ibm-cloud-managed': 'true',
Copy link
Contributor

@marun marun Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a non-zero chance that developers external to the team may need to add more of these annotations in the future. Maybe inject a comment into the manifest after conversion to yaml (since json doesn't support comments) indicating that the file is generated (and maybe how to update the inputs) and add a verify check that prevents a manual change from being committed (since it would be overwritten by a subsequent generate invocation)?

I'm guessing this is what @hexfusion meant by updating make, though, so it can be done in a follow-up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed will do in a follow up, and sync with other folks on this.

@lilic
Copy link
Contributor Author

lilic commented Jun 23, 2021

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 23, 2021
@lilic
Copy link
Contributor Author

lilic commented Jun 23, 2021

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

8 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@lilic
Copy link
Contributor Author

lilic commented Jun 24, 2021

/hold

seems like we have etcdHighNumberOfLeaderChanges fire but only for e2e-gcp-five-control-plane-replicas, the alert was not changed so not sure what is up with that, will have a look.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

19 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@hexfusion
Copy link
Contributor

/hold 5 node seems broken

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 26, 2021
@lilic
Copy link
Contributor Author

lilic commented Jul 13, 2021

/retest

1 similar comment
@lilic
Copy link
Contributor Author

lilic commented Jul 13, 2021

/retest

@hexfusion
Copy link
Contributor

upstream test needs fixed, failure not related to code.

/override ci/prow/e2e-gcp-five-control-plane-replicas

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 13, 2021

@hexfusion: Overrode contexts on behalf of hexfusion: ci/prow/e2e-gcp-five-control-plane-replicas

In response to this:

upstream test needs fixed, failure not related to code.

/override ci/prow/e2e-gcp-five-control-plane-replicas

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@lilic
Copy link
Contributor Author

lilic commented Jul 13, 2021

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 13, 2021
@openshift-merge-robot openshift-merge-robot merged commit 5c17c5c into openshift:master Jul 13, 2021
@lilic lilic deleted the move-alerts branch July 13, 2021 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants