OTA-1817: Adding unit tests for the extend recommended alerts#2262
OTA-1817: Adding unit tests for the extend recommended alerts#2262hongkailiu wants to merge 2 commits intoopenshift:mainfrom
Conversation
pkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-critical-alert:
Add Additional Alerts to the OCP Update Precheck
Component teams want to raise special alerts to the `oc adm upgrade recommend ...`
command for consideration by cluster-admins considering a future cluster update.
And we also give `label-maintainers` the ability to add or remove alerts from
the recommend-precheck by a new label `openShiftUpdatePrecheck`[1]
We can follow below steps to generate these test files:
Launch a cluster by cluster-bot:
$ launch 4.22 aws
Cluster Bot gives the default (X.509 certificate based) kubeconfig
returned by the installer, but Thanos alert-retrieval requires a
token. So I created one using the cluster-version operator's
namespace:
$ TOKEN="$(oc -n openshift-cluster-version create token default)"
And Cluster Bot clusters use self-signed certificate, so I collected
all possible Kube API server and ingress TLS certificates in a file:
$ oc -n openshift-kube-apiserver get -o json secrets | jq -r '.items[] | select(.type == "kubernetes.io/tls").data["tls.crt"] | @base64d' >ca.crt
$ oc -n openshift-ingress get -o json secrets | jq -r '.items[] | select(.type == "kubernetes.io/tls").data["tls.crt"] | @base64d' >>ca.crt
Prepare test data for warning alerts:
$ cat <<EOF | oc -n openshift-cluster-version create -f -
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: warning-testing
namespace: openshift-cluster-version
annotations:
kubernetes.io/description: Alerting rules for testing.
spec:
groups:
- name: VirtWarningPrecheck
rules:
- alert: VirtHandlerDaemonSetRolloutFailing
annotations:
summary: Test summary.
description: Test description.
expr: |
up{job="cluster-version-operator"} == 1
labels:
severity: warning
namespace: openshift-cluster-version
reason: UpdatingPrometheusFailed
openShiftUpdatePrecheck: "true"
- name: VMCannotBeEvictedWarningNormal
rules:
- alert: VMCannotBeEvicted
annotations:
summary: Test summary.
description: Test description.
expr: |
up{job="cluster-version-operator"} == 1
labels:
severity: warning
namespace: openshift-cluster-version
reason: UpdatingPrometheusFailed
openShiftUpdatePrecheck: "false"
- name: CustomWarningPrecheck
rules:
- alert: TestAlert
annotations:
summary: Test summary.
description: Test description.
expr: |
up{job="cluster-version-operator"} == 1
labels:
severity: warning
namespace: openshift-cluster-version
reason: UpdatingPrometheusFailed
openShiftUpdatePrecheck: "true"
EOF
retrieved ClusterVersion YAML with:
$ oc get -o yaml clusterversion version >pkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-alert-cv.yaml
retrieve the cluster's alerts:
$ OC_ENABLE_CMD_INSPECT_ALERTS=true oc adm inspect-alerts --token "${TOKEN}" --certificate-authority ca.crt | jq . > pkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-alert-alerts.json
then I generated the output fixtures with:
$ UPDATE=yes go test -v ./pkg/cli/admin/upgrade/recommend/...
delete above alerts:
$ oc delete PrometheusRule/warning-testing -n openshift-cluster-version
Wait some minutes, then repeat above step to generate new test data for critical alearts:
Create critical alerts:
cat <<EOF | oc -n openshift-cluster-version create -f -
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: critical-testing
namespace: openshift-cluster-version
annotations:
kubernetes.io/description: Alerting rules for testing.
spec:
groups:
- name: VirtCriticalPrecheck
rules:
- alert: VirtHandlerDaemonSetRolloutFailing
annotations:
summary: Test summary.
description: Test description.
expr: |
up{job="cluster-version-operator"} == 1
labels:
severity: critical
namespace: openshift-cluster-version
reason: UpdatingPrometheusFailed
openShiftUpdatePrecheck: "true"
- name: VMCannotBeEvictedCriticalPrecheck
rules:
- alert: VMCannotBeEvicted
for: 1h
annotations:
summary: Test summary.
description: Test description.
expr: |
up{job="cluster-version-operator"} == 1
labels:
severity: critical
namespace: openshift-cluster-version
reason: UpdatingPrometheusFailed
openShiftUpdatePrecheck: "true"
EOF
Generate test data:
$ oc get -o yaml clusterversion version >pkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-critical-alert-cv.yaml
$ OC_ENABLE_CMD_INSPECT_ALERTS=true oc adm inspect-alerts --token "${TOKEN}" --certificate-authority ca.crt | jq . >pkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-critical-alert-alerts.json
$ UPDATE=yes go test -v ./pkg/cli/admin/upgrade/recommend/...
Then delate critical alerts:
$ oc delete PrometheusRule/critical-testing -n openshift-cluster-version
[1]: https://issues.redhat.com/browse/OTA-1864
Instead of skipping the cases, each new case is mapped to a random version. - The code to skip the cases are no longer needed. - It adds the coverage for the case "error: no updates available".
|
@hongkailiu: This pull request references OTA-1817 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughAdds example fixtures and test data for alert handling scenarios in the upgrade recommendation CLI tool, including JSON alert responses, Kubernetes manifests, and corresponding CLI output examples for both recommended and critical alert situations. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes 🚥 Pre-merge checks | ✅ 11 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Review rate limit: 9/10 reviews remaining, refill in 6 minutes. Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hongkailiu The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/cli/admin/upgrade/recommend/examples_test.go (1)
83-85: Fail fast when aspecific versionmapping is missing.
At Line 83-Line 85, an unmappedcvleavesvariant.outputSuffixempty, which can produce confusing fixture-path behavior. Add an explicit guard so missing mappings fail with a clear message.Proposed refactor
var version string if version = variant.versions[cv]; version != "" { variant.outputSuffix = fmt.Sprintf(variant.outputSuffixPattern, version) + } else if variant.outputSuffixPattern != "" { + t.Fatalf("missing specific-version fixture mapping for %s", cv) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/cli/admin/upgrade/recommend/examples_test.go` around lines 83 - 85, The code currently lets an unmapped cv produce an empty variant.outputSuffix; change it to fail fast by checking the lookup result from variant.versions[cv] and erroring when empty. Specifically, after retrieving version := variant.versions[cv] (or when assigning into version), if version == "" call t.Fatalf (or panic if outside a test helper) with a clear message identifying the missing mapping (include cv and variant/outputSuffixPattern) so missing specific-version mappings are caught immediately instead of creating empty outputSuffix values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/cli/admin/upgrade/recommend/examples_test.go`:
- Around line 83-85: The code currently lets an unmapped cv produce an empty
variant.outputSuffix; change it to fail fast by checking the lookup result from
variant.versions[cv] and erroring when empty. Specifically, after retrieving
version := variant.versions[cv] (or when assigning into version), if version ==
"" call t.Fatalf (or panic if outside a test helper) with a clear message
identifying the missing mapping (include cv and variant/outputSuffixPattern) so
missing specific-version mappings are caught immediately instead of creating
empty outputSuffix values.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 7e68ed96-c0a7-4706-bb16-bc85abe741a1
📒 Files selected for processing (11)
pkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-alert-alerts.jsonpkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-alert-cv.yamlpkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-alert.outputpkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-alert.show-outdated-releases-outputpkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-alert.version-1.2.3-not-important-outputpkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-critical-alert-alerts.jsonpkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-critical-alert-cv.yamlpkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-critical-alert.outputpkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-critical-alert.show-outdated-releases-outputpkg/cli/admin/upgrade/recommend/examples/4.22.0-extend-recommended-critical-alert.version-1.2.3-not-important-outputpkg/cli/admin/upgrade/recommend/examples_test.go
|
/retest-required |
|
@hongkailiu: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Re-open #2213
Summary by CodeRabbit