Skip to content

monitoring-plugin: add e2e-management-api Prow job#79673

Open
sradco wants to merge 1 commit into
openshift:mainfrom
sradco:monitoring-plugin-management-api-e2e
Open

monitoring-plugin: add e2e-management-api Prow job#79673
sradco wants to merge 1 commit into
openshift:mainfrom
sradco:monitoring-plugin-management-api-e2e

Conversation

@sradco
Copy link
Copy Markdown

@sradco sradco commented May 25, 2026

Summary

  • Adds a new optional presubmit job e2e-management-api for openshift/monitoring-plugin
  • Adds a new step registry entry monitoring-plugin-tests-management-api that deploys the plugin backend on a provisioned AWS cluster, port-forwards it to localhost, and runs the Go backend integration tests (make test-e2e)
  • Trigger: /test e2e-management-api

Context

The test/e2e/ directory in openshift/monitoring-plugin contains Go integration tests for the Alerting Management API that hit the real plugin HTTP server and verify PrometheusRule CRDs are created/deleted on the cluster. These tests were not previously wired into any Prow job.

This PR fixes that gap by adding the Prow job and step. The step:

  1. Deploys the monitoring-plugin backend as a Deployment in openshift-monitoring
  2. Port-forwards it to localhost:9001
  3. Sets PLUGIN_URL and KUBECONFIG and runs make test-e2e

Related: openshift/monitoring-plugin#942

Made with Cursor

Summary by CodeRabbit

This PR adds CI infrastructure in the OpenShift CI repo to run Alerting Management API end-to-end tests for the openshift/monitoring-plugin repository.

What changed (practical terms)

  • Added an optional presubmit Prow job e2e-management-api to the monitoring-plugin CI config. The job targets the openshift-org-aws cluster profile, uses the ipi-aws workflow, and is triggerable with the comment: /test e2e-management-api.
  • Added a step-registry entry monitoring-plugin-tests-management-api that deploys and runs the monitoring-plugin backend and then executes the repository's Go e2e tests.

Key behaviors of the new step

  • Deploys the monitoring-plugin backend (built/run from source) into the openshift-monitoring namespace on a provisioned AWS cluster, port-forwards the backend to localhost:9001, and waits for /health to be reachable.
  • Exports PLUGIN_URL (pointing at localhost:9001) and relies on ci-operator-injected KUBECONFIG so tests can create namespaces and verify PrometheusRule CRD lifecycle.
  • Runs the existing Go integration tests under test/e2e/ via make test-e2e.
  • Resource requests: 2 CPU, 4Gi memory (8Gi limit); timeout 30m; 60s grace period; includes cleanup trap to stop background processes.

Other changes

  • Added OWNERS for the new step-registry directory.
  • Regenerated presubmits via prowgen to fix generated-config/metadata formatting.

Impact

Wires previously-unexecuted Alerting Management API integration tests into Prow CI so they run (on demand) against a real plugin HTTP server and a provisioned cluster, validating PrometheusRule CRD creation/deletion end-to-end.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds an optional CI e2e job e2e-management-api and its step registry, metadata, OWNERS, and a Bash runner that starts the monitoring-plugin backend locally, waits for /health, sets PLUGIN_URL, and executes make test-e2e.

Changes

Monitoring Plugin Management API E2E Test

Layer / File(s) Summary
CI job configuration
ci-operator/config/openshift/monitoring-plugin/openshift-monitoring-plugin-main.yaml
Adds the optional e2e-management-api test job configured for cluster_profile: openshift-org-aws, test: monitoring-plugin-tests-management-api, and workflow: ipi-aws.
Step registry, metadata, OWNERS
ci-operator/step-registry/monitoring-plugin/tests/management-api/monitoring-plugin-tests-management-api-ref.yaml, ci-operator/step-registry/monitoring-plugin/tests/management-api/monitoring-plugin-tests-management-api-ref.metadata.json, ci-operator/step-registry/monitoring-plugin/tests/management-api/OWNERS
Adds the monitoring-plugin-tests-management-api step reference (30m timeout, 60s grace), resource requests (2 CPU, 4Gi memory) and 8Gi memory limit, metadata linking the YAML and owners, and OWNERS file with approvers/reviewers.
Test runner implementation
ci-operator/step-registry/monitoring-plugin/tests/management-api/monitoring-plugin-tests-management-api-commands.sh
Bash script: strict modes, optional proxy sourcing, configures Go caches, starts backend via go run ./cmd/plugin-backend.go on port 9001 in background, traps EXIT to kill the PID, polls http://localhost:9001/health up to 15 times, exports PLUGIN_URL, and runs make test-e2e.

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels: lgtm, approved, rehearsals-ack

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Ipv6 And Disconnected Network Test Compatibility ⚠️ Warning The test runner script uses IPv4-only URL construction pattern (localhost:port) without IPv6 brackets, which fails in IPv6-only disconnected environments where localhost resolves to ::1. Update URL construction to use IPv6-compatible format: replace http://localhost:${PORT} with http://[localhost]:${PORT} or detect IPv6 and add brackets accordingly.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'monitoring-plugin: add e2e-management-api Prow job' accurately summarizes the main change: adding a new e2e-management-api Prow job for the monitoring-plugin project.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR adds CI infrastructure only (YAML, shell scripts, metadata, OWNERS) with no Go test files or Ginkgo test definitions. Check for Ginkgo test name stability is not applicable.
Test Structure And Quality ✅ Passed PR adds CI configuration and bash test runner, not Ginkgo test code. Custom check for Ginkgo test structure is not applicable to these CI configuration files.
Microshift Test Compatibility ✅ Passed This PR adds CI configuration (YAML and bash scripts) to run existing e2e tests, but does not add any new Ginkgo test code. The custom check is not applicable when no new tests are introduced.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR adds CI configuration only; no new Ginkgo test code is included. The referenced tests are existing code in monitoring-plugin repo, not new tests added here.
Topology-Aware Scheduling Compatibility ✅ Passed PR adds CI test configuration and scripts only; no deployment manifests, operators, or scheduling constraints targeting control-plane nodes, pod anti-affinity, or topology assumptions are introduced.
Ote Binary Stdout Contract ✅ Passed This PR adds CI configuration for regular Go e2e tests, not OTE binaries. The check is inapplicable as no OTE test binaries are added or modified.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from etmurasaki and jgbernalp May 25, 2026 13:16
@sradco sradco force-pushed the monitoring-plugin-management-api-e2e branch from 0ba39e4 to 6432570 Compare May 25, 2026 13:16
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 25, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sradco
Once this PR has been reviewed and has the lgtm label, please assign jgbernalp for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sradco sradco force-pushed the monitoring-plugin-management-api-e2e branch from 6432570 to d5b7863 Compare May 25, 2026 13:17
@sradco
Copy link
Copy Markdown
Author

sradco commented May 25, 2026

@simonpasquier, @jgbernalp, @machadovilaca please review this PR

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
ci-operator/step-registry/monitoring-plugin/tests/management-api/monitoring-plugin-tests-management-api-commands.sh (1)

37-46: ⚡ Quick win

Consider a more robust port-forward readiness check.

The 5-second sleep works but doesn't verify the port-forward is actually ready. A polling loop checking port availability would be more reliable.

♻️ Optional: Poll for port readiness
 echo "Starting port-forward to localhost:${PORT}..."
 oc --kubeconfig="${KUBECONFIG}" -n "${NAMESPACE}" \
   port-forward "deployment/${DEPLOY_NAME}" "${PORT}:${PORT}" &
 PF_PID=$!
 trap "kill ${PF_PID} 2>/dev/null || true; \
   oc --kubeconfig=${KUBECONFIG} -n ${NAMESPACE} delete deployment/${DEPLOY_NAME} \
      service/${DEPLOY_NAME} --ignore-not-found" EXIT
 
-# Wait for port-forward to be ready
-sleep 5
+# Wait for port-forward to be ready
+echo "Waiting for port-forward to be ready..."
+for i in {1..30}; do
+  if curl -s -o /dev/null -w "%{http_code}" "http://localhost:${PORT}/health" | grep -q "200\|404"; then
+    echo "Port-forward is ready"
+    break
+  fi
+  sleep 1
+done

Note: Adjust the health check endpoint as needed for your service.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/monitoring-plugin/tests/management-api/monitoring-plugin-tests-management-api-commands.sh`
around lines 37 - 46, Replace the fixed "sleep 5" with a polling readiness check
that verifies the background port-forward (PF_PID) actually opened the local
port: after launching oc port-forward for "deployment/${DEPLOY_NAME}" and
capturing PF_PID, loop until either a TCP connection to localhost:${PORT}
succeeds (e.g., using nc or curl to a health endpoint) or a timeout elapses, and
also ensure the loop exits with error if PF_PID dies; keep the existing trap
that kills PF_PID and deletes resources but change the readiness logic to poll
localhost:${PORT} (or a service health path) instead of a blind sleep.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@ci-operator/step-registry/monitoring-plugin/tests/management-api/monitoring-plugin-tests-management-api-commands.sh`:
- Around line 37-46: Replace the fixed "sleep 5" with a polling readiness check
that verifies the background port-forward (PF_PID) actually opened the local
port: after launching oc port-forward for "deployment/${DEPLOY_NAME}" and
capturing PF_PID, loop until either a TCP connection to localhost:${PORT}
succeeds (e.g., using nc or curl to a health endpoint) or a timeout elapses, and
also ensure the loop exits with error if PF_PID dies; keep the existing trap
that kills PF_PID and deletes resources but change the readiness logic to poll
localhost:${PORT} (or a service health path) instead of a blind sleep.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 559220c1-cdcf-41e5-a65c-c8d2a2ccdc0c

📥 Commits

Reviewing files that changed from the base of the PR and between 98d41cb and 0ba39e4.

⛔ Files ignored due to path filters (1)
  • ci-operator/jobs/openshift/monitoring-plugin/openshift-monitoring-plugin-main-presubmits.yaml is excluded by !ci-operator/jobs/**
📒 Files selected for processing (4)
  • ci-operator/config/openshift/monitoring-plugin/openshift-monitoring-plugin-main.yaml
  • ci-operator/step-registry/monitoring-plugin/tests/management-api/monitoring-plugin-tests-management-api-commands.sh
  • ci-operator/step-registry/monitoring-plugin/tests/management-api/monitoring-plugin-tests-management-api-ref.metadata.json
  • ci-operator/step-registry/monitoring-plugin/tests/management-api/monitoring-plugin-tests-management-api-ref.yaml

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/step-registry/monitoring-plugin/tests/management-api/monitoring-plugin-tests-management-api-commands.sh`:
- Around line 32-39: The health-check loop that polls
"http://localhost:${PORT}/health" should fail fast if the backend never becomes
ready: after the for-loop that runs curl (the readiness probe loop), add a check
that verifies success (e.g., a boolean/ready flag or re-run curl) and if not
ready print an error like "Backend did not become ready" and exit 1 so tests
(make test-e2e) are not run; apply the same change to the second identical loop
around the later health check so both loops abort the script on timeout instead
of continuing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 616c3a69-5f18-4dd0-89ec-9584515d3dd7

📥 Commits

Reviewing files that changed from the base of the PR and between 0ba39e4 and d5b7863.

⛔ Files ignored due to path filters (1)
  • ci-operator/jobs/openshift/monitoring-plugin/openshift-monitoring-plugin-main-presubmits.yaml is excluded by !ci-operator/jobs/**
📒 Files selected for processing (4)
  • ci-operator/config/openshift/monitoring-plugin/openshift-monitoring-plugin-main.yaml
  • ci-operator/step-registry/monitoring-plugin/tests/management-api/monitoring-plugin-tests-management-api-commands.sh
  • ci-operator/step-registry/monitoring-plugin/tests/management-api/monitoring-plugin-tests-management-api-ref.metadata.json
  • ci-operator/step-registry/monitoring-plugin/tests/management-api/monitoring-plugin-tests-management-api-ref.yaml
✅ Files skipped from review due to trivial changes (1)
  • ci-operator/step-registry/monitoring-plugin/tests/management-api/monitoring-plugin-tests-management-api-ref.metadata.json

@sradco sradco force-pushed the monitoring-plugin-management-api-e2e branch from c9650d7 to e8df0ad Compare May 25, 2026 14:47
Add a new optional presubmit that starts the monitoring-plugin backend
directly from source (go run) and runs the Go e2e tests against the live
Alerting Management API.

Changes:
- ci-operator/config: add e2e-management-api test stanza (always_run:
  false, optional: true) using cluster_profile openshift-org-aws and
  workflow ipi-aws
- ci-operator/jobs: regenerate presubmits via prowgen
- step-registry/monitoring-plugin/tests/management-api: new step with
  commands.sh, ref.yaml, metadata.json, and OWNERS

Signed-off-by: Shirly Radco <sradco@redhat.com>
@sradco sradco force-pushed the monitoring-plugin-management-api-e2e branch from b0b7d81 to 8c326a9 Compare May 25, 2026 15:56
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@sradco: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-monitoring-plugin-main-e2e-management-api openshift/monitoring-plugin presubmit Presubmit changed
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 25, 2026

@sradco: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant