Skip to content

NO-JIRA: increase precondition cluster health timeout to 10m#30893

Open
fonta-rh wants to merge 1 commit intoopenshift:mainfrom
fonta-rh:bump-precondition-timeout
Open

NO-JIRA: increase precondition cluster health timeout to 10m#30893
fonta-rh wants to merge 1 commit intoopenshift:mainfrom
fonta-rh:bump-precondition-timeout

Conversation

@fonta-rh
Copy link
Contributor

Summary

  • Bump preconditionClusterHealthyTimeout from 5 minutes to 10 minutes
  • Pacemaker stop sequences on TNF nodes take 4-8 minutes, making the previous 5-minute precondition timeout insufficient for disruptive test lanes
  • This prevents premature test skips when operators are still stabilizing after a prior disruptive test

Test plan

  • Verify disruptive TNF CI jobs no longer skip tests due to precondition timeouts when operators are still progressing
  • Confirm non-disruptive test lanes are unaffected (precondition checks pass quickly on healthy clusters)

🤖 Generated with Claude Code

@openshift-ci-robot
Copy link

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 17, 2026
@openshift-ci-robot
Copy link

@fonta-rh: This pull request explicitly references no jira issue.

Details

In response to this:

Summary

  • Bump preconditionClusterHealthyTimeout from 5 minutes to 10 minutes
  • Pacemaker stop sequences on TNF nodes take 4-8 minutes, making the previous 5-minute precondition timeout insufficient for disruptive test lanes
  • This prevents premature test skips when operators are still stabilizing after a prior disruptive test

Test plan

  • Verify disruptive TNF CI jobs no longer skip tests due to precondition timeouts when operators are still progressing
  • Confirm non-disruptive test lanes are unaffected (precondition checks pass quickly on healthy clusters)

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link

coderabbitai bot commented Mar 17, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a3f84851-7dde-4c7f-b7f2-c39f644b3c78

📥 Commits

Reviewing files that changed from the base of the PR and between 733b1c4 and 4524133.

📒 Files selected for processing (1)
  • test/extended/two_node/utils/common.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/extended/two_node/utils/common.go

Walkthrough

A single timeout constant in the two-node test utilities was increased from 5 minutes to 10 minutes; no other code, control flow, or behavior was modified.

Changes

Cohort / File(s) Summary
Timeout Configuration
test/extended/two_node/utils/common.go
Changed constant preconditionClusterHealthyTimeout from 5 * time.Minute to 10 * time.Minute. preconditionEtcdHealthyTimeout unchanged.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.3)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can get early access to new features in CodeRabbit.

Enable the early_access setting to enable early access features such as new models, tools, and more.

@openshift-ci openshift-ci bot requested review from jaypoulz and qJkee March 17, 2026 12:46
@jaypoulz
Copy link
Contributor

/approve
/lgtm

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Mar 17, 2026
Pacemaker stop sequences on TNF nodes take 4-8 minutes, making the
5-minute precondition timeout insufficient for disruptive test lanes.
Bump to 10 minutes to avoid premature test skips.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@fonta-rh fonta-rh force-pushed the bump-precondition-timeout branch from 733b1c4 to 4524133 Compare March 17, 2026 15:11
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 17, 2026
@jaypoulz
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 17, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 17, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fonta-rh, jaypoulz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 17, 2026

@fonta-rh: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-metal-ipi-ovn-ipv6 4524133 link true /test e2e-metal-ipi-ovn-ipv6

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@fonta-rh
Copy link
Contributor Author

/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-degraded-techpreview

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 18, 2026

@fonta-rh: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-degraded-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c54fc3c0-22b5-11f1-9128-7d59fef5ef5e-0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants