NO-JIRA: Tolerate restarts for kubevirt external infra #3451

davidvossel · 2024-01-22T18:55:31Z

This PR allows the test to tolerate a single restart only for KubeVirt when running on external infra. The centralized KubeVirt infra test still does not tolerate any unexpected restarts.

The KubeVirt platform has two modes, centralized infra (where HCP and VMs run on the same OCP cluster, and external infra (Where HCP and VMs run on separate OCP clusters)

When we test external infra, we are running HCP KubeVirt running within HCP KubeVirt. This is a complex environment that is difficult to ensure has predictable performance. We occasionally see that random pods in the HCP namespace restart in this nested environment due to "Error: context deadline exceeded" being reported by the kubelet. This is likely a result of etcd latency within this environment.

openshift-ci-robot · 2024-01-22T18:55:35Z

@davidvossel: This pull request explicitly references no jira issue.

In response to this:

This PR allows the test to tolerate a single restart only for KubeVirt when running on external infra. The centralized KubeVirt infra test still does not tolerate any unexpected restarts.

The KubeVirt platform has two modes, centralized infra (where HCP and VMs run on the same OCP cluster, and external infra (Where HCP and VMs run on separate OCP clusters)

When we test external infra, we are running HCP KubeVirt running within HCP KubeVirt. This is a complex environment that is difficult to ensure has predictable performance. We occasionally see that random pods in the HCP namespace restart in this nested environment due to "Error: context deadline exceeded" being reported by the kubelet. This is likely a result of etcd latency within this environment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

nunnatsa

/lgtm

openshift-ci · 2024-01-23T15:54:16Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: davidvossel, nunnatsa

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [davidvossel]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot · 2024-01-23T18:53:56Z

/retest-required

Remaining retests: 0 against base HEAD 9b08dcf and 2 for PR HEAD 221de09 in total

openshift-ci-robot · 2024-01-24T06:15:13Z

/retest-required

Remaining retests: 0 against base HEAD 6c84753 and 1 for PR HEAD 221de09 in total

nunnatsa · 2024-01-24T14:02:36Z

/retest-required

openshift-ci-robot · 2024-01-24T17:53:06Z

/retest-required

Remaining retests: 0 against base HEAD d7b8d75 and 0 for PR HEAD 221de09 in total

openshift-ci-robot · 2024-01-24T19:50:49Z

/hold

Revision 221de09 was retested 3 times: holding

davidvossel · 2024-01-24T20:28:14Z

/retest-required

nunnatsa · 2024-01-25T07:02:14Z

/unhold

openshift-ci-robot · 2024-01-25T07:31:33Z

/retest-required

Remaining retests: 0 against base HEAD 8d96c1a and 2 for PR HEAD 221de09 in total

nunnatsa · 2024-01-25T10:19:55Z

/retest

davidvossel · 2024-01-25T14:15:38Z

/test verify

Signed-off-by: David Vossel <davidvossel@gmail.com>

qinqon · 2024-01-25T14:49:20Z

/lgtm

nunnatsa · 2024-01-25T14:49:28Z

/lgtm

openshift-ci-robot · 2024-01-25T15:21:11Z

/retest-required

Remaining retests: 0 against base HEAD 83fb2da and 2 for PR HEAD e9904a3 in total

openshift-ci-robot · 2024-01-25T20:04:30Z

/retest-required

Remaining retests: 0 against base HEAD 330501b and 1 for PR HEAD e9904a3 in total

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 22, 2024

openshift-ci bot added the do-not-merge/needs-area label Jan 22, 2024

openshift-ci bot requested review from enxebre and sjenning January 22, 2024 18:56

openshift-ci bot added area/testing Indicates the PR includes changes for e2e testing approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed do-not-merge/needs-area labels Jan 22, 2024

nunnatsa approved these changes Jan 23, 2024

View reviewed changes

openshift-ci bot assigned nunnatsa Jan 23, 2024

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 23, 2024

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 24, 2024

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 25, 2024

Tolerate restarts for kubevirt external infra

e9904a3

Signed-off-by: David Vossel <davidvossel@gmail.com>

davidvossel force-pushed the tolerate-external-infra-restart branch from 221de09 to e9904a3 Compare January 25, 2024 14:48

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jan 25, 2024

openshift-ci bot assigned qinqon Jan 25, 2024

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 25, 2024

openshift-merge-bot bot merged commit 05168af into openshift:main Jan 26, 2024
11 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NO-JIRA: Tolerate restarts for kubevirt external infra #3451

NO-JIRA: Tolerate restarts for kubevirt external infra #3451

davidvossel commented Jan 22, 2024

openshift-ci-robot commented Jan 22, 2024

nunnatsa left a comment

openshift-ci bot commented Jan 23, 2024

openshift-ci-robot commented Jan 23, 2024

openshift-ci-robot commented Jan 24, 2024

nunnatsa commented Jan 24, 2024

openshift-ci-robot commented Jan 24, 2024

openshift-ci-robot commented Jan 24, 2024

davidvossel commented Jan 24, 2024

nunnatsa commented Jan 25, 2024

openshift-ci-robot commented Jan 25, 2024

nunnatsa commented Jan 25, 2024

davidvossel commented Jan 25, 2024

qinqon commented Jan 25, 2024

nunnatsa commented Jan 25, 2024

openshift-ci-robot commented Jan 25, 2024

openshift-ci-robot commented Jan 25, 2024

NO-JIRA: Tolerate restarts for kubevirt external infra #3451

NO-JIRA: Tolerate restarts for kubevirt external infra #3451

Conversation

davidvossel commented Jan 22, 2024

openshift-ci-robot commented Jan 22, 2024

nunnatsa left a comment

Choose a reason for hiding this comment

openshift-ci bot commented Jan 23, 2024

openshift-ci-robot commented Jan 23, 2024

openshift-ci-robot commented Jan 24, 2024

nunnatsa commented Jan 24, 2024

openshift-ci-robot commented Jan 24, 2024

openshift-ci-robot commented Jan 24, 2024

davidvossel commented Jan 24, 2024

nunnatsa commented Jan 25, 2024

openshift-ci-robot commented Jan 25, 2024

nunnatsa commented Jan 25, 2024

davidvossel commented Jan 25, 2024

qinqon commented Jan 25, 2024

nunnatsa commented Jan 25, 2024

openshift-ci-robot commented Jan 25, 2024

openshift-ci-robot commented Jan 25, 2024