Bug 1835146: Cluster restore should not stop network pods on bare-metal #352

retroflexer · 2020-05-15T14:06:34Z

Currently the cluster backup script backs up static pod resources for "kube-apiserver-pod.yaml" "kube-controller-manager-pod.yaml" "kube-scheduler-pod.yaml" and "etcd-pod.yaml". However, when cluster-restore script is run, it stops all the static pods (including network pods for the bare-metal case).

However, after the restore operation we only restore the kube static pods and etcd, but do not restore network pods (as they are not backed up originally). This results in failed cluster restore unable to scale up fully.

This PR fixes the problem by only stopping the kube static pods, leaving the network pods to continue to run.

openshift-ci-robot · 2020-05-15T14:08:32Z

@retroflexer: This pull request references Bugzilla bug 1835146, which is invalid:

expected the bug to target the "4.5.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 1835146: Cluster restore should not stop network pods on bare-metal

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

retroflexer · 2020-05-15T14:18:44Z

/bugzilla refresh

openshift-ci-robot · 2020-05-15T14:18:49Z

@retroflexer: This pull request references Bugzilla bug 1835146, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target release (4.5.0) matches configured target release for branch (4.5.0)
bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot · 2020-05-15T14:19:33Z

@retroflexer: This pull request references Bugzilla bug 1835146, which is valid.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target release (4.5.0) matches configured target release for branch (4.5.0)
bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1835146: Cluster restore should not stop network pods on bare-metal

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

retroflexer · 2020-05-16T00:43:08Z

/retest

bindata/etcd/cluster-restore.sh

hexfusion

/lgtm

openshift-ci-robot · 2020-05-19T01:14:01Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hexfusion, retroflexer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [hexfusion]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

hexfusion · 2020-05-19T01:14:04Z

/retest

openshift-bot · 2020-05-19T01:19:36Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-05-19T02:11:35Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-05-19T02:51:32Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-05-19T03:07:22Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-ci-robot · 2020-05-19T03:52:59Z

@retroflexer: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-azure	`3c88a62`	link	`/test e2e-azure`
ci/prow/e2e-aws-disruptive	`3c88a62`	link	`/test e2e-aws-disruptive`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-bot · 2020-05-19T03:55:37Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-ci-robot · 2020-05-19T04:11:58Z

@retroflexer: All pull requests linked via external trackers have merged: openshift/cluster-etcd-operator#352. Bugzilla bug 1835146 has been moved to the MODIFIED state.

In response to this:

Bug 1835146: Cluster restore should not stop network pods on bare-metal

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

retroflexer · 2020-05-20T14:40:19Z

/cherrypick release-4.4

openshift-cherrypick-robot · 2020-05-20T14:40:28Z

@retroflexer: new pull request created: #357

In response to this:

/cherrypick release-4.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Cluster restore should not stop network pods on bare-metal

3c88a62

openshift-ci-robot requested review from alaypatel07 and mfojtik May 15, 2020 14:06

retroflexer changed the title ~~Cluster restore should not stop network pods on bare-metal~~ Bug 1835146: Cluster restore should not stop network pods on bare-metal May 15, 2020

openshift-ci-robot added the bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. label May 15, 2020

openshift-ci-robot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label May 15, 2020

openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label May 15, 2020

openshift-ci-robot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label May 15, 2020

hexfusion reviewed May 18, 2020

View reviewed changes

bindata/etcd/cluster-restore.sh Show resolved Hide resolved

hexfusion approved these changes May 19, 2020

View reviewed changes

openshift-ci-robot assigned hexfusion May 19, 2020

openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 19, 2020

openshift-merge-robot merged commit ac29d52 into openshift:master May 19, 2020

openshift-cherrypick-robot mentioned this pull request May 20, 2020

Bug 1836270: Cluster restore should not stop network pods on bare-metal #357

Merged

retroflexer deleted the fix-cluster-restore branch June 1, 2021 23:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug 1835146: Cluster restore should not stop network pods on bare-metal #352

Bug 1835146: Cluster restore should not stop network pods on bare-metal #352

retroflexer commented May 15, 2020 •

edited

openshift-ci-robot commented May 15, 2020

retroflexer commented May 15, 2020

openshift-ci-robot commented May 15, 2020

openshift-ci-robot commented May 15, 2020

retroflexer commented May 16, 2020

hexfusion left a comment

openshift-ci-robot commented May 19, 2020

hexfusion commented May 19, 2020

openshift-bot commented May 19, 2020

openshift-bot commented May 19, 2020

openshift-bot commented May 19, 2020

openshift-bot commented May 19, 2020

openshift-ci-robot commented May 19, 2020

openshift-bot commented May 19, 2020

openshift-ci-robot commented May 19, 2020

retroflexer commented May 20, 2020

openshift-cherrypick-robot commented May 20, 2020

Bug 1835146: Cluster restore should not stop network pods on bare-metal #352

Bug 1835146: Cluster restore should not stop network pods on bare-metal #352

Conversation

retroflexer commented May 15, 2020 • edited

openshift-ci-robot commented May 15, 2020

retroflexer commented May 15, 2020

openshift-ci-robot commented May 15, 2020

openshift-ci-robot commented May 15, 2020

retroflexer commented May 16, 2020

hexfusion left a comment

Choose a reason for hiding this comment

openshift-ci-robot commented May 19, 2020

hexfusion commented May 19, 2020

openshift-bot commented May 19, 2020

openshift-bot commented May 19, 2020

openshift-bot commented May 19, 2020

openshift-bot commented May 19, 2020

openshift-ci-robot commented May 19, 2020

openshift-bot commented May 19, 2020

openshift-ci-robot commented May 19, 2020

retroflexer commented May 20, 2020

openshift-cherrypick-robot commented May 20, 2020

retroflexer commented May 15, 2020 •

edited