Bug 1809665: The console should wait until it is out of rotation to shut down #385

smarterclayton · 2020-02-28T23:04:19Z

When a pod is marked deleted, endpoints are updated instantly and
propagate to load balancers, routers, and nodes. A component that
wishes to remain available during upgrades must wait longer than
the default propagation interval for these changes to avoid having
requests delivered to pods that are shutting down.

Change the console to wait 25s before terminating the serving
process and up to 40s on the node to ensure all front ends have
time to drain. The minimum interval here is how long an average
connection can take behind the router to drain, once new connections
stop getting created. I.e.

wait = time to propagate endpoints (5s) +
       time for router reload (5s) +
       time for longest request to finish (15s)
     = 25s

This change should result in the console not being disrupted during
upgrade (0s disruption to the console route).

Downtime documented in https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/19447

When a pod is marked deleted, endpoints are updated instantly and propagate to load balancers, routers, and nodes. A component that wishes to remain available during upgrades must wait longer than the default propagation interval for these changes to avoid having requests delivered to pods that are shutting down. Change the console to wait 25s before terminating the serving process and up to 40s on the node to ensure all front ends have time to drain. The minimum interval here is how long an average connection can take behind the router to drain, once new connections stop getting created. I.e. wait = time to propagate endpoints (5s) + time for router reload (5s) + time for longest request to finish (15s) = 25s

smarterclayton · 2020-02-28T23:12:35Z

/hold

while testing

smarterclayton · 2020-02-29T22:00:07Z

/test images

smarterclayton · 2020-03-03T03:27:45Z

/hold cancel

Verified during upgrade route ingresses remain available https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/19806 when testing all three PRs together.

This is ready for review.

openshift-ci-robot · 2020-03-03T16:17:29Z

@smarterclayton: This pull request references Bugzilla bug 1809665, which is invalid:

expected the bug to be in one of the following states: NEW, ASSIGNED, ON_DEV, POST, POST, but it is MODIFIED instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 1809665: The console should wait until it is out of rotation to shut down

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

smarterclayton · 2020-03-03T16:19:03Z

/cherry-pick release-4.4

openshift-cherrypick-robot · 2020-03-03T16:19:04Z

@smarterclayton: once the present PR merges, I will cherry-pick it on top of release-4.4 in a new PR and assign it to you.

In response to this:

/cherry-pick release-4.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

smarterclayton · 2020-03-03T16:19:09Z

/cherry-pick release-4.3

openshift-cherrypick-robot · 2020-03-03T16:19:10Z

@smarterclayton: once the present PR merges, I will cherry-pick it on top of release-4.3 in a new PR and assign it to you.

In response to this:

/cherry-pick release-4.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

smarterclayton · 2020-03-03T16:19:13Z

4.4 bug is 1809667 and 4.3 bug is 1809668

benjaminapetersen

/approve
/lgtm

looking into the e2e test fail to see if its a flake.

benjaminapetersen · 2020-03-03T16:25:30Z

/retest

Feb 28 23:36:09.600: INFO:  > ERROR: (gcloud.compute.instance-groups.list-instances) could not parse resource [] 
...
 fail [k8s.io/kubernetes/test/e2e/apimachinery/resource_quota.go:166]: Unexpected error:
    <*errors.errorString | 0xc0002c43f0>: {
        s: "timed out waiting for the condition",
}
timed out waiting for the condition

smarterclayton · 2020-03-03T16:47:50Z

/retest

openshift-ci-robot · 2020-03-03T16:49:31Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: benjaminapetersen, smarterclayton

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [benjaminapetersen]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-bot · 2020-03-03T18:19:00Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-03-03T20:55:07Z

/retest

Please review the full test history for this PR and help us cut down flakes.

smarterclayton · 2020-03-03T21:34:21Z

/bugzilla refresh

openshift-ci-robot · 2020-03-03T21:34:57Z

@smarterclayton: This pull request references Bugzilla bug 1809665, which is invalid:

expected the bug to be in one of the following states: NEW, ASSIGNED, ON_DEV, POST, POST, but it is ON_QA instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

smarterclayton · 2020-03-03T21:35:46Z

/bugzilla refresh

openshift-ci-robot · 2020-03-03T21:35:50Z

@smarterclayton: This pull request references Bugzilla bug 1809665, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target release (4.5.0) matches configured target release for branch (4.5.0)
bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

smarterclayton · 2020-03-03T21:36:44Z

/retest

openshift-bot · 2020-03-04T00:11:33Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-03-04T00:23:04Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-03-04T08:51:38Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-03-04T09:44:01Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-03-04T12:31:30Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-03-04T13:23:21Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-03-04T13:36:20Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-03-04T14:55:21Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-03-04T15:07:08Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-ci-robot · 2020-03-04T18:06:10Z

@smarterclayton: All pull requests linked via external trackers have merged. Bugzilla bug 1809665 has been moved to the MODIFIED state.

In response to this:

Bug 1809665: The console should wait until it is out of rotation to shut down

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-cherrypick-robot · 2020-03-04T18:06:44Z

@smarterclayton: new pull request created: #387

In response to this:

/cherry-pick release-4.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-cherrypick-robot · 2020-03-04T18:07:00Z

@smarterclayton: new pull request created: #388

In response to this:

/cherry-pick release-4.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 28, 2020

openshift-ci-robot requested review from benjaminapetersen and spadgett February 28, 2020 23:04

smarterclayton mentioned this pull request Feb 28, 2020

Bug 1809665: The oauth server should wait until it is out of rotation to shut down openshift/cluster-authentication-operator#252

Merged

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 28, 2020

smarterclayton mentioned this pull request Mar 2, 2020

Bug 1809665: Start graceful shutdown on SIGTERM openshift/router#94

Merged

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 3, 2020

smarterclayton changed the title ~~The console should wait until it is out of rotation to shut down~~ Bug 1809665: The console should wait until it is out of rotation to shut down Mar 3, 2020

openshift-ci-robot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Mar 3, 2020

benjaminapetersen approved these changes Mar 3, 2020

View reviewed changes

openshift-ci-robot assigned benjaminapetersen Mar 3, 2020

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 3, 2020

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 3, 2020

openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Mar 3, 2020

openshift-ci-robot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Mar 3, 2020

openshift-merge-robot merged commit a9b8732 into openshift:master Mar 4, 2020

openshift-cherrypick-robot mentioned this pull request Mar 4, 2020

[release-4.4] Bug 1809667: The console should wait until it is out of rotation to shut down #387

Merged

openshift-cherrypick-robot mentioned this pull request Mar 4, 2020

[release-4.3] Bug 1809668: The console should wait until it is out of rotation to shut down #388

Closed

openshift-ci-robot mentioned this pull request Apr 13, 2020

Bug 1809665: Re-add pod disruption budget for ingress controllers openshift/cluster-ingress-operator#387

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug 1809665: The console should wait until it is out of rotation to shut down #385

Bug 1809665: The console should wait until it is out of rotation to shut down #385

smarterclayton commented Feb 28, 2020 •

edited

smarterclayton commented Feb 28, 2020

smarterclayton commented Feb 29, 2020

smarterclayton commented Mar 3, 2020

openshift-ci-robot commented Mar 3, 2020

smarterclayton commented Mar 3, 2020

openshift-cherrypick-robot commented Mar 3, 2020

smarterclayton commented Mar 3, 2020

openshift-cherrypick-robot commented Mar 3, 2020

smarterclayton commented Mar 3, 2020

benjaminapetersen left a comment

benjaminapetersen commented Mar 3, 2020

smarterclayton commented Mar 3, 2020

openshift-ci-robot commented Mar 3, 2020

openshift-bot commented Mar 3, 2020

openshift-bot commented Mar 3, 2020

smarterclayton commented Mar 3, 2020

openshift-ci-robot commented Mar 3, 2020

smarterclayton commented Mar 3, 2020

openshift-ci-robot commented Mar 3, 2020

smarterclayton commented Mar 3, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-ci-robot commented Mar 4, 2020

openshift-cherrypick-robot commented Mar 4, 2020

openshift-cherrypick-robot commented Mar 4, 2020

Bug 1809665: The console should wait until it is out of rotation to shut down #385

Bug 1809665: The console should wait until it is out of rotation to shut down #385

Conversation

smarterclayton commented Feb 28, 2020 • edited

smarterclayton commented Feb 28, 2020

smarterclayton commented Feb 29, 2020

smarterclayton commented Mar 3, 2020

openshift-ci-robot commented Mar 3, 2020

smarterclayton commented Mar 3, 2020

openshift-cherrypick-robot commented Mar 3, 2020

smarterclayton commented Mar 3, 2020

openshift-cherrypick-robot commented Mar 3, 2020

smarterclayton commented Mar 3, 2020

benjaminapetersen left a comment

Choose a reason for hiding this comment

benjaminapetersen commented Mar 3, 2020

smarterclayton commented Mar 3, 2020

openshift-ci-robot commented Mar 3, 2020

openshift-bot commented Mar 3, 2020

openshift-bot commented Mar 3, 2020

smarterclayton commented Mar 3, 2020

openshift-ci-robot commented Mar 3, 2020

smarterclayton commented Mar 3, 2020

openshift-ci-robot commented Mar 3, 2020

smarterclayton commented Mar 3, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-bot commented Mar 4, 2020

openshift-ci-robot commented Mar 4, 2020

openshift-cherrypick-robot commented Mar 4, 2020

openshift-cherrypick-robot commented Mar 4, 2020

smarterclayton commented Feb 28, 2020 •

edited