Bug 1796147: pkg/server: serve config only to master in bootstrap server #1421

runcom · 2020-01-29T22:17:42Z

The new cluster etcd operator flow is:

start bootstrap mcs
start etcd on bootstrap
wait for bootstrapping to finish i.e. atleast one control-plane is ready and there is MCS running on cluster
turn down bootstrap mcs

What the above does is giving a chance to workers to grab
the ignition config from the bootstap server which now stays up longer.
However, by the time they attempt to create a CSR the kube-apiserver has
rotated that bootstrap chain of trust out which causes the workers to error out with:

Jan 29 19:55:20 ip-10-0-130-205 hyperkube[2623]: E0129 19:55:20.869251 2623 certificate_manager.go:421] Failed while requesting a signed certificate from the master: cannot create certificate signing request: Unauthorized

The above results in workers not being able to join the cluster eventually.

What this patch does is denying serving the configuration to all pools but master
within the bootstrap server, effectively delaying workers to grab the wrong config
from the wrong server. Workers will keep polling for configuration and they'll
eventually grab the correct one from the server running within the new cluster.

Signed-off-by: Antonio Murdaca runcom@linux.com

The new cluster etcd operator flow is: 1) start bootstrap mcs 2) start etcd on bootstrap 3) wait for bootstrapping to finish i.e. atleast one control-plane is ready and there is MCS running on cluster 4) turn down bootstrap mcs What the above does is giving a chance to workers to grab the ignition config from the bootstap server which now stays up longer. However, by the time they attempt to create a CSR the kube-apiserver has rotated that bootstrap chain of trust out which causes the workers to error out with: Jan 29 19:55:20 ip-10-0-130-205 hyperkube[2623]: E0129 19:55:20.869251 2623 certificate_manager.go:421] Failed while requesting a signed certificate from the master: cannot create certificate signing request: Unauthorized The above results in workers not being able to join the cluster eventually. What this patch does is denying serving the configuration to all pools but master within the bootstrap server, effectively delaying workers to grab the wrong config from the wrong server. Workers will keep polling for configuration and they'll eventually grab the correct one from the server running within the new cluster. Signed-off-by: Antonio Murdaca <runcom@linux.com>

openshift-ci-robot · 2020-01-29T22:17:48Z

@runcom: This pull request references Bugzilla bug 1796147, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Bug 1796147: pkg/server: serve config only to master in bootstrap server

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kikisdeliveryservice · 2020-01-29T22:19:12Z

cc: @crawford

crawford · 2020-01-29T22:22:12Z

/lgtm

openshift-ci-robot · 2020-01-29T22:23:59Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: crawford, runcom

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [runcom]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

crawford · 2020-01-30T00:33:47Z

It's not terribly surprising that e2e-gcp-upgrade failed since it's using the bootstrap process prior to this change.

/override ci/prow/e2e-gcp-upgrade

openshift-ci-robot · 2020-01-30T00:33:49Z

@crawford: Overrode contexts on behalf of crawford: ci/prow/e2e-gcp-upgrade

In response to this:

It's not terribly surprising that e2e-gcp-upgrade failed since it's using the bootstrap process prior to this change.

/override ci/prow/e2e-gcp-upgrade

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot · 2020-01-30T00:37:06Z

@runcom: All pull requests linked via external trackers have merged. Bugzilla bug 1796147 has been moved to the MODIFIED state.

In response to this:

Bug 1796147: pkg/server: serve config only to master in bootstrap server

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot · 2020-01-30T02:18:48Z

@runcom: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-gcp-upgrade	`c58b8f9`	link	`/test e2e-gcp-upgrade`
ci/prow/e2e-aws-scaleup-rhel7	`c58b8f9`	link	`/test e2e-aws-scaleup-rhel7`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

vrutkovs · 2020-01-30T15:11:05Z

/cherrypick fcos

openshift-cherrypick-robot · 2020-01-30T15:11:15Z

@vrutkovs: new pull request created: #1423

In response to this:

/cherrypick fcos

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Jan 29, 2020

openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 29, 2020

openshift-ci-robot requested review from cgwalters and ericavonb January 29, 2020 22:18

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 29, 2020

openshift-ci-robot assigned crawford Jan 29, 2020

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 29, 2020

openshift-merge-robot merged commit 4932c47 into openshift:master Jan 30, 2020

runcom deleted the no-workers-bootstrap-mcs branch January 30, 2020 07:26

openshift-cherrypick-robot mentioned this pull request Jan 30, 2020

[fcos] pkg/server: serve config only to master in bootstrap server #1423

Merged

stbenjam mentioned this pull request Feb 7, 2020

Bug 1800746: baremetal: only respond to dhcp for control plane mac's openshift/installer#3079

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug 1796147: pkg/server: serve config only to master in bootstrap server #1421

Bug 1796147: pkg/server: serve config only to master in bootstrap server #1421

runcom commented Jan 29, 2020

openshift-ci-robot commented Jan 29, 2020

kikisdeliveryservice commented Jan 29, 2020

crawford commented Jan 29, 2020

openshift-ci-robot commented Jan 29, 2020

crawford commented Jan 30, 2020

openshift-ci-robot commented Jan 30, 2020

openshift-ci-robot commented Jan 30, 2020

openshift-ci-robot commented Jan 30, 2020

vrutkovs commented Jan 30, 2020

openshift-cherrypick-robot commented Jan 30, 2020

Bug 1796147: pkg/server: serve config only to master in bootstrap server #1421

Bug 1796147: pkg/server: serve config only to master in bootstrap server #1421

Conversation

runcom commented Jan 29, 2020

openshift-ci-robot commented Jan 29, 2020

kikisdeliveryservice commented Jan 29, 2020

crawford commented Jan 29, 2020

openshift-ci-robot commented Jan 29, 2020

crawford commented Jan 30, 2020

openshift-ci-robot commented Jan 30, 2020

openshift-ci-robot commented Jan 30, 2020

openshift-ci-robot commented Jan 30, 2020

vrutkovs commented Jan 30, 2020

openshift-cherrypick-robot commented Jan 30, 2020