-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle updates to certs or CA info #34
Conversation
If the certificates or CA info is updated (re-issued, transferred to a different CA, etc), the charm needs to handle that by updating the files on disk (already done) and restarting the services (added here).
I do think that We probably don't need to restart kube-scheduler, but I also don't see any real harm in doing it. All that said, I recommend removing the |
Updated per review. As per the other PRs (copying here for completeness), this was tested manually on AWS by doing the following:
|
In testing, during this step, I found that there is a race that can leave kubernetes-master in a bad state where kube-apiserver is no longer able to connect to etcd. The following sequence occurred:
I confirmed with In this state, all kubectl calls fail. Hooks on kubernetes-master and kubernetes-worker are affected (staying in "executing" state for a long time, as they retry kubectl commands), and the cluster has not recovered on its own. Removing easyrsa did not help. |
Flannel also failed to switch to the new etcd client cert info. From
I suspect flannel, calico, canal, and tigera-secure-ee all need similar work to account for this transition, as all of these services connect directly to etcd. |
@Cynerva Please take a look at the following PRs to address the issue in the other charms as well: |
Ready for review. Tested the latest commit as follows:
Some comments:
|
This now depends on charmed-kubernetes/cdk-addons#133 I updated the addon restart code to select on a new Tested again as described in #34 (comment), went well. Needs review. |
Do not merge, needs retest. Updated kubectl calls in this PR to use |
Done testing, awaiting approval. |
@@ -2466,7 +2466,7 @@ def restart_addons_for_ca(): | |||
service_accounts = [] | |||
for namespace, name in service_account_names: | |||
output = kubectl( | |||
'kubectl', 'get', 'ServiceAccount', name, | |||
'get', 'ServiceAccount', name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totally read over that, good catch.
* Handle updates to certs or CA info If the certificates or CA info is updated (re-issued, transferred to a different CA, etc), the charm needs to handle that by updating the files on disk (already done) and restarting the services (added here). * Remove kick_api_server in favor of update_certs * Detect changes to etcd client cert and ensure restart * Lower timeouts for kube-system pod status check * Restart addons after CA changes * Ensure ServiceAccount secrets are updated before restarting addons * Use cdk-restart-on-ca-change label to select addons for restart * Use kubernetes_common.kubectl and clear_flag * Fix broken call to kubectl('kubectl', ...)
If the certificates or CA info is updated (re-issued, transferred to a different CA, etc), the charm needs to handle that by updating the files on disk (already done) and restarting the services (added here).