Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot delete keypair secrets #8944

Closed
marek-obuchowicz opened this issue Apr 20, 2020 · 4 comments · Fixed by #8945
Closed

Cannot delete keypair secrets #8944

marek-obuchowicz opened this issue Apr 20, 2020 · 4 comments · Fixed by #8945
Assignees

Comments

@marek-obuchowicz
Copy link

Recreating issue #6482 which was closed due to inactivity. This is confirmed bug, we just lost our whole cluster created with kops and we can't rotate credentials using method described on https://github.com/kubernetes/kops/blob/master/docs/rotate-secrets.md - so we have complete kubernetes cluster down right now and the problem described in this issue prevents us from bringing cluster back online, following official docs. Hence, ropening a valid and important ticket.

  1. What kops version are you running? The command kops version, will display
    this information.

$ kops version
Version 1.11.0, has also been confirmed with 1.15.0 and 1.16.0.

  1. What Kubernetes version are you running? kubectl version will print the
    version if a cluster is running or provide the Kubernetes version specified as
    a kops flag.

Irrelevant.

  1. What cloud provider are you using?

AWS

  1. What commands did you run? What is the simplest way to reproduce this issue?

$ kops delete secret keypair kube-controller-manager
I0219 15:22:22.716650 15341 certificate.go:106] Ignoring unexpected PEM block: "RSA PRIVATE KEY"

error deleting secret: error deleting certificate: error loading certificate "s3:////pki/private/kube-controller-manager/.key": could not parse certificate
5. What happened after the commands executed?

They failed.

  1. What did you expect to happen?

I expect them to remove the kube-controller-manager keypair, according to your documentation https://github.com/kubernetes/kops/blob/master/docs/rotate-secrets.md

  1. Please provide your cluster manifest. Execute
    kops get --name my.example.com -o yaml to display your cluster manifest.
    You may want to remove your cluster name and other sensitive information.

Irrelevant to this issue.

  1. Please run the commands with most verbose logging by adding the -v 10 flag.
    Paste the logs into this report, or in a gist and provide the gist link here.

$ kops delete secret keypair kube-controller-manager -v10
I0219 15:23:35.129669 15348 factory.go:68] state store s3:///
I0219 15:23:35.409810 15348 s3context.go:194] found bucket in region "eu-central-1"
I0219 15:23:35.409867 15348 s3fs.go:220] Reading file "s3:////config"
I0219 15:23:36.054560 15348 s3fs.go:257] Listing objects in S3 bucket "" with prefix "/pki/private/kube-controller-manager/"
I0219 15:23:36.095834 15348 s3fs.go:285] Listed files in s3:////pki/private/kube-controller-manager: [s3:////pki/private/kube-controller-manager/.key s3:////pki/private/kube-controller-manager/keyset.yaml]
I0219 15:23:36.096162 15348 s3fs.go:220] Reading file "s3:////pki/private/kube-controller-manager/.key"
I0219 15:23:36.170662 15348 certificate.go:106] Ignoring unexpected PEM block: "RSA PRIVATE KEY"

error deleting secret: error deleting certificate: error loading certificate "s3:////pki/private/kube-controller-manager/.key": could not parse certificate
9. Anything else do we need to know?

Please don't let your bots close this issue and take it seriously.
This was already reported in #5318

@marek-obuchowicz
Copy link
Author

@justinsb decided to mention you, as this issue can have a critical impact on cluster stability (unable to rotate credentials - i just have a cluster down because TLS certs expired and can't rotate them). Similar issue has been already reported several times and always seems to be auto-closed, without any triage or discussion. Issue exists at least since version 0.11.x

Ability to correctly rotate credentials is a critical maintenance task, required in cluster operations.

@marek-obuchowicz
Copy link
Author

Workaround: delete pki/issued and pri/private directly from S3 bucket which stores kops state information.

@olemarkus
Copy link
Member

After deleting the objects above, did you manage to continue?
I have solved the bug this issue is about, and I thought I would continue on the rotate secrets docs, but it won't provision new certs for me.

/assign

@marek-obuchowicz
Copy link
Author

What worked for me as the whole process for credentials rotation was in end effect much more complicated:

  • delete objects from s3 (pki/issued and pki/private)
  • kops update cluster --yes (it says it generates new certs; i use terraform output, so I called terraform plan&apply after this step)
  • kops rolling-update cluster --cloudonly --master-interval=10s --node-interval=10s --force --yes
  • kops export kubecfg

After that login via ssh to each master and delete certs from EBS volume:
sudo find /mnt/ -name server.* | xargs -I {} sudo rm {}
sudo find /mnt/ -name me.* | xargs -I {} sudo rm {}

Reboot all master nodes

After this step, etcd should create a working cluster correctly, using new certificates

Next, i've proceeded with "deleting all service accounts" as described in rotate-secrets.md. As a consequence I had to delete all pods (they didn't create because of missing service account tokens) - they recreated correctly and I got k8s cluster up and running again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants