Ignore controller user_data changes to allow plugin updates #335

dghubble · 2018-10-28T23:01:04Z

Updating the terraform-provider-ct plugin is known to produce a user_data diff in all pre-existing clusters. Applying the diff to pre-existing cluster destroys controller nodes
Ignore changes to controller user_data. Once all managed clusters use a release containing this change, it is possible to update the terraform-provider-ct plugin (worker user_data will still be modified)
Changing the module ref for an existing cluster and re-applying is still NOT supported (although this PR would protect controllers from being destroyed)

* Updating the `terraform-provider-ct` plugin is known to produce a `user_data` diff in all pre-existing clusters. Applying the diff to pre-existing cluster destroys controller nodes * Ignore changes to controller `user_data`. Once all managed clusters use a release containing this change, it is possible to update the `terraform-provider-ct` plugin (worker `user_data` will still be modified) * Changing the module `ref` for an existing cluster and re-applying is still NOT supported (although this PR would protect controllers from being destroyed)

dghubble · 2018-10-29T00:21:16Z

Once all managed clusters run v1.12.2 or higher, you can optionally update to the v0.3.0 terraform-provider-ct plugin. Terraform only pins plugin versions per-module for "official" providers so updating the plugin binary is global. Backup the old terraform-provider-ct plugin. Download the new terraform-provider-ct plugin in its place.

In a Terraform config directory with clusters that are v1.12.2 or higher, re-run init and plan.

terraform init
terraform plan

Double check that no diff's are proposed for controller nodes, as those will destroy cluster state. Otheriwse, proceed with the apply.

terraform apply

Worker node user-data will be altered slightly (the new plugin generates Ignition with a newer version number). Rolling workers happens differently depending on platform:

AWS

AWS creates a new worker ASG, then removes the old ASG. New workers will join the cluster, then old workers will disappear. Expect terraform apply to hang during this process.

Azure

Azure edits the worker scale set in-place instantly. Manually terminate workers to create replacement workers using the new user-data.

Bare-Metal

No action is needed. Bare-Metal machines do not re-PXE unless explicitly made to do so.

DigitalOcean

DigitalOcean destroys existing worker nodes and DNS records. Then it creates new workers and records. Since DigitalOcean lacks a "managed group" or "auto-scaling group" equivalent, you must taint the secret copy step to secure copy a kubeconfig to the new workers, otherwise they won't join the cluster.

terraform apply
# old workers destroyed, new workers created

terraform state list | grep null_resource
terraform taint -module digital-ocean-nemo null_resource.copy-secrets.0
...

terraform apply

Google Cloud

Google Cloud creates a new worker template and edits the worker instance group in-place instantly. Manually terminate workers to create replacement workers using the new user-data (or just wait 24 hours if they're preemptible instances).

dghubble force-pushed the ignore-controller-user-data branch from 7be543f to be42ffc Compare October 28, 2018 23:13

dghubble force-pushed the ignore-controller-user-data branch from be42ffc to 0e71f7e Compare October 28, 2018 23:48

dghubble merged commit 0e71f7e into master Oct 29, 2018

dghubble deleted the ignore-controller-user-data branch November 4, 2018 05:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ignore controller user_data changes to allow plugin updates #335

Ignore controller user_data changes to allow plugin updates #335

dghubble commented Oct 28, 2018 •

edited

dghubble commented Oct 29, 2018

Ignore controller user_data changes to allow plugin updates #335

Ignore controller user_data changes to allow plugin updates #335

Conversation

dghubble commented Oct 28, 2018 • edited

dghubble commented Oct 29, 2018

AWS

Azure

Bare-Metal

DigitalOcean

Google Cloud

dghubble commented Oct 28, 2018 •

edited