ibm_container_vpc_cluster - update_all_workers not working #1952

pauljegouic · 2020-10-08T11:15:18Z

Hello,

This property does not trigger any update:

Even after a manual update using the GUI,
Even forcing 1.18.9 on kube_version (master is well updated)

# module.******_k8s.module.iks.ibm_container_vpc_cluster.cluster will be updated in-place
  ~ resource "ibm_container_vpc_cluster" "cluster" {
        albs                            = [
            {
                alb_type               = "private"
                disable_deployment     = false
                enable                 = false
                id                     = "private-crbtov77lf059gpfv8g2fg-alb1"
                load_balancer_hostname = ""
                name                   = ""
                resize                 = false
                state                  = "disabled"
            },
            {
                alb_type               = "private"
                disable_deployment     = false
                enable                 = false
                id                     = "private-crbtov77lf059gpfv8g2fg-alb2"
                load_balancer_hostname = ""
                name                   = ""
                resize                 = false
                state                  = "disabled"
            },
            {
                alb_type               = "public"
                disable_deployment     = false
                enable                 = true
                id                     = "public-crbtov77lf059gpfv8g2fg-alb1"
                load_balancer_hostname = "e782ea05-eu-de.lb.appdomain.cloud"
                name                   = ""
                resize                 = false
                state                  = "enabled"
            },
            {
                alb_type               = "public"
                disable_deployment     = false
                enable                 = true
                id                     = "public-crbtov77lf059gpfv8g2fg-alb2"
                load_balancer_hostname = "e782ea05-eu-de.lb.appdomain.cloud"
                name                   = ""
                resize                 = false
                state                  = "enabled"
            },
        ]
        crn                             = "crn:v1:bluemix:public:containers-kubernetes:eu-de:a/b47236314fa44796b08c3558d97e7d1c:btov77lf059gpfv8g2fg::"
        disable_public_service_endpoint = false
        flavor                          = "cx2.4x8"
      ~ force_delete_storage            = false -> true
        id                              = "btov77lf059gpfv8g2fg"
        ingress_hostname                = "imad-************-cluster-6ee6c8df39940f0fcb1b8a02364e6ccb-0000.eu-de.containers.appdomain.cloud"
        ingress_secret                  = (sensitive value)
      ~ kube_version                    = "1.17.12" -> "1.18.9"
        master_status                   = "Ready"
        master_url                      = "https://c2.eu-de.containers.cloud.ibm.com:24212"
        name                            = "imad-************-cluster"
        pod_subnet                      = "172.17.128.0/18"
        private_service_endpoint_url    = "https://c2.private.eu-de.containers.cloud.ibm.com:24212"
        public_service_endpoint_url     = "https://c2.eu-de.containers.cloud.ibm.com:24212"
        resource_controller_url         = "https://cloud.ibm.com/kubernetes/clusters"
        resource_crn                    = "crn:v1:bluemix:public:containers-kubernetes:eu-de:a/b47236314fa44796b08c3558d97e7d1c:btov77lf059gpfv8g2fg::"
        resource_group_id               = "40469f682f5a48ee8f5060da1959d111"
        resource_group_name             = "************-integration"
        resource_name                   = "imad-************-cluster"
        resource_status                 = "normal"
        service_subnet                  = "172.21.0.0/16"
        state                           = "normal"
        tags                            = [
            "applicationname:************",
            "environmentid:imad",
            "environmenttype:dev",
            "project:************",
            "terraform:true",
            "usage:demo",
        ]
      ~ update_all_workers              = false -> true
        vpc_id                          = "r010-50124cb6-bbbc-4f99-bfcf-6a081fcc3ab8"
        wait_till                       = "IngressReady"
        worker_count                    = 2
        worker_labels                   = {
            "ibm-cloud.kubernetes.io/worker-pool-id" = "btov77lf059gpfv8g2fg-fa041c8"
        }

      + kms_config {
          + crk_id           = (known after apply)
          + instance_id      = (known after apply)
          + private_endpoint = true
        }

        zones {
            name      = "eu-de-1"
            subnet_id = "02b7-d9ae2e24-00b9-4788-9a00-075ed4fdfb86"
        }
        zones {
            name      = "eu-de-2"
            subnet_id = "02c7-690b0562-75a8-4790-8511-4c280f00929b"
        }
    }

Here is the result:

Master's info

Workers info

Anil-CM · 2020-10-30T08:00:23Z

@ifs-pauljegouic we addressed these issues in the PR - #1989

The PR addresses the following:

Kube version of nodes updated serially
Separated the dependency of worker nodes kube version upgradation. - To update only the worker nodes update_all_workers.
New parameter - wait_for_worker_update is introduced. to avoid more time in updating kube version of worker nodes.
Note : This will cause cluster downtime.

we are coming up with a new resource for kube version upgrade to handle all the requirements.

hkantare · 2020-10-30T08:40:30Z

Fix available in latest release
https://github.com/IBM-Cloud/terraform-provider-ibm/releases/tag/v1.14.0

TBradCreech · 2020-11-19T23:36:42Z

@Anil-CM and @hkantare , I wanted to share my testing results using the IBM terraform provider 1.14.0:

I can confirm that when updating the kube_version to a new major.minor, AND changing update_all_workers from false to true, causes both the master AND workers to get upgraded.
Additionally, the worker upgrades are done one-worker-at-a-time. Great

HOWEVER, While that test description above could have moved a cluster from 1.17 to 1.18 latest, imagine changing the kube_version to 1.19 and doing a new init/generate/apply. Recall that update_all_workers is still set to true from the previous test. The expectation would be that the master AND workers are upgraded to 1.19.

Unfortunately, in this case, Terraform recognizes the change in the kube_version and moves the master to 1.19, but the workers are not upgraded. It's as if the logic is seemingly coded (incorrectly) to require update_all_workers to change from false to true, for the new code fix from this Git issue to get executed.

As a further test, I moved that setting back to false, then back to true, and saw the workers finally get upgraded (as long as the kube_version changed too).

It should not be necessary to toggle update_all_workers from true to false to true, in order for your new fix the get invoked. It should be enough if it is set to true, and Terraform recognized change in kube_version (or elsewhere), then if there is a worker upgrade to be applied, Terraform does so.

bemahone · 2020-11-20T20:42:38Z

@TBradCreech - I experienced the same behavior today. However even after the toggle of update_all_workers from true to false back to true (with terraform plan/apply in between each toggle), only 1 out of 3 workers updated and the schematic was running for nearly an hour and a half. It eventually failed with this error message:

2020/11/20 20:22:56 Terraform apply | module.iks_cluster.ibm_container_vpc_cluster.cluster[0]: Still modifying... [id=<redacted>, 1h12m11s elapsed]
 2020/11/20 20:23:00 Terraform apply | 
 2020/11/20 20:23:00 Terraform apply | Error: Error waiting for cluster (<redacted>) worker nodes kube version to be updated: timeout while waiting for state to become 'normal' (last state: 'updating', timeout: 1h0m0s)
 2020/11/20 20:23:00 Terraform apply | 
 2020/11/20 20:23:00 Terraform apply |   on cluster/cluster.tf line 15, in resource "ibm_container_vpc_cluster" "cluster":
 2020/11/20 20:23:00 Terraform apply |   15:  resource "ibm_container_vpc_cluster" "cluster" {
 2020/11/20 20:23:00 Terraform apply | 
 2020/11/20 20:23:00 Terraform apply | 
 2020/11/20 20:23:00 Terraform APPLY error: Terraform APPLY errorexit status 1
 2020/11/20 20:23:00 Could not execute action

The workers and cluster were all normal in the ibmcloud CLI...

hkantare · 2021-03-31T07:36:49Z

Now we support two ways to update the patch

update_all_workers and patch_version

More details can be reffered from #1978

hkantare · 2021-03-31T07:36:57Z

closing the issue

bemahone mentioned this issue Oct 19, 2020

Limitations When Updating Kubernetes Versions on VPC Gen 2 Clusters #1978

Closed

Anil-CM mentioned this issue Oct 27, 2020

IKS cluster resources to handle Kube version update. #1989

Merged

hkantare assigned hkantare and Anil-CM Oct 30, 2020

kavya498 added the service/Kubernetes Service Issues related to Kubernetes Service Issues label Mar 30, 2021

hkantare closed this as completed Mar 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ibm_container_vpc_cluster - update_all_workers not working #1952

ibm_container_vpc_cluster - update_all_workers not working #1952

pauljegouic commented Oct 8, 2020

Anil-CM commented Oct 30, 2020

hkantare commented Oct 30, 2020

TBradCreech commented Nov 19, 2020

bemahone commented Nov 20, 2020

hkantare commented Mar 31, 2021

hkantare commented Mar 31, 2021

ibm_container_vpc_cluster - update_all_workers not working #1952

ibm_container_vpc_cluster - update_all_workers not working #1952

Comments

pauljegouic commented Oct 8, 2020

Anil-CM commented Oct 30, 2020

hkantare commented Oct 30, 2020

TBradCreech commented Nov 19, 2020

bemahone commented Nov 20, 2020

hkantare commented Mar 31, 2021

hkantare commented Mar 31, 2021