Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ibm_container_vpc_cluster - update_all_workers not working #1952

Closed
pauljegouic opened this issue Oct 8, 2020 · 6 comments
Closed

ibm_container_vpc_cluster - update_all_workers not working #1952

pauljegouic opened this issue Oct 8, 2020 · 6 comments
Assignees
Labels
service/Kubernetes Service Issues related to Kubernetes Service Issues

Comments

@pauljegouic
Copy link

Hello,

This property does not trigger any update:

  • Even after a manual update using the GUI,
  • Even forcing 1.18.9 on kube_version (master is well updated)
# module.******_k8s.module.iks.ibm_container_vpc_cluster.cluster will be updated in-place
  ~ resource "ibm_container_vpc_cluster" "cluster" {
        albs                            = [
            {
                alb_type               = "private"
                disable_deployment     = false
                enable                 = false
                id                     = "private-crbtov77lf059gpfv8g2fg-alb1"
                load_balancer_hostname = ""
                name                   = ""
                resize                 = false
                state                  = "disabled"
            },
            {
                alb_type               = "private"
                disable_deployment     = false
                enable                 = false
                id                     = "private-crbtov77lf059gpfv8g2fg-alb2"
                load_balancer_hostname = ""
                name                   = ""
                resize                 = false
                state                  = "disabled"
            },
            {
                alb_type               = "public"
                disable_deployment     = false
                enable                 = true
                id                     = "public-crbtov77lf059gpfv8g2fg-alb1"
                load_balancer_hostname = "e782ea05-eu-de.lb.appdomain.cloud"
                name                   = ""
                resize                 = false
                state                  = "enabled"
            },
            {
                alb_type               = "public"
                disable_deployment     = false
                enable                 = true
                id                     = "public-crbtov77lf059gpfv8g2fg-alb2"
                load_balancer_hostname = "e782ea05-eu-de.lb.appdomain.cloud"
                name                   = ""
                resize                 = false
                state                  = "enabled"
            },
        ]
        crn                             = "crn:v1:bluemix:public:containers-kubernetes:eu-de:a/b47236314fa44796b08c3558d97e7d1c:btov77lf059gpfv8g2fg::"
        disable_public_service_endpoint = false
        flavor                          = "cx2.4x8"
      ~ force_delete_storage            = false -> true
        id                              = "btov77lf059gpfv8g2fg"
        ingress_hostname                = "imad-************-cluster-6ee6c8df39940f0fcb1b8a02364e6ccb-0000.eu-de.containers.appdomain.cloud"
        ingress_secret                  = (sensitive value)
      ~ kube_version                    = "1.17.12" -> "1.18.9"
        master_status                   = "Ready"
        master_url                      = "https://c2.eu-de.containers.cloud.ibm.com:24212"
        name                            = "imad-************-cluster"
        pod_subnet                      = "172.17.128.0/18"
        private_service_endpoint_url    = "https://c2.private.eu-de.containers.cloud.ibm.com:24212"
        public_service_endpoint_url     = "https://c2.eu-de.containers.cloud.ibm.com:24212"
        resource_controller_url         = "https://cloud.ibm.com/kubernetes/clusters"
        resource_crn                    = "crn:v1:bluemix:public:containers-kubernetes:eu-de:a/b47236314fa44796b08c3558d97e7d1c:btov77lf059gpfv8g2fg::"
        resource_group_id               = "40469f682f5a48ee8f5060da1959d111"
        resource_group_name             = "************-integration"
        resource_name                   = "imad-************-cluster"
        resource_status                 = "normal"
        service_subnet                  = "172.21.0.0/16"
        state                           = "normal"
        tags                            = [
            "applicationname:************",
            "environmentid:imad",
            "environmenttype:dev",
            "project:************",
            "terraform:true",
            "usage:demo",
        ]
      ~ update_all_workers              = false -> true
        vpc_id                          = "r010-50124cb6-bbbc-4f99-bfcf-6a081fcc3ab8"
        wait_till                       = "IngressReady"
        worker_count                    = 2
        worker_labels                   = {
            "ibm-cloud.kubernetes.io/worker-pool-id" = "btov77lf059gpfv8g2fg-fa041c8"
        }

      + kms_config {
          + crk_id           = (known after apply)
          + instance_id      = (known after apply)
          + private_endpoint = true
        }

        zones {
            name      = "eu-de-1"
            subnet_id = "02b7-d9ae2e24-00b9-4788-9a00-075ed4fdfb86"
        }
        zones {
            name      = "eu-de-2"
            subnet_id = "02c7-690b0562-75a8-4790-8511-4c280f00929b"
        }
    }

Here is the result:

Master's info
image

Workers info
image

@Anil-CM
Copy link
Contributor

Anil-CM commented Oct 30, 2020

@ifs-pauljegouic we addressed these issues in the PR - #1989

The PR addresses the following:

  1. Kube version of nodes updated serially
  2. Separated the dependency of worker nodes kube version upgradation. - To update only the worker nodes update_all_workers.
  3. New parameter - wait_for_worker_update is introduced. to avoid more time in updating kube version of worker nodes.
    Note : This will cause cluster downtime.

we are coming up with a new resource for kube version upgrade to handle all the requirements.

@hkantare
Copy link
Collaborator

Fix available in latest release
https://github.com/IBM-Cloud/terraform-provider-ibm/releases/tag/v1.14.0

@TBradCreech
Copy link

@Anil-CM and @hkantare , I wanted to share my testing results using the IBM terraform provider 1.14.0:

  • I can confirm that when updating the kube_version to a new major.minor, AND changing update_all_workers from false to true, causes both the master AND workers to get upgraded.
  • Additionally, the worker upgrades are done one-worker-at-a-time. Great

HOWEVER, While that test description above could have moved a cluster from 1.17 to 1.18 latest, imagine changing the kube_version to 1.19 and doing a new init/generate/apply. Recall that update_all_workers is still set to true from the previous test. The expectation would be that the master AND workers are upgraded to 1.19.

Unfortunately, in this case, Terraform recognizes the change in the kube_version and moves the master to 1.19, but the workers are not upgraded. It's as if the logic is seemingly coded (incorrectly) to require update_all_workers to change from false to true, for the new code fix from this Git issue to get executed.

As a further test, I moved that setting back to false, then back to true, and saw the workers finally get upgraded (as long as the kube_version changed too).

It should not be necessary to toggle update_all_workers from true to false to true, in order for your new fix the get invoked. It should be enough if it is set to true, and Terraform recognized change in kube_version (or elsewhere), then if there is a worker upgrade to be applied, Terraform does so.

@bemahone
Copy link

@TBradCreech - I experienced the same behavior today. However even after the toggle of update_all_workers from true to false back to true (with terraform plan/apply in between each toggle), only 1 out of 3 workers updated and the schematic was running for nearly an hour and a half. It eventually failed with this error message:

2020/11/20 20:22:56 Terraform apply | module.iks_cluster.ibm_container_vpc_cluster.cluster[0]: Still modifying... [id=<redacted>, 1h12m11s elapsed]
 2020/11/20 20:23:00 Terraform apply | 
 2020/11/20 20:23:00 Terraform apply | Error: Error waiting for cluster (<redacted>) worker nodes kube version to be updated: timeout while waiting for state to become 'normal' (last state: 'updating', timeout: 1h0m0s)
 2020/11/20 20:23:00 Terraform apply | 
 2020/11/20 20:23:00 Terraform apply |   on cluster/cluster.tf line 15, in resource "ibm_container_vpc_cluster" "cluster":
 2020/11/20 20:23:00 Terraform apply |   15:  resource "ibm_container_vpc_cluster" "cluster" {
 2020/11/20 20:23:00 Terraform apply | 
 2020/11/20 20:23:00 Terraform apply | 
 2020/11/20 20:23:00 Terraform APPLY error: Terraform APPLY errorexit status 1
 2020/11/20 20:23:00 Could not execute action

The workers and cluster were all normal in the ibmcloud CLI...

@kavya498 kavya498 added the service/Kubernetes Service Issues related to Kubernetes Service Issues label Mar 30, 2021
@hkantare
Copy link
Collaborator

Now we support two ways to update the patch

update_all_workers and patch_version

More details can be reffered from #1978

@hkantare
Copy link
Collaborator

closing the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
service/Kubernetes Service Issues related to Kubernetes Service Issues
Projects
None yet
Development

No branches or pull requests

6 participants