Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Update GKE version #51

Closed
wants to merge 4 commits into from
Closed

fix: Update GKE version #51

wants to merge 4 commits into from

Conversation

flamarion
Copy link
Contributor

@flamarion flamarion commented Feb 27, 2023

This PR is intended to upgrade the Kubernetes cluster to a newer stable version since the current version is not supported anymore.

module.wandb.module.database.google_sql_user.wandb: Creation complete after 1s [id=wandb//tf-perms-gcp-star-sheep]
╷
│ Error: googleapi: Error 400: Master version "1.22.12-gke.500" is unsupported., badRequest
│
│   with module.wandb.module.app_gke.google_container_cluster.default,
│   on ../terraform-google-wandb/modules/app_gke/main.tf line 1, in resource "google_container_cluster" "default":
│    1: resource "google_container_cluster" "default" {
│
╵

The current supported versions

gcloud container get-server-config --flatten="channels" --filter="channels.channel=STABLE" \
    --format="yaml(channels.channel,channels.validVersions)"
Fetching server config for europe-west2-a
---
channels:
  channel: STABLE
  validVersions:
  - 1.24.9-gke.1500
  - 1.23.14-gke.1800
  - 1.22.16-gke.2000
  - 1.21.14-gke.14600
  - 1.21.14-gke.14100

It also adds the possibility of informing the Kubernetes version and a validation of the versions accepted.


validation {
condition = regex("^1.2[2-4].*", var.gke_version)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been tested.

for a in "1.22.16-gke.2000" "1.23.14-gke.1800" "1.24.9-gke.1500" "1.21.14-gke.14600" "1.25" ; do echo "length(regexall(\"^1.2[2-4].*\", \"${a}\")) > 0" | terraform console; done
true
true
true
false
false

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still seems wrong. You are limiting the versions to 1.24, 1.23, 1.24, means we would need to update it on version 1.25

Something like this would make more sense

^1.[0-9]*.[0-9]*-gke.[0-9]*$

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GKE valid versions in stable channel are:

  - 1.24.9-gke.1500
  - 1.23.14-gke.1800
  - 1.22.16-gke.2000
  - 1.21.14-gke.14600
  - 1.21.14-gke.14100

But I'm fine to open to any other version if you believe this is what we want.

@flamarion
Copy link
Contributor Author

The upgrade process takes quite some time to upgrade the cluster and the node pool.

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the
following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # module.wandb.module.app_gke.google_container_cluster.default will be updated in-place
  ~ resource "google_container_cluster" "default" {
        id                          = "projects/tmp-terraform-permissions/locations/europe-west2-a/clusters/tf-perms-gcp-cluster"
      ~ min_master_version          = "1.22.16-gke.2000" -> "1.23.14-gke.1800"
        name                        = "tf-perms-gcp-cluster"
        # (27 unchanged attributes hidden)

        # (16 unchanged blocks hidden)
    }

  # module.wandb.module.app_gke.google_container_node_pool.default will be updated in-place
  ~ resource "google_container_node_pool" "default" {
        id                          = "projects/tmp-terraform-permissions/locations/europe-west2-a/clusters/tf-perms-gcp-cluster/nodePools/default-pool-relieved-dinosaur"
        name                        = "default-pool-relieved-dinosaur"
      ~ version                     = "1.22.16-gke.2000" -> "1.23.14-gke.1800"
        # (9 unchanged attributes hidden)

        # (4 unchanged blocks hidden)
    }

Cluster:

module.wandb.module.app_gke.google_container_cluster.default: Still modifying... [id=projects/tmp-terraform-permissions/loca...-west2-a/clusters/tf-perms-gcp-cluster, 6m30s elapsed]
module.wandb.module.app_gke.google_container_cluster.default: Still modifying... [id=projects/tmp-terraform-permissions/loca...-west2-a/clusters/tf-perms-gcp-cluster, 6m40s elapsed]
module.wandb.module.app_gke.google_container_cluster.default: Still modifying... [id=projects/tmp-terraform-permissions/loca...-west2-a/clusters/tf-perms-gcp-cluster, 6m50s elapsed]
module.wandb.module.app_gke.google_container_cluster.default: Modifications complete after 6m54s [id=projects/tmp-terraform-permissions/locations/europe-west2-a/clusters/tf-perms-gcp-cluster]

Node Pool:

odule.wandb.module.app_gke.google_container_node_pool.default: Still modifying... [id=projects/tmp-terraform-permissions/loca...dePools/default-pool-relieved-dinosaur, 20m50s elapsed]
module.wandb.module.app_gke.google_container_node_pool.default: Still modifying... [id=projects/tmp-terraform-permissions/loca...dePools/default-pool-relieved-dinosaur, 21m0s elapsed]
module.wandb.module.app_gke.google_container_node_pool.default: Still modifying... [id=projects/tmp-terraform-permissions/loca...dePools/default-pool-relieved-dinosaur, 21m10s elapsed]
module.wandb.module.app_gke.google_container_node_pool.default: Still modifying... [id=projects/tmp-terraform-permissions/loca...dePools/default-pool-relieved-dinosaur, 21m20s elapsed]
module.wandb.module.app_gke.google_container_node_pool.default: Modifications complete after 21m28s [id=projects/tmp-terraform-permissions/locations/europe-west2-a/clusters/tf-perms-gcp-cluster/nodePools/default-pool-relieved-dinosaur]

@jsbroks
Copy link
Member

jsbroks commented Mar 1, 2023

Wait do we need to set a min master version? Should terraform use the latest when we deploy the instance?

@flamarion
Copy link
Contributor Author

Wait do we need to set a min master version? Should terraform use the latest when we deploy the instance?

The default version for the STABLE channel is not the latest, but the 1.23

gcloud container get-server-config --zone=europe-west2 --flatten=channels --filter="channels.channel=STABLE" --format="value(channels.defaultVersion)"
Fetching server config for europe-west2
1.23.14-gke.1800

1.24 is available, but it's not the default from the STABLE channel. If you think we need to go to 1.25 you need to replace the channel from STABLE with RAPID.

@flamarion flamarion closed this Mar 1, 2023
@flamarion flamarion deleted the updated_gke_version branch March 1, 2023 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants