Unclear when to use google_container_cluster or google_container_node_pool #475

paddycarver · 2017-09-27T17:27:23Z

The google_container_cluster resource has a node_pool field that can be used to define the node pools of the cluster. But there's also a google_container_node_pool resource that can also define node pools in a cluster. But there's no guidance on when/how to use these, whether they should be used together, or why they're separated in the first place.

The text was updated successfully, but these errors were encountered:

paddycarver · 2017-09-27T17:30:28Z

I think we can probably resolve this by updating the documentation pages for both these resources to explain that google_container_cluster should manage the node pools when you have a single, authoritative list of node pools--this should, generally, be the common case. However, google_container_node_pool should be used when you want to distribute authority on node pool configuration in a cluster, e.g., when an infrastructure team manages the cluster, and then each developer team manages their own node pools, sometimes with different requirements. We should probably also note that google_container_node_pool won't remove node pools that are added outside of Terraform. And we should show how to use lifecylcle.ignore_changes to make google_container_cluster work with google_container_node_pool.

rochdev · 2017-11-08T16:35:54Z

@paddycarver Just to confirm my understanding, in order to manage node pools using gcloud_container_node_pool the following steps have to be taken:

Create a google_container_cluster which will create a default pool, and use lifecycle.ignore_changes to ignore the pool.
Create a null_resource that will delete that pool
Create any number of pools using google_container_node_pool

My reasoning is that by using lifecycle.ignore_changes, no changes can ever be done to that node pool, so it should simply be removed and replaced with a google_container_node_pool.

Are there other ways to manage updateable node pools that can be managed externally?

matti · 2018-01-18T20:48:17Z

I think I'm finally successful with:

resource "google_container_cluster" "stateful" {
  lifecycle {
    ignore_changes = ["node_pool"]
  }
  node_pool = {}
}

This will create an extra node pool (I don't understand how null_resource can be used to delete that (that sounds awful)), but now it works as expected. If this is the correct way to go, this is the example that should be in the docs.

**EDIT: not so sure anymore, I'm giving up with separate google_container_node_pools and just in-lining them to my google_container_cluster (it's a massive list) -- I don't understand how this went this complex. There's clearly something wrong with this design/docs.

***EDIT2: well, that prevents me from removing a node pool in the future without recreating the cluster.

mattdodge · 2018-01-23T21:48:57Z

I may be in the minority for this, but I do think that in production you should almost always be managing your cluster and your node pool separately. Primarily because of @matti's second edit, that any changes to the node pool will require the entire cluster to go down and come back up, no zero-downtime deploys are possible. That means you're left with that pesky default node pool though. Terraform is kind of in a tough spot here I think, I think the fault really lies with GCP's inability to launch a cluster without any node pool (despite the fact that you can delete all of the node pools?).

Anyways, I posted it here too, but here's an example of how to use a null_resource to delete the default node pool after the cluster is created.

resource "google_container_cluster" "cluster" {
  name = "my-cluster"
  zone = "us-west1-a"
  initial_node_count = 1
}

resource "google_container_node_pool" "pool" {
  name = "my-cluster-nodes"
  node_count = "3"
  zone = "us-west1-a"
  cluster = "${google_container_cluster.cluster.name}"
  node_config {
    machine_type = "n1-standard-1"
  }
  # Delete the default node pool before spinning this one up
  depends_on = ["null_resource.default_cluster_deleter"]
}

resource "null_resource" "default_cluster_deleter" {
  provisioner "local-exec" {
    command = <<EOF
      gcloud container node-pools \
	--project my-project \
	--quiet \
	delete default-pool \
	--cluster ${google_container_cluster.cluster.name}
EOF
  }
}

roobert · 2018-04-23T10:29:13Z

For anyone else who finds this issue it looks like there is now the remove_default_node_pool parameter (#1245).

The following config will create a cluster (cluster0) with two attached node pools (nodepool{0,1}) and no default node-pool):

"resource" "google_container_cluster" "cluster0" {
  "name" = "cluster0"
  "zone" = "europe-west1-b"
  "remove_default_node_pool" = true
  "additional_zones" = ["europe-west1-c", "europe-west1-d"]
  "node_pool" = {
    "name" = "default-pool"
  }
  "lifecycle" = {
    "ignore_changes" = ["node_pool"]
  }
}

"resource" "google_container_node_pool" "nodepool0" {
  "name" = "nodepool0"
  "cluster" = "cluster0"
  "node_count" = 1
  "zone" = "europe-west1-b"
  "depends_on" = ["google_container_cluster.cluster0"]
  "node_config" = {
    "machine_type" = "f1-micro"
  }
}

"resource" "google_container_node_pool" "nodepool1" {
  "name" = "nodepool1"
  "cluster" = "cluster0"
  "node_count" = 3
  "zone" = "europe-west1-d"
  "depends_on" = ["google_container_cluster.cluster0"]
}

Updating node pool properties and adding/deleting node pools to the cluster seems to behave as expected.

I think this issue is probably still valid as it's not really clear from the docs whether this is the preferred method for managing node pools or not.

michaelbannister · 2018-04-25T06:03:28Z

According to the docs, GKE chooses the master VM’s size based on the initial number of nodes, so if you’re going to have a large cluster, you may want that initial number to be bigger than 1, even though you’re going to delete it!
https://kubernetes.io/docs/admin/cluster-large/#size-of-master-and-master-components
If anyone knows this to be outdated, I’d love to hear it :)

rochdev · 2018-04-25T11:38:03Z

@michaelbannister This only seems to apply when using the kube-up.sh script to manage the masters yourself on GCE. With GKE however, the masters are managed by Google, in which case it becomes their responsibility to deal with scaling to support your nodes.

ghost · 2019-03-04T13:51:37Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

paddycarver added the documentation label Sep 27, 2017

paddycarver mentioned this issue Sep 27, 2017

Improve container engine / node pool handling #285

Closed

danawillow mentioned this issue Nov 17, 2017

Cannot mix node pools within google_container_cluster with seperate google_container_node_pool resources #751

Closed

rochdev mentioned this issue Nov 21, 2017

Not possible to use google_container_node_pool without the default node pool #773

Closed

pratikmallya mentioned this issue Sep 26, 2018

Remove default node pools and manage them explicitly terraform-google-modules/terraform-google-kubernetes-engine#15

Closed

nathanwilk7 mentioned this issue Oct 31, 2018

Regional K8 Cluster Created with Non-Functional Node Pools #2380

Closed

rileykarson mentioned this issue Jan 30, 2019

Update GKE examples, docs to recommend fine grained node pools. GoogleCloudPlatform/magic-modules#1329

Merged

modular-magician closed this as completed in GoogleCloudPlatform/magic-modules#1329 Feb 1, 2019

ghost locked and limited conversation to collaborators Mar 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unclear when to use google_container_cluster or google_container_node_pool #475

Unclear when to use google_container_cluster or google_container_node_pool #475

paddycarver commented Sep 27, 2017

paddycarver commented Sep 27, 2017

rochdev commented Nov 8, 2017

matti commented Jan 18, 2018 •

edited

Loading

mattdodge commented Jan 23, 2018 •

edited

Loading

roobert commented Apr 23, 2018

michaelbannister commented Apr 25, 2018

rochdev commented Apr 25, 2018 •

edited

Loading

ghost commented Mar 4, 2019

Unclear when to use google_container_cluster or google_container_node_pool #475

Unclear when to use google_container_cluster or google_container_node_pool #475

Comments

paddycarver commented Sep 27, 2017

paddycarver commented Sep 27, 2017

rochdev commented Nov 8, 2017

matti commented Jan 18, 2018 • edited Loading

mattdodge commented Jan 23, 2018 • edited Loading

roobert commented Apr 23, 2018

michaelbannister commented Apr 25, 2018

rochdev commented Apr 25, 2018 • edited Loading

ghost commented Mar 4, 2019

matti commented Jan 18, 2018 •

edited

Loading

mattdodge commented Jan 23, 2018 •

edited

Loading

rochdev commented Apr 25, 2018 •

edited

Loading