Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion for new Kubernetes resource(s) #120

Closed
bkircher opened this issue Jan 25, 2021 · 12 comments
Closed

Suggestion for new Kubernetes resource(s) #120

bkircher opened this issue Jan 25, 2021 · 12 comments
Assignees
Milestone

Comments

@bkircher
Copy link
Contributor

Current example on how to spin up a k8s cluster using gridscale_paas resource: terraform-examples/managed-k8s/cluster.tf.

I find this hard to use. (Why is k8s_worker_node_storage exposed when I am not allowed to change it? (see #118) Particularly, errors happen very late after apply at the API layer. (IMO, getting a 400 back from the API after doing terrform plan && terraform apply should be treated as a bug in this provider.)

Proposal: add new resources dedicated only to k8s (and possibly later more for other PaaS offerings)

To get available versions we could point the user to gscloud tool:

$ gscloud kubernetes versions
1.16.15-gs2
1.17.14-gs1
1.18.12-gs1
1.19.4-gs1

(Given that gscloud has implemented issue #113.)

This version slug could be subsequently used in the version parameter (see below). Behind the scenes we could use this to find the current service_template_uuid before planning.

That might prevent binding a service_template_uuid directly to a resource even, freeing us from the problems when clusters are updated on gridscale's side without being reflected in the local TF state.

Example:

resource "gridscale_kubernetes_cluster" "mycluster" {
  name   = "mycluster"
  version = "1.19.4-gs1"
  labels = ["foo", "staging"]

  timeouts {
    create = "15m"
  }

  worker_pool {
    node_count = 3
    cores = 2  # cores per node
    memory = 4  # mem per node
    storage = 30  #
  }
}

Additionally (possibly), we could allow for a "provider" here that can help retrieving the kubeconfig from the API in the aftermath and make variables directly available in TF.

Resource limits could be transparently encoded in the "worker_pool".

Not sure yet how to bring in the security zone here.

If we someday implement auto-scaling boundaries, this could also handled in the "worker_pool" as max, min or so.

Thoughts?

@nvthongswansea
Copy link
Contributor

@bkircher This sounds a good idea to me. However, I would like the gscloud command to be like this:

$ gscloud kubernetes versions
1.16.15-gs2 xxx-xxx-xxx-xxx
1.17.14-gs1 xxx-xxx-xxx-xxx
1.18.12-gs1 xxx-xxx-xxx-xxx
1.19.4-gs1 xxx-xxx-xxx-xxx

The user will copy the corresponding UUID of the k8s template, and put it in the tf file. The reason is that we will not need to call the api to get all the template, and search for the UUID of the version.

@twiebe
Copy link
Member

twiebe commented Feb 3, 2021

@bkircher I like it. UX will benefit from this.

Some notes:

  • Instead of version (e.g. 1.16.4-gs1), let the user define the release (e.g. 1.16). There is only ever one version of a release available at a time and it is subject to change at any time. Existing clusters are updated automatically on a regular basis.
  • There is the concept of Node Pools. Node Pools allow you to define different node configurations - each with cores/memory/storage/storage type and amount of nodes. While currently this is not available resp. you can only define the one, it's in planning. The terraform provider could already prepare for this.
  • As for your question Why is k8s_worker_node_storage exposed when I am not allowed to change it?: To allow for initial configuration. It's also a planned feature to allow for vertical scaling resp. changing node resources.

@bkircher
Copy link
Contributor Author

bkircher commented Feb 7, 2021

Thanks for the feedback @twiebe!

Instead of version (e.g. 1.16.4-gs1), let the user define the release (e.g. 1.16). There is only ever one version of a release available at a time and it is subject to change at any time. Existing clusters are updated automatically on a regular basis.

We'll do. Does this also imply that release is never changed on backend side?

@nvthongswansea
Copy link
Contributor

@bkircher @twiebe I am working with this branch k8s. You can checkout this branch, build, and test.

@bkircher
Copy link
Contributor Author

👍 But let's first determine how the schema would look like before writing code.

With @twiebe suggestions and @itakouna we came up with this:

resource "gridscale_kubernetes_cluster" "mycluster" {
  name   = "mycluster"
  release = "1.19"
  labels = ["foo", "staging"]

  timeouts {
    create = "15m"
  }

  # Node pool attached to this k8s resource; possibly multiple with different properties
  node_pool {
      name = "my_node_pool"
      node_count = 3
      cores = 2  # cores per node
      memory = 4  # mem per node
      storage = 80  # … uses a default storage type
  }

  node_pool {
      name = "my_io_intensive_node_pool"
      node_count = 6
      cores = 2
      memory = 4
      storage = 30
      storage_type = "insane"
  }
}

Note:

  • We use "release" here from objects/paas/service_templates API. In the implementation, we retrieve the service template UUID that corresponds to that release version combo.
  • We added node_pool, there can be multiple in a cluster. The cluster scheduler can assign certain workloads to certain node pools.
  • Node pools are really part of the cluster. You delete the cluster; node pools get purged.

@bkircher
Copy link
Contributor Author

bkircher commented Mar 5, 2021

gscloud kubernetes releases will be in v0.10.0 of gscloud. See gridscale/gscloud#113

@bkircher
Copy link
Contributor Author

bkircher commented Mar 5, 2021

@nvthongswansea what are your thoughts on above schema proposal? You think we can start implementing this?

@nvthongswansea
Copy link
Contributor

@nvthongswansea what are your thoughts on above schema proposal? You think we can start implementing this?

Yes, I'm on it. The thing is the current api only allows a single node_pool. Correct me if I'm wrong @bkircher @twiebe @itakouna

@nvthongswansea
Copy link
Contributor

@bkircher @itakouna one question, please. How does the backend know when to use a specific node_pool? Is it defined by the user (e.g. set the weight for each node_pool, etc.)?

@nvthongswansea
Copy link
Contributor

nvthongswansea commented Apr 6, 2021

@bkircher @itakouna I've added an example of the new k8s resource. More

bkircher pushed a commit that referenced this issue Apr 7, 2021
Add a new k8s resource to TF provider. See #120.

What changes:

- Add new tf resource "gridscale_k8s".
- Add "gridscale_k8s" resource's docs.

What it does:

- Manage k8s cluster resources in gridscale.
- Validate gsk (gridscale k8s cluster) parameters. Much better than what we had before.
- Allow user to input k8s_release (e.g. 1.19), instead of inputting service_template_uuid. Optionally, the values of `release` can be retrieved via gscloud (since 0.10 release).
@itakouna
Copy link
Contributor

itakouna commented Apr 8, 2021

@bkircher @itakouna one question, please. How does the backend know when to use a specific node_pool? Is it defined by the user (e.g. set the weight for each node_pool, etc.)?

The kube-scheduler of k8s cluster assigns workloads to a specific node pool based on some labels and taints that the k8s administrator/user adds. However, they are out of the scope of this issue.

@nvthongswansea
Copy link
Contributor

nvthongswansea commented Apr 15, 2021

Done in #140.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants