-
Notifications
You must be signed in to change notification settings - Fork 8
Add Management Cluster page with details on cluster updates #823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| ## Overview of Management Clusters | ||
|
|
||
| As we are fully convinced of Kubernetes as a platform for building platforms, we build all our management clusters based on Kubernetes. Giant Swarm leverages the concept of [“Operators"](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) to control all resources that clusters need as [“Custom Resources”](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is taken from the architecture docs but I think it works as an intro.
| Our workload clusters are versioned using Giant Swarm releases. From a workload cluster point of view upgrades are described in [cluster upgrades]({{< relref "/general/cluster-upgrades" > }}). | ||
|
|
||
| For management clusters when we publish a new Giant Swarm release we create a new Release custom resource and any new operator versions are deployed to the management cluster. | ||
| We deploy new instances of the operators with the new version to avoid impacting existing workload clusters. These new operators will not become active until existing workload clusters are upgraded or a new workload cluster is created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This to me is the key point. As its how we isolate MC updates from workload clusters.
|
|
||
| The components that are not part of a Giant Swarm release like our monitoring stack are deployed via app collections. We run the latest version of each component and the collection is managed by our legacy deployment component called draughtsman. | ||
|
|
||
| The components are deployed as App custom resources using our app platform. Changes are tested in a number of internal management clusters before being deployed to all management clusters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its best to be explicit here about the test process for MC changes but shout if that isn't helpful / needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further above, App Platform has been title case. Please adjust for consistency.
|
|
||
| ## Management Cluster infrastructure upgrades | ||
|
|
||
| TBC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to mention Terraform updates here?
If so I think these security docs explain it well. https://docs.giantswarm.io/security/operational-layers/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we do, even if we're planning on moving to using operators in future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it's helpful to have a point towards that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added it and included why we want to move to Operators and how they are better than Terraform and Kops.
@teemow I took ideas from https://www.giantswarm.io/management-cluster for that. Thanks I hadn't seen that page.
| Our workload clusters are versioned using Giant Swarm releases. From a workload cluster point of view upgrades are described in [cluster upgrades]({{< relref "/general/cluster-upgrades" >}}). | ||
|
|
||
| For management clusters when we publish a new Giant Swarm release we create a new Release custom resource and any new operator versions are deployed to the management cluster. | ||
| We deploy new instances of the operators with the new version to avoid impacting existing workload clusters. These new operators will not become active until existing workload clusters are upgraded or a new cluster is created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This to me is the key point. As its how we isolate MC updates from workload clusters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks right to me
|
@JosephSalisbury This is what there is so far. On the 2 TBC sections I'd love your feedback on if they are needed. If so I can continue or feel free to take over if you prefer. Thx! |
stone-z
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just some assorted nitpick suggestions
Co-authored-by: Zach Stone <zach@giantswarm.io>
Co-authored-by: Marian Steinbach <marian@giantswarm.io>
rossf7
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oshratn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a few nitpicks. Apart from that is clear and informative.
|
|
||
| Our workload clusters are versioned using [workload cluster releases]({{< relref "/general/releases" >}}). From a workload cluster point of view, upgrades are described in [cluster upgrades]({{< relref "/general/cluster-upgrades" >}}). | ||
|
|
||
| When we publish a new Giant Swarm release, we create a new Release custom resource and any new operator versions are deployed to the management cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| When we publish a new Giant Swarm release, we create a new Release custom resource and any new operator versions are deployed to the management cluster. | |
| When we publish a new Giant Swarm release, we create a new release custom resource and any new operator versions are deployed to the management cluster. |
|
OK this should be close now. Fishing for any final comments or approvals. Otherwise will merge later this morning. |
|
Going with this. Thanks for all the help <3 Happy to do a follow up if there is feedback that misses the train. |
Towards https://github.com/giantswarm/giantswarm/issues/15700
Please remember to