Skip to content

Conversation

@rossf7
Copy link
Contributor

@rossf7 rossf7 commented Feb 16, 2021

Towards https://github.com/giantswarm/giantswarm/issues/15700

Please remember to

  • Give the PR a meaningful title -- it will be shown in the release notes
  • Add (or remove) user questions associated with any updated docs

@rossf7 rossf7 self-assigned this Feb 16, 2021

## Overview of Management Clusters

As we are fully convinced of Kubernetes as a platform for building platforms, we build all our management clusters based on Kubernetes. Giant Swarm leverages the concept of [“Operators"](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) to control all resources that clusters need as [“Custom Resources”](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is taken from the architecture docs but I think it works as an intro.

Our workload clusters are versioned using Giant Swarm releases. From a workload cluster point of view upgrades are described in [cluster upgrades]({{< relref "/general/cluster-upgrades" > }}).

For management clusters when we publish a new Giant Swarm release we create a new Release custom resource and any new operator versions are deployed to the management cluster.
We deploy new instances of the operators with the new version to avoid impacting existing workload clusters. These new operators will not become active until existing workload clusters are upgraded or a new workload cluster is created.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This to me is the key point. As its how we isolate MC updates from workload clusters.


The components that are not part of a Giant Swarm release like our monitoring stack are deployed via app collections. We run the latest version of each component and the collection is managed by our legacy deployment component called draughtsman.

The components are deployed as App custom resources using our app platform. Changes are tested in a number of internal management clusters before being deployed to all management clusters.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its best to be explicit here about the test process for MC changes but shout if that isn't helpful / needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further above, App Platform has been title case. Please adjust for consistency.


## Management Cluster infrastructure upgrades

TBC
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to mention Terraform updates here?

If so I think these security docs explain it well. https://docs.giantswarm.io/security/operational-layers/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we do, even if we're planning on moving to using operators in future

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it's helpful to have a point towards that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added it and included why we want to move to Operators and how they are better than Terraform and Kops.

@teemow I took ideas from https://www.giantswarm.io/management-cluster for that. Thanks I hadn't seen that page.

Our workload clusters are versioned using Giant Swarm releases. From a workload cluster point of view upgrades are described in [cluster upgrades]({{< relref "/general/cluster-upgrades" >}}).

For management clusters when we publish a new Giant Swarm release we create a new Release custom resource and any new operator versions are deployed to the management cluster.
We deploy new instances of the operators with the new version to avoid impacting existing workload clusters. These new operators will not become active until existing workload clusters are upgraded or a new cluster is created.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This to me is the key point. As its how we isolate MC updates from workload clusters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks right to me

@rossf7
Copy link
Contributor Author

rossf7 commented Feb 17, 2021

@JosephSalisbury This is what there is so far.

On the 2 TBC sections I'd love your feedback on if they are needed. If so I can continue or feel free to take over if you prefer. Thx!

@rossf7 rossf7 changed the title Add docs for management cluster updates Add Management Cluster page with details on cluster updates Feb 17, 2021
@rossf7 rossf7 marked this pull request as ready for review February 17, 2021 14:43
@rossf7 rossf7 requested review from a team, othylmann, puja108 and teemow February 17, 2021 14:44
Copy link
Contributor

@stone-z stone-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some assorted nitpick suggestions

Co-authored-by: Zach Stone <zach@giantswarm.io>
rossf7 and others added 2 commits February 17, 2021 18:51
Co-authored-by: Marian Steinbach <marian@giantswarm.io>
Copy link
Contributor Author

@rossf7 rossf7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stone-z @marians Thanks for the reviews. All changes made.

Copy link
Contributor

@Oshratn Oshratn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few nitpicks. Apart from that is clear and informative.


Our workload clusters are versioned using [workload cluster releases]({{< relref "/general/releases" >}}). From a workload cluster point of view, upgrades are described in [cluster upgrades]({{< relref "/general/cluster-upgrades" >}}).

When we publish a new Giant Swarm release, we create a new Release custom resource and any new operator versions are deployed to the management cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When we publish a new Giant Swarm release, we create a new Release custom resource and any new operator versions are deployed to the management cluster.
When we publish a new Giant Swarm release, we create a new release custom resource and any new operator versions are deployed to the management cluster.

@rossf7
Copy link
Contributor Author

rossf7 commented Feb 18, 2021

OK this should be close now.

Fishing for any final comments or approvals. Otherwise will merge later this morning.

@rossf7
Copy link
Contributor Author

rossf7 commented Feb 18, 2021

Going with this. Thanks for all the help <3

Happy to do a follow up if there is feedback that misses the train.

@rossf7 rossf7 merged commit 5e54925 into master Feb 18, 2021
@rossf7 rossf7 deleted the mgmt-cluster-updates branch February 18, 2021 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants