Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm Refactoring #13259

Merged
merged 1 commit into from
Oct 12, 2020
Merged

Helm Refactoring #13259

merged 1 commit into from
Oct 12, 2020

Conversation

seanmwinn
Copy link
Contributor

@seanmwinn seanmwinn commented Sep 22, 2020

Signed-off-by: Sean Winn sean@isovalent.com

Full refactor of helm charts:

  1. Re-structure helm charts into a single chart for cilium.

  2. Configure sane default values at the chart level for each component, removing the need for all global values previously implemented.

  3. Maintain upgrade compatibility with prior versions of cilium.

Fixes: #13210

Helm charts have been fully re-structured to a single chart for cilium with no dependency on sub-charts. More than 170 global values have been properly scoped to the cilium chart with sane defaults defined. Users upgrading from prior versions of cilium should be sure to read the upgrade guide for specific instructions.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Sep 22, 2020
@seanmwinn seanmwinn added the release-note/major This PR introduces major new functionality to Cilium. label Sep 22, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Sep 22, 2020
Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't do a deep review yet since I'm still mulling over the best way to efficiently do so for such a large change. I did however note a few points below where it seems like the charts are not taking into account changes made into the master branch of the Cilium tree, such as hubble-relay TLS functionality and likely also the "local redirect policy" RBAC (amongst other things).

One reassuring point would be if you run make -C install/kubernetes, does it produce any diff against the quick-install / experimental-install YAMLs? If yes, why? (If this command breaks, then well that could do with some debugging and I'm happy to help debug why, feel free to reach out).

Step 2 of your PR description, "Maintain upgrade compatibility" suggests that upgrade should continue to work. Two questions from my side on this:

  1. Do we need any changes to the upgrade guide to account for this?
  2. Does the upgrade CI test still pass?

(I can run the CI to get feedback on the second point)

.github/workflows/smoke-test-ipv6.yaml Outdated Show resolved Hide resolved
@seanmwinn seanmwinn marked this pull request as ready for review September 25, 2020 22:40
@seanmwinn seanmwinn requested review from a team as code owners September 25, 2020 22:40
@seanmwinn seanmwinn requested review from a team September 25, 2020 22:40
@seanmwinn seanmwinn requested a review from a team as a code owner September 25, 2020 22:40
Copy link
Member

@tgraf tgraf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The end result of this looks awesome. Much cleaner.

The PR description states that compatibility is maintained. How? I assume that existing values will be ignored and users will have to convert old helm values to new helm values. Is that correct? Do we need a script or a guide on how to convert old values to new values?

Documentation/install/upgrade.rst Outdated Show resolved Hide resolved
Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fantastic overall! Very small comments to address below.

install/kubernetes/Makefile Outdated Show resolved Hide resolved
vendor/github.com/fsnotify/fsnotify/.travis.yml Outdated Show resolved Hide resolved
Copy link
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How of curiosity, how will helm decide if we end up with the same option in 2 different sub-charts?

also, the quick-install.yaml file is a lot different from the current root. Ideally the generated file should be the same, or similar to the existing one.

@tgraf
Copy link
Member

tgraf commented Sep 29, 2020

I've reviewed this PR a second time to make sure I understand what it changes. Based on the comments in the meeting on Monday, I was under the impression that this would also remove the need for the compatibility values which, if I understand this PR correctly, are still in-tact and unchanged after this PR.

The main change this PR achieves seems to be to merge all agent-related sub-charts into a single parent chart to avoid global variables. The benefit from this is that a single Values file can control all aspects of the agent.

The cost of this change is that all Helm value names are changing which means a hard breakage unless we take action to convert existing Values files. If I understand it correctly, helm upgrade will break hard unless we do something about it.

It seems vital that we understand what we are asking the user to do in the event of an upgrade in order to migrate to this new model. I'm not sure I understand what we are about to ask users.

@seanmwinn
Copy link
Contributor Author

@tgraf your understanding of the upgrade being broken by this change is not entirely accurate. I have been using helm upgrade to migrate from existing 1.7x and 1.8x installations to ones using this helm chart as a basis of my own testing. The upgradeCompatibility flag is removed from the values file, and any leftover pieces in the config map have zero effect. I still have a little bit of cleanup to do there, apparently, but upgrades totally work with small changes on the user's part which can be easily documented.

The upgrade guide has the start of what will become the upgrade directions, but I have been working toward getting all of the tests to pass as a primary focus, and they have been secondary and are still not complete. I'm expecting to complete the last pieces today/tomorrow now that I got all the base tests passing as of last night. I will need to also see what CI produces next. I'm about to kick it off.

My offer from the Monday call stands to demonstrate how this works in an upgrade scenario. I've tested it thoroughly, and I'm confident it will not break upgrades for users, however, they will need to follow the upgrade guide to be successful. The other piece which I think still needs some clarification across the team is how to set default values for each released version of a chart.

@seanmwinn
Copy link
Contributor Author

test-me-please

@seanmwinn seanmwinn requested a review from a team as a code owner September 29, 2020 21:57
@seanmwinn
Copy link
Contributor Author

test-me-please

@rolinh rolinh added the area/helm Impacts helm charts and user deployment experience label Sep 30, 2020
@seanmwinn
Copy link
Contributor Author

test-me-please

4 similar comments
@seanmwinn
Copy link
Contributor Author

test-me-please

@seanmwinn
Copy link
Contributor Author

test-me-please

@seanmwinn
Copy link
Contributor Author

test-me-please

@seanmwinn
Copy link
Contributor Author

test-me-please

@joestringer joestringer merged commit e9cb43c into cilium:master Oct 12, 2020
rolinh added a commit that referenced this pull request Oct 13, 2020
It looks like this file was wrongly committed as part of the Helm
refactoring PR (#13259).

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
rolinh added a commit that referenced this pull request Oct 13, 2020
PR #11577 added checks to ensure that the following conditions are met:

  - Hubble UI requires Hubble Relay
  - Hubble Relay requires Hubble

These checks were pruned as part of the Helm refactoring PR (#13259) so
this commit adds them back. This ensures that invalid Hubble settings
such as enabling UI without enable Relay are caught early to avoid
confusion.

Fixes: #13537

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
rolinh added a commit that referenced this pull request Oct 13, 2020
PR #13259 move helm chart files around and the CODEOWNERS file was not
updated accordingly.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
tklauser pushed a commit that referenced this pull request Oct 14, 2020
PR #13259 move helm chart files around and the CODEOWNERS file was not
updated accordingly.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
tklauser pushed a commit that referenced this pull request Oct 14, 2020
PR #11577 added checks to ensure that the following conditions are met:

  - Hubble UI requires Hubble Relay
  - Hubble Relay requires Hubble

These checks were pruned as part of the Helm refactoring PR (#13259) so
this commit adds them back. This ensures that invalid Hubble settings
such as enabling UI without enable Relay are caught early to avoid
confusion.

Fixes: #13537

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
tklauser pushed a commit that referenced this pull request Oct 14, 2020
It looks like this file was wrongly committed as part of the Helm
refactoring PR (#13259).

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
gandro added a commit that referenced this pull request Oct 15, 2020
PR #13259 made the service account configuration a top level value
(`.Values.serviceAccounts`). This commit removes the now unused service
accounts under `hubble.ui.serviceAccount` and
`hubble.relay.serviceAccount`, which are supposed to be configured via
`serviceAccounts.ui` and `serviceAccounts.relay`.

In addition, also fixes the service account value in the Hubble UI
cluster role and cluster role binding.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
joestringer pushed a commit that referenced this pull request Oct 15, 2020
PR #13259 made the service account configuration a top level value
(`.Values.serviceAccounts`). This commit removes the now unused service
accounts under `hubble.ui.serviceAccount` and
`hubble.relay.serviceAccount`, which are supposed to be configured via
`serviceAccounts.ui` and `serviceAccounts.relay`.

In addition, also fixes the service account value in the Hubble UI
cluster role and cluster role binding.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
runPath: "/var/run/cilium"
# nativeRoutingCIDR allows to explicitly specify the CIDR for native routing. This
# value corresponds to the configured cluster-cidr.
# nativeRoutingCIDR:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seanmwinn Here "nativeRoutingCIDR" is commented out. Was that an intended change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/helm Impacts helm charts and user deployment experience release-note/major This PR introduces major new functionality to Cilium.
Projects
No open projects
1.9.0
  
Awaiting triage
Development

Successfully merging this pull request may close these issues.

Helm: Helm Chart Refactoring helm: inconsistency in hubble-relay chart