Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More consistent helm package versions #1251

Closed

Conversation

TBBle
Copy link
Contributor

@TBBle TBBle commented Aug 26, 2020

What this PR does / Why we need it:

There's a few fixes here for the generated Helm charts, some low-hanging fruit from #1244 and a fix for some issues noticed while working on #1246.

The major visible fix is that Helm charts packaged from non-release builds will now have a unique AppVersion (the container image tag), so that it's possible to know which version of Open Match you have installed. I haven't changed the (SemVer) Version, as that's not low-hanging fruit per #1244.

This also has the effect that such Helm charts will point to the specific container image tag from which they were built, rather than using a floating image tag. This will avoid surprising issues in an Open Match deployment from a non-release build. It's generally a bad idea to use floating image tags in a Kubernetes Deployment. The manifest YAML generated from the same builds did not use the floating image tags, so this problem was only visible to Helm users.

In order to reduce the risk of similar issues in future, the manifest YAML is generated from the packaged chart, not the chart source. At this point, this probably makes no difference, but as other improvements flow from #1246, it will reduce one point of divergence between the manifest YAML and the Helm package. CI deployments still use the chart source because they need to generate TLS certificates. Fixing this (and hence supporting TLS certificates for the packaged chart) is also not low-hanging fruit.

I also fixed an issue where the open-match-telemetry subchart was not having its dependencies updated, meaning it was specified to use one version of the third-party charts it depends on, but was deploying much older versions in practice.

Which issue(s) this PR fixes:

Improvements towards #1244

Special notes for your reviewer:

Paul "Hampy" Hampson added 2 commits August 26, 2020 11:19
This looks like a refactoring oversight: The existing update was only
updating the top-level chart, but included the incubator repo which is
only needed by the telemetry subchart.

Signed-off-by: Paul "Hampy" Hampson <p_hampson@wargaming.net>
This ensures that anything done in the `helm package` call is applied to
the YAML output consistently.

Signed-off-by: Paul "Hampy" Hampson <p_hampson@wargaming.net>
Paul "Hampy" Hampson added 4 commits August 26, 2020 12:41
This allows distinguishing builds that have have the same Version, i.e.
the ones produced by non-release builds, and also builds towards making
the default global.image.tag value be `.Chart.AppVersion`, to ensure
that Helm charts are not pointing at a floating image tag.

Signed-off-by: Paul "Hampy" Hampson <p_hampson@wargaming.net>
Signed-off-by: Paul "Hampy" Hampson <p_hampson@wargaming.net>
This fixes an issue that the published Helm charts, until like the
generated manifest YAMLs, point at a floating contaner image tag in
non-release builds.

Signed-off-by: Paul "Hampy" Hampson <p_hampson@wargaming.net>
Signed-off-by: Paul "Hampy" Hampson <p_hampson@wargaming.net>
@TBBle TBBle force-pushed the more_consistent_helm_package_versions branch from 7e9cbf0 to 211c798 Compare August 26, 2020 02:41
@TBBle
Copy link
Contributor Author

TBBle commented Aug 26, 2020

I've just looked at the last-published 0.0.0-dev chart, and I realised that it did package a set of TLS secrets. I'm not sure if that is intentional or not, since the Makefile was set up to only depend on the secrets for the CI deployments, not the YAML or Helm packages, but I guess there's an ordering thing going on there.

@TBBle TBBle marked this pull request as ready for review August 26, 2020 05:51
@TBBle
Copy link
Contributor Author

TBBle commented Aug 27, 2020

While testing an Open Match version without this fix, that the currently-published 0.0.0-dev Helm chart doesn't work, because it is looking for images like gcr.io/open-match-public-images/openmatch-backend:0.0.0-dev, but the images from the auto-builds are pushed to gcr.io/open-match-build/openmatch-backend:0.0.0-dev.

I've pushed a fix for this in 955ffda. However, as far as I know, I can't get the generated Helm charts from PR CI builds right now.

Edit: I added a commit to include the Helm chart in the Build Artifacts on Cloud Build. However, to actually download them (or any artifact) I had to strip the #... off the URL, which means I'm probably getting the latest build, since the URLs in the build artifacts are the same apart from the #.... Still, I was able to confirm that I saw the expected values in the Helm chart for this PR, both the AppVersion and the image registry/tag values.

Paul "Hampy" Hampson added 2 commits August 27, 2020 20:55
This ensurse that non-release Helm charts are usable out-of-the-box, by
patching in the registry at helm package time rather than overriding
during the test and YAML-generation stages.

Signed-off-by: Paul "Hampy" Hampson <p_hampson@wargaming.net>
Signed-off-by: Paul "Hampy" Hampson <p_hampson@wargaming.net>

lint-chart: build/toolchain/bin/helm$(EXE_EXTENSION) build/toolchain/bin/ct$(EXE_EXTENSION)
(cd $(REPOSITORY_ROOT)/install/helm; $(HELM) lint $(OPEN_MATCH_HELM_NAME))
$(CHART_TESTING) lint --all --chart-yaml-schema $(TOOLCHAIN_BIN)/etc/chart_schema.yaml --lint-conf $(TOOLCHAIN_BIN)/etc/lintconf.yaml --chart-dirs $(REPOSITORY_ROOT)/install/helm/
$(CHART_TESTING) lint-and-install --all --chart-yaml-schema $(TOOLCHAIN_BIN)/etc/chart_schema.yaml --lint-conf $(TOOLCHAIN_BIN)/etc/lintconf.yaml --chart-dirs $(REPOSITORY_ROOT)/install/helm/

build/chart/open-match-$(BASE_VERSION).tgz: build/toolchain/bin/helm$(EXE_EXTENSION) lint-chart
patch-chart-image-values:
$(SED_REPLACE) 's| registry: .*| registry: $(REGISTRY)|g' $(REPOSITORY_ROOT)/install/helm/open-match/values.yaml

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

values.yaml is commited, so running any command which calls into this would cause unwanted churn on these file, right?

Why are you replacing the current helm_image_tags?

Copy link
Contributor Author

@TBBle TBBle Sep 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because right now, the generated chart contains floating image tags (0.0.0-dev), and I don't think imagePullPolicy: Always is set here. (I checked, it's set by default in values.yaml... A different code-smell.)

So installing this chart will mean your pods are running whatever version of the image happened to be built from master at the time of first pull for that node pod startup. Over time, this means you'll end up with a mix of different versions of the same image, which seems a huge risk.

Copy link
Contributor Author

@TBBle TBBle Sep 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patching the registry is needed because the auto-build images are not pushed to gcr.io/$(OPEN_MATCH_PUBLIC_IMAGES_PROJECT_ID) (which is what's in the committed values.yaml), but to gcr.io/$(OPEN_MATCH_BUILD_PROJECT_ID).

The idea here is to test the same thing we ship. Before this change, the behaviour was to generate templates from the Helm chart, and patch those generated files for CI, leaving the Helm chart untested and invalid out-of-the-box.

Another option would be to make the helm calls for CI, and the users of the 0.0.0-dev chart, pass --set global.image.registry=gcr.io/$(OPEN_MATCH_BUILD_PROJECT_ID) to Helm on each invocation (or when using it as a subchart, hard-coding that in the parent chart's values.yaml).

A different solution would be to have the Helm chart template its registry value to detect a GIT SHA as the image tag, and in that case replace gcr.io/$(OPEN_MATCH_PUBLIC_IMAGES_PROJECT_ID) with gcr.io/$(OPEN_MATCH_BUILD_PROJECT_ID) if the user has not overridden the registry directly.

I feel like that would be less clear than having the same script that pushes the images also tell the Helm package where it pushed those images to, but it would mean the Helm chart is packaged from the unadulterated source, which is nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, the change made by this target should not be committed, since it's patching in build-specific values.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I don't want commands making changes to files which shouldn't be committed (exception: anything in .gitignore, obviously). At best it'd be annoying to revert, especially when there's other changes to the same file, at worst changes accidentally make there way into being committed.

Why is simply setting the variables when building the charts not the right course of action here?

Sorry for the delay reviewing this - I wasn't involved in writing the charts nor the build system so I need time to become familiar and figure out the best solution here.

Copy link
Contributor Author

@TBBle TBBle Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't set variables except .Chart.Version and .Chart.AppVersion when calling helm package. The best you can do is change the values.yaml before calling helm package. That's why I suggested as a TODO to move the image tag be .Chart.AppVersion, because we can set that at package-build time, to remove 50% of this editing.

To get rid of the other 50% of the editing, we'd need to instead have a macro reacting to .Chart.AppVersion or .Chart.Version, and recognising an auto-build, to replace the existing hard-coded default repo with two hard-coded default repos, selected by, e.g. the presence of -dev in .Chart.Version.

However, both of the above changes are complicated by the current muliti-chart setup, as the sub-charts always have version 0.0.0-dev, even in release builds, because we don't use helm package for them, but source them inline, which means there's no opportunity to set .Chart.Version and .Chart.AppVersion at package time.

We could, as a separate change, modify the build system to do helm package for the subcharts rather than helm dep update on the parent charts, perhaps. But per #1244, the whole subcharting setup in Open Match is pretty messy, and I'd rather spent the time unwinding that, if I was going to spend time on it. (And any Open Match time I spend has higher priorities, like the outstanding documentation updates for #1235).

Perhaps with some clever scripting, the values.yaml changes could be reverted as part of the Make execution? Although obviously that won't fly if the process fails and the user just runs it again, because then on next run, it'd be changing from the modified files anyway, and change them back to the modified files.

To be fair, this is a pretty common problem I've hit with container image repositories and registries, if you have separate 'release' and 'auto-build' output locations. There was a proposal for Helm to make image-registry handling more formalised and allow solving issues like this more streamlined, but it didn't seem to go anywhere.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't set variables except .Chart.Version and .Chart.AppVersion when calling helm package.

(╯°□°)╯︵ ┻━┻

Copy link
Contributor Author

@TBBle TBBle Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was briefly a feature to do exactly that, but it broke other workflows, e.g., stripped all the comments from the values.yaml, and was backed out. The feature might come back in the future, but it's been a slow, 2-year burn so far. So I'm not personally building any CI pipelines around the assumption I'll have it in a meaningful timeframe.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was thinking about this more, would it make sense to switch to "building" charts?

Basically:

  • Copy chart folder from install/ into build/ (which is in .gitignore)
  • Build subcharts and put them in the proper folder
  • Edit the version and image info
  • Build the actual chart

This would also get rid of committed zip file sub charts (which kind of also shows that there are build steps to the charts already, they're just a bit of a hack today.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that would definitely be an option. And if you're doing that, I'd just copy the current subcharts into the charts/ folder, rather than building them separately and copying the tarballs in, since (as far as I recall) they are not deployed separately anywhere anyway. In effect, Helm's just extracting those those tarballs internally and treating the whole thing as one big directory anyway, so it removes a needless step of "building" the subcharts, when they don't really have an independent life outside the open-match chart.

You'd still run helm dep up or helm dep build (I forget which one is in the OM workflow) to update external charts, but that won't touch chart directories in the charts/ folder.

@Laremere
Copy link

There's been a lot of good discussion here, but I think solution as is won't be merged. I'm closing this PR to reflect that.

Better helm building/packaging is a high priority maintenance issue for OM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants