Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chart version range specifications #62

Closed
kanatohodets opened this issue Dec 21, 2018 · 5 comments

Comments

@kanatohodets
Copy link
Contributor

commented Dec 21, 2018

In order to keep a Chart ecosystem healthy, it is important for some range of Chart updates to be applied quickly and consistently. Encoding a specific version into Application objects and then updating it with every change means that new versions are unlikely to be applied across the whole population of Application objects: the Chart maintainers might release a new version, but only some Application owners will action the upgrade, which represents toil for them.

To address this, we think it makes sense for Shipper to develop the ability to work with semver version range specifications for Charts. In this way, Application owners can specify a version constraint that they're comfortable with (for example, upwards float on minor or bugfix versions), and get those updates automatically.

Approach

Allow a version range specification as part of the version key of the chart field in Application objects; for example, a floating patch specification: ~1.1.0. When a Release is created as part of a rollout, Shipper should resolve that range specification to a concrete, pinned version, such as 1.1.6. The Release should get that pinned version in the version key.

Release will not support version range constraints: Releases must always have a concrete version. This ensures that aborts and rollbacks always go back to a predictable version. Version range resolution is a feature of the higher level Application interface.

Considerations

Become a real Helm repo client

In order to resolve the version range spec, Shipper will need to become a fully fledged Helm repo client which processes index.yaml. Right now it just combines the version and repoUrl and appends .tgz to try to download a Chart directly, which is incorrect behavior. If we were to keep this behavior, repoUrl should be renamed to chartUrl.

Speed

Shipper should check for an updated index.yaml as soon as it needs to do a rollout. This allows CI/CD pipelines to push a new chart and then immediately deploy it.

Robustness to repo unavailability

Shipper currently caches Charts aggressively. This is in order to prevent rollouts with a previously-deployed Chart from failing due to a broken Chart repository. We want to maintain this property as much as possible: we'd like to avoid going over the wire to contact the chart repo unless absolutely required. As such:

  1. We always try to use a cached Chart tarball when dealing with concrete versions.
  2. If the index.yaml fetch fails when resolving a version range specification, we should try to fall back to a cached index.yaml.

This means that a version range resolution may not return the latest possible version if the Chart repo is down. In this case Shipper should indicate the resolution failure or fallback using a Condition on the Application (or similar).

@isutton

This comment has been minimized.

Copy link
Contributor

commented Dec 21, 2018

Release will not support version range constraints: Releases must always have a concrete version. This ensures that aborts and rollbacks always go back to a predictable version. Version range resolution is a feature of the higher level Application interface.

It is worth to mention that in the case of a rollback that the user is responsible for adding a range specification for the next rollout, otherwise the Application's chart will be pinned (resolved to a single version) for subsequent rollouts and this might not be what the user expects.

@luisdavim

This comment has been minimized.

Copy link

commented Feb 1, 2019

Regarding helm repo availability, what about supporting having a docker registry and helm repo per cluster by allowing some king of pattern or templating in the manifest? For example having an https://github.com/goharbor/harbor on each application cluster being synced from another one running on the manager cluster?

@juliogreff juliogreff added this to the release-0.5 milestone May 13, 2019

@icanhazbroccoli icanhazbroccoli self-assigned this May 20, 2019

@juliogreff

This comment has been minimized.

Copy link
Contributor

commented Jul 23, 2019

Closed by #105

@juliogreff juliogreff closed this Jul 23, 2019

@icanhazbroccoli

This comment has been minimized.

Copy link
Contributor

commented Aug 2, 2019

release-0.5 exposed a significant flaw in the implementation. In the worst case scenario application releases were re-creating by Shipper uncontrollably causing an extensive resource consumption. The root cause is in the env hashing strategy: in the proposed solution app.spec.template.chart.version stays the original one (read: unresolved), whereas the release.spec.environment.chart.version contains a resolved chart version. On a resync cycle Shipper detected env hash mismatch between the app and the contender and decided to create another release right away. The history repeats until the resource limit is exhausted.

I'm reopening this issue for the sake of connecting it to an upcoming release. Once the development group is convinced on the newly proposed implementation, we can close this issue "for real".

icanhazbroccoli added a commit that referenced this issue Aug 2, 2019

Reworked chart version range resolution strategy refs #62
This change is aiming to address the problem we encountered during
exploiting release-0.5. In this release we introduced helm chart range
functionality, where an application.spec.template.chart.version could
contain a SemVer constraint and be resolved to a material chart version.
In the original implementation we decided to keep
application.spec.template.chart.version as given and preserve resolved
chart version in release.spec.environment.chart.version. Under these
circumstances we broke 2 things: normal rollout and rollbacks.

Rollouts were broken due to an always-mismatch between env hashes of the
application and it's contender: due to the difference between the
versions, env hashes were never the same and therefore Shipper was
continuously re-creating new releases. On top of that, an attempt to
re-rollout the same app with a constraint would fail as well even if we
had fixed the hash problem: they would be the same, which Shipper
recofnises as a no-op.

Rollbacks were breaking the contract due to the environment back-copy
operations: during a rollback Shipper copies contender spec back to the
application. This meant we're loosing the initial information about the
original SemVer constraint.

This change takes a different approach. In essence,
application.spec.template.chart.version is being updated on the release
stage: the chart version is being materialised and preserved in the
application spec instead of the initial constraint. The constraint is
now preserved in application annotations among with the chart name and
the resolved version. These 3 components are being used to detect
whether a specific chart version has already been resolved and whether
it's consistent with the one specified in the spec. In this case the
rest of the algorithm stays in place: releases are being created with
the updated materialised chart spec, and rollbacks don't loose any data.

Signed-off-by: Oleg Sidorov <oleg.sidorov@booking.com>

icanhazbroccoli added a commit that referenced this issue Aug 7, 2019

Reworked chart version range resolution strategy refs #62
This change is aiming to address the problem we encountered during
exploiting release-0.5. In this release we introduced helm chart range
functionality, where an application.spec.template.chart.version could
contain a SemVer constraint and be resolved to a material chart version.
In the original implementation we decided to keep
application.spec.template.chart.version as given and preserve resolved
chart version in release.spec.environment.chart.version. Under these
circumstances we broke 2 things: normal rollout and rollbacks.

Rollouts were broken due to an always-mismatch between env hashes of the
application and it's contender: due to the difference between the
versions, env hashes were never the same and therefore Shipper was
continuously re-creating new releases. On top of that, an attempt to
re-rollout the same app with a constraint would fail as well even if we
had fixed the hash problem: they would be the same, which Shipper
recofnises as a no-op.

Rollbacks were breaking the contract due to the environment back-copy
operations: during a rollback Shipper copies contender spec back to the
application. This meant we're loosing the initial information about the
original SemVer constraint.

This change takes a different approach. In essence,
application.spec.template.chart.version is being updated on the release
stage: the chart version is being materialised and preserved in the
application spec instead of the initial constraint. The constraint is
now preserved in application annotations among with the chart name and
the resolved version. These 3 components are being used to detect
whether a specific chart version has already been resolved and whether
it's consistent with the one specified in the spec. In this case the
rest of the algorithm stays in place: releases are being created with
the updated materialised chart spec, and rollbacks don't loose any data.

Signed-off-by: Oleg Sidorov <oleg.sidorov@booking.com>

parhamdoustdar added a commit that referenced this issue Aug 7, 2019

Merge pull request #143 from bookingcom/olegs/fix-release-0.5-chart-r…
…anges

Reworked chart version range resolution strategy refs #62
@icanhazbroccoli

This comment has been minimized.

Copy link
Contributor

commented Aug 14, 2019

The recent release 0.5.0-alpha.4 is closing this issue (https://github.com/bookingcom/shipper/releases/tag/v0.5.0-alpha.4)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.