Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for using OCI references as the dependency repository #7613

Closed
wants to merge 1 commit into from

Conversation

phroggyy
Copy link

@phroggyy phroggyy commented Feb 17, 2020

Summary

This PR introduces the ability to specify OCI references as the repository for chart dependencies. This means it will be possible to specify:

# Chart.yaml
dependencies:
- name: nginx
  version: "1.2.3"
  repository: "oci://nginx:latest"

This MR does not (yet anyway) add support for directly installing OCI-compatible images via helm install, i.e helm install --name nginx oci://nginx, although that addition should be fairly easy.

Points of discussion

Versions

Currently, the helm versioning system relies on SemVer. This is great, and something I think should definitely be kept. However, with OCI registries, we have no way of determining a version or any way of indexing. Thus, we have some options:

  1. Ignore the version: this is what the current code does, and I don't like it. This will rely solely on the image name and tag for the download.
  2. Enforce SemVer tagging and disallow image tags in the image reference. This would mean that the download would just use the version as the image tag. e.g version: 1.10 and repository: oci://nginx would download nginx:1.10. I'm not a fan of enforcing this SemVer tagging, as it makes Helm's system less flexible than what image tags allow, and may disrupt workflows for application charts that developers might be used to (e.g using the commit ref and latest as tags)
  3. Require either version or a tag. This would mean that if no tag is specified, users are required to specify the version, which will then be used as the image tag. If a tag is provided, we'll use that, and not require the version to be specified (as it will be whatever image was specified). This is my personal favourite. One downside to this is that the "either or" might be confusing, and I can see it confusing developers if using both version and tag. Or if a tag is not specified, knowing that the version is what's being used.

I'd love to see some discussion around the versioning, as that's currently the main problem I can see with an OCI implementation. As mentioned, my preferred option is some variation of the third option. Most importantly, I'd like to allow Helm users to use specific image tags shall they wish to do so.

Limitations

It is not possible to do provenance verification at download-time (meaning this would be incompatible with install --verify) as no provenance file is present in the image.

Notes

Resolves #6593

@helm-bot helm-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 17, 2020
Copy link
Contributor

@jdolitsky jdolitsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor comments, but overall looks good!

cmd/helm/dependency_update.go Outdated Show resolved Hide resolved
pkg/getter/getter.go Outdated Show resolved Hide resolved
pkg/getter/registrygetter.go Outdated Show resolved Hide resolved
pkg/getter/registrygetter.go Outdated Show resolved Hide resolved
@jdolitsky
Copy link
Contributor

@phroggyy looks like CI is failing - can you try running tests locally?

@jdolitsky
Copy link
Contributor

Im personally a fan of #3 suggestion, "require either version or a tag. This would mean that if no tag is specified, users are required to specify the version"

that sounds reasonable to me. this is actually how it works now with helm chart save

@phroggyy phroggyy force-pushed the feat/oci-dependencies branch 3 times, most recently from 9982f31 to e8059b2 Compare February 18, 2020 19:13
@helm-bot helm-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 22, 2020
@phroggyy phroggyy changed the title WIP: Add support for using OCI references as the dependency repository Add support for using OCI references as the dependency repository Feb 22, 2020
internal/test/net.go Outdated Show resolved Hide resolved
Copy link
Contributor

@jdolitsky jdolitsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty great. Looking forward to this. LGTM

internal/test/net.go Outdated Show resolved Hide resolved
pkg/downloader/manager.go Outdated Show resolved Hide resolved
@phroggyy phroggyy force-pushed the feat/oci-dependencies branch 2 times, most recently from 87232d6 to a7cf177 Compare March 8, 2020 17:30
@zendril
Copy link

zendril commented Mar 11, 2020

@phroggyy What happens when the registry is airgapped?

Today when using a chart itself, you have a default value for registry/repo:tag, and most charts allow you to override , <repo (aka image)> and . That way you can if your images need to come from an internally hosted registry (instead of public internet like docker.io), then you have that flexibility.

From what I currently see, it is nice that the dependent chart can be specified, but it is then "hard-coded" to pull from a specific registry at helm install/pull time.

With current charts, the tgz is completely self contained. Is that to continue to be the case?
If so, then this question is moot.

If we are expecting that the dependent charts can be dynamically resolved at install/pull time, then I think we need to account for this override scenario.

@zendril
Copy link

zendril commented Mar 11, 2020

After chat on slack, confirmed that the tgz will still remain as it is today. All transitive chart dependencies are still packed inside. This means that the OCI references are only at build time, not runtime.

All good. :)

@jdolitsky
Copy link
Contributor

This affects mostly just experimental stuff, so I think we are safe to merge.

Other maintainers: biggest change is here is adding methods to the Getter interface:

	Filename(url *url.URL, version string) string
	// Allow the getter to alter the download URL before retrieval
	URL(u *url.URL, version string) (string, error)

Copy link
Member

@bacongobbler bacongobbler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few questions about the design changes to the Getter interface in this PR

pkg/downloader/manager.go Show resolved Hide resolved
pkg/getter/getter.go Outdated Show resolved Hide resolved
pkg/getter/getter.go Outdated Show resolved Hide resolved
@phoenix-bjoern
Copy link

@bacongobbler can you please continue with the code review and merge this PR before the 3.2.0 release? This is the most important feature to get the OCI charts working.
Thanks.

@bacongobbler
Copy link
Member

bacongobbler commented Mar 25, 2020

I am busy with other work at the moment and cannot continue with the review in the near future. Those comments were merely observations about what the getter package was originally written for, and this PR appears to change its originally intended behaviour. I would suggest reaching out to another maintainer for review.

@DracoBlue
Copy link

Hello,

as the oci registries are called "registries" on the official docs and as sub command (https://helm.sh/docs/topics/registries/), wouldn't it be better to call it like this:

# Chart.yaml
dependencies:
- name: nginx
  version: "1.2.3"
  registry: "oci://nginx:latest"

@phroggyy
Copy link
Author

I had a look into refactoring the getter bit as had been discussed. My initial thoughts were to let the downloader set the filename, but this would be a breaking change and does indeed break existing tests. Going down the GetWithDetails route is feasible, but still doesn't feel right to me; we're making every other getter re-implement identical behaviour, just to be able to treat the registry getter differently. If we could require a tgz extension, one option would be to have the downloader substitute : for - and ensure a tgz extension. That seems like a less intrusive way of doing it, and not requiring re-implementing identical behaviour across all-but-one getters.

@sajayantony
Copy link

Singularity has a similar model of {scheme}://{name} reference
There are differences in SymVer 2.0 grammar and the Reference spec which disables certain characters like + so trying to represent the tag as a projection of the version will need some canonicalization which could lead to more issues. I would try to just reference the tag directly and avoid inference to avoid the similar issues like those that come due tolatest.

@jdolitsky
Copy link
Contributor

@phroggyy apologies for delay here. I think the GetWithDetails is an acceptable solution. Could you rebase and address the DCO and Circle errors?

@anschoewe
Copy link

I would love to see this merged. It would make working with OCI sub-charts/dependencies in an umbrella chart so much easier. I'll try to learn go in the near future so I can actually help improve the tool myself.

@phroggyy
Copy link
Author

@jdolitsky sorry, been away for a couple days. I'll address those issues today.

@hickeyma
Copy link
Contributor

@jdolitsky @bacongobbler Would you be happy to merge this PR?

@bacongobbler
Copy link
Member

I want to take a quick look at something first to verify before merging, but LGTM

Copy link
Member

@bacongobbler bacongobbler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the code in here violates licensing terms. We cannot accept code that we did not originally write and claim that copyright for ourselves.

If you can provide a workaround that replaces this code with your own original work (or, include the project as a dependency in go.mod), then we can work around those terms. But merging this PR as-is would open us up to legal issues.

internal/test/net.go Outdated Show resolved Hide resolved
cmd/helm/dependency.go Outdated Show resolved Hide resolved
Signed-off-by: Leo Sjöberg <leo.sjoberg@gmail.com>
@phroggyy
Copy link
Author

@bacongobbler fixed for your requests!

pkg/downloader/chart_downloader.go Show resolved Hide resolved
return "", nil, errors.New("no version or tag provided")
}

if len(parts) != 2 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading this right, the following will append a version to the path if there are multiple colons in the URL path, like so:

repository: oci://localhost:5000/testrepo/testchart:1.2.3:1.2.3
version: 1.0.0

Which will result in the following URL path: /testrepo/testchart:1.2.3:1.2.3:1.0.0

Is that right? That doesn't seem like valid input.

Copy link
Author

@phroggyy phroggyy Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple colons in the path isn't allowed; you can't tag an OCI image with localhost:5000/testrepo/testchart:1.2.3:1.2.3;

➜ docker tag nginx:1.16.0 localhost:5000/nginx:1.16.0:1.2.3
Error parsing reference: "localhost:5000/nginx:1.16.0:1.2.3" is not a valid repository/tag: invalid reference format

So that repository string is invalid to begin with. I could add additional validation to ensure you don't specify multiple tags, but I don't see that being too valuable.

Copy link

@pmalic pmalic Jul 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for barging in, but isn't this shorter & more correct:

		if len(parts) == 1 {
			if version == "" {
				return "", nil, errors.New("no version or tag provided")
			}

			parts = append(parts, version)
			u.Path = fmt.Sprintf("%s:%s", u.Path, version)
		}

You wouldn't be doing len() twice and would only be appending version if : is not present in the path at all.

This would also prevent "corrupting" the URL even further by adding another : when two or more are already present. The URL could be invalid in the first place (the user provided it as such), but it doesn't seem Helm is doing any validation for the repository key, try using repository: mailto:foo@bar.com or repository: https:://..., you'll only get an error message that the chart wasn't found, that's it.

}

downloadURL = u.String()
name = fmt.Sprintf("%s-%s.tgz", parts[0], parts[1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving the same example:

repository: oci://localhost:5000/testrepo/testchart:1.0.1:1.2.3
version: 1.0.0

How is this considered valid input? How is the chart downloader able to determine that parts[1] is the right version? How is the user supposed to know which version Helm will choose?

This makes me go back to the original question: why are we accepting tags in the URL path in the first place?

Copy link
Author

@phroggyy phroggyy Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that shouldn't be valid, as that repository tag is invalid. As mentioned in the other comment – I can add validation for that if desired.

As to why we're accepting them: supporting the same format as helm chart save (as you mentioned in #8332 (comment)). I think it's perfectly fine to drop support for arbitrary tags, but then we should do that for helm chart operations as well.

I can do that cleanup as a separate task if desired so it's done all at the same time.

@pmalic
Copy link

pmalic commented Jun 30, 2020

I've been following this PR for months and have now finally tried storing charts in a container registry. Want to share my experience, could be useful. It's an enterprise app suite that consists of eight charts, one parent chart that has no templates and seven subcharts that represent different apps/components that comprise this app suite. Each of these subcharts comes from a different team, so the parent chart has all of them listed as dependencies in Chart.yaml.

In short, it works! :) Thanks @phroggyy for all the hard work and patience 👍

The reason why we were interested in this PR in the first place is, of course, the ability to store the charts together with container images. The limitations with chart versioning & tags are not a problem in our case. We basically have no need for anything beyond limited version matching that already works. Yes, retrieving the list of tags via /v2/<name>/tags/list and doing version matching (e.g. ^1.2.3, '>= 1.2.3 < 1.5.0', …) would be cool, but I think you can live without that unless you want to run a public repo with charts from different unrelated parties that are not in sync (i.e. communicating with each other) or you’re a big company with a large number of teams. I mean, who’s most likely to start using container registries to store charts, I think people/teams who want to simplify their infrastructure, and I’d say these tend to be smaller teams.

Having said that, if you do have plans to later add support for retrieving tags and doing version matching on them, then IMHO you should not allow tags to be specified in the repository key, because a tag, although mutable, always points to a single image, and that makes the version key somewhat useless. For example:

repository: oci://localhost:5000/testrepo/testchart:1.0.1
version: 1.0.0

This will always retrieve version 1.0.1, the version field in this case could only serve as some “sanity check”, i.e. verify that the chart “behind” the 1.0.1 tag is actually of version 1.0.0. This doesn’t make sense in this particular example, but would make sense here:

repository: oci://localhost:5000/testrepo/testchart:latest
version: '^1.0.0'

This should fail if the chart “behind” the latest tag is of e.g. version 2.0.0.

I’ve read the arguments that disabling tags would make everything less flexible, but you wouldn’t be restricting anything because no one is using this yet :) The flexibility of tags is of great importance with container images, but we’re not talking about container images here. Is there a use case for tagging Helm charts with something other than the version, e.g. stable-alpine? :))) I agree, latest is useful, but you can support it without allowing tags: if you don’t specify version key, then use latest tag (I mean, this is how image referencing generally works, if you don’t specify tag, latest is implied).

So, that's my two cents regarding the version/tag discussion.

For the end, I have a specific question for @phroggyy: When doing helm dep update I’ve noticed that the local cache isn’t really used. By that I mean that if you have a chart in the local cache, but not yet pushed to the container registry, dep update will fail. Also, dep update will always pull the chart from the registry and always overwrite the version in the local cache. This is analogous to the --pull switch of docker client, but this switch is not turned on by default. This is problematic if you’re changing something, you have to push it first before you can use it. This means you have to temporarily use a different tag not used by anyone if you don’t want to accidentally break something for someone. Is there a way to control this behavior? Thanks!

@bacongobbler
Copy link
Member

Given v3.3.0-rc.1 is scheduled for release tomorrow, I'm going to bump this to 3.4 as there is still some open questions with regards to the overall design of this feature.

@bacongobbler bacongobbler modified the milestones: 3.3.0, 3.4.0 Jul 6, 2020
@phroggyy
Copy link
Author

phroggyy commented Jul 7, 2020

@bacongobbler do you have a summary of the current concerns? The ones I've seen are

  • arbitrary tags: happy to just drop this from the PR (but we'll need some kind of cleanup task to remove it from helm chart save)
  • SemVer matching: imo this should be done as a separate improvement. This feature is still experimental which allows us to iteratively improve it rather than have one monster PR
  • using local cache: I can look at this as part of the PR.

It'd be good to have a rough idea of what's needed to get it through (or otherwise to close the PR if it's not close enough to what it "should be"), as I feel I personally have been too deep in the code of this to reasonably determine what makes sense anymore.

@pmalic
Copy link

pmalic commented Jul 7, 2020

This SemVer matching on tags is "challenging". When matching, are you always going to retrieve all the tags for each image from the registry? There could be thousands of them for each dependency, you have to use pagination if the registry has some limit of returned tags per request (e.g. Azure CR has 999 as limit). If you aren't going to retrieve them every time, you have to cache them, but how and where? For "classic" repositories you have a local cache that you can update with repo update, but with OCI, each image would have to be treated like a repository and have its own cache.

@TBBle
Copy link
Contributor

TBBle commented Jul 9, 2020

I agree that SemVer matching is a different challenge and shouldn't block getting the basic dependency-pulling support in.

We'll need to resolve it (even if that resolution is "you can't do that...") before this feature leaves experimental, but if we limit the supported tags to SemVer now, then we aren't blocking off a route to SemVer Matching via tag listing. If a different approach is found, the restriction on tag format can be relaxed less disruptively than it can be introduced if tag listing is the best way forward.

@yurrriq
Copy link
Contributor

yurrriq commented Jul 28, 2020

any update on plans/timeline here? we would love to see #3 (and helm install support). can I help somehow?

@bacongobbler
Copy link
Member

see helm/community#136

@bsuchorowski
Copy link

bsuchorowski commented Sep 18, 2020

Hi @bacongobbler, any news when this is going to be released? We are using an umbrella chart with Helm2 and we would love to switch onto Helm3 as soon as possible.

@mattfarina
Copy link
Collaborator

An alternative implementation has come in via #8843. Removing this from the milestone as we look at the options.

@jdolitsky
Copy link
Contributor

jdolitsky commented Mar 15, 2021

Closing as #8843 was merged

Thanks @phroggyy for all your efforts here, which helped push the conversation forward and resulted in HIP 6

@jdolitsky jdolitsky closed this Mar 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Helm3: Use registry as dependency URL