Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Helm Downloads #5663

Closed
mattfarina opened this issue Apr 30, 2019 · 42 comments

Comments

Projects
9 participants
@mattfarina
Copy link
Collaborator

commented Apr 30, 2019

Helm downloads are currently served out of a bucket in GKE owned by the Kubernetes people at Google. This is not part of the Kubernetes infra in the CNCF, where Kubernetes is migrating the things to. The bucket is not in the control of the Helm project.

We should migrate downloads/assets to a new location run by the Helm project. Two possibilities immediately come to mind:

  1. GitHub Releases
  2. An object storage bucket owned by the Helm project and behind custom DNS (e.g., downloads.helm.sh). The custom DNS would allow us to move the backend location later without impacting end users.

The pros to using GitHub Releases is that a) they are free to use and b) someone viewing the releases can find all the files. The cons to GitHub Releases are the extra complication in automation uploading the files and in the scripts that need to find the right file to download and install.

The pros to a custom storage bucket + DNS is that a) it's in more of our control and b) the scripts we have will be less complicated. The con is cost. But, we could do this on the existing Helm Azure account where we have the room.

We've been asked when we can move off the Google managed infrastructure so this is something we need to do.

Any thoughts on the alternative path to start using?

@mattfarina

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 30, 2019

@bacongobbler

This comment has been minimized.

Copy link
Member

commented Apr 30, 2019

Thanks for bringing this up, @mattfarina.

I'm personally leaning towards object storage served with a CDN. Having control of the distribution of Helm releases would be great to have, as it allows us to control the distribution (like in China) and bandwidth (not hogging Github's CDN during spike loads).

Additionally, we would be able to receive reliable download metrics of our content, and from what regions it's being downloaded from. As far as I know, GitHub does not provide those download metrics.

Using the existing Helm Azure account would be great to burn some of that unused budget. :)

Should we decide to go down this route, we should see what we can do about documenting the architecture of the Helm CDN. That way we can quickly rebuild it in case of emergency (or perform maintenance/updates to the system via PR).

@technosophos

This comment has been minimized.

Copy link
Member

commented Apr 30, 2019

@viglesiasce

This comment has been minimized.

Copy link
Contributor

commented May 1, 2019

I'll break up the Matt-fest a bit :)

I think custom DNS is the most important part and should be prioritized as a first step. From there we can move things around as needed. CDN+Object storage sounds good.

@technosophos

This comment has been minimized.

Copy link
Member

commented May 1, 2019

@bacongobbler

This comment has been minimized.

Copy link
Member

commented May 2, 2019

Any name suggestions?

  • downloads.helm.sh
  • download.helm.sh
  • get.helm.sh

I'm leaning towards get.helm.sh.

@bacongobbler bacongobbler added this to Backlog in Helm 3 via automation May 2, 2019

@bacongobbler bacongobbler moved this from Backlog to In Progress in Helm 3 May 2, 2019

@technosophos

This comment has been minimized.

Copy link
Member

commented May 2, 2019

@viglesiasce

This comment has been minimized.

Copy link
Contributor

commented May 2, 2019

@idvoretskyi

This comment has been minimized.

Copy link
Contributor

commented May 3, 2019

@mattfarina

This comment has been minimized.

Copy link
Collaborator Author

commented May 3, 2019

To update everyone... after talking with @idvoretskyi we're going to do three things.

  1. Get get.helm.sh setup to point to the current bucket. Then update the get script and docs.
  2. Investigate a couple options for to move hosting to. One of the characteristics I want to look for is analytics. We would like to again have good analytics on downloads. Also, we want something available world wide
  3. Switch the back-end that get.helm.sh points to for the new location
@hickeyma

This comment has been minimized.

Copy link
Contributor

commented May 3, 2019

I like the concept of DNS also but I would lean more towards download.helm.sh. I am however ok with get.helm.sh.

Re to hosting, here is feedback I got after asking in the community:
CNCF recommends jfrog artifactory and it is available for CNCF OSS projects. It seems that gRPC is using it with success.

@technosophos

This comment has been minimized.

Copy link
Member

commented May 3, 2019

If jFrog provides adequate analytics, I'm fine with that.

@bacongobbler

This comment has been minimized.

Copy link
Member

commented May 7, 2019

I've started work on the following:

Get get.helm.sh setup to point to the current bucket. Then update the get script and docs.

Unfortunately, adding a CNAME record pointing get.helm.sh to the current bucket won't work. There are a few more steps involved. I'll look into this.

In other news, I'm currently experimenting with Azure Blob Store and see what kind of analytics we can get from there. I'll report back once I find out more information on this.

@mattfarina are you following up with JFrog on experimenting with Artifactory? If you are, perhaps in a few weeks we can share our results on a dev call to report our findings. If not, I'd be happy to experiment. I just need someone to show me where to start :)

@hickeyma

This comment has been minimized.

Copy link
Contributor

commented May 7, 2019

Since I raised about JFrog, I can investigate it if that's ok? @mattfarina What do you think or have you started investigating?

@bacongobbler

This comment has been minimized.

Copy link
Member

commented May 7, 2019

After more investigation, it looks like more setup is involved to front our existing cloud storage buckets with a custom domain. From Google's docs:

Note: You can use a CNAME redirect only with HTTP, not with HTTPS. To serve your content through a custom domain over SSL, you can set up a load balancer.

Which would mean creating a load balancer in an account which we already have limited billing access to.

With that in mind, it might make more sense to cut over sooner rather than later IMO.

@technosophos

This comment has been minimized.

Copy link
Member

commented May 7, 2019

@hickeyma

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

There is a free JFrog Artifactory Cloud solution on GKE for OSS projects. This is the recommendation from the CNCF for CNCF projects.

I can submit an application for helm if people think this is a good idea?

@mattfarina

This comment has been minimized.

Copy link
Collaborator Author

commented May 8, 2019

Yesterday I asked @rimusz for some info on Artifactory to learn more

@hickeyma I would like to host somewhere other than GKE. Google Cloud is not available in China. I would prefer to use a host with worldwide distribution capabilities if we are picking a new setup.

@hickeyma

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

I would like to host somewhere other than GKE. Google Cloud is not available in China. I would prefer to use a host with worldwide distribution capabilities if we are picking a new setup.

Good point @mattfarina. Did @rimusz provide any feedback on additional hosting?

@rimusz

This comment has been minimized.

Copy link
Member

commented May 8, 2019

JFrog Artifactory Cloud solution is on GCP not GKE :)
We also have BINTRAY FOR OSS there (not running on GCP), I will check if it is available in China.

@jbaruch

This comment has been minimized.

Copy link

commented May 8, 2019

Head of JFrog Developer Relations here. There should be no problem having your instance on Azure instead of GCP.
Please sign up at https://jfrog.com/open-source/#artifactory2 and I'll make sure it is set up right.

@hickeyma

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

Head of JFrog Developer Relations here. There should be no problem having your instance on Azure instead of GCP.
Please sign up at https://jfrog.com/open-source/#artifactory2 and I'll make sure it is set up right.

@jbaruch Thanks for contacting us and the feedback.

@rimusz Thanks for helping out and getting info on cloud hosting (and for correcting my mistake! :))

@hickeyma

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

@helm/helm-core-maintainers how would you like to progress? Should I submit an application for the JFrog Artifactory Cloud solution?

Let me know what you think.

@bacongobbler

This comment has been minimized.

Copy link
Member

commented May 8, 2019

In the same time it's taken to figure out how to provision Artifactory, I've already gone ahead and:

  1. Provisioned a storage account in our existing Helm subscription
  2. Set up https://get.helm.sh to point to the CDN (I'd be happy to change this if others feel strong about this being me "picking a winner" here)
  3. Copied existing Helm 2 assets over to the storage account
  4. Provisioned an SSL certificate for https://get.helm.sh (currently finishing up provisioning as we speak and should be finished in a few hours)
  5. Tested and verified we are receiving valuable metrics from Azure Monitor and Verizon (the Azure CDN provider)

I'll be more than happy to provide a demo tomorrow during the dev call to show how this is all set up.

#5694 provides a way to publish Helm 3 assets over to this infrastructure moving forward. If we want to publish Helm 2 release assets here as well, I'd be more than happy to provide a PR to master, though I imagine we probably want to continue publishing to GCS for backwards compatibility concerns.

@bacongobbler

This comment has been minimized.

Copy link
Member

commented May 8, 2019

As an example, here's what the final destination looks like: https://get.helm.sh/helm-v2.13.1-linux-amd64.tar.gz

@henrynash

This comment has been minimized.

Copy link
Contributor

commented May 9, 2019

imho, experimentation is always valuable, but if JFrog is indeed the recommended way for CNCF projects, then we need to justify why we are NOT following their guidance. Let's minimise divergence here, unless there is a really good reason.

@bacongobbler

This comment has been minimized.

Copy link
Member

commented May 9, 2019

The goal was to evaluate several systems and come up with a workable solution. Artifactory is one solution, Azure is another.

There's several reasons why we should choose Azure in this case, in my opinion:

  1. I already have a working solution readily available that can be swapped out in the future now that we're fronting it with https://get.helm.sh. There's no reason why this should be blocked given that we're in an alpha stage for Helm 3; if Artifactory proves to be the better product for our use case then it's an easy swap with no code changes in Helm.
  2. It meets the criteria: release assets are hosted in a billing account the org maintainers have control over. We have download metrics (and some other fancy metrics available, too). Finally, we have a custom domain with SSL to front it in case we need to move at some point.
  3. Users in China are able to access those resources. It is unknown whether Artifactory can support this based on @rimusz's last comment. It is known that Azure CDN can distribute assets to China, and then there's Azure CDN China. If we find out through Azure CDN's Geography Reports that there's a significant market in China, we can make further investments there.
  4. We have several core maintainers who have experience running production systems on Azure.

The last point is the most critical IMO. At the end of the day, it is the responsibility of the core maintainers to ensure that the system is running smoothly, and we're the ones on call to support this. This decision should be up to the core maintainers to make the call on what system they feel most comfortable going to production with. Given my personal experience being a Microsoft employee, I'm more comfortable supporting Azure in production.

@bacongobbler

This comment has been minimized.

Copy link
Member

commented May 9, 2019

I should also point out that if we want download metrics for the first alpha release, we should land on a solution soon. We're zeroing in on a final release candidate and KubeCon EU is in 2 weeks. We should have something in place by that time.

@technosophos

This comment has been minimized.

Copy link
Member

commented May 9, 2019

@henrynash I talked to the CNCF directly, and they do not prefer or require jFrog. They merely provide jFrog services. We also happen to have Azure services provided for us, as well as Google Cloud. CNCF leaves it up to projects to choose what technology they select. @mattfarina was in the same meeting and can confirm.

@technosophos

This comment has been minimized.

Copy link
Member

commented May 9, 2019

Quick sampling of CNCF projects:

  • Jaegger: GitHub releases
  • Notary: GitHub releases
  • Linkerd: GitHub Releases
  • Envoy: DockerHub
  • Prometheus: GitHub Releases
  • Kubernetes: Google Storage
  • Fluentd: Amazon S3
  • Harbor: Google Storage
@bacongobbler

This comment has been minimized.

Copy link
Member

commented May 9, 2019

I did find one project (thanks @hickeyma for the pointer): GRPC. They use Artifactory for https://packages.grpc.io/.

See https://github.com/grpc/grpc/blob/3a26459edcbbc086d385178f464682c7a4157079/tools/internal_ci/linux/grpc_publish_packages.sh#L243-L245

Regardless, the main motivation for moving was

  1. Getting out of the kubernetes billing account
  2. Front the storage layer with a custom domain (https://get.helm.sh)

The storage layer is just an implementation detail.

@henrynash

This comment has been minimized.

Copy link
Contributor

commented May 9, 2019

So given @technosophos comment on the fact that this isn't a recommendation from the CNCF, then my point is mute :-). My general point was that, given that I assume we are looking towards graduation as well as an increasingly diverse maintainer membership as we grow, that our technology choices should try and follow the patterns that are out there in the CNCF (and other related foundations).

@mattfarina

This comment has been minimized.

Copy link
Collaborator Author

commented May 9, 2019

@rimusz Thanks for the correction and filling things in.

@jbaruch Would you have some time to talk about Artifactory? I have a bunch of questions.

@henrynash Artifactory is an option rather than something that's recommended. In the meeting @technosophos and I were in we discussed both Artifactory and Azure storage (we already use Azure for some other things). There was no preference give to any solution. They just wanted us to be aware of the options.

I am curious about Artifactory because I'm thinking more holistically than just tgz downloads and I'm interested in statistics. If we were going to go with the simpler solution that is less work for the immediate problem it would be Azure Storage.

@hickeyma

This comment has been minimized.

Copy link
Contributor

commented May 9, 2019

@henrynash I agree with @mattfarina and @technosophos as being an option and not the only solution to use. Sorry for any mis-understanding gleaned from comments above.

@mattfarina

This comment has been minimized.

Copy link
Collaborator Author

commented May 9, 2019

@henrynash We do look at what the other CNCF projects are doing and talk with CNCF folks about it. I also tend to attend the TOC meeting to follow along with any direction they are discussing.

@henrynash

This comment has been minimized.

Copy link
Contributor

commented May 9, 2019

@mattfarina excellent!

@jbaruch

This comment has been minimized.

Copy link

commented May 10, 2019

Couple points to add to this:

  1. Bintray is not available in China, Artifactory on AWS and Azure is.
  2. If you pick AWS, we can make sure the files are served via Cloud-Front (as if you used S3 directly).
  3. Artifactory knows how to auto calculate Helm index, no need to calculate and upload.
@jbaruch

This comment has been minimized.

Copy link

commented May 10, 2019

Re CNCF, we already serve some CNCF projects (e.g. grpc) on Artifactory and looking forward to serve more :)

@bacongobbler

This comment has been minimized.

Copy link
Member

commented May 10, 2019

As of #5709 we are now publishing Helm 3 release assets to https://get.helm.sh, so the original ask for this issue has been resolved. I assume we want to keep this open until we evaluate other systems though?

@mattfarina

This comment has been minimized.

Copy link
Collaborator Author

commented May 14, 2019

@bacongobbler Might I suggest we keep this open until we launch the alpha1. After that we create a ticket to move if there is a reason and the reason be detailed in the issue for traceability.

@bacongobbler

This comment has been minimized.

Copy link
Member

commented May 14, 2019

Might I suggest we keep this open until we launch the alpha1.

Sounds good to me.

@bacongobbler bacongobbler moved this from In Progress to Done in Helm 3 May 14, 2019

@bacongobbler

This comment has been minimized.

Copy link
Member

commented Jun 4, 2019

3.0.0-alpha.1 was released, so this can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.