Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Umbrella issue: k8s.gcr.io => registry.k8s.io solution #1834

Closed
21 of 25 tasks
BobyMCbobs opened this issue Mar 25, 2021 · 35 comments
Closed
21 of 25 tasks

Umbrella issue: k8s.gcr.io => registry.k8s.io solution #1834

BobyMCbobs opened this issue Mar 25, 2021 · 35 comments
Assignees
Labels
area/artifacts Issues or PRs related to the hosting of release artifacts for subprojects lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/k8s-infra Categorizes an issue or PR as relevant to SIG K8s Infra. sig/release Categorizes an issue or PR as relevant to SIG Release.

Comments

@BobyMCbobs
Copy link
Member

BobyMCbobs commented Mar 25, 2021

Umbrella issue: k8s.gcr.io => registry.k8s.io solution #1834

This markdown is synced from https://hackmd.io/gN-1GeSpSgyNSvmjKSULbg?edit to #1834 (comment) manually by @BobyMCbobs

Scope: https://github.com/kubernetes/k8s.io/wiki/New-Registry-url-for-Kubernetes-(registry.k8s.io)

Design Doc: https://docs.google.com/document/d/1yNQ7DaDE5LbDJf9ku82YtlKZK0tcg5Wpk9L72-x2S2k/edit (shared w/ dev@kubernetes.io and SIG mailing list)

Board: https://github.com/orgs/kubernetes/projects/77

DRAFT AIs that need filled turned into tickets:
https://github.com/orgs/kubernetes/projects/77/views/2?filterQuery=is%3Adraft

What exactly are you doing? (and how?)

@ameukam ameukam added area/artifacts Issues or PRs related to the hosting of release artifacts for subprojects priority/backlog Higher priority than priority/awaiting-more-evidence. labels Mar 25, 2021
@stp-ip
Copy link
Member

stp-ip commented Mar 26, 2021

Correct link as Github parsed wrong I guess: https://hackmd.io/@TKToYPauRJ-u_mNRBOh4HQ/HJBH3QF4_

@thockin
Copy link
Member

thockin commented Mar 26, 2021

This finally forced me to disassemble the registry protocol a bit. Interesting. I picked a simple image I know:

$ curl -i https://k8s.gcr.io/v2/git-sync/git-sync/manifests/v3.2.2
HTTP/2 200 
docker-distribution-api-version: registry/2.0
content-type: application/vnd.docker.distribution.manifest.list.v2+json
content-length: 1670
docker-content-digest: sha256:6a543fb2d1e92008aad697da2672478dcfac715e3dddd33801d772da6e70cf24
date: Fri, 26 Mar 2021 22:20:30 GMT
server: Docker Registry
x-xss-protection: 0
x-frame-options: SAMEORIGIN
alt-svc: h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"

{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1572,
         "digest": "sha256:85d203d29623d5e7489751812d628e29d0e22075c94a2e99681ecf70be3977ad",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1572,
         "digest": "sha256:31ba6a8e4f1aad8a9c42d97cac8752aaa0e4a92a5b2a3457e597020645fc6a0c",
         "platform": {
            "architecture": "arm",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1572,
         "digest": "sha256:690188a4785caa356d2d98a806524f6f9aa4663a8c43be11fbd9dd5379a01fc9",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1572,
         "digest": "sha256:21487b58352611e67ca033a96f59f1ba47f3e377f5f2e365961c35829bc68ff7",
         "platform": {
            "architecture": "ppc64le",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1572,
         "digest": "sha256:41f3ac440284018ce19b78a8e39a3e99c701a6d7c90fdf7204e180a9715ca7e3",
         "platform": {
            "architecture": "s390x",
            "os": "linux"
         }
      }
   ]
}

I picked the last blob:

$ curl -i https://k8s.gcr.io/v2/git-sync/git-sync/blobs/sha256:41f3ac440284018ce19b78a8e39a3e99c701a6d7c90fdf7204e180a9715ca7e3
HTTP/2 302 
docker-distribution-api-version: registry/2.0
location: https://storage.googleapis.com/us.artifacts.k8s-artifacts-prod.appspot.com/containers/images/sha256:41f3ac440284018ce19b78a8e39a3e99c701a6d7c90fdf7204e180a9715ca7e3
content-type: application/json
date: Fri, 26 Mar 2021 22:21:42 GMT
server: Docker Registry
cache-control: private
x-xss-protection: 0
x-frame-options: SAMEORIGIN
alt-svc: h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
accept-ranges: none
vary: Accept-Encoding

{"errors":[]}

So maybe this is not so hard as I feared?

If we catch URLs of the form https://reg.k8s.io/v2/<name>/manifests/<tag> we can redirect those to k8s.gcr.io (which has global replicas) or some other "canonical" source for metadata. I don't know if docker clients would trip over a literal redirect or what, so worst case we'd have to proxy that data (yuck).

Then we catch URLs of the form https://reg.k8s.io/v2/<name>/blobs/<digest> and redirect to one of the backends. As you point out, we have to do the split-horizon (geo IP) ourselves (yuck).

What I don't know is what tools or public IP databases or other resources are available for the 2nd part. The more we can outsource, the better. But a proof-of-concept would be cool!

I spent a bit of time trying to coax the Google cloud LB to distinguish /v2/<name>/manifests/<ref> from /v2/<name>/blobs/<digest> so the 1st part could simply be a cloud LB rule. Alas it only matches on prefixes. It might be possible to use Content-Type or Accept headers to tell the difference (suggested: match Accept header with blob mime type). If we could do that, then the only thing we'd have to own would be the 2nd part.

I suspect that a model which requires providers to answer our DNS will be more difficult overall.

@BobyMCbobs
Copy link
Member Author

Correct link as Github parsed wrong I guess: https://hackmd.io/@TKToYPauRJ-u_mNRBOh4HQ/HJBH3QF4_

@stp-ip, thank you. I've updated the description

@justaugustus
Copy link
Member

Great to see this discussion happening!

A few things I'd like to see:

These discussions/decisions impact release delivery, so I'd really love to see them happening in venues where @kubernetes/release-managers are hanging out.

@BobyMCbobs
Copy link
Member Author

@thockin, thank you for your input!

If we catch URLs of the form https://reg.k8s.io/v2/<name>/manifests/<tag> we can redirect those to k8s.gcr.io (which has global replicas) or some other "canonical" source for metadata. I don't know if docker clients would trip over a literal redirect or what, so worst case we'd have to proxy that data (yuck).

I suspect that clients may be fine with redirects

Then we catch URLs of the form https://reg.k8s.io/v2/<name>/blobs/<digest> and redirect to one of the backends. As you point out, we have to do the split-horizon (geo IP) ourselves (yuck).

I'm unsure the capability of Google CloudDNS or the use of LoadBalancers to achieve this, community-hosting it might be the option (I'm still investigating otherwise).

I spent a bit of time trying to coax the Google cloud LB to distinguish /v2/<name>/manifests/<ref> from /v2/<name>/blobs/<digest> so the 1st part could simply be a cloud LB rule. Alas it only matches on prefixes. It might be possible to use Content-Type or Accept headers to tell the difference (suggested: match Accept header with blob mime type). If we could do that, then the only thing we'd have to own would be the 2nd part.

This would mean declaring a rule to rewrite the URL that redirects to a DNS host that uses split-horizon DNS which will then go to a blobs server at the nearest cloud provider?

@BobyMCbobs
Copy link
Member Author

@justaugustus, appreciate your comments!

Great to see this discussion happening!

A few things I'd like to see:

Thank you, I'll take a read of the KEP.

Totally [epic], I'll check it out as well

Absolutely! I've got the two proposals for either Distribution or Harbor. Both are wonderful pieces of software.

  • an idea of intended assignees from the WG K8s Infra side (I'm on point for SIG Release)
  • feedback from @kubernetes/sig-release-leads @kubernetes/release-engineering

That would be lovely!

These discussions/decisions impact release delivery, so I'd really love to see them happening in venues where @kubernetes/release-managers are hanging out.

I'll get in contact regarding this issue with folks. I look forward to coordinating a solution with ya'll 😃

@thockin
Copy link
Member

thockin commented Mar 29, 2021 via email

@BobyMCbobs
Copy link
Member Author

Either 302 redirect to blob.k8s.io which uses DNS split horizon (which requires the backends to host certs for that SAN) or 302 to blob.k8s.io which is code we host that does the GeoIP lookup, picks a best backend, and then 302s again to that backend. The advantage of the latter is that the backends don't need special certs.

Would you say that a small webserver to do 302 redirects may be easier or more maintainable than split-horizon?

If we can't coax the cloud LB to do this for us, it starts to look more like:

  1. User pulls foo:bar
  2. Client hits reg.k8s.io/v2/foo/manifests/bar
  3. Receive that at a program we run (nginx or bespoke or ...)
  4. Redirect to k8s.gcr.io/v2/foo/manifests/bar
  5. Metadata fetched
  6. Client hits /v2/foo/blobs/<digest>
  7. Received at same program as step 3
  8. GeoIP lookup, backend select
  9. Redirect to <backend>/v2/foo/blobs/<digest>
  10. Repeat steps 6-10 for each blob
  11. Image is pulled

This is a really clear flow!

@thockin, thank you!

@rikatz
Copy link
Contributor

rikatz commented Mar 31, 2021

Can I do an attempt into Cloud Run and check if instead we run a machine, running a function that does the redirect wouldn't be cheaper (probably not!) and better? :D

Edit: @justinsb was fair enough saying that we probably can run this inside the AAA and have not so much problems as well, so yeah, let's see how we can use a redirector inside Kubernetes

@hh
Copy link
Member

hh commented Apr 8, 2021

I've been searching out a few ASNs for larger cloud providers that likely hit our existing infra.
Once we use these to understand which providers are costing the CNCF the most, we can approach to redirect to a local solution. If anyone from these providers wants to help narrow down which ASNs are part of their cloud offerings, that would help.

@stp-ip
Copy link
Member

stp-ip commented Apr 8, 2021

There are a few other providers that could result in traffic, but above felt like a good additional selection of the bigger ones.
Full reference of a list of providers: https://docs.google.com/spreadsheets/d/1LxSqBzjOxfGx3cmtZ4EbB_BGCxT_wlxW_xgHVVa23es/edit#gid=0

So depending on how much traffic is done by those above we could always dig deeper. Let's see what the stats say for the listed providers and then happy to dig into the smaller providers.

@BobyMCbobs BobyMCbobs changed the title k8s.gcr.io => registry.k8s.io solution Umbrella issue: k8s.gcr.io => registry.k8s.io solution Apr 8, 2021
@BobyMCbobs
Copy link
Member Author

I believe this is the list of ASNs for Equinix Metal:

- 8545 - 9989 - 12085 - 12188 - 14609 - 15734 - 15830 - 15830 - 15830 - 15830 - 15830 - 15830 - 15830 - 16243 - 16397 - 16553 - 17819 - 17941 - 19930 - 21371 - 23637 - 23686 - 24115 - 24121 - 24989 - 24990 - 26592 - 27224 - 27272 - 27330 - 27566 - 29154 - 29884 - 32323 - 32550 - 34209 - 35054 - 43147 - 47886 - 47886 - 54588 - 54825 - 62421 - 64275 - 137840 - 139281 - 264220 - 265376 - 266849 - 270119 - 394749

@BobyMCbobs
Copy link
Member Author

ASNs in k8s.io repo: #1914

@ameukam ameukam added sig/release Categorizes an issue or PR as relevant to SIG Release. wg/k8s-infra labels Apr 16, 2021
@thockin
Copy link
Member

thockin commented Apr 16, 2021

Would you say that a small webserver to do 302 redirects may be easier or more maintainable than split-horizon?

Yes. My thinking is mostly around TLS - if we do split horizon, the real backends have to offer certs for our names. If we 302, they do not. There are a number of GeoIP libs for Go that could be viable. Other than that, the logic seems simple enough to prototype. We could throw it into the aaa cluster as a quick test.

@BobyMCbobs
Copy link
Member Author

Would you say that a small webserver to do 302 redirects may be easier or more maintainable than split-horizon?

Yes. My thinking is mostly around TLS - if we do split horizon, the real backends have to offer certs for our names. If we 302, they do not. There are a number of GeoIP libs for Go that could be viable. Other than that, the logic seems simple enough to prototype. We could throw it into the aaa cluster as a quick test.

Thank you @thockin for your comments.

Regarding using a service to perform a redirect, the behavour of something like docker pull registry.k8s.io/{{.Image}}:

  • it will pull tagging the image as registry.k8s.io/{{.Image}}, not what it is at the container registry
  • due to Envoy being exposed with an Ingress host that has TLS, it appears to not matter about the TLS at the actual container registry

ref: https://ii.coop/blog/rerouting-container-registries-with-envoy/#the-implementation

@justinsb
Copy link
Member

We could throw it into the aaa cluster as a quick test.

Do you mean deploying https://github.com/kubernetes/k8s.io/tree/main/artifactserver as a test?

@BobyMCbobs
Copy link
Member Author

BobyMCbobs commented Apr 20, 2021

related: #1758

@BobyMCbobs
Copy link
Member Author

I deployed Envoy as well as Distribution on a cluster in the k8s-infra-ii-sandbox project from this Org file
https://github.com/cncf-infra/prow-config/blob/master/infra/gcp/README.org#envoy

@justinsb
Copy link
Member

@BobyMCbobs can we try deploying artifactserver as well?

@BobyMCbobs
Copy link
Member Author

@BobyMCbobs can we try deploying artifactserver as well?

Yes! I've deployed it to https://artifacts.ii-sandbox.bobymcbobs-oitq.pair.sharing.io at the moment
https://github.com/cncf-infra/prow-config/blob/dc681e5d79d85af47df5f01ebcf281bf193de666/infra/gcp/README.org#artifactserver

I am currently trying to adapt the source to provide the same 302 functionality as what Envoy is providing.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 18, 2021
@puerco
Copy link
Member

puerco commented Aug 18, 2021

/remove-lifecycle stale

@thockin
Copy link
Member

thockin commented Oct 22, 2021

WRT tech stack:

I took this program:

package main

import (
	"log"
	"net/http"
	"os"
	"regexp"
	"strings"
)

func main() {
	port := os.Getenv("PORT")
	if port == "" {
		port = "8080"
	}
	log.Printf("listening on port %s", port)
	http.ListenAndServe(":"+port, http.HandlerFunc(handler))
}

func handler(w http.ResponseWriter, r *http.Request) {
	path := r.URL.Path
	switch {
	case strings.HasPrefix(path, "/v2/"):
		doV2(w, r)
	case strings.HasPrefix(path, "/v1/"):
		doV1(w, r)
	default:
		log.Printf("unknown request: %q", path)
		http.NotFound(w, r)
	}
}

var reBlob = regexp.MustCompile("^/v2/.*/blobs/sha256:[0-9a-f]{64}$")

func doV2(w http.ResponseWriter, r *http.Request) {
	path := r.URL.Path

	if reBlob.MatchString(path) {
		// Blob requests are the fun ones.
		log.Printf("v2 blob request: %q", path)
		//FIXME: look up the best backend
		http.Redirect(w, r, "https://k8s.gcr.io"+path, http.StatusTemporaryRedirect)
		return
	}

	// Anything else (manifests in particular) go to the canonical registry.
	log.Printf("v2 request: %q", path)
	http.Redirect(w, r, "https://k8s.gcr.io"+path, http.StatusPermanentRedirect)
}

func doV1(w http.ResponseWriter, r *http.Request) {
	path := r.URL.Path
	log.Printf("v1 request: %q", path)
	//FIXME: look up backend?
	http.Redirect(w, r, "https://k8s.gcr.io"+path, http.StatusPermanentRedirect)
}

...and it acts as a proxy to k8s.gcr.io for docker pull. We can run it in a GKE cluster (or in several around the world). But seeing how trivial this is, there has to be a better way.

So I put it into Cloud Run. Easy. My test project is locked down (org policy, yay), so I can't point you at it, but easy to replicate.

It seems possible to add multiple global backends: https://cloud.google.com/run/docs/multiple-regions

So what are we missing:

  • logic to figure out "best" backends
  • the stuff listed above about productionizing it.

How do we make progress on that?

@aojea
Copy link
Member

aojea commented Jan 26, 2022

/cc

BenTheElder added a commit to BenTheElder/registry.k8s.io that referenced this issue Jan 27, 2022
αρχείο (archeío) is roughly? Greek for "registry".

Source originally from kubernetes/k8s.io#1834 (comment)

See also: https://docs.google.com/document/d/1yNQ7DaDE5LbDJf9ku82YtlKZK0tcg5Wpk9L72-x2S2k/

Co-authored-by: Tim Hockin <thockin@google.com>
@BenTheElder
Copy link
Member

see: https://docs.google.com/document/d/1yNQ7DaDE5LbDJf9ku82YtlKZK0tcg5Wpk9L72-x2S2k/ (shared with dev@kubernetes.io mailinglist and the SIG mailing list) for some recent discussion on this topic.

ameukam added a commit to ameukam/test-infra that referenced this issue Feb 7, 2022
Related:
  - kubernetes/k8s.io#1834

Test different AWS AMI with registry-sandbox.k8s.io

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
ameukam added a commit to ameukam/test-infra that referenced this issue Feb 9, 2022
Related:
  - kubernetes/k8s.io#1834

Test different AWS AMI with registry-sandbox.k8s.io

Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
@BobyMCbobs
Copy link
Member Author

Update 📰 🎉

The redirecting from registry.k8s.io to k8s.gcr.io and prod-registry-k8s-io-$REGION.s3.dualstack.us-east-2.amazonaws.com is instantiated and there is automated replication between the buckets.
There is a registry-sandbox.k8s.io, for staging, including an auto-deploy from main. The staging environment is also used in CI jobs.
The repo for the redirector is available at https://github.com/kubernetes/registry.k8s.io.
It has been a huge effort with collaborations between many folks in sig-k8s-infra and sig-release.

cc @kubernetes/sig-k8s-infra

@BenTheElder
Copy link
Member

I think we can close this.

This is at https://registry.k8s.io now and is generally implemented.

What remains is phasing over users, which we're tracking elsewhere.

@BenTheElder
Copy link
Member

/close

sig-k8s-infra automation moved this from Needs Triage to Done Mar 16, 2023
@k8s-ci-robot
Copy link
Contributor

@BenTheElder: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/artifacts Issues or PRs related to the hosting of release artifacts for subprojects lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/k8s-infra Categorizes an issue or PR as relevant to SIG K8s Infra. sig/release Categorizes an issue or PR as relevant to SIG Release.
Projects
Status: Reporting, Audit & Logging
sig-k8s-infra
  
Done
Development

No branches or pull requests