Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sig-cluster-lifecycle: KEP for communicating a local registry #1757

Merged
merged 1 commit into from Jun 10, 2020
Merged

sig-cluster-lifecycle: KEP for communicating a local registry #1757

merged 1 commit into from Jun 10, 2020

Conversation

nicks
Copy link
Contributor

@nicks nicks commented May 8, 2020

This proposal came out of kubernetes-sigs/kind#1543.

I've integrated some suggestions from the sig-cluster-lifecycle meeting on 5/5/2020 that I really liked.

Issue: #1755

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 8, 2020
@k8s-ci-robot
Copy link
Contributor

Welcome @nicks!

It looks like this is your first PR to kubernetes/enhancements 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/enhancements has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @nicks. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 8, 2020
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. labels May 8, 2020
@nicks
Copy link
Contributor Author

nicks commented May 8, 2020

@neolit123 thank you for your help so far getting this discussion started! Are you the right person to review this, or is there a better candidate?

@neolit123
Copy link
Member

/ok-to-test
@kubernetes/sig-cluster-lifecycle-pr-reviews

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 8, 2020
@neolit123
Copy link
Member

neolit123 commented May 8, 2020

pull-enhancements-verify — Job failed.

usually this requires calling make and committing the extra diff. the tooling contains something that adjusts TOCs on KEPs.

@neolit123 neolit123 removed the sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. label May 8, 2020
@k8s-ci-robot k8s-ci-robot added the sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. label May 8, 2020
Copy link
Contributor Author

@nicks nicks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, i was confused by something else i read. Thanks!

-->

Many local clusters currently support local registries. But they put the onus of
configuration on the user. First, the user has to follow a shell script to setup

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "shell script" is only accurate in the case of kind. microk8s, minikube, and k3d bake support into the binary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't even know minikube supported this! if only there was an automatic discovery mechanism where I could have been notified about it! 😈

I can also file an issue on minikube pointing to this KEP, to get feedback from anyone following that repo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will not be the case in kind either very soon :-)

@tstromberg
Copy link

Sounds like a great proposal. I'd be happy to ensure that whatever the final state that folks agree upon is implemented for minikube's registry addon.

interact with it. They think deleting the ConfigMap would delete the local
registry (which it does not).

- There is no automatic mechanism for keeping the RegistryStatus up-to-date
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not necessarily true?
(though it seems perhaps somewhat unreasonable to run a controller just for this)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I clarified this a bit - more that this KEP doesn't specify one.


How will security be reviewed and by whom?

How will UX be reviewed and by whom?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justaugustus what is a response to this line supposed to look like?

@BenTheElder
Copy link
Member

Generally SGTM, happy to implement this. Thanks for putting this together and meeting with everyone!

Copy link
Contributor Author

@nicks nicks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @BenTheElder ! Added some clarifications

interact with it. They think deleting the ConfigMap would delete the local
registry (which it does not).

- There is no automatic mechanism for keeping the RegistryStatus up-to-date
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I clarified this a bit - more that this KEP doesn't specify one.

@k8s-ci-robot k8s-ci-robot added the sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. label May 22, 2020
@nicks
Copy link
Contributor Author

nicks commented May 22, 2020

Thanks!

  • 4th of June sounds good to me

  • I'll also send a note to the SIG CL mailing list

  • Maintainers from both the Kind and Microk8s teams have thumbs-up'd the latest version of this proposal. The Minikube and K3d maintainers have seemed generally positive, but still need to confirm they've seen this version.

  • The KEP template talks about a bunch of process things, like marking the status: implementable, removing the template comments, checking off the release checklist. Do I do those in this PR, or a follow-up PR?

@bparees
Copy link

bparees commented May 22, 2020

Openshift has defined some fairly extensive apis around registry configuration for components running on the cluster that need to talk to registries, things like:

  1. where to get CAs for a given registry
  2. which registry hostnames should be whitelisted/blacklisted
  3. which registries should be allowed to be accessed as insecure

It's heavierweight than what is proposed here, but might be more generally useful to adopt some of the apis it includes.

https://github.com/openshift/api/blob/master/config/v1/0000_10_config-operator_01_image.crd.yaml

openshift also defines an imagecontentsourcepolicy which defines valid mirrors for a primary registry, which means components can go to the mirror if the primary is unavailable. Again, a concept potentially worth including in an api like this:

https://github.com/openshift/api/blob/master/operator/v1alpha1/0000_10_config-operator_01_imagecontentsourcepolicy.crd.yaml

of course it is not intended solely for internal/on-cluster registries.

@neolit123
Copy link
Member

neolit123 commented May 22, 2020

@nicks

Maintainers from both the Kind and Microk8s teams have thumbs-up'd the latest version of this proposal. The Minikube and K3d maintainers have seemed generally positive, but still need to confirm they've seen this version.

sounds good.

The KEP template talks about a bunch of process things, like marking the status: implementable, removing the template comments, checking off the release checklist. Do I do those in this PR, or a follow-up PR?

we can amend this KEP PR last minute before the merge to "implementable". as far as checklists, it's fine to mark the items that are covered in that same last amend.

something else that we may want to include in the KEP text body is links to the tracking issue in this repository and the SIG mailing list, so that users can find where to send feedback more easily.

@bparees

thanks for these links!

i have not seen what projects/products out there are already doing on this topic, so openshift's perspective seems quite interesting. this proposal seems to be more in the lines of a minimal-viable option for now and already talks about CRDs as alternatives. mirrors are also interesting, so maybe that is something that should be considered for the v1.

@nicks WDYT?

@nicks
Copy link
Contributor Author

nicks commented May 22, 2020

Thanks @bparees !

Ya, I had seen these a while back. My understanding is that these are for configuring the container runtime inside the cluster, e.g., configuring registry mirrors in the container runtime.
https://docs.openshift.com/container-platform/4.4/openshift_images/image-configuration.html#images-configuration-registry-mirror_image-configuration

That's a complementary problem to what this KEP is trying to solve.

This KEP is about giving tools outside the cluster a way to discover local, insecure registries that have already been configured by other tools. But we're assuming a world where every cluster has a different way to configure it.

I added a link to the OpenShift bits to this part of the KEP:

### Non-Goals

- Any API for configuring a local registry in a cluster. If there was a standard
  CRD that configured a local registry, and all implementations agreed to
  support that CRD, this KEP would become moot. That approach would have
  substantial technical challenges. OpenShift supports [a
  CRD](https://docs.openshift.com/container-platform/4.4/openshift_images/image-configuration.html)
  for this, which they use to configure registries.conf inside the cluster.

does that match your understanding?

@bparees
Copy link

bparees commented May 23, 2020

My understanding is that these are for configuring the container runtime inside the cluster, e.g., configuring registry mirrors in the container runtime.

primarily, but the config can be consumed by any component. In particular it is also consumed by the build component (which does its own pulling of images) and the imagestream import component (which imports metadata about images, from external registries).

This KEP is about giving tools outside the cluster a way to discover local, insecure registries that have already been configured by other tools.

fair enough, that is definitely a different scenario, i was under the impression it was about telling components running on the cluster, how to consume images from certain known registries.

in any case, whatever the outcome from this i think openshift (which also provides its own on-cluster registry) would certainly be interested in using this mechanism to publish information about the on-cluster registry. (Today it does publish information about the on-cluster registry in the status field of that image-config object, but it does not provide CA information or other details). I've pointed our registry component maintainers to this proposal for their awareness as well.


Local clusters like Kind, K3d, Minikube, and Microk8s let users iterate on Kubernetes
quickly in a hermetic environment. To avoid network round-trip latency, these
clusters can be configured to pull from a local, insecure registry.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenShift provides a local secured registry by default. It's secured by a self-signed CA, but all internal components know that the registry uses this CA, and this CA may be exported for external components.

Is there any reason not to add insecure and caBundle to the proposed config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmage Oooh that's a deep question! I don't think there's any active harm in putting it there. But what would you expect to happen differently if we did? i.e., what should we tell the consumer of the ConfigMap to do with this info?

I can think of a few different possibilities!

  1. The cluster configuration script has already set up your local docker config / registries.conf with the caBundle so that pushing to the registry will work out-of-the-box. So this is just an FYI to the IDE / dev env reading this ConfigMap. They don't need to take any action.

  2. When the IDE / dev env reads the ConfigMap, it's responsible for updating the docker config / registries.conf itself with the caBundle (if it can).

  3. When the IDE / dev env reads the ConfigMap, it's responsible for checking that the docker config / registries.conf has this information, and refusing to push to that registry if it's not and/or giving the user instructions on how to set it up.

I think I was assuming that we live in world (1) - that the tooling setting up the local cluster has enough information to set up the configs itself, and doesn't need to push the burden downstream to other tools (or even worse, pushing it onto the human)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

friendly ping in case you missed this comment

the more i think about it, we're probably better off if we omit the fields, rather than add the fields and all consumers ignore them.

maybe a better solution is to add some text in the spec like "if pushing to the registry fails, the consumer should include the help link in the error message to the user". Then cluster providers can use the help link to show users how to configure the registry caBundle

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason not to add insecure and caBundle to the proposed config?

some questions from me here:

  • when a CA certificate is exposed in a public space (e.g. kube-public NS) this can be used for clients to verify the server they are trying to reach is the one they want. this gives clients the chance to trust this server. but given anyone can get the CA, the server cannot trust all clients. is that the intention here?
  • what is the expected / suggested behavior if insecure (assuming it's a bool) is false but caBundle is empty?
  • would caBundle be base64 encoded?

@nicks
Copy link
Contributor Author

nicks commented May 26, 2020

@bparees Sorry! I should have pinged openshift or crc teams about this proposal. Let me also file an issue against CRC for them to weigh in on this, since they also have a local registry solution
https://github.com/code-ready/crc/wiki/Adding-an-insecure-registry

The motivation behind this KEP is that I maintain a decision tree in Tilt that tries to do cap detection against different types of local dev clusters and the registries they support. Adding new cluster types is kind of a pain. People have asked us to add openshift/crc detection. So I'm hoping that having a shared standard will make it easier to support on both sides

@medyagh
Copy link
Member

medyagh commented Jun 3, 2020

CC: I think we should include @afbjorklund in this discussion

@nicks
Copy link
Contributor Author

nicks commented Jun 3, 2020

CC: I think we should include @afbjorklund in this discussion

Agreed! We had some good discussion on the minikube repo: kubernetes/minikube#8101
and I've folded some of the feedback there into this proposal

@nicks
Copy link
Contributor Author

nicks commented Jun 4, 2020

I updated this proposal with details on how to handle secure registries with self-signed CAs. For now, the proposal recommends that we push this burden on the user and document it at the help url.

That's at least better than the status quo, and allows us to move forward without having to agree on a common standard for how to load certificates into docker or buildah or whatever.

@neolit123
Copy link
Member

the deadline we appointed for merging this passed, but let's have a final wait until Wednesday next week.

@medyagh @afbjorklund @dmage
please review and comment further if needed.

@afbjorklund
Copy link

I don't see a problem with the minikube "registry" addon implementing this ConfigMap (or even CRD if it comes to that), the main feedback was that most users would not be deploying such a local registry...

@neolit123
Copy link
Member

please squash the commits to 1, then LGTM.

@nicks
Copy link
Contributor Author

nicks commented Jun 10, 2020

squashed!

@neolit123
Copy link
Member

ok, i'm merging this document as something that multiple parties agreed on as the initial specification for local cluster registries. we are considering this as something that multiple parties gave their +1 about and saw no major objections.

please note this is a living document and if you wish to send PRs for it, please feel free to do so.
for general discussion and comments either use #1755 or the SIG Cluster Lifecycle mailing list.

things that can be easily amended:

  • context about what projects are currently doing.
  • outdated links.

things that can change over time, but will have pending discussion:

  • addition of new specification and structures.
  • overall scope of the document.

things that should no change:

  • implementation details for specification that is already merged - e.g. localRegistryHosting.v1

thank you very much @nicks for writing the proposal and enduring the discussion and review.
well done.

/lgtm
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 10, 2020
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 10, 2020
@k8s-ci-robot k8s-ci-robot merged commit f2af06b into kubernetes:master Jun 10, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.19 milestone Jun 10, 2020
@nicks nicks deleted the nicks/local-registry branch June 10, 2020 14:55
@neolit123 neolit123 removed the sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. label Jun 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants