Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First addon operator integration: CoreDNS #9374

Merged
merged 1 commit into from
May 25, 2021

Conversation

justinsb
Copy link
Member

@justinsb justinsb commented Jun 16, 2020

Hidden behind a feature-flag, but when the UseAddonOperators flag is set,
we now use the cluster-addons CoreDNS operator instead of our built-in
manifests.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 16, 2020
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 16, 2020
@justinsb
Copy link
Member Author

Builds on #8119

@@ -0,0 +1,96 @@
/*
Copyright 2019 The Kubernetes Authors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Copyright 2019 The Kubernetes Authors.
Copyright 2020 The Kubernetes Authors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly not ;-)

name: coredns-operator-leader-election-role
subjects:
- kind: ServiceAccount
name: default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default service account? Those are some really powerful rights being granted here and below. The effort of separating roles is wasted by not separating out service accounts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - I'll take it up upstream along with the idea of precreating RBAC

spec:
ports:
- name: https
port: 8443
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not 443?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done upstream - moved to 443 with seemingly no consequence so I'll take it up there along with other changes!

@johngmyers
Copy link
Member

I am alarmed by the sheer amount of insecurity and risk that this approach appears to be leading to. Code running with almost unbounded privileges and not a whole lot of segregation. And the point appears to be to download and run code directly from external sources, which is diametrically opposed to what my employer wants.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 28, 2020
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. area/channels and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 3, 2020
@justinsb
Copy link
Member Author

So I think we've advanced part of the security posture here. I've hacked up the coredns operator a bit, but if we agree then I think we can upstream those changes.

The idea is that we can (and should) pre-create the RBAC objects that the addon is going to use: the ServiceAccount, the ClusterRole, the ClusterRoleBinding. Because ClusterRoles have to be inherited by the operator if they're going to be created, this doesn't actually cost us anything in terms of updates - updates that need new permissions would always require an operator update; that operator update now includes the new permissions.

This makes the operator now into an application-aware health-checker and configurer for the addon itself (primarily the deployment), which is really the goal.

I personally think this makes the addon operator story good from the point of view of "is every operator running with cluster-admin permissions" - the permissions look pretty reasonable to me... but please take a look @johngmyers !

Now I think there are two things that are not solved, but I think they might be better solved more generally (i.e. I think we should solve them, I'm not sure we should solve them for cluster-addons as they might apply everywhere):

  • How do we restrict usage of the ServiceAccount? i.e. if I can create a pod in kube-system, then can I use any ServiceAccount I find there, and act as it. I suspect I'm just missing something here. In general, it's an argument for more namespaces, we might have problems with DNS specifically because I imagine lots of people expect it to be called kube-dns in kube-system.

  • How do we prevent "non approved" images from running. Here I think kubernetes has a strong answer in ImagePolicyWebhook, but I don't know to what extent this is actually used.

It would be intriguing to actually combine the two, I think: I probably want much stricter controls on things running in kube-system than I do in some myapp application namespace.

I would propose that we proceed with the RBAC pre-creation, with this behind a feature-flag. I think if we treat the ability to modify CoreDNS objects as equivalent to being able to create Pods in kube-system then I don't think we're changing the security stance (?). CoreDNS objects are only mutable by cluster-admin, by default.

To try this out:

export KOPS_FEATURE_FLAGS="ClusterAddons,UseOperators"

kops create cluster addons.k8s.local --zones us-east-2a --channel file://channels/stable

kops edit cluster addons.k8s.local and add to spec:

  kubeDNS:
    provider: CoreDNS
kops update cluster addons.k8s.local --yes
kops export kubecfg addons.k8s.local --admin
kops validate cluster addons.k8s.local --wait=5m

@justinsb justinsb force-pushed the coredns_via_operator branch 2 times, most recently from 3db335d to 5371e5e Compare August 16, 2020 15:27
Copy link
Member

@johngmyers johngmyers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The direction seems promising.

I've seen Slack messages from two sites that have moved a bunch of stuff from kube-system to their own namespaces. Kubernetes has fortunately made kube-system a lot less special than it used to be.

The need for the kube-dns Service in kube-system does make CoreDNS particularly troublesome. To site it in a different namespace would require the operator to manually sync the Endpoints from the other namespace (but it would only need rights to Service and Endpoints in kube-system).

Running an admission controller to limit the image registries is pretty easy. Would the image references in these manifests go through remapping in the assets phase or is this an area where lack of templating would cause a problem?

resources:
- '*'
verbs:
- list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it's only list, but the apiGroups and resources are remarkably broad.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concur... I'll track it down upstream

- get
- list
- patch
- update
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is patch/update needed on coredns or is it sufficient to only have it on coredns/status? Does the operator really need to be able to write its own instructions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question - that would be a nice restriction

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I will take this up in the addon operator after this merges)

@justinsb
Copy link
Member Author

justinsb commented Sep 4, 2020

The need for the kube-dns Service in kube-system does make CoreDNS particularly troublesome. To site it in a different namespace would require the operator to manually sync the Endpoints from the other namespace (but it would only need rights to Service and Endpoints in kube-system).

I agree, and maybe we could overcome that (including as you described). I definitely think it's worth trying to really lock down the permissions as much as possible here, both because it'll be a bigger project to move it out of kube-system (not that I'm ruling it out!) and because we're likely to copy these permissions to other operators.

Running an admission controller to limit the image registries is pretty easy. Would the image references in these manifests go through remapping in the assets phase or is this an area where lack of templating would cause a problem?

So we can make sure that the operator manifests go through remapping (if they're not doing so already) but for the manifests the operators deploy, by default I'd imagine they are not going to be sourced from something kops-controlled. We could make them kops-controlled (indeed, that might not be a bad thing from a reliability and security point of view). I'm proposing here that we would mirror upstream manifests, not that we would ourselves take on the definitions of those manifests. I'm not sure if we would do that mirroring in the kops repo (or some other similar place) or whether we would have the CLI mirror on the fly. I'm leaning towards the mirror-on-the-fly approach.

The other option for image mirroring is that we use the image mirroring functionality that is in containerd. Last I checked this would still go to the primary location on a mirror "cache miss", which was not really what I wanted.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 4, 2020
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 4, 2020
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 28, 2020
@justinsb justinsb force-pushed the coredns_via_operator branch 3 times, most recently from 83c1790 to d3e8d0a Compare December 7, 2020 14:14
@justinsb
Copy link
Member Author

justinsb commented Dec 7, 2020

cc @johngmyers I've updated this so that the operator has much fewer permissions.... using a combination of name filtering and pre-creationg of RBAC. (Sadly RBAC name filtering doesn't work on create).

There's also #10381 which proposes a mode where we offer client-side expansion of these operators (i.e. there's no operator running on the server, but we're still able to pull manifests out of the kops binary)

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 14, 2020
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Mar 20, 2021
@justinsb justinsb changed the title WIP: First operator integration: CoreDNS First addon operator integration: CoreDNS Mar 20, 2021
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 20, 2021
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Mar 20, 2021

@justinsb: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kops-verify-cloudformation addaeac link /test pull-kops-verify-cloudformation
pull-kops-verify-hashes d3e8d0a link /test pull-kops-verify-hashes

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 21, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 21, 2021
@justinsb
Copy link
Member Author

So I'd like to suggest that we merge this and continue to improve it behind the feature-flag. It's particularly difficult to make changes here because we're working across multiple repos. As it stands, everything is behind the UseAddonOperators feature flag

Before removing it from the feature-flag, I/we will:

  • Split out the RBAC roles for the operator so that the operator has vastly fewer permissions
  • For other operators, try to run in dedicated namespaces (coredns is tricky because of the kube-dns service)
  • Support an optional no-operator mode, using the client-side operator evaluation technique in WIP: Demonstrate client-side mode for CoreDNS operator #10381.
  • Look into whether it's possible to produce a sandbox that doesn't require docker in the WIP: Demonstrate client-side mode for CoreDNS operator #10381 approach. Likely Linux only, but in particular would address the "run in a container" problem. (Maybe not a blocker for removing the feature-flag?)

What do you think @johngmyers ? (I think others are on board...)

@justinsb
Copy link
Member Author

In office hours on Friday, @johngmyers kindly agreed that we should move forward.

That said, I think i'm going to target this to the 1.22 branch. I don't think that's as far away as it sounds, with 1.20 likely going out in the next few days (#11021 being the last known blocker)

If anyone disagrees though, that gives people a bit more time!

/milestone v1.22

@k8s-ci-robot k8s-ci-robot added this to the v1.22 milestone Mar 27, 2021
Hidden behind a feature-flag, but when the UseAddonOperators feature
flag is set, we now use the cluster-addons CoreDNS operator instead of
our built-in manifests.
Copy link
Member

@hakman hakman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 25, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hakman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 25, 2021
@k8s-ci-robot k8s-ci-robot merged commit 6d65087 into kubernetes:master May 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api area/channels cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants