Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP: [sig-cluster-lifecycle] addons via operators #746

Merged
merged 3 commits into from Mar 26, 2019

Conversation

@justinsb
Copy link
Member

commented Jan 28, 2019

No description provided.

@justinsb

This comment has been minimized.

Copy link
Member Author

commented Jan 28, 2019

cc @luxas @roberthbailey @timothysc

Got it live before the sig-cluster-lifecycle meeting... just!

@kubernetes/sig-cluster-lifecycle-pr-reviews

@justinsb justinsb force-pushed the justinsb:kep_addon_operators branch from 011b9f5 to 1c1dbae Jan 28, 2019

@neolit123
Copy link
Member

left a comment

this is great, thanks @justinsb

added some comments for discussion.
nothing blocking.

/assign @timothysc @fabriziopandini @luxas


## Implementation History

Addon Operator session given by jrjohnson & justinsb at Kubecon NA - Dec 2018

This comment has been minimized.

Copy link
@neolit123

neolit123 Jan 29, 2019

Member

link is here for folks that haven't seen it already:
https://www.youtube.com/watch?v=LPejvfBR5_w

This comment has been minimized.

Copy link
@atoato88

atoato88 Jan 29, 2019

Should add hyper link to this string? What do you think?

This comment has been minimized.

Copy link
@justinsb

justinsb Jan 29, 2019

Author Member

Will do - great call, thank you!

we would hope these would migrate to the various addon components, so we could
also just store these under e.g. cluster-api.

Unclear whether this should be a subproject?

This comment has been minimized.

Copy link
@neolit123

neolit123 Jan 29, 2019

Member

i guess it depends if code will be owned in a new repo and by which SIG.
if all the work can happen in k/k and kubebuilder then a subproject is not needed?

but i think this effort needs a WG and a meeting schedule.

This comment has been minimized.

Copy link
@roberthbailey

roberthbailey Jan 29, 2019

Member

Is it a WG or subproject? Maybe up for discussion tomorrow?

This comment has been minimized.

Copy link
@errordeveloper

errordeveloper Mar 7, 2019

Member

Was this discussed yet?

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 11, 2019

Author Member

I don't believe we have discussed - as these operators are code, ideally that code will be owned by the SIGs that own the thing being installed - the X-addon-operator should be developed alongside X. It's not clear if we end up with anything at the cluster-lifecycle scope long-term (particularly if we punt as much as possible to kubebuilder). As the intention is not to be stuck holding the code bag, maybe a working-group makes more sense?

But on the flip-side we likely want a repo to start developing the operators, until we can hand off ownership.

Can a working-group get a repo temporarily?

This comment has been minimized.

Copy link
@timothysc

timothysc Mar 12, 2019

Member

You can probably nix this statement now.

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

Thanks - nixed!

least-privilege. But it is not clear how to side-step this, nor that any of
the alternatives would be better or more secure.
* Investigate use of patching mechanisms (as seen in `kubectl patch` and
`kustomize`) to support advanced tailoring of addons. The goal here is to

This comment has been minimized.

Copy link
@neolit123

neolit123 Jan 29, 2019

Member

patching might make upgrades tricky.

This comment has been minimized.

Copy link
@errordeveloper

errordeveloper Mar 7, 2019

Member

Could you please clarify why?

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 11, 2019

Author Member

If the underlying structure changes dramatically the patches won't apply. Added a comment about this, and that we'll need some form of error state, and that a good addon won't change structure just for fun!

This comment has been minimized.

Copy link
@jwforres

jwforres Mar 19, 2019

Member

Is this referring to patching the addons that the operator is managing (the operand), like patching a Deployment?

In this context, patching opens the door to changes that could make the safety of upgrades unpredictable for the operator. Can they patch env vars, attached volumes, or command line arguments for the binary being run in the container? Will you limit what they can / can't patch?

The CRD is the safe API contract. Once the admin strays outside of that and starts patching the generated resources, the operator can only offer a best effort attempt to upgrade. What if a command line flag on the binary needs to be deprecated or changed? This isn't out of the question and has happened before with kube binaries. If the operator is in control and this feature is defined at the API level, then rollout of this flag change can be done smoothly by the operator. If an admin instead has patched the Deployment to use flags that the operator isn't managing, when the new version tries to roll out and the patch is reapplied with an invalid flag, the new pods are going to crashloop. This wouldn't have been caught at the patching step, the patch itself was not invalid, it wouldn't be caught until the middle of the upgrade when things start to fail.

authors:
- "@justinsb"
owning-sig: sig-cluster-lifecycle
reviewers:

This comment has been minimized.

Copy link
@roberthbailey

roberthbailey Jan 29, 2019

Member

We need some reviewers / approvers before we commit this. Maybe something to discuss tomorrow?

This comment has been minimized.

Copy link
@mattfarina

mattfarina Jan 29, 2019

Member

If this has not been discussed in the SIG already it should be. The SIG should look at who the approvers are (the chairs?)

The related sigs field should also be filled in. I would suggest SIG Architecture as one of them.

This comment has been minimized.

Copy link
@BenTheElder

BenTheElder Jan 29, 2019

Member

[This was raised to SIG Cluster Lifecycle in today's meeting]

This comment has been minimized.

Copy link
@jdumars

jdumars Mar 7, 2019

Member

can someone post a summary of where the SIG is at on this? There's no harm in merging and iterating.

This comment has been minimized.

Copy link
@timothysc

timothysc Mar 7, 2019

Member

We can chat about it during our next call.
/cc @justinsb

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 11, 2019

Author Member

Good point @mattfarina - I'd like to bring it to sig-architecture when we have something concrete to show them.

This comment has been minimized.

Copy link
@bgrant0607

bgrant0607 Mar 18, 2019

Member

SIG Arch is working to decentralize decision-making:
https://docs.google.com/document/d/1BlmHq5uPyBUDlppYqAAzslVbAO8hilgjqZUTaNXUhKM/edit#bookmark=id.7lj9uyhho0mu

I suggest discussing in SIG Cluster Lifecycle for now, and invite interested parties to attend.

we would hope these would migrate to the various addon components, so we could
also just store these under e.g. cluster-api.

Unclear whether this should be a subproject?

This comment has been minimized.

Copy link
@roberthbailey

roberthbailey Jan 29, 2019

Member

Is it a WG or subproject? Maybe up for discussion tomorrow?

@mattfarina
Copy link
Member

left a comment

The writeup here appears to be more like a scope or work rather than a targeted enhancement. For example, the goals state:

Explore the use of operators for managing addons

Could this be phrased more around the enhancement direction rather than the scope of the group?

authors:
- "@justinsb"
owning-sig: sig-cluster-lifecycle
reviewers:

This comment has been minimized.

Copy link
@mattfarina

mattfarina Jan 29, 2019

Member

If this has not been discussed in the SIG already it should be. The SIG should look at who the approvers are (the chairs?)

The related sigs field should also be filled in. I would suggest SIG Architecture as one of them.

coordination each piece of tooling must reinvent the wheel; we expect more
mistakes (even measured per cluster) in that scenario.

## Graduation Criteria

This comment has been minimized.

Copy link
@mattfarina

mattfarina Jan 29, 2019

Member

These graduation criteria are a bit squishy. If you're going to have them could you look at the in work template updates for some more guidance.

This comment has been minimized.

Copy link
@timothysc

timothysc Feb 1, 2019

Member

We talked on the sig meeting to help refine this.

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 11, 2019

Author Member

Thanks! I realized that graduation critieria are the alpha->beta->GA criteria. So I added some alpha criteria that are adoption based, and renamed the fuzzy graduation criteria to be "success criteria"

@@ -0,0 +1,181 @@
KEP: Addons via Operators

This comment has been minimized.

Copy link
@philoserf

philoserf Jan 29, 2019

All content has to follow the YAML front matter. See https://jekyllrb.com/docs/front-matter/

@timothysc
Copy link
Member

left a comment

So I need to ruminate on this, because it seems a bit heavy-weight.

On air-gapped, regulated environments they'd want tight control and explicit control.

/cc @ncdc - thoughts?

reviewers:
- TBD
approvers:
- TBD

This comment has been minimized.

Copy link
@timothysc

timothysc Feb 1, 2019

Member

Assign me as both reviewer and approver.
also
/assign @luxas @roberthbailey @timothysc


## Summary

We propose to use operators for managing cluster addons. Each addon will have

This comment has been minimized.

Copy link
@timothysc

timothysc Feb 1, 2019

Member

One thing I struggle with is: do we need a whole CRD for every addon. Or is kustomize + default bundle of YAML sufficient? I'm guessing this would depend on the addon.

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 11, 2019

Author Member

CRDs are supposed to be cheap, so I'd say we should have one for every addon - we do expect there will be per-addon-settings, and so a CRD-per-addon lets us actually use all this machinery we've invested so much in.

The hope is that we can make developing the controllers as easy as kustomize + default bundle of YAML though!

I'll add a comment to this effect

This comment has been minimized.

Copy link
@ncdc

ncdc Mar 12, 2019

Member

Does this mean we'll have e.g. /apis/addons.k8s.io/v1/kubeproxies and /apis/addons.k8s.io/v1/corednses and so on?

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

Yes (modulo naming and pluralization rules)

This comment has been minimized.

Copy link
@ncdc

ncdc Mar 18, 2019

Member

Does it seem a bit weird to do that for singleton addons?

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

Not really to my mind - can you clarify why it's weird?

I do think we should establish patterns for singletons (e.g. "name the instance default"). But I love the progress made on CRDs and think we should make use of them!

This comment has been minimized.

Copy link
@derekwaynecarr

derekwaynecarr Mar 18, 2019

Member

we have followed a convention similar to what @justinsb described. the operator basically only respects a single instance of a named cluster scoped CR for certain types of operators (not all).

This comment has been minimized.

Copy link
@ncdc

ncdc Mar 19, 2019

Member

A naming pattern like default works for me.

This comment has been minimized.

Copy link
@pablochacin

pablochacin Apr 11, 2019

Maybe I'm missing something here, but having a different CRD type for each oddon makes more difficult to create tools for generic operations, like listing all addons with a given status, or watch for the installation of addos? What approaches should be used in that cases?


* Extend kubebuilder & controller-runtime to make it easy to build operators for
addons
* Build addons for the primary addons currently in the cluster/ directory

This comment has been minimized.

Copy link
@timothysc

timothysc Feb 1, 2019

Member

Could we get more concrete here and enumerate the exact ones.

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 11, 2019

Author Member

I put as a proposed list CoreDNS & kubeproxy (needed for conformance), dashboard (demos well) and metrics-server (non-trivial), LocalDNS-Agent (useful). I'm not sure whether you would rather we were more or less ambitious here @timothysc - happy to go eithher way.


We expect the following functionality to be common to all operators for addons:

* A CRD per addon

This comment has been minimized.

Copy link
@timothysc

timothysc Feb 1, 2019

Member

I'd really like to see example(s) for common core ones, e.g. proxy.

This comment has been minimized.

Copy link
@errordeveloper

errordeveloper Mar 7, 2019

Member

Yes, some concrete examples would help a lot. I think there are a few classes of add-ons and each of the classes has certain characteristics that are are shared within the class.

Here is what I've been thinking about the classification:

One type of addons are the components of Kubernetes that are strongly coupled with the Kubernetes releases, such as kube-dns/coredns, kube-proxy. Beside version coupling, you normally wouldn't want to run multiples of these running at the same time.

A networking component is usually less strongly coupled to Kubernetes version, but it is more critical to cluster functionality. There is also not standard way of running multiple different networks.

Secondly, there are kinds of add-ons that provide functionality that is rather critical to a user who runs workloads on the given cluster, ingress controllers and storage drivers would fall into this category. Some of these things can be seen as operators.

There is also the class of add-ons that extends functionality even further, and are often a combination of components which may or may not be defined as standalone add-ons from one of the above categories also. Istio and knative would fall into this category.

Additionally, there is a class of workloads that depends on versions of Kubernetes, but doesn't provide additional functionality to the cluster directly, observability products, dashboards, deployment tools and things like security scanners will fall into this category.

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 11, 2019

Author Member

Added an example.

@errordeveloper you raise some good categorizations - some of these are going to be "apps". For example the nginx-ingress-controller isn't really tied to the k8s cluster lifecycle. We would expect those to be deployed using tools like helm or kustomize or kubectl.

I do think we should figure out what to do about singleton objects. I've been called them "default" in "kube-system", but there are many approaches and even I only passionately hate about half of them ;-)

I don't really know how to handle "pick one of a set" type operators (i.e. CNI). Operators do let us have better cleanup procedures, but realistically it's highly disruptive to switch. kops started labelling the addons so that we could prune the other ones, but most CNI providers leave some networking plumbing behind (e.g. iptables). We can't easily do a rolling-update in most cases because the network partitions between old-CNI and new-CNI. There are usually supporting infrastructure things e.g. opening firewall rules for IPIP for calico. My preference would be that we can set up the operators to do all they can, but realistically if we get a nice migration path for any CNI providers it'll be because the tooling put in special code to move between a particular transition. We at least have somewhere to put that code now!

This comment has been minimized.

Copy link
@errordeveloper

errordeveloper Mar 18, 2019

Member

For example the nginx-ingress-controller isn't really tied to the k8s cluster lifecycle.

AWS ALB ingress controller would only ever work in AWS and needs IAM permissions, so it's tied to a few properties. When it comes to lifecycle and Cluster API, provisioning of such IAM permissions becomes a lifecycle concern.

GKE has built-in ingress controller, and that's currently something GKE user selects as an add-on and it is managed by the add-on manager. This one is actually tied into lifecycle as it gets included during cluster creation, and cannot be removed easily as far as I know.

Also, Heptio Contour ships a CRD, and that has versions dependencies.

I agree that neither of these are tied into cluster lifecycle, but they have dependencies on cluster APIs, and provide vita functionality for many users.

We would expect those to be deployed using tools like helm or kustomize or kubectl.

Even if we are to say that some things like ingress controllers are seen as "apps with deep Kubernetes integration" and in some case dependencies on cloud resources associated with the cluster, how do we expect users to make this distinctions and find the right way of managing such things?

This comment has been minimized.

Copy link
@errordeveloper

errordeveloper Mar 18, 2019

Member

network partitions between old-CNI and new-CNI

Not the case with Weave Net, and I don't know when it is practically the case. I would expect any well-designed CNI addon to behave well.
If it is the case, I'd say it's addon-specific, and certainly justifies the need for an addon-specific operator.

There are usually supporting infrastructure things

Seems similar to IAM dependencies problem.

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

You're right that IAM provisioning likely brings it into cluster-scope; I don't think an ingress controller or a something like kube2iam are necessarily deal-breakers there.

I would like to avoid "deploying a CRD => cluster-lifecycle". I know there are versioning concerns, but I would hope we could come up with some reasonable patterns such that in practice helm or something could manage an app that includes CRDs. i.e. although technically there are problems with incompatible CRDs, by being careful about versions we can avoid them.

I think for users, we would expect their installation tooling (kubeadm, cluster-api etc) to include or reference a small set of cluster-addons. and for users to follow that lead. And I expect we'll naturally expect kubernetes component developers to identify whether they are really a cluster-addon or an app, and there will be natural pushback against inclusion from the installation tooling if they choose cluster-addon when they are really an app.

If the CNI providers are in fact capable of being switched, that's much easier! We can probably start with simple docs for how to turn on Weave and turn off Calico, and I'm sure Weave will write those docs, just as I'm sure Calico will write the doc in the other direction!

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Mar 18, 2019

Contributor

Who reviews and approves the CRD APIs for these operators? I.e. if kube owns "official DNS installation API", do the core Kubernetes API approvers have veto power over what features DNS can support? How do we determine what features are allowed to be added to the "official DNS installation API"?

We have a lot of experience with the shortcomings of domain-specific APIs like ingress, service load balancer, storage where multiple different implementations can have wildly different feature sets. What happens when the N+1 feature addition to DNS API gets NACKed by the maintainer of the "official DNS installation API"? Does someone create "unofficial DNS installation API"? Do we support random extensions? Do we pick a very hardline and encourage people to fork and write their own operators?

I like the idea of a simple DNS config API. I don't like the idea of the DNS config API becoming just another ingress API problem. I don't see how these APIs won't fall into that trap.

@bgrant0607

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Mar 18, 2019

Contributor

An alternate approach would be:

Someone goes and writes a damn good coredns operator that accepts lots of features. It's not part of the core project, but we make it really easy to install that operator. No API review in core is needed. Scope of Kube doesn't increase.

Since operators are controllers the best operators will be a simple deployment, rbac rules, and maybe a secret. Do we need to go beyond that?

This comment has been minimized.

Copy link
@bgrant0607

bgrant0607 Mar 18, 2019

Member

Do we need to distinguish the types of addons?

Several of the examples, such as kube-proxy and metrics server (https://github.com/kubernetes-incubator/metrics-server), are components developed as part of the Kubernetes project. Is there a reason why it wouldn't make sense for the owners of those components to also own the operator CRDs for those components?

I'd assume SIG Network would own the DNS-related operators.

On the simple DNS API: Offhand, I'd split the common parameters into the Ingress-like implementation-independent API (or just additional fields on some object[s]), and leave implementation-specific details in the operator CRDs. I haven't seriously looked at what such a split would look like, but candidate fields include those used to generate the default /etc/resolve.conf for containers and those documented as part of kube-dns and CoreDNS configuration:
https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/#configure-stub-domain-and-upstream-dns-servers

This comment has been minimized.

Copy link
@bgrant0607

bgrant0607 Mar 18, 2019

Member

Do we think we have an "official DNS installation API" today?

* Operators are declaratively driven, and can source manifests via https
(including mirrors), or from data stored in the cluster itself
(e.g. configmaps or cluster-bundle CRD, useful for airgapped)
* Operators are able to expose different update behaviours: automatic immediate

This comment has been minimized.

Copy link
@timothysc

timothysc Feb 1, 2019

Member

Ahh automatic updates...

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 11, 2019

Author Member

I can't say I would recommend it for prod clusters, but totally automated updates of e.g. CoreDNS would be a cool demo!

coordination each piece of tooling must reinvent the wheel; we expect more
mistakes (even measured per cluster) in that scenario.

## Graduation Criteria

This comment has been minimized.

Copy link
@timothysc

timothysc Feb 1, 2019

Member

We talked on the sig meeting to help refine this.

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Mar 7, 2019

@timothysc: GitHub didn't allow me to request PR reviews from the following users: justinsb.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

We can chat about it during our next call.
/cc @justinsb

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@timothysc
Copy link
Member

left a comment

/approve
/hold
Minor comments then fine to merge as provisional
/cc @luxas

We propose to use operators for managing cluster addons. Each addon will have
its own CRD, and users will be able to perform limited tailoring of the addon
(install/don’t install, choose version, primary feature selection) by modifying
the CR. The operator encodes any special logic (e.g. dependencies) needed to

This comment has been minimized.

Copy link
@timothysc

timothysc Mar 12, 2019

Member

the CRD.

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

Changing to "the instance of the CRD"

* Create patterns, libraries & tooling so that addons are of high quality,
consistent in their API surface (common fields on CRDs, use of Application
CRD, consistent labeling of created resources), yet are easy to build.
* Build addons for the basic set of components, acting as a quality reference

This comment has been minimized.

Copy link
@timothysc

timothysc Mar 12, 2019

Member

How does bundles fit under these goals?

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

So ideally each addon operator references a set of manifests somewhere, rather than baking them into the image, so we don't always have to update the operator (just when the underlying change is non-trivial).

Those manifests could be in any format really. They could be simple yaml files. I do think we should recommend a format.

If we actually pick bundles as our format, this gives us a bit of metadata when pulling from https, but because bundles are CRDs, this also gives us the ability to fetch those manifests from the cluster rather than relying on some https server, i.e. it's "half" of airgap support.

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Mar 18, 2019

Contributor

Doesn't that mean that the manifests themselves are the artifacts and the image should be embedded in them?

This comment has been minimized.

Copy link
@bgrant0607

bgrant0607 Mar 18, 2019

Member

Please not files on a disk that have to be rsync'ed.

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

Not sure I follow @smarterclayton - manifests do specify an image name, but I'm missing where you're going with that...

@bgrant0607 I don't think we would recommend files on a disk for any production configuration :-)

name: default
namespace: kube-system
spec:
clusterCidr: 100.64.0.0/10

This comment has been minimized.

Copy link
@timothysc

timothysc Mar 12, 2019

Member

And I'm guessing use of bundling allows you to customize this.

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

Yes; other approaches include the operator discovering it by looking for a cluster-api Cluster object, or by having cluster-api create this KubeProxy instance with clusterCidr set. The latter feels more straightforward, and I do like that it is explicit vs magic defaulting, but this might actually vary for different use-cases.

This comment has been minimized.

Copy link
@derekwaynecarr

derekwaynecarr Mar 21, 2019

Member

The KEP is proposing a subproject, it would be good to understand what the dependency orderings are for the project.

Is this orthogonal to the Cluster API?
Is it additive to any conformant Kubernetes cluster?
Can we get a clear statement on intended layering across the various subprojects?

This comment has been minimized.

Copy link
@derekwaynecarr

derekwaynecarr Mar 21, 2019

Member

The KEP implies it’s open for adoption by a plurality of tooling, but the comments make that less clear. Ideally the operator can discover it’s required context information independent of installation tooling choice?

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 21, 2019

Author Member

@derekwaynecarr the above is really just my speculatively enumerating some of the approaches an operator could take to get a value (clusterCidr). We do a good job of covering all the tooling in sig-cluster-lifecycle (mostly due to the eternal vigilance of @timothysc ), and so if you're able to attend the meeting then we can address your questions there I think - I'm not entirely sure what you mean by some of them. Even better, it would be great to have your participation in building the right thing!

This comment has been minimized.

Copy link
@derekwaynecarr

derekwaynecarr Mar 21, 2019

Member

@justinsb thanks for clarifying. I asked the question if this was decomposed as the cluster api subproject meeting yesterday was also discussing composition of its APIs moving forward. I think it’s right to handle this as a separate subproject and look forward to collaborating.

This particular manifest is pinned to `version: 1.14.4`. We could also
subscribe to a stream of updates with a field like `channel: stable`.

### Risks and Mitigations

This comment has been minimized.

Copy link
@timothysc

timothysc Mar 12, 2019

Member

I think test and verification for a release is also super important.

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

Good point - added a paragraph specifically for that. I think if we get to parity with the "embedded manifest" approach that we know and love, that'll help initially. And I think we unlock approaches that are a little more flexible than embedding manifests, but I don't think we need to dictate anything there.

we would hope these would migrate to the various addon components, so we could
also just store these under e.g. cluster-api.

Unclear whether this should be a subproject?

This comment has been minimized.

Copy link
@timothysc

timothysc Mar 12, 2019

Member

You can probably nix this statement now.

owning-sig: sig-cluster-lifecycle
reviewers:
- @luxas
- @roberthbailey

This comment has been minimized.

Copy link
@detiber

detiber Mar 15, 2019

Member

Is @roberthbailey still going to be a reviewer for this considering his role change?

## Proposal

This is the current plan of action; it is based on experience gathered and work
done for Google’s GKE-on-prem product. However we don’t expect this will

This comment has been minimized.

Copy link
@derekwaynecarr

derekwaynecarr Mar 18, 2019

Member

openshift has been exploring this space as well.

This comment has been minimized.

Copy link
@bgrant0607

bgrant0607 Mar 18, 2019

Member

Is there a summary of your learnings written down?

addons
* Build addons for the primary addons currently in the cluster/ directory, at
least including those required to bring up a conformant cluster. Proposed
list: CoreDNS, kube-proxy, dashboard, metrics-server, localdns-agent.

This comment has been minimized.

Copy link
@derekwaynecarr

derekwaynecarr Mar 18, 2019

Member

there are a number of existing operators in the world for these items.

https://github.com/openshift/cluster-dns-operator
https://github.com/openshift/cluster-monitoring-operator
https://github.com/openshift/cluster-ingress-operator

can we look at the set of operators that have already been developed and enable integration of those pieces? plenty of folks are successful building operators in this space that we should take into account for any design approach.

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

We'll absolutely look at what other people have done in OSS, and it would be great to have you collaborate with us.

This comment has been minimized.

Copy link
@kow3ns

kow3ns Mar 21, 2019

Member

I actually don't see why we couldn't incorporate those (with some SWE effort) as part of the swtich over from bash to CRDs.

@derekwaynecarr

This comment has been minimized.

Copy link
Member

commented Mar 18, 2019

cc @jwforres interested in your perspective here as well.

applications are generally recognized. We aim to bring the benefits of
operators to addons also.

### Goals

This comment has been minimized.

Copy link
@ecordell

ecordell Mar 18, 2019

We've been working on operator lifecycle manager for a couple of years, and it shares a lot of these same goals:

  • Define extensions to Kubernetes as operators managing CRDs (or extension apiservers)
  • Provide a consistent interface for describing operators and their interactions with the cluster and each other
  • Constrain the scope of where an operator is allowed to operate (e.g. cluster-scoped vs. namespace-scoped, something in between)
  • Prevent end-user privilege escalation via operators
  • Express dependencies between operators and their provided APIs
  • Provide a packaging and installation model that supports offline, air-gapped installs
  • Provide optional, automated updates across operators

There might be some places where OLM isn't perfectly aligned with what is needed for managing addons - but it seems pretty close, and we'd love to jumpstart this work by contributing what we've already done and iterating on it to meet the community's needs.

As a first step, we could try to prototype installing today's addons with OLM and see if there are any gaps?

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

Of course - we'd welcome your collaboration!

We expect the following functionality to be common to all operators for addons:

* A CRD per addon
* Common fields in spec that define the version and/or channel

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Mar 18, 2019

Contributor

What is a channel? Maintained and supported by the community?

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

A channel is what we're calling a stream of versions. Better names welcome, but I think channel is a good name that wise people wearing red hats use...

I think it would be up to a project e.g. CoreDNS whether to support a channel for their operator and how they would define it. But again, suggestions & recommendations are welcome.

* Build addons for the basic set of components, acting as a quality reference
implementation suitable for production use. We aim also to demonstrate the
utility and explore any challenges, and to verify that the tooling does make
addon-development easy.

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Mar 18, 2019

Contributor

I'd add another goal here:

API review, feature addition, and scope of the project should not drastically increases as a result of this KEP.

Or said another way:

We want the minimal solution which doesn't grow the scope of Kube BUT allows Kube like things to be trivial to add.

The possible APIs for Kube-like components is unbounded and cannot be scoped, so the more API surface we have here the greater the burden on the project.

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Mar 18, 2019

Contributor

@liggitt cause we discussed this recently

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

These are CRDs though - they aren't all going to be k8s APIs. We would prefer homogeneity, so we would welcome any input from you and @liggitt on what a "recommended" API should look like. We have a prototype where we defined a few common fields, which feels right - better to have version always at spec.version (I think we can agree). But whether it should be spec.version or spec.version.lock or something else entirely we could definitely use help on!

This comment has been minimized.

Copy link
@bgrant0607

bgrant0607 Mar 18, 2019

Member

These are good points.

I think they also apply elsewhere, such as the component config effort, exported metrics, Event contents, component log messages, and kubectl commands. We're accruing operational surface area over time.

I think the only way that can be addressed is by enabling the technical oversight of these areas to be distributed, such as by mentoring/shadowing, documenting policies and conventions, writing linters and compatibility tests, etc.

@bgrant0607

This comment has been minimized.

Copy link
Member

commented Mar 18, 2019

Some high-level background. (I haven't read the KEP yet.)

Add-on management is a long-standing issue:

It has been exacerbated by 2 developments:

  1. More cluster-lifecycle tools under the project. kube-up, kops, minikube, and kubeadm need and have add-on managers. (Well, we want to delete kube-up.) Whatever we use for end-to-end tests needs a mechanism also.
  2. An increasing amount of functionality expected by default not being built-in to the main components (apiserver, controller manager, scheduler, kubelet, kube-proxy), both due to adding new functionality via extensions and extracting existing functionality using new extension mechanisms (e.g., cloud providers, volume sources).

If we want Kubernetes releases to continue to be usable OOB, we at least need a reference implementation to run DNS, for instance.

I think the Operator approach is interesting because we do want Operators to run reliably and to be reconfigurable.

We do need to be careful about how/where the APIs are used. I'd expect managed services to not necessarily expose add-on management, for instance. And some essential functionality, notably DNS, has multiple implementations in use, so we may want to think about implementation-independent APIs for configuring commonly customized user-visible behaviors.

cc @thockin, since he's been interested in the distro vs. kernel question

@bgrant0607

This comment has been minimized.

tooling
* Useful: users are generally satisfied with the functionality of addon
operators and are not trying to work around them, or making lots of proposals /
PRs to extend them

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Mar 18, 2019

Contributor

I think this is a great success criteria and the one I'm skeptical about in the current scope.

Would we do an ingress controller operator? Which ingress controllers would it support? What options on those ingress controllers?

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

If these all have operators, I was imagining there would be an nginx-ingress operator, and a distinct gclb-ingress operator, aws-alb operator etc. Ideally component projects end up owning their operators.

I was not imagining we would try here to define a generic ingress operator that can install arbitrary ingress controllers - it doesn't feel very compatible with the idea of federating responsibility & of self-determination. It also feels like that would be a sig-network project.


```yaml
apiVersion: addons.sigs.k8s.io/v1alpha1
kind: KubeProxy

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Mar 18, 2019

Contributor

If I'm a network plugin, am I required to respect this if installed (even though calico / others might provide kube-proxy function natively)? How would a third party network plugin indicate they are ignoring this operator?

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

I don't believe the operator should be required to, no. These are declarations of user intent, and so if the user intent is unsatisfiable I believe the common pattern is to expose it under status if we know it won't work, or - often - simply not work correctly. kube-router might choose to check for kube-proxy and surface a status, for a better UX, but I don't think we would require that.

namespace: kube-system
spec:
clusterCidr: 100.64.0.0/10
version: 1.14.4

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Mar 18, 2019

Contributor

Who decides what versions can be used here? Can this drift from Kube versions, or is it required to be aligned?

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Mar 18, 2019

Contributor

Is something like a Calico network plugin able to dictate that they only work with certain versions?

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

Typically the addon itself will define its own numbering scheme. For the particular example of kube-proxy, today it is locked to k8s versions, but that isn't always going to be the case. CoreDNS & calico define their own versioning scheme - kube-proxy is more of a special case here.

I don't think we want to dictate that a calico operator define that it only work with some e.g. k8s versions. I suspect we'll come up with some recommendations for what an operator should do if it is running outside of the tested versions (e.g. surface a warning in status?) or running with a known-incompatible version (e.g. refuse to update & surface an error condition in status?).

What do you think we should do @smarterclayton ? What did OpenShift end up doing here?

clusterCidr: 100.64.0.0/10
version: 1.14.4
status:
healthy: true

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Mar 18, 2019

Contributor

Some things that status would have to represent that aren't covered here:

  1. dimensions of health - conditions, message, structured information for a consumer like Deployment
  2. whether the current version has been rolled out completely and if not, how long before it is

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

Agreed - we plan on developing these together as a community project. It would be great to have your help / input from people that worked on operators in OpenShift!

* Extend kubebuilder & controller-runtime to make it easy to build operators for
addons
* Build addons for the primary addons currently in the cluster/ directory, at
least including those required to bring up a conformant cluster. Proposed

This comment has been minimized.

Copy link
@bgrant0607

bgrant0607 Mar 18, 2019

Member

I think we need principles regarding which add-ons belong as part of the Kubernetes project. In the past, the project accepted many contributions, which then later became maintenance burdens. I'd suggest we only support add-ons for components developed as part of the project, and that the owners of those components should also own the reference add-ons.

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

Agree with the suggestion. Ideally the "cluster-addons" subproject ends up not "owning" any operators, but rather each addon operator is maintained by the team that writes the addon (on the theory that they know best how to operate their addon).

If we keep the operators sufficiently declarative, we can also make the use of the operator a tooling decision also.

affect most/all kubernetes clusters (even if we don’t mandate adoption, if we
succeed we hope for widespread adoption). We can continue with our strategies
that we use for core components such as kube-apiserver: primarily we must keep
the notion of stable vs less-stable releases, to stagger the risk of a bad

This comment has been minimized.

Copy link
@bgrant0607

bgrant0607 Mar 18, 2019

Member

I don't think we have the notion of stable vs less stable releases today. Individual components, like Core DNS, have become the default once they are considered GA.

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 18, 2019

Author Member

Right - I'm saying that we would expect that addon X might continue to support 1.1 for a while after introducing 1.2 to avoid the monoculture risk. They might choose to leverage channels for this, for example choosing to define a "stable" channel that stays on 1.1 for a while, while the "beta" channel moves to 1.2 until they feel it has been derisked.

Today this work is all on the kubernetes test & release teams, which feels very centralized.

these patterns.

We hope that components will choose to maintain their own operators, encoding
their knowledge of how best to operate their addon.

This comment has been minimized.

Copy link
@krmayankk

krmayankk Mar 18, 2019

Contributor

is the scope to also have all these managed k8s providers like gke/eks also move to this addon model ? Its currently a pain to not be able to modify easily the addons provided by default on the various providers . But at the same time, will CRD provide enough configurability or just need a way to expose all these addons to be modifiable by consumers. Every one has their unique requirements on what part of each addon they want to modify

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 21, 2019

Author Member

It's not a requirement for anyone to move to the model, but we would like to create something that tooling chooses to move to. Figuring out ways to support a balance of modification and control to enable that is in scope.

necessarily be directly applicable in the OSS world and we are open to change as
we discover new requirements.

* Extend kubebuilder & controller-runtime to make it easy to build operators for

This comment has been minimized.

Copy link
@jwforres

jwforres Mar 19, 2019

Member

This is one of the project goals for the operator SDK, bootstrapping new operator projects so that they will be set up to follow best practices. The SDK is based on the controller-runtime project.

least-privilege. But it is not clear how to side-step this, nor that any of
the alternatives would be better or more secure.
* Investigate use of patching mechanisms (as seen in `kubectl patch` and
`kustomize`) to support advanced tailoring of addons. The goal here is to

This comment has been minimized.

Copy link
@jwforres

jwforres Mar 19, 2019

Member

Is this referring to patching the addons that the operator is managing (the operand), like patching a Deployment?

In this context, patching opens the door to changes that could make the safety of upgrades unpredictable for the operator. Can they patch env vars, attached volumes, or command line arguments for the binary being run in the container? Will you limit what they can / can't patch?

The CRD is the safe API contract. Once the admin strays outside of that and starts patching the generated resources, the operator can only offer a best effort attempt to upgrade. What if a command line flag on the binary needs to be deprecated or changed? This isn't out of the question and has happened before with kube binaries. If the operator is in control and this feature is defined at the API level, then rollout of this flag change can be done smoothly by the operator. If an admin instead has patched the Deployment to use flags that the operator isn't managing, when the new version tries to roll out and the patch is reapplied with an invalid flag, the new pods are going to crashloop. This wouldn't have been caught at the patching step, the patch itself was not invalid, it wouldn't be caught until the middle of the upgrade when things start to fail.

### Risks and Mitigations

This will involve running a large number of new controllers. This will require
more resources; we can mitigate this by combining them into a single binary

This comment has been minimized.

Copy link
@jwforres

jwforres Mar 19, 2019

Member

I expect operator owners will want to lifecycle and release new versions of their operators independently from each other. Especially for operators that aren't tied to a particular kubernetes release.

Combining into a single operator binary for all addons also makes your RBAC concerns worse, now there is one operator with access to do everything all of the operands can do.

This comment has been minimized.

Copy link
@justinsb

justinsb Mar 21, 2019

Author Member

A fair point. I'm also unhappy about the way RBAC permissions "trickle down", where an operator needs to have all the permissions of the addon - I hope we can investigate ways to address that. Is there anything we can learn from from Openshift's work here?

Initial development of the tooling can probably take place as part of
kubebuilder

We should likely create a repo for holding the operators themselves. Eventually

This comment has been minimized.

Copy link
@jwforres

jwforres Mar 19, 2019

Member

I agree with @detiber A single repo for all operators doesn't allow the owners of the addons to build their operators where and how they want. If we know of addon operators that don't need to cycle exactly with a particular kubernetes release, a single repo doesn't seem like the right approach to enabling that scenario.

Define the API contract an operator must fulfill (status/condition reporting, how the operator says if it has rolled out the expected version, etc). Have tooling to make it easy for an operator to bootstrap a project to conform to that API contract.

@justinsb

This comment has been minimized.

Copy link
Member Author

commented Mar 21, 2019

@jwforres for some reason I can't reply to some of your comments inline.

For allowing patching, yes it's clear this is the "break glass" option and carries an element of "use at your own risk". The reason for it is that we don't want CRDs that redefine every field in the manifest, lest we end up defining a meta-language in the CRD. The hope is that supporting something like patching allows the CRD to be smaller and still cover the vast majority of users, with the remaining users able to use patching and not be blocked. But also it's much more realistic to support a relatively small surface area API. A large surface area API over an addon with breaking changes will also have breaking changes.

For flags specifically, you're absolutely right that they are brittle - and there's complementary work to establish componentconfig as the recommended approach instead (the component work being done by @luxas @mtaufen @sttts et al)

For the one repo vs multiple repos question, if there are projects that want to adopt the patterns right away, that's great. I was suspecting that we would have to build a few in a single repo, before persuading the projects to move the code into their repo/repos. But I could be wrong, and I welcome the optimism :-)

By creating a CRD per addon, we are able make use of the kubernetes API
machinery for those per-addon options.

We will create tooling to make it easy to build addon operators that follow the

This comment has been minimized.

Copy link
@kow3ns

kow3ns Mar 21, 2019

Member

Should we just upstream this work into controller runtime so that both kubebuilder and OperatorKit can leverage it? Or do you have more specific work in mind i.e something like AddOn(builder/kit)?

Addons are components that are managed alongside the lifecycle of the cluster.
They are often tied to or dependent on the configuration of other cluster
components. Management of these components has proved complicated. Our
existing solution in the form of the bash addon-manager has many known

This comment has been minimized.

Copy link
@kow3ns

kow3ns Mar 21, 2019

Member

So I like the idea of having add on manager replace the bash scripts. I think it generally improves our cluster turn up story for users of OSS kubernetes that do not have the benefit of leveraging a cloud provider service for their cluster management.

addons
* Build addons for the primary addons currently in the cluster/ directory, at
least including those required to bring up a conformant cluster. Proposed
list: CoreDNS, kube-proxy, dashboard, metrics-server, localdns-agent.

This comment has been minimized.

Copy link
@kow3ns

kow3ns Mar 21, 2019

Member

I actually don't see why we couldn't incorporate those (with some SWE effort) as part of the swtich over from bash to CRDs.

@k8s-ci-robot k8s-ci-robot added the lgtm label Mar 26, 2019

@luxas

luxas approved these changes Mar 26, 2019

Copy link
Member

left a comment

/lgtm

As discussed and approved in today's SIG Cluster Lifecycle meeting, we're merging this KEP as provisional now, and will follow up in the new cluster-addons subproject of SIG Cluster Lifecycle that was formed to discuss and own the direction of this KEP.

The repo has been set up: https://github.com/kubernetes-sigs/addon-operators
The upcoming meetings have been announced: https://groups.google.com/d/msg/kubernetes-sig-cluster-lifecycle/GA2q8BYXkTo/lg9NWWyAAwAJ
If you want to discuss the implementation details that are still outstanding from this KEP as-is, join us for the meetings starting: Friday, 5th April 10:00 PDT
Meeting notes: https://docs.google.com/document/d/10_tl_SXcFGb-2109QpcFVrdrfnVEuQ05MBrXtasB0vk/edit#
Slack channel: #cluster-addons

Thanks @dholbach and @justinsb for doing the logistics to get this effort started!
See you on next Friday!

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: justinsb, luxas, timothysc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@luxas

This comment has been minimized.

Copy link
Member

commented Mar 26, 2019

/hold cancel

@k8s-ci-robot k8s-ci-robot merged commit e9f9f97 into kubernetes:master Mar 26, 2019

3 checks passed

cla/linuxfoundation justinsb authorized
Details
pull-enhancements-verify Job succeeded.
Details
tide In merge pool.
Details

@kalbir kalbir referenced this pull request Jul 9, 2019

Merged

Second minor release (0.2.0) #992

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.