-
Notifications
You must be signed in to change notification settings - Fork 1.8k
doc/proposals: update OLM integration proposal with OperatorGroup logic #2324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
0fd52c2
4be4309
cec81e4
d43c346
417e531
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,15 +3,13 @@ title: Neat-Enhancement-Idea | |
authors: | ||
- "@estroz" | ||
reviewers: | ||
- TBD | ||
- "@joelanford" | ||
- "@dmesser" | ||
approvers: | ||
- TBD | ||
- "@joelanford" | ||
- "@dmesser" | ||
creation-date: 2019-09-12 | ||
last-updated: 2019-09-12 | ||
last-updated: 2019-12-11 | ||
status: implementable | ||
see-also: | ||
- "./cli-ux-phase1.md" | ||
|
@@ -39,34 +37,49 @@ OLM is an incredibly useful cluster management tool. There is currently no integ | |
|
||
#### General | ||
|
||
* Operator developers can use `operator-sdk` to quickly deploy OLM on a given Kubernetes cluster | ||
* Operator developers can use `operator-sdk` to run their Operator under OLM | ||
* Operator developers can use `operator-sdk` to build a catalog/bundle containing their Operator for use with OLM | ||
- Operator developers can use `operator-sdk` to quickly deploy OLM on a given Kubernetes cluster | ||
- Operator developers can use `operator-sdk` to run their Operator under OLM | ||
- Operator developers can use `operator-sdk` to build a catalog/bundle containing their Operator for use with OLM | ||
|
||
#### Specific | ||
|
||
* `operator-sdk` creates a [bundle][bundle] from an Operator project to deploy with OLM | ||
* `operator-sdk` has a CLI interface to interact with OLM | ||
* `operator-sdk` installs a specific version of OLM onto Kubernetes cluster | ||
* `operator-sdk` uninstalls a specific version of OLM onto Kubernetes cluster | ||
* `operator-sdk` accepts a bundle and deploys that operator onto an OLM-enabled Kubernetes cluster | ||
* `operator-sdk` accepts a bundle and removes that operator onto an OLM-enabled Kubernetes cluster | ||
- `operator-sdk` creates a [bundle][bundle] from an Operator project to deploy with OLM | ||
- `operator-sdk` has a CLI interface to interact with OLM | ||
- `operator-sdk` installs a specific version of OLM onto Kubernetes cluster | ||
- `operator-sdk` uninstalls a specific version of OLM onto Kubernetes cluster | ||
- `operator-sdk` accepts a bundle and deploys that operator onto an OLM-enabled Kubernetes cluster | ||
- `operator-sdk` accepts a bundle and removes that operator from an OLM-enabled Kubernetes cluster | ||
|
||
### Non-Goals | ||
|
||
- Replicate mechanisms and abilities of OLM in `operator-sdk`. | ||
|
||
## Proposal | ||
|
||
### User Stories | ||
|
||
**TODO** | ||
|
||
Detail the things that people will be able to do if this is implemented. | ||
Include as much detail as possible so that people can understand the "how" of | ||
the system. The goal here is to make this feel real for users without getting | ||
bogged down. | ||
The following stories pertain to both upstream Kubernetes and OpenShift cluster types. | ||
|
||
#### Story 1 | ||
|
||
I should be able to install a specific version of OLM onto a cluster | ||
|
||
#### Story 2 | ||
|
||
I should be able to uninstall a specific version of OLM from a cluster | ||
|
||
#### Story 3 | ||
|
||
I should be able to deploy a specific version of an Operator using OLM and a bundle directory. | ||
|
||
#### Story 4 | ||
|
||
I should be able to remove a specific version of an Operator deployed using `operator-sdk` via OLM from a cluster. | ||
|
||
#### Story 5 | ||
|
||
I should be able to specify one or more [required manifests](#olm-resources) saved locally or have `operator-sdk` generate them from bundled data during deployment. | ||
|
||
### Implementation Details/Notes/Constraints | ||
|
||
Initial PR: https://github.com/operator-framework/operator-sdk/pull/1912 | ||
|
@@ -75,14 +88,126 @@ Initial PR: https://github.com/operator-framework/operator-sdk/pull/1912 | |
|
||
The SDK's approach to deployment should be as general and reliant on existing mechanisms as possible. To that end, [`operator-registry`][registry] should be used since it defines what a bundle contains and how one is structured. `operator-registry` libraries should be used to create and serve bundles, and interact with package manifests. | ||
|
||
The idea is to create a `Deployment` containing the latest `operator-registry` [image][registry-image] to initialize a bundle database and run a registry server serving that database using binaries contained in the image. The `Deployment` will contain volume mounts from a `ConfigMap` containing bundle files and a package manifest for an operator. Using manifest data in the `ConfigMap` volume source, the registry initializer can build a local database and serve that database through the `Service`. OLM-specific resources created by the SDK or supplied by a user, described below, will establish communication between this registry server and OLM. | ||
The idea is to create a `Deployment` containing the latest `operator-registry` [image][registry-image] to initialize a bundle database and run a registry server serving that database using binaries contained in the image. The `Deployment` will contain volume mounts from a `ConfigMap` containing bundle files and a package manifest for an Operator. Using manifest data in the `ConfigMap` volume source, the registry initializer can build a local database and serve that database through the `Service`. OLM-specific resources created by the SDK or supplied by a user, described below, will establish communication between this registry server and OLM. | ||
|
||
#### OLM resources | ||
|
||
OLM understands `operator-registry` servers and served data through several objects. A [`CatalogSource`][olm-catalogsource] specifies how to communicate with a registry server. A [`Subscription`][olm-subscription] links a particular CSV channel to a `CatalogSource`, indicating from which `CatalogSource` OLM should pull an Operator. Another OLM resource that _may_ be required is an [`OperatorGroup`][olm-operatorgroup], which provides Operator namespacing information to OLM; OLM creates two `OperatorGroup`'s by default, one of which can be used for globally scoped Operators. | ||
OLM understands `operator-registry` servers and served data through several objects. A [`CatalogSource`][olm-catalogsource] specifies how to communicate with a registry server. A [`Subscription`][olm-subscription] links a particular CSV channel to a `CatalogSource`, indicating from which `CatalogSource` OLM should pull an Operator. Another OLM resource that _may_ be required is an [`OperatorGroup`][olm-operatorgroup], which provides Operator namespacing information to OLM. OLM creates a globally-scoped `OperatorGroup` by default, which can be used for globally-scoped Operators. | ||
|
||
These resources can be created from bundle data with minimal user input. They can also be created from manifests defined by the user; however, the SDK cannot make guarantees that user-defined manifests will work as expected. | ||
|
||
#### OperatorGroups and tenancy requirements | ||
|
||
[`OperatorGroup`][olm-operatorgroup]'s configure CSV tenancy in multiple | ||
namespaces in a cluster. Each Operator must be a | ||
[member][olm-operatorgroup-membership] with one `OperatorGroup` resource in | ||
the cluster, which defines a set of namespaces the CSV can exist in. A CSV's | ||
`installModes` determine what [type][olm-operatorgroup-installmodes] of | ||
`OperatorGroup` it can be a member of. | ||
|
||
No two `OperatorGroup`'s can exist in the same namespace, and a CSV with | ||
membership in an `OperatorGroup` of a type it does not support (determined | ||
by `installModes`) will transition to a failure state. | ||
|
||
Given these rules and constraints, Operator developers may have a tough time | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The document from here on is not wrong, but I think it should probably do more to indicate that installmodes should describe the way the operator works and are not things that can be changed "after the fact" without rewriting/etc. If the operator starts up and watches all namespaces, it should be AllNamespace, and should go into an OperatorGroup that watches all namespaces. If the operator starts up and watches its own namespace, it should be OwnNamespace and go into an OperatorGroup that watches its own namespace only. If the operator starts up and watches a single namespace based on an env var, and that env var is wired up to project the Likewise, multinamespace mode and the configuration thereof is fundamental to the way the operator starts up and generating an operatorgroup with multiple namespaces will do nothing if the operator doesn't support watching n namespaces. Operators can also support one or all of these, depending on how it is written. A lot of this can be determined based on the properties of the operator itself:
The current proposal is fine and makes sense! But I'm wondering if there's a way that the sdk can "just know" what operatorgroup to make and what installmodes to make more directly, because of knowledge of what namespaces can be watched by the operator. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about establishing a convention that is based on the order of priority of the following conditions: If AllNamespace is supported, create an OperatorGroup that watches all namespaces. |
||
writing an `OperatorGroup` for their Operator initially. To assist them, | ||
`operator-sdk` should automate `OperatorGroup` "compilation" if one is not | ||
supplied. | ||
|
||
To perform compilation, the user can optionally supply the desired install | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What command is this option added to? I would imagine it's on the command that actually installs/runs? |
||
mode type by which the CSV is installed through an `--install-mode` flag, and | ||
the set of namespaces (may be all namespaces, `""`) in which the CSV will be | ||
installed. For example, `--install-mode=MultiNamespace=[ns1,ns2]` will create | ||
this `OperatorGroup`: | ||
```yaml | ||
apiVersion: operators.coreos.com/v1 | ||
kind: OperatorGroup | ||
metadata: | ||
name: my-group | ||
namespace: my-namespace | ||
labels: | ||
operator-sdk: true | ||
spec: | ||
targetNamespaces: | ||
- ns1 | ||
- ns2 | ||
``` | ||
|
||
The compilation algorithm is as follows: | ||
|
||
``` | ||
1. If an OperatorGroup manifest is supplied: | ||
1. Use the one supplied and return. | ||
Comment on lines
+139
to
+140
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are there use cases that will require an operator group to be supplied, or is it possible to always generate one with flags (or defaults)? I'm wondering if we can keep it simple and cover 90% of the use cases so that the CLI flag set doesn't explode. Can we wait to see if there's demand for supplying the operator group directly? EDIT: I kept reading. It sounds like there may be cases where this is needed for running an operator in an existing namespace that already has an operator group? If we say that that scenario is out of scope, would that simplify things? Would it kill a bunch of common use cases? @shawn-hurley @robszumski thoughts? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure how common it is to roll your own If I understand correctly, if |
||
2. Else if an OperatorGroup manifest is not supplied, compile an OperatorGroup g: | ||
1. If no installMode and set of namespaces is supplied: | ||
1. Initialize g as type OwnNamespace by setting g's targetNamespaces to the Operator's namespace, and return. | ||
2. Else if an installMode and set of namespaces is supplied: | ||
1. Validate the set of namespaces against the install mode's constraints and the Operator's namespace. | ||
2. Initialize g as the desired type with the set of namespaces and return. | ||
``` | ||
|
||
Managing `OperatorGroup` resources for multiple Operators _before_ deployment | ||
is attempted is a more complex problem, but prevents annoying-to-debug | ||
deployment issues that will occur in the following scenarios: | ||
|
||
- A user wants to deploy two or more Operators with CSV install modes | ||
incompatible for one `OperatorGroup` to handle in the same namespace. | ||
- A user wants to create an `OperatorGroup` in a namespace that already has | ||
an `OperatorGroup`. | ||
- The new and existing `OperatorGroup` namespace intersection is: | ||
- Equivalent to the set of new and existing namespaces (they have the | ||
same set). | ||
- The empty set (not intersecting). | ||
- A strict subset of either namespace set. | ||
|
||
A solution to these types of conflicts is the following two algorithms: | ||
|
||
Algorithm for creating an `OperatorGroup`: | ||
``` | ||
1. Follow the compilation algorithm above to create an OperatorGroup g. | ||
2. Determine whether an OperatorGroup exists in a given namespace n. | ||
3. If no OperatorGroup exists in n: | ||
1. If g was not compiled by operator-sdk: | ||
1. Label g with a static label to signify g was not created by operator-sdk. | ||
2. Else if g was created by operator-sdk: | ||
1. Label g with a static label to signify g was created by operator-sdk. | ||
3. Create g in n and return. | ||
4. Else if an OperatorGroup h exists in n: | ||
1. If h was not compiled by operator-sdk, return an error. | ||
2. Else if h was compiled by operator-sdk: | ||
1. Determine which CSV's are members of h, h's targetNamespaces hn, and g's targetNamespaces gn. | ||
2. If gn is equivalent to hn, return. | ||
3. Else if the intersection of gn and hn is the empty set or a subset of either: | ||
1. Label g with a static label to signify g was created by operator-sdk. | ||
2. Create g in another namespace m and return. | ||
``` | ||
|
||
Algorithm for deleting an `OperatorGroup`: | ||
``` | ||
1. Determine whether an OperatorGroup exists in a given namespace n. | ||
2. If no OperatorGroup exists in n, return. | ||
3. Else if an OperatorGroup g exists in n: | ||
1. If g is not labeled with an operator-sdk static label, return. | ||
2. Else if g is labeled with an operator-sdk static label: | ||
1. Determine the set of CSV's cs that are members of g. | ||
2. If cs is the empty set: | ||
1. Delete g and return. | ||
3. Else if cs is not the empty set, return. | ||
``` | ||
|
||
Notes on these algorithms: | ||
- Labeling allows `operator-sdk` to determine whether an `OperatorGroup` can | ||
be deleted; `OperatorGroup`'s not compiled by `operator-sdk` should not be | ||
deleted in any case. | ||
- An `OperatorGroup` not compiled by `operator-sdk` is considered a user- | ||
managed resource. All conflicts must be resolved by the user, so an error | ||
is returned if a non-compiled `OperatorGroup` is already present in a namespace. | ||
- Deleting an `OperatorGroup` associated with 1..N CSVs will cause those CSVs | ||
to transition to a failure state, so we should not delete if this is the case. | ||
|
||
[olm-operatorgroup-membership]: https://github.com/operator-framework/operator-lifecycle-manager/blob/1cb0681/doc/design/operatorgroups.md | ||
[olm-operatorgroup-installmodes]: https://github.com/operator-framework/operator-lifecycle-manager/blob/1cb0681/doc/design/operatorgroups.md | ||
|
||
#### Use of operator-framework/api validation | ||
|
||
Static validation is necessary for users to determine problems before deploying their Operator. As we all know, static bugs are usually more tractable than runtime bugs, especially those discovered in a live cluster. The [`operator-framework/api`][of-api] repo intends to house a validation library for static, and potentially runtime, validation. The SDK should use this library as the source of truth for the qualities of a valid OLM manifest. This repo is a work-in-progress, and should be used as soon as it is ready. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be a little more specific than "namespaces the CSV can exist in" - it defines the set of namespaces the operator defined in the CSV is permitted to operate over; OperatorGroups are largely about RBAC and cluster visibility.