Skip to content

Conversation

estroz
Copy link
Member

@estroz estroz commented Dec 12, 2019

Description of the change:

  • doc/proposals/sdk-integration-with-olm.md: document OperatorGroup handling logic

Motivation for the change: #1912 implements some of this logic, but a few corner cases need to be discussed. The algorithms added here should cover these corner cases.

/ping @ecordell @njhale @kevinrizza

@estroz estroz added olm-integration Issue relates to the OLM integration kind/documentation Categorizes issue or PR as related to documentation. labels Dec 12, 2019
@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 12, 2019
Comment on lines +123 to +124
1. If an OperatorGroup manifest is supplied:
1. Use the one supplied and return.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there use cases that will require an operator group to be supplied, or is it possible to always generate one with flags (or defaults)?

I'm wondering if we can keep it simple and cover 90% of the use cases so that the CLI flag set doesn't explode. Can we wait to see if there's demand for supplying the operator group directly?

EDIT: I kept reading. It sounds like there may be cases where this is needed for running an operator in an existing namespace that already has an operator group? If we say that that scenario is out of scope, would that simplify things? Would it kill a bunch of common use cases?

@shawn-hurley @robszumski thoughts?

Copy link
Member Author

@estroz estroz Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how common it is to roll your own OperatorGroup. Given info in a CSV and --install-mode we can create a valid OperatorGroup 100% of the time.

If I understand correctly, if h exists in namespace n, then we can create g in another namespace m. The CSVs will still be deployed according to targetNamespaces and no OperatorGroup conflict errors will occur. I will test this theory to make sure.

[`OperatorGroup`][olm-operatorgroup]'s configure CSV tenancy in multiple
namespaces in a cluster. Each Operator must be a
[member][olm-operatorgroup-membership] with one `OperatorGroup` resource in
the cluster, which defines a set of namespaces the CSV can exist in. A CSV's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be a little more specific than "namespaces the CSV can exist in" - it defines the set of namespaces the operator defined in the CSV is permitted to operate over; OperatorGroups are largely about RBAC and cluster visibility.

`operator-sdk` should automate `OperatorGroup` "compilation" if one is not
supplied.

To perform compilation, the user can optionally supply the desired install
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What command is this option added to? I would imagine it's on the command that actually installs/runs?

membership in an `OperatorGroup` of a type it does not support (determined
by `installModes`) will transition to a failure state.

Given these rules and constraints, Operator developers may have a tough time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The document from here on is not wrong, but I think it should probably do more to indicate that installmodes should describe the way the operator works and are not things that can be changed "after the fact" without rewriting/etc.

If the operator starts up and watches all namespaces, it should be AllNamespace, and should go into an OperatorGroup that watches all namespaces.

If the operator starts up and watches its own namespace, it should be OwnNamespace and go into an OperatorGroup that watches its own namespace only.

If the operator starts up and watches a single namespace based on an env var, and that env var is wired up to project the olm.targetNamespaces annotation from the deployment, then it should go into a SingleNamespace OperatorGroup.

Likewise, multinamespace mode and the configuration thereof is fundamental to the way the operator starts up and generating an operatorgroup with multiple namespaces will do nothing if the operator doesn't support watching n namespaces.

Operators can also support one or all of these, depending on how it is written.

A lot of this can be determined based on the properties of the operator itself:

  1. If it uses the downward API to read the namespace annotations, then it can potentially support Single and Multi namespace modes (the inverse might make more sense - if a deployment for an operator does not read from the annotations, then it cannot possible satisfy single or multi namespace installmode requirements).
  2. If the operator can only start up control loops per a configured namespace value (be that a specific namespace or AllNamespaces ""), then it cannot support MultiNamespace

The current proposal is fine and makes sense! But I'm wondering if there's a way that the sdk can "just know" what operatorgroup to make and what installmodes to make more directly, because of knowledge of what namespaces can be watched by the operator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about establishing a convention that is based on the order of priority of the following conditions:

If AllNamespace is supported, create an OperatorGroup that watches all namespaces.
If SingleNamespace is supported, create an OperatorGroup that watches the namespace the Operator is deployed in.
If OwnNamespace is supported, same as above.
If MultiNamespace is supported, same as above.

Copy link
Member

@joelanford joelanford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@estroz Where we did arrive on this? Code is merged correct? If so, let's make any necessary updates here and get it merged.

@estroz
Copy link
Member Author

estroz commented Jan 20, 2020

@joelanford right now the OperatorGroup logic describe here is not implmemented. A simple "create if not exists, delete if exists" algorithm is in master. I think we can hold off on adding this to the run --olm CLI until more important work is done since run --olm works fine without it, although in a limited fashion.

@ecordell's comment needs to be addressed, this proposal reorganized slightly, and implementation refactored to accommodate his suggestions.

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 20, 2020
@dmesser
Copy link
Contributor

dmesser commented Jan 21, 2020

How does the current command implementation work with OLM if it does not manage/test for OperatorGroups?

@estroz
Copy link
Member Author

estroz commented Jan 21, 2020

@dmesser
On operator-sdk run --olm, an OperatorGroup is created if one does not exist.
On operator-sdk cleanup --olm, the OperatorGroup created by ^ is deleted.

The implementation is very simple for now so we can add smarter logic after discussing it here.

@estroz
Copy link
Member Author

estroz commented Feb 18, 2020

More complex OperatorGroup handling logic is not currently a priority, as current logic (with better error handling/messages) is good enough for now. Closing this for now and will re-open when we bump priority.

/priority important-longterm

@openshift-ci-robot openshift-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Feb 18, 2020
@estroz estroz closed this Feb 18, 2020
@estroz estroz deleted the docs/update-olm-integration-proposal branch April 1, 2020 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/documentation Categorizes issue or PR as related to documentation. olm-integration Issue relates to the OLM integration priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants