Pod configuration #950

tkashem · 2019-07-11T20:11:55Z

As an OLM user, I would like to set configuration for operator pods, so that they are persisted between operator upgrades.
https://jira.coreos.com/browse/OLM-1163

Pending issue:

Proxy.Status is not populated on the cluster yet. We read proxy env variable(s) off of Proxy.Status as per instruction.

TODO(s):

Pin to k8s 1.14 libs. Once chore(modules): pin k8s deps to 1.14 #954 is merged then do a rebase.
From the resolver we get the owner ( this is the csv that owns the deployment ). I need to get to the subscription that owns this csv to get the pod configuration. Right now, we are using a subscription lister to list all subscription(s) in the same namespace and then find the owner subscription by matching subscription.Status.InstalledCSV. We can use annotations/label to get to the subscription. This can be done in a separate PR as well.
Subscription status is not updated with the result of env injection operation. We can address this in a follow up PR.
Add e2e test to validate that env variable(s) in 'cluster' Proxy object propagates to the deployment.
OpenAPI validation for pod configuration
Add support for EnvFrom

tkashem · 2019-07-18T22:17:14Z

/retest

tkashem · 2019-07-19T05:30:29Z

/retest

tkashem · 2019-07-19T13:33:22Z

/retest

tkashem · 2019-07-19T13:38:16Z

pkg/controller/install/resolver.go

+		if r.ProxyInjectorBuilder != nil {
+			initializers = append(initializers, r.ProxyInjectorBuilder(owner))
+		}
+


This is a little bit of a hack for now, I am leaving it as a todo.

tkashem · 2019-07-19T13:58:18Z

pkg/lib/proxy/syncer.go

+			return
+		}
+
+		err = fmt.Errorf("global Proxy configuration not defined in '%s' - %v", globalProxyName, getErr)


For discussion: if an openshift cluster does not have the cluster proxy object then deployment initialization will fail and the operator install will fail as well.

I don't think it's required that an openshift cluster have a cluster proxy object, right? Just that if it is there, we should respect it?

Yes, this is much safer. And for a scenario where there was a cluster object and then it gets removed later - we will not find a cluster proxy object and generate a pod spec without the proxy env vars. This will cause a mismatch of the desired and observed and eventually update deployment(s) (originally created with the proxy env variable(s) ) to not have them any longer, correct?

Yep, that sounds right

tkashem · 2019-07-19T14:01:30Z

pkg/controller/operators/olm/envvar/config.go

+		if sub.Status.InstalledCSV == ownerCSV.GetName() {
+			return sub
+		}
+	}


It's a little hacky too! there should be a more straightforward way to get to the subscription from the csv. I am tracking it as a todo.

This is fine for now - we will eventually start storing the CSV data on the subscription so this will get fixed

tkashem · 2019-07-19T14:39:49Z

pkg/controller/operators/olm/envvar/inject.go

+
+func merge(containerEnvVars []corev1.EnvVar, newEnvVars []corev1.EnvVar) (merged []corev1.EnvVar) {
+	merged = containerEnvVars
+


Thoughts?

If an env variable that is being injected is already defined in the csv deployment spec, it will be overwritten.

For proxy env var, empty means unset and will not result in an env var. Can we say the same for all other env variable(s) in pod configuration?

I think we definitely need to support the OperatorConfig overwriting a "set" env var to "". If you're explicitly setting it in the operator config, it should take precedence.

For the proxy vars, the issue is that if any of them is set manually on the operator config, we ignore the global config, right?

yes, I agree on allowing the user to overwrite a "set" env var to ""
proxy env is an exception, if they are empty they are not injected. and if the user specifies any in subscription pod ( empty or not ) it overrides the proxy, we won't even query cluster proxy at that point.

Given this, we need to treat the proxy env differently, I will add that change ( maybe in the followup PR I am planning if I don't have enough time )

tkashem · 2019-07-19T15:48:42Z

/retest

tkashem · 2019-07-19T16:18:27Z

/test e2e-aws-olm

tkashem · 2019-07-19T16:44:12Z

/retest

tkashem · 2019-07-19T16:44:48Z

/retest

ecordell · 2019-07-19T17:23:25Z

pkg/controller/install/resolver.go

+
+		initializers := []DeploymentInitializerFunc{}
+		if r.ProxyInjectorBuilder != nil {
+			initializers = append(initializers, r.ProxyInjectorBuilder(owner))


nit: we use the convention elsewhere of Func suffixes, e.g. ProxyInjectorFunc(owner)

ecordell · 2019-07-19T17:25:30Z

pkg/controller/install/resolver.go

@@ -28,7 +28,9 @@ type StrategyResolverInterface interface {
 	InstallerForStrategy(strategyName string, opClient operatorclient.ClientInterface, opLister operatorlister.OperatorLister, owner ownerutil.Owner, annotations map[string]string, previousStrategy Strategy) StrategyInstaller
 }

-type StrategyResolver struct{}
+type StrategyResolver struct {
+	ProxyInjectorBuilder DeploymentInitializerFuncBuilder


Do we expect to vary this between StrategyResolver instances?

It might, depending on the pod configuration. We could also decide a static set of initializers some of which could be noop. Today there is only one for env.
Ideally we should have a layer that returns a chain of deployment initializer func based on the context. What we have today is a quick fix, we can improve the strategy resolver and deployment initializer. I have a task that tracks it.

ecordell · 2019-07-19T17:27:01Z

pkg/controller/operators/olm/envvar/initializer.go

+	merged = append(podConfigEnvVar, proxyEnvVar...)
+
+	if len(merged) == 0 {
+		d.logger.Debugf("no env var to inject into csv=%s", ownerCSV.GetName())


nit: we try to use WithLabel("csv", ownerCSV.GetName()).Debug() style when possible

ecordell · 2019-07-19T17:28:19Z

pkg/controller/operators/olm/envvar/inject.go

@@ -0,0 +1,61 @@
+package envvar


nit: we try to put helper packages that are not OLM-specific (like this) into pkg/lib

I thought the envvar package was specific to olm only. I have a proxy package that deals with openshift proxy type that is in pkg/lib.

ecordell · 2019-07-19T17:29:48Z

pkg/controller/operators/olm/envvar/inject.go

+
+func merge(containerEnvVars []corev1.EnvVar, newEnvVars []corev1.EnvVar) (merged []corev1.EnvVar) {
+	merged = containerEnvVars
+


I think we definitely need to support the OperatorConfig overwriting a "set" env var to "". If you're explicitly setting it in the operator config, it should take precedence.

For the proxy vars, the issue is that if any of them is set manually on the operator config, we ignore the global config, right?

ecordell · 2019-07-19T18:04:08Z

pkg/controller/operators/olm/operator.go

+	proxyAPIExists, err := proxy.IsAPIAvailable(discovery)
+	if err != nil {
+		op.logger.Errorf("error happened while probing for Proxy API support - %v", err)
+		return nil, err


this shouldn't error out if we're not on an openshift cluster, right?

It should not, err is set only when there is a connectivity error. check - 35a159e#diff-eb26d569f984271521cd371191e0b72bR30

ecordell · 2019-07-19T18:04:32Z

pkg/controller/operators/olm/operator.go

+
+	// setup proxy env var injection policies
+	discovery := config.operatorClient.KubernetesInterface().Discovery()
+	proxyAPIExists, err := proxy.IsAPIAvailable(discovery)


For the cluster operator API, we poll for a while to see if the api is ready. We should think about doing that here too.

yeah, in monitor we wait indefinitely since it'a an independently running go routine. For the operator we can retry - is 6 tries, every 10-second and then giving up a good default?

Seems reasonable to me - we probably want to go Degraded if this fails (and we're on openshift), but I think that can be a followup in the future.

ecordell · 2019-07-19T18:08:51Z

pkg/lib/proxy/querier.go

+)
+
+// DefaultQuerier does...
+func DefaultQuerier() Querier {


Default is fine, I think we use Noop for this type of object elsewhere...

ecordell · 2019-07-19T18:10:32Z

test/e2e/subscription_e2e_test.go

+
+	newConfigClient := func(t *testing.T) configv1client.ConfigV1Interface {
+		config, err := clientcmd.BuildConfigFromFlags("", *kubeConfigPath)
+		// require.NoErrorf(t, err, "could not create rest config: %s", err.Error())


nit: commented code

ecordell · 2019-07-19T18:15:41Z

test/e2e/subscription_e2e_test.go

+	expected := podEnv
+	expected = append(expected, proxyEnv...)
+
+	checkDeploymentWithPodConfiguration(t, c, csv, podConfig.Env)


tkashem · 2019-07-19T18:46:31Z

/retest

Use case: The env variables defined in cluster scoped global proxy configuration object should be injected into deployment spec if not overridden in subscription pod configuration. Add a new proxy package that provides the following support: - Add support to check if Proxy api is available on cluster. - Add support to query the cluster for the 'cluster' Proxy object and return the env variable(s). - Provide logic to determine whether proxy env variable has been overridden in pod configuration.

Add a deployment initializer function that can inject environment variable(s) specified in pod configuration of a subscription into deployment object. The following rules apply: - Proxy env variable(s) defined in 'cluster' Proxy object should be injected into deployment object(s). - If proxy env variable defined in pod configuration of a Subscription overrides 'cluster' Proxy object. More here - https://github.com/operator-framework/operator-lifecycle-manager/blob/master/Documentation/contributors/design-proposals/operator-config.md

ecordell · 2019-07-19T19:18:13Z

/lgtm

None of my comments above should block :)

openshift-ci-robot · 2019-07-19T19:18:16Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ecordell, tkashem

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ecordell]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

tkashem · 2019-07-19T20:21:55Z

/test e2e-aws-olm

ecordell · 2019-07-19T22:37:39Z

/retest

tkashem · 2019-07-19T22:44:45Z

/retest

openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 11, 2019

openshift-ci-robot requested review from alecmerdler and njhale July 11, 2019 20:12

openshift-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jul 11, 2019

tkashem force-pushed the pod-configuration branch 2 times, most recently from da3c683 to b2c0c2c Compare July 12, 2019 19:10

openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 12, 2019

tkashem force-pushed the pod-configuration branch 2 times, most recently from e95bb78 to 3ce1e3e Compare July 12, 2019 20:46

openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 12, 2019

tkashem force-pushed the pod-configuration branch 4 times, most recently from 116c13d to 3599114 Compare July 18, 2019 19:58

tkashem commented Jul 19, 2019

View reviewed changes

tkashem force-pushed the pod-configuration branch from 3599114 to c4056e6 Compare July 19, 2019 15:03

ecordell reviewed Jul 19, 2019

View reviewed changes

tkashem added 2 commits July 19, 2019 15:13

add operator configuration to subscription spec

414ffa9

tkashem added 2 commits July 19, 2019 15:13

(test) add e2e test

1797228

tkashem force-pushed the pod-configuration branch from c4056e6 to 1797228 Compare July 19, 2019 19:14

tkashem changed the title ~~[WIP] Pod configuration~~ Pod configuration Jul 19, 2019

openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 19, 2019

openshift-ci-robot assigned ecordell Jul 19, 2019

openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jul 19, 2019

openshift-merge-robot merged commit 80d425e into operator-framework:master Jul 20, 2019

awgreene mentioned this pull request Oct 10, 2019

feat(subscription) Config Volume and VolumeMounts #1058

Merged

secat mentioned this pull request Feb 19, 2020

Unable to override resources and tolerations for an operator deployment #1298

Closed


		func merge(containerEnvVars []corev1.EnvVar, newEnvVars []corev1.EnvVar) (merged []corev1.EnvVar) {
		merged = containerEnvVars

Pod configuration #950

Pod configuration #950

Conversation

tkashem commented Jul 11, 2019 • edited

tkashem commented Jul 18, 2019

tkashem commented Jul 19, 2019

tkashem commented Jul 19, 2019

Choose a reason for hiding this comment

tkashem Jul 19, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tkashem Jul 19, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tkashem commented Jul 19, 2019

tkashem commented Jul 19, 2019

tkashem commented Jul 19, 2019

tkashem commented Jul 19, 2019

ecordell Jul 19, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tkashem commented Jul 19, 2019

ecordell commented Jul 19, 2019

openshift-ci-robot commented Jul 19, 2019

tkashem commented Jul 19, 2019

ecordell commented Jul 19, 2019

tkashem commented Jul 19, 2019

tkashem commented Jul 11, 2019 •

edited

tkashem Jul 19, 2019 •

edited

tkashem Jul 19, 2019 •

edited

ecordell Jul 19, 2019 •

edited