New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WRKLDS-728: Capabilities: drop build/apps APIService when capabilities are not enabled #532
WRKLDS-728: Capabilities: drop build/apps APIService when capabilities are not enabled #532
Conversation
@ingvagabund: This pull request references WRKLDS-728 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
d54a619
to
e0b42d7
Compare
e0b42d7
to
fe768dd
Compare
fe768dd
to
f5218fd
Compare
f5218fd
to
09d986a
Compare
/retest-required |
1 similar comment
/retest-required |
70ebfd5
to
96e3280
Compare
c1b2527
to
dc8a55f
Compare
/retest-required |
} | ||
|
||
func apiServicesReferences() []configv1.ObjectReference { | ||
ret := []configv1.ObjectReference{} | ||
for _, apiService := range apiServices() { | ||
ret = append(ret, configv1.ObjectReference{Group: "apiregistration.k8s.io", Resource: "apiservices", Name: apiService.Spec.Version + "." + apiService.Spec.Group}) | ||
for _, apiService := range apiServiceGroupVersions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this doesn't have to operate only on the enabled services?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apiServicesReferences()
is consumed by .WithClusterOperatorStatusController
which uses the references to build a list of related objects. In case a reference has no relevant object, it gets ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, it looks like this is purely informational. It doesn't have any impact on the core logic.
pkg/operator/starter.go
Outdated
{Group: "security.openshift.io", Version: "v1"}, | ||
{Group: "template.openshift.io", Version: "v1"}, | ||
func apiServices(clusterVersionInformer configinformersconfigv1.ClusterVersionInformer) ([]*apiregistrationv1.APIService, []*apiregistrationv1.APIService, error) { | ||
clusterVersion, err := clusterVersionInformer.Lister().Get("version") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should either change the NewAPIServiceController
controller or this function. Otherwise we cannot guarantee the clusterVersionInformer
will be synced. Am I right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to that I think that we should add a reactor to the NewAPIServiceController
to catch changes to clusterVersionInformer
. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apiServices
can have various implementations so any change needs to happen inside apiService
.
Are you suggesting to invoke WaitForCacheSync
? Checking https://github.com/openshift/cluster-openshift-apiserver-operator/blob/master/pkg/operator/starter.go#L368-L374 noone of the informers is waiting for caches to be fully synced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should add clusterVersionInformer
to WithInformers
here https://github.com/openshift/library-go/blob/75c7d51fc4155264ba71551d22d0a1968b7bf989/pkg/operator/apiserver/controller/apiservice/apiservice_controller.go#L66 because
// WithInformers is used to register event handlers and get the caches synchronized functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and maybe we should change GetAPIServicesToMangeFunc
to accept clusterVersionInformer configinformersconfigv1.ClusterVersionInformer
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternative would be to add WaitForCacheSync
to the apiServices
function and somehow trigger the controller to call apiServices
when clusterVersionInformer
changes/get an event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should add clusterVersionInformer to WithInformers here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for passing informers as the list of managed api services may have many different implementation which do not need clusterVersionInformer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, thanks.
configMap := resourceread.ReadConfigMapV1OrDie(v311_00_assets.MustAsset("v3.11.0/openshift-apiserver/cm.yaml")) | ||
defaultConfig := v311_00_assets.MustAsset("v3.11.0/config/defaultconfig.yaml") | ||
|
||
clusterVersion, err := clusterVersionInformer.Lister().Get("version") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same question here. Should we guarantee that the clusterVersionInformer is synced?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And whether we need to add a reactor to the controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's where informer passed in dc8a55f#diff-0d623dfd885adb20f991bda4c2453aebd732ca6dbb4d1d4be6e79805c3b48de6R243 takes affect. Everytime a cluster version changes, OpenShiftAPIServerWorkload.Sync triggers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, so it looks like we don't need an informer, we need a Lister here :) Please change it to a Lister.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
requiredConfigMap, _, err := resourcemerge.MergePrunedConfigMap( | ||
&openshiftcontrolplanev1.OpenShiftAPIServerConfig{}, | ||
configMap, | ||
"config.yaml", | ||
nil, | ||
defaultConfig, | ||
bytes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have a unit test to cover merging ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PerGroupOptions: []openshiftcontrolplanev1.PerGroupOptions{}, | ||
}, | ||
} | ||
if knownCaps.Has(configv1.ClusterVersionCapabilityBuild) && !capsEnabled.Has(configv1.ClusterVersionCapabilityBuild) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you have a test that will test the disabling of this APIs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should have an e2e test for it, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating/adding e2es is the next step once this PR and openshift/cluster-openshift-controller-manager-operator#291 are merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where are you planing to add that test ? will it be this repo or in the origin ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am planning origin as we need to check for disabled controllers as well. E.g. read the config or parse the controllers logs. On the other hand we first need to update the origin tests to react on missing API so other tests (e.g. running in parallel) for Builds/DCs do not fail. Ultimately, each operator repo might have its own version of the e2e. The e2e (going here or into origin) will need a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally if we could add a test here and then also run it from the origin repo. Are you planing to add the test before we ship 4.14 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. All tests need to land before 4.14 ships.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I would like to avoid shipping a feature without the tests. Thanks.
} | ||
|
||
if knownCaps.Has(configv1.ClusterVersionCapabilityDeploymentConfig) && !capsEnabled.Has(configv1.ClusterVersionCapabilityDeploymentConfig) { | ||
klog.Infof("Capability %q not enabled, disabling 'openshift.io/apps' controller", configv1.ClusterVersionCapabilityDeploymentConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't this spam logs on every Sync
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will increase the log level to 4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
/retest-required |
The last two e2e-aws-own failed due to empty openshift-apiserver audit logs All OA instances are reporting bunch of:
|
… capabilities are not enabled Unless a capability is unknown, install APIService object and run apiserver for corresponding API only when enabled.
dc8a55f
to
7113150
Compare
configInformers.Config().V1().ClusterVersions().Informer() | ||
|
||
ctx := context.Background() | ||
configInformers.Start(ctx.Done()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and having a lister instead of an informer will simplify this unit tests :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
apiServersConfig.APIServers.PerGroupOptions = append(apiServersConfig.APIServers.PerGroupOptions, openshiftcontrolplanev1.PerGroupOptions{Name: openshiftcontrolplanev1.OpenShiftAppsAPIserver, DisabledVersions: []string{"v1"}}) | ||
} | ||
|
||
bytes, err := json.Marshal(apiServersConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mhm, should we be passing a yaml ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
t.Fatal(err) | ||
} | ||
|
||
cm, err := kubeClient.CoreV1().ConfigMaps("openshift-apiserver").Get(ctx, "config", metav1.GetOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we check kubeClient.Actions()
instead ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually do this to check what resource would be sent to the server and how many times. The Get method reads the state stored in memory, which may not necessarily correspond to the resource that would be sent to the server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like https://github.com/openshift/openshift-apiserver/blob/064c2d0ef0ecaeda2bcc4387eaaa7258cee5adcf/pkg/project/apiserver/registry/project/proxy/proxy_test.go#L96-L101? Is the goal here to add just a test for Matches("update", "configmaps")
? Or, to more inspect the corresponding action? Do you have an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually we verify the verbs (https://github.com/openshift/library-go/blob/abaa55e36ec42bcade37586fbeabfb5bf48cd5b8/pkg/operator/encryption/controllers/key_controller_test.go#L389)
And the content of the resource as well (https://github.com/openshift/library-go/blob/abaa55e36ec42bcade37586fbeabfb5bf48cd5b8/pkg/operator/encryption/controllers/key_controller_test.go#L302)
I think we could check both here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
for _, tc := range testCases { | ||
t.Run(tc.name, func(t *testing.T) { | ||
|
||
kubeClient := fake.NewSimpleClientset( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please rename it to fakeKubeClient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestOperatorConfigProgressingCondition uses kubeClient well. Do you wanna rename it on other test(s) as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, if this is not a problem for you that would be fine :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} | ||
|
||
config := openshiftcontrolplanev1.OpenShiftAPIServerConfig{} | ||
if err := json.NewDecoder(bytes.NewBuffer([]byte(cm.Data["config.yaml"]))).Decode(&config); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oas
uses a versioned decoder. Could we do the same here ?
scheme := runtime.NewScheme()
utilruntime.Must(openshiftcontrolplanev1.Install(scheme))
codecs := serializer.NewCodecFactory(scheme)
obj, err := runtime.Decode(codecs.UniversalDecoder(openshiftcontrolplanev1.GroupVersion, configv1.GroupVersion), configContent)
if err != nil {
return err
}
config := obj.(*openshiftcontrolplanev1.OpenShiftAPIServerConfig)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
name string | ||
knownCapabilities []configv1.ClusterVersionCapability | ||
enabledCapabilities []configv1.ClusterVersionCapability | ||
expectedPerGroupOptions []openshiftcontrolplanev1.PerGroupOptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to validate the whole config and not only PerGroupOptions
?
assuming that other fields remain unchanged, this shouldn't require much more effort
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
5361ad6
to
2b7de92
Compare
} | ||
|
||
if knownCaps.Has(configv1.ClusterVersionCapabilityDeploymentConfig) && !capsEnabled.Has(configv1.ClusterVersionCapabilityDeploymentConfig) { | ||
klog.V(4).Infof("Capability %q not enabled, disabling 'openshift.io/apps' controller", configv1.ClusterVersionCapabilityDeploymentConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mhm, I still think it will be logged on every sync. What is the value of spamming the log file with this information ? Ideally if we could log it just once, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need a new condition ? Does the service controller set a condition ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With log 4 it is good to know the operator properly reads the capabilities. Otherwise, I'd need to get the CM, decode it, read the list of disabled apiservers and compare it. Too much additional code for just a single log line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API service controller sets the new Degraded condition for API service endpoints until the disabled APIService objects are deleted. The test is reported through the already existing conditions for operands.
@ingvagabund: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
2b7de92
to
6bfe2e1
Compare
6bfe2e1
to
2cc5428
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ingvagabund, tkashem The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Unless a capability is unknown, install APIService object for corresponding API only when enabled.