Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1965283: Static Resources Controller for Sync #216

Merged

Conversation

adambkaplan
Copy link
Contributor

@adambkaplan adambkaplan commented Jun 1, 2021

  • Refactor ocm-o to use the static resources controller to sync static
    YAML assets.
  • Add infomers for cluster-scoped objects, openshift-infra, and
    kube-system as a namespace to use an informer to fetch core k8s
    objects.
  • Use cached ConfigMapsGetter in the main workload sync controller
  • Don't set unmanaged/removed messages. The status controller is now
    responsible for setting the cluster operator states for managed/
    unmanaged. The existing code now collides with the new static
    resource sync controller.
  • Increase rate limit for k8s API calls. This ensures the static
    resource sync controller is not throttled when reconciling RBAC
    objects.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 1, 2021
@openshift-ci openshift-ci bot requested review from deads2k and soltysh June 1, 2021 18:56
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 1, 2021
- Update openshift/library-go to latest on master (b9c317a319e0)
- go mod vendor, add static resource sync controller
@adambkaplan adambkaplan changed the title WIP - fix Upgradeable = "Unknown" condition Bug 1965283: Static Resources Controller for Sync Jun 2, 2021
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 2, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 2, 2021

@adambkaplan: This pull request references Bugzilla bug 1965283, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.8.0) matches configured target release for branch (4.8.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @wewang58

In response to this:

Bug 1965283: Static Resources Controller for Sync

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Jun 2, 2021
@openshift-ci openshift-ci bot requested a review from wewang58 June 2, 2021 17:23
Copy link
Contributor Author

@adambkaplan adambkaplan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/assign @gabemontero

I believe you had last touched the bits around the status reporting - this was changed a bit with the addition of the new controller.

@@ -51,7 +50,7 @@ func NewOpenShiftControllerManagerOperator(
targetImagePullSpec string,
operatorConfigInformer operatorinformersv1.OpenShiftControllerManagerInformer,
proxyInformer configinformerv1.ProxyInformer,
kubeInformersForOpenshiftControllerManager informers.SharedInformerFactory,
kubeInformers v1helpers.KubeInformersForNamespaces,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use this struct with the static resource sync controller and the resource sync controller, too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adambkaplan - would it be possible to specify the repos(s) / packages(s) where the static resource sync controller and resource sync controller are located? I'm guessing they are in library-go, but some quick scans are not proving fruitful to me.

I'd like to do some cross referencing as part of the review.

Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that is it

if _, err := c.operatorConfigClient.OpenShiftControllerManagers().UpdateStatus(context.TODO(), operatorConfig, metav1.UpdateOptions{}); err != nil {
return err
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The status sync controller takes care of the CVO status reporting for us if the operator moves to Unamanged or Removed.

if highRateLimitProtoKubeConfig.Burst < 100 {
highRateLimitProtoKubeConfig.Burst = 100
}
kubeClient, err := kubernetes.NewForConfig(highRateLimitProtoKubeConfig)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The static resource sync controller unfortunately doesn't use listers when reconciling RBAC objects.
This is problematic for ocm because we reconcile a lot of roles/clusterroles and their role bindings.
Long term we should be the ones to improve this in library-go, but for the interim we can increase the QPS and reduce the instances where the sync controller is throttled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, a commit specific link to the file in question might help us to quickly remember where to go if we ever get to this TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if I update the comment to use StaticResourceController then we should be ok (it is initialized later in this method).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I'm not following how updating the comment to use StaticResourceController pertains to my question around confirming where the shortcoming in library-go is and ask to somehow point to the location in that code for future reference

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comments - hopefully things are a little bit clearer.

Comment on lines +57 to +64
"",
util.TargetNamespace,
util.OperatorNamespace,
util.UserSpecifiedGlobalConfigNamespace,
util.InfraNamespace,
metav1.NamespaceSystem,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New additions to this list:

  • Cluster scoped ("")
  • openshift-infra
  • kube-system

@@ -105,10 +126,40 @@ func RunOperator(ctx context.Context, controllerConfig *controllercmd.Controller
controllerConfig.EventRecorder,
)

staticResourceController := staticresourcecontroller.NewStaticResourceController(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a simple controller that takes a bag of YAMLs and applies them, and reports status to the operator config object's status. That gets rolled up to the CVO object via the status sync controller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the TODO here to make it clear where in library-go we need to add a future enhancement.

Copy link
Contributor

@gabemontero gabemontero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice pulling in a relatively newer library-go function @adambkaplan

Minor comment / constant suggestions around the parameter for cluster scoped and the future TODO wrt rbac listers in the static resource controller. Otherwise just a clarifying question or two.

And per you comment that I was last in this code, I certainly did not see anything that disrupted my recent Upgradeable change.

@@ -39,15 +53,22 @@ func RunOperator(ctx context.Context, controllerConfig *controllercmd.Controller
return err
}

kubeInformers := v1helpers.NewKubeInformersForNamespaces(kubeClient, util.TargetNamespace, util.OperatorNamespace, util.UserSpecifiedGlobalConfigNamespace)
kubeInformers := v1helpers.NewKubeInformersForNamespaces(kubeClient,
"",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's put a comment in code to indicate this means cluster scoped, as you clarified in the PR comment 09eb355#r644180187

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or go with #216 (comment) instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a comment to say "empty means cluster-scoped"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@@ -51,7 +50,7 @@ func NewOpenShiftControllerManagerOperator(
targetImagePullSpec string,
operatorConfigInformer operatorinformersv1.OpenShiftControllerManagerInformer,
proxyInformer configinformerv1.ProxyInformer,
kubeInformersForOpenshiftControllerManager informers.SharedInformerFactory,
kubeInformers v1helpers.KubeInformersForNamespaces,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if highRateLimitProtoKubeConfig.Burst < 100 {
highRateLimitProtoKubeConfig.Burst = 100
}
kubeClient, err := kubernetes.NewForConfig(highRateLimitProtoKubeConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forceRollout = true
}
}
// TODO - use labels/annotations to force a daemonset rollout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC this is a different TODO comment from what you had in your last PR in the same spot, and this is a TODO comment you want to keep, correct @adambkaplan ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - we are passing forceRollout to a method that has been deprecated. We should use labels or annotations to drive rollouts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the confirmation

@@ -6,6 +6,7 @@ const (
MachineSpecifiedGlobalConfigNamespace = "openshift-config-managed"
TargetNamespace = "openshift-controller-manager"
OperatorNamespace = "openshift-controller-manager-operator"
InfraNamespace = "openshift-infra"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative to my comment at 09eb355#diff-0d623dfd885adb20f991bda4c2453aebd732ca6dbb4d1d4be6e79805c3b48de6R57 we could add a constant here named something like ClusterScoped that is set to the emtpy string ""

if highRateLimitProtoKubeConfig.Burst < 100 {
highRateLimitProtoKubeConfig.Burst = 100
}
kubeClient, err := kubernetes.NewForConfig(highRateLimitProtoKubeConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, a commit specific link to the file in question might help us to quickly remember where to go if we ever get to this TODO

- Refactor ocm-o to use the static resources controller to sync static
  YAML assets.
- Add infomers for cluster-scoped objects, openshift-infra, and
  kube-system as a namespace to use an informer to fetch core k8s
  objects.
- Use cached ConfigMapsGetter in the main workload sync controller
- Don't set unmanaged/removed messages. The status controller is now
  responsible for setting the cluster operator states for managed/
  unmanaged. The existing code now collides with the new static
  resource sync controller.
- Increase rate limit for k8s API calls. This ensures the static
  resource sync controller is not throttled when reconciling RBAC
  objects.
Copy link
Contributor Author

@adambkaplan adambkaplan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added additional doc comments to clarify what I added (and why)

if highRateLimitProtoKubeConfig.Burst < 100 {
highRateLimitProtoKubeConfig.Burst = 100
}
kubeClient, err := kubernetes.NewForConfig(highRateLimitProtoKubeConfig)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comments - hopefully things are a little bit clearer.

@@ -39,15 +53,22 @@ func RunOperator(ctx context.Context, controllerConfig *controllercmd.Controller
return err
}

kubeInformers := v1helpers.NewKubeInformersForNamespaces(kubeClient, util.TargetNamespace, util.OperatorNamespace, util.UserSpecifiedGlobalConfigNamespace)
kubeInformers := v1helpers.NewKubeInformersForNamespaces(kubeClient,
"",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@@ -105,10 +126,40 @@ func RunOperator(ctx context.Context, controllerConfig *controllercmd.Controller
controllerConfig.EventRecorder,
)

staticResourceController := staticresourcecontroller.NewStaticResourceController(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the TODO here to make it clear where in library-go we need to add a future enhancement.

@gabemontero
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 3, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 3, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adambkaplan, gabemontero

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 286c157 into openshift:master Jun 3, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 3, 2021

@adambkaplan: All pull requests linked via external trackers have merged:

Bugzilla bug 1965283 has been moved to the MODIFIED state.

In response to this:

Bug 1965283: Static Resources Controller for Sync

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

This PR has been included in build ose-cluster-openshift-controller-manager-operator-container-v4.8.0-202311261141.p0.g286c157.assembly.stream for distgit ose-cluster-openshift-controller-manager-operator.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants