Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate owner reference in packageserver service #1237

Closed
bryanl opened this issue Jan 16, 2020 · 9 comments
Closed

Duplicate owner reference in packageserver service #1237

bryanl opened this issue Jan 16, 2020 · 9 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/unresolved Indicates an issue that can not or will not be resolved.

Comments

@bryanl
Copy link

bryanl commented Jan 16, 2020

Bug Report

What did you do?

Installed OLM using kubectl

What did you expect to see?

The service created should have a single owner reference to the CSV.

What did you see instead? Under which circumstances?

The service has multiple owner references to the same CSV.

---
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: "2020-01-15T13:49:15Z"
  name: v1-packages-operators-coreos-com
  namespace: olm
  ownerReferences:
  - apiVersion: operators.coreos.com/v1alpha1
    blockOwnerDeletion: false
    controller: false
    kind: ClusterServiceVersion
    name: packageserver
    uid: e2bdc207-8f58-4ef8-afe4-212704dbe060
  - apiVersion: operators.coreos.com/v1alpha1
    blockOwnerDeletion: false
    controller: false
    kind: ClusterServiceVersion
    name: packageserver
    uid: e2bdc207-8f58-4ef8-afe4-212704dbe060
  resourceVersion: "201176"
  selfLink: /api/v1/namespaces/olm/services/v1-packages-operators-coreos-com
  uid: 7a1f2c82-e2ae-4adb-a10d-dee40c8a76ae
spec:
  clusterIP: 10.96.68.225
  ports:
  - port: 443
    protocol: TCP
    targetPort: 5443
  selector:
    app: packageserver
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

Environment

  • operator-lifecycle-manager version:

https://github.com/operator-framework/operator-lifecycle-manager/tree/976f2b981c326ee51748045f6df20317d6d95bd5

  • Kubernetes version information:
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.0", GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", GitTreeState:"clean", BuildDate:"2019-12-07T21:20:10Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.0", GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", GitTreeState:"clean", BuildDate:"2019-12-07T21:12:17Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes cluster kind:

Minikube

@exdx
Copy link
Member

exdx commented Jan 17, 2020

Hey @bryanl, thanks for filing this bug. We think it stems from the way we set owner references without checking the existence of previous references. Will take a look at this one.

@exdx exdx added the kind/bug Categorizes issue or PR as related to a bug. label Jan 17, 2020
@exdx exdx self-assigned this Jan 21, 2020
@exdx
Copy link
Member

exdx commented Jan 21, 2020

Looking into this a bit, I see we do check for existing owner references when assigning a new owner to an object, and return early instead of assigning the same reference again. https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/lib/ownerutil/util.go#L158

That being said this bug shows that the duplicate owernerref is being set again. This suggests it could be a race condition. The call that sets the ownerref for the packageserver service is here

ownerutil.AddNonBlockingOwner(service, csv)
. This runs inside of OLM operator and is triggered every time a sync is being done.

One thing I noticed is that the function that actually sets the ownerref has the following signature.

func AddNonBlockingOwner(object metav1.Object, owner Owner)

It takes copies to objects instead of pointers to the objects themselves. Therefore in the case where one sync runs, and calls this function, it sees no references and appends them, and if another sync is running and calls this function with the same object, it will not see the updated references yet and set it again. Then as the sync updates the object on the api server it sets duplicate references. I played around with refactoring the signature to take in pointers to the object and owner args instead, but this causes issues like Cannot use type *Service as type *metav1.Object as the pointer type fulfills a different interface.

All that being said if this was on on minikube then the sync should run really fast, and assuming no errors were present I'm a little doubtful its a race. Any thoughts @ecordell @njhale?

@stale
Copy link

stale bot commented Mar 21, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Mar 21, 2020
@openshift-ci-robot openshift-ci-robot added triage/unresolved Indicates an issue that can not or will not be resolved. and removed wontfix labels Mar 22, 2020
@stale
Copy link

stale bot commented May 21, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale
Copy link

stale bot commented Jul 20, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale
Copy link

stale bot commented Sep 20, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale
Copy link

stale bot commented Nov 22, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale
Copy link

stale bot commented Jan 22, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@timflannagan
Copy link
Contributor

timflannagan commented Jul 28, 2021

Closing as this appears to have already been resolved but automation didn't close this out immediately. Try using a more up-to-date OLM release that contains the fixes linked in this issue. Feel free to re-open if those fixes don't address completely address your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/unresolved Indicates an issue that can not or will not be resolved.
Projects
None yet
Development

No branches or pull requests

4 participants