Selector on watches.yaml not honoured #31

anupchandak · 2023-02-14T10:05:01Z

To control the scope of an operator in a multi-development environment, I have defined a selector at the watches.yaml level by referring here

The selector is defined as something like the below (presented below with equivalent dummy values)

- version: v1
  group: mytest.com
  kind: MyKind
  snakeCaseParameters: False
  playbook: playbooks/create.yml
  finalizer:
    name: myTest.com/finalizer
    playbook: playbooks/purge.yml
  selector:
    matchExpressions:
      - key: mytest.com/controller-namespace
        operator: In
        values: 
          - "my-test-na"

When I start my ansible runner then as expected, I see the following log at the start

{"level":"info","ts":1676367873.856818,"logger":"cmd","msg":"Watch namespaces not configured by environment variable WATCH_NAMESPACE or file. Watching all namespaces.","Namespace":""}

and I expect that my Operator will still not worry (watch) about CR defined with the label mytest.com/controller-namespace=your-test-na. But it does and reconciles it.

It is an ansible based operator and environment details are as below:

% ansible --version
/usr/local/lib/python3.9/site-packages/paramiko/transport.py:236: CryptographyDeprecationWarning: Blowfish has been deprecated
  "class": algorithms.Blowfish,
ansible [core 2.13.5]
  config file = /Users/anupchandak/ansible-profiler.cfg
  configured module search path = ['/Users/anupchandak/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /Users/anupchandak/Library/Python/3.9/lib/python/site-packages/ansible
  ansible collection location = /Users/anupchandak/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/local/bin/ansible
  python version = 3.9.14 (main, Sep  6 2022, 23:29:09) [Clang 13.1.6 (clang-1316.0.21.2.5)]
  jinja version = 3.1.2
  libyaml = True

The text was updated successfully, but these errors were encountered:

varshaprasad96 · 2023-02-20T20:26:11Z

@jberkhahn The thread regarding this issue: https://mail.google.com/mail/u/0/#search/ansible/FMfcgzGrcXtllNJlqSFwVfvsJVrwzQjw

@anupchandak Could you please share your controller pod logs or the project, for us to able to run it locally and check the issue. The selectors should be working as expected by creating predicates, looking at the logs may help us dig into it more.

anupchandak · 2023-02-23T11:02:21Z

@varshaprasad96 - I tried creating a sample project using the Memcached example but was not able to reproduce the above issue.

I cannot share my work project for copyright restrictions.

Any pointer on how I can check what is all coming on the operator's watch list when it starts? And selector it is applying.

anupchandak · 2023-02-25T12:00:40Z

Any way to know what dependent resource was changed that triggered the operator's reconciliation loop?

varshaprasad96 · 2023-02-27T16:32:50Z

The other option is to add additional logs in ansible operator binary and try it out locally to see what is happening. Some pointers are:

I'd start by looking if watches.yaml is being parsed as expected. In the sense if the selectors are being parsed and loaded from the watches file, which happens here: https://github.com/operator-framework/operator-sdk/blob/5cbdad9209332043b7c730856b6302edc8996faf/internal/ansible/watches/watches.go#L313
This is where predicates are set up based on labels: https://github.com/operator-framework/operator-sdk/blob/d828db26e4c0377e8423bfbdafa36449a971f05a/internal/ansible/controller/controller.go#L115. Checking here if predicates are being created successfully would be helpful.
The above two steps should help in digging the issue. If not, I would go a step further and try to replicate this method (https://github.com/kubernetes-sigs/controller-runtime/blob/b9940edaaafe3f0292d6be43b362852aab079369/pkg/predicate/predicate.go#L375), which is where predicates are created according to labels. That would help in checking if the labels are in the right format, and if the predicate func is appearing as expected.
This is where the ansible controller's logic is written (https://github.com/operator-framework/operator-sdk/blob/d828db26e4c0377e8423bfbdafa36449a971f05a/internal/cmd/ansible-operator/run/cmd.go#L89), digging into logs to check the events which are being received and the requests triggering the reconciler would be helpful.

You may have to build the binary locally to test it out. The steps are here: https://sdk.operatorframework.io/docs/contribution-guidelines/developer-guide/.

Before all this, I would suggest to increase the log verbosity and check if there is anything suspicious indicating that labels haven't been set up as expected. Hope this helps!

anupchandak · 2023-03-09T09:14:19Z

@varshaprasad96 - Thank you so much for your detailed reply above.

Sorry for the late reply but I think I am able to reproduce the issue. I think it's because of the dependent resource CronJob created by the CR.

Please use the attached project and follow the below steps to reproduce the issue.

Copy the project locally.
Install CRD make install.
Start the operator locally with ansible-operator run local --zap-devel=true.

Create the first CR in the apple namespace. This will create a deployment object and a CronJob in the suspended state.

kubectl create namespace apple
kubectl config set-context --current --namespace=apple
kubectl --namespace apple create -f config/samples/apple_sample.yaml

Create the second CR in the banana namespace. This will create a deployment object and a CronJob in the non-suspended state.

kubectl create namespace banana
kubectl config set-context --current --namespace=banana
kubectl --namespace banana create -f config/samples/banana_sample.yaml

Now, stop the operator and modify the watches.yaml to only select resources from the apple namespace.

selector:
   matchExpressions:
      - key: cache.example.com/controller-namespace
         operator: In 
         values: [apple]

Restart the operator with ansible-operator run local --zap-devel=true.
You should see that the operator will run whenever a CronJob in the banana namespace is triggered even though the Operator watches selector is configured to watch only resources from the apple namespace.

I have also attached logs from my local execution. Please note that to restrict the logs to only testing namespaces, I had set export WATCH_NAMESPACE=apple,banana.

memcached-operator.zip
reconcile_log.txt

Thank you!

anupchandak · 2023-04-03T06:24:01Z

Team - Any comment/update on this issue?

anupchandak · 2023-05-18T06:22:26Z

Hi Team - Have you got a chance to look at this issue?

openshift-bot · 2023-08-16T09:00:43Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2023-09-16T00:30:32Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

jberkhahn assigned jberkhahn and varshaprasad96 Feb 20, 2023

jberkhahn added the triage/support Indicates an issue that is a support question. label Feb 20, 2023

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 16, 2023

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 16, 2023

everettraven transferred this issue from operator-framework/operator-sdk Oct 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Selector on watches.yaml not honoured #31

Selector on watches.yaml not honoured #31

anupchandak commented Feb 14, 2023 •

edited

varshaprasad96 commented Feb 20, 2023

anupchandak commented Feb 23, 2023 •

edited

anupchandak commented Feb 25, 2023

varshaprasad96 commented Feb 27, 2023 •

edited

anupchandak commented Mar 9, 2023

anupchandak commented Apr 3, 2023

anupchandak commented May 18, 2023

openshift-bot commented Aug 16, 2023

openshift-bot commented Sep 16, 2023

Selector on watches.yaml not honoured #31

Selector on watches.yaml not honoured #31

Comments

anupchandak commented Feb 14, 2023 • edited

varshaprasad96 commented Feb 20, 2023

anupchandak commented Feb 23, 2023 • edited

anupchandak commented Feb 25, 2023

varshaprasad96 commented Feb 27, 2023 • edited

anupchandak commented Mar 9, 2023

anupchandak commented Apr 3, 2023

anupchandak commented May 18, 2023

openshift-bot commented Aug 16, 2023

openshift-bot commented Sep 16, 2023

anupchandak commented Feb 14, 2023 •

edited

anupchandak commented Feb 23, 2023 •

edited

varshaprasad96 commented Feb 27, 2023 •

edited