Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix clusterrolebinding update #353

Merged
merged 1 commit into from
Aug 11, 2021

Conversation

clamoriniere
Copy link
Collaborator

@clamoriniere clamoriniere commented Aug 10, 2021

What does this PR do?

ClusterRoleBinding can't be update if we change its RoleRef because this field
is immutable.

To fix the issue, we now do Delete and Create.

This PR also refactor the code base to define only one function responsible of updating a ClusterRoleBinding.

Motivation

In operator v0.7, the clusterchecksrunner has its owner clusterrole (not shared with the agent ds).
So to migrate from Operator v0.6, the ClusterRoleBinding need to be updated. but a simple Update()
doesn't work due to the RoleRef immutability

Additional Notes

N/A

Describe your test plan

  1. Deploy the operator v0.6.0
  2. Deploy the DatadogAgent with clusterchecksrunner enabled.
  3. Chek the status of the DatadogAgent, it should be Active=true like here
    NAME            ACTIVE   AGENT                CLUSTER-AGENT     CLUSTER-CHECKS-RUNNER   AGE
    datadog-agent   True     Running (15/15/15)   Running (2/2/2)   Running (3/3/3)          3m
  4. Update the operator to use v0.7.0(-rc.x)
  5. Wait that the new operator instance get the leadership
  6. The status of the DatadogAgent should still be Active=true
  7. You can also check that the ClusterRoleBinding has been recreated. (timestamp)

@clamoriniere clamoriniere requested a review from a team as a code owner August 10, 2021 13:21
@clamoriniere clamoriniere added this to the v0.7 milestone Aug 10, 2021
@clamoriniere clamoriniere added the bug Something isn't working label Aug 10, 2021
if err := r.client.Delete(context.TODO(), clusterRoleBinding); err != nil {
return reconcile.Result{}, err
}
if err := r.client.Create(context.TODO(), newClusterRoleBinding); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we ignore the "not exists" error in the Delete above?

Not sure if the following could happen:

  • Delete is successful
  • Create fails because an error connection
  • When the job is retried it will fail because the Delete will return "not found"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it was delete, we should not go again in the update function, but instead in the create function. So I think it should be ok. Worth case it will take 2 reconcile
Loop

Copy link
Member

@davidor davidor Aug 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right 👍

@clamoriniere clamoriniere merged commit 18dac27 into main Aug 11, 2021
@clamoriniere clamoriniere deleted the clamoriniere/fix-rolebinding-update branch August 11, 2021 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants