Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1776079: Azure: check existing role assignments before creating a new one #137

Merged

Conversation

joelddiaz
Copy link
Contributor

Currently when re-reconciling an already-provisioned CredentialsRequest in Azure, the actuator will just always attempt to create a role assignment even if it already exists. There is error handling to catch the "RoleAssignmentExists" error, and it just moves along to the next task.

This results in the Resource Group where the cluster is installed having periodic entries in the Azure console Activity Log for the Resource Group recording these (non-critical) errors:

{
    "authorization": {
        "action": "Microsoft.Authorization/roleAssignments/write",
        "scope": "/subscriptions/SUBSCRIPTION_ID/resourceGroups/jdiaz-az-gpqgx-rg/providers/Microsoft.Authorization/roleAssignments/94025186-5e7b-4e18-88de-4625cac3ed19"
    },
    "level": "Error",
    "operationName": {
        "value": "Microsoft.Authorization/roleAssignments/write",
        "localizedValue": "Create role assignment"
    },
    "resourceGroupName": "jdiaz-az-gpqgx-rg",
    "subStatus": {
        "value": "Conflict",
        "localizedValue": "Conflict (HTTP Status Code: 409)"
    },
    "properties": {
        "statusCode": "Conflict",
        "serviceRequestId": "931445cc-fbc6-469a-a6f4-2f7750788255",
        "statusMessage": "{\"error\":{\"code\":\"RoleAssignmentExists\",\"message\":\"The role assignment already exists.\"}}"
    },
}

Change the logic to pull the current list of Role Assignments so that we can avoid making unnecessary CreateRoleAssignment calls.

@openshift-ci-robot
Copy link
Contributor

@joelddiaz: This pull request references Bugzilla bug 1776079, which is invalid:

  • expected the bug to target the "4.4.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 1776079: Azure: check existing role assignments before creating a new one

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Nov 25, 2019
@joelddiaz
Copy link
Contributor Author

/bugzilla refresh

@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Nov 25, 2019
@openshift-ci-robot
Copy link
Contributor

@joelddiaz: This pull request references Bugzilla bug 1776079, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Nov 25, 2019
Copy link
Contributor

@dgoodwin dgoodwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you able to reproduce what the customer found?

// check whether assignment already exists
alreadyExists := false
for _, r := range currentRoleAssignments {
if strings.Contains(*r.Properties.Scope, scope) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here clarifying why we can't do an eq and what the data strings look like? I see scope above but what does r.Properties.Scope come out looking like? The test data looks awfully similar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Contains() comparison is because our generated scope var string doesn't have a leading /, and the returned r.Properties.Scope does have the leading /.

i reworked the generated string a few lines above to just put the leading / in the string so we can now do a simple string comparison. putting the leading / still allows all the minting to work.

@joelddiaz
Copy link
Contributor Author

Were you able to reproduce what the customer found?

Yes. I stood up an Azure cluster and by forcing CCO to re-reconcile a CredentialsRequest, you could then see the log messages on the Azure console (that's how I got that JSON example ;) )

@joelddiaz
Copy link
Contributor Author

/test e2e-azure

…new one

Currently when re-reconciling an already-provisioned CredentialsRequest in Azure, the actuator will just always attempt to create a role assignment even if it already exists. There is error handling to catch the "RoleAssignmentExists" error, and it just moves along to the next task.

This results in the Resource Group where the cluster is installed having periodic entries in the Activity Log recording these (non-critical) errors:

```
{
    "authorization": {
        "action": "Microsoft.Authorization/roleAssignments/write",
        "scope": "/subscriptions/SUBSCRIPTION_ID/resourceGroups/jdiaz-az-gpqgx-rg/providers/Microsoft.Authorization/roleAssignments/94025186-5e7b-4e18-88de-4625cac3ed19"
    },
    "level": "Error",
    "operationName": {
        "value": "Microsoft.Authorization/roleAssignments/write",
        "localizedValue": "Create role assignment"
    },
    "resourceGroupName": "jdiaz-az-gpqgx-rg",
    "subStatus": {
        "value": "Conflict",
        "localizedValue": "Conflict (HTTP Status Code: 409)"
    },
    "properties": {
        "statusCode": "Conflict",
        "serviceRequestId": "931445cc-fbc6-469a-a6f4-2f7750788255",
        "statusMessage": "{\"error\":{\"code\":\"RoleAssignmentExists\",\"message\":\"The role assignment already exists.\"}}"
    },
}
```

Change the logic to pull the current list of Role Assignments so that we can avoid making unnecessary CreateRoleAssignment calls.
@joelddiaz
Copy link
Contributor Author

/test e2e-azure

@dgoodwin
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 26, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dgoodwin, joelddiaz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

6 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 0eab472 into openshift:master Nov 27, 2019
@openshift-ci-robot
Copy link
Contributor

@joelddiaz: All pull requests linked via external trackers have merged. Bugzilla bug 1776079 has been moved to the MODIFIED state.

In response to this:

Bug 1776079: Azure: check existing role assignments before creating a new one

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@pearj
Copy link

pearj commented Nov 27, 2019

Were you able to reproduce what the customer found?

Yes. I stood up an Azure cluster and by forcing CCO to re-reconcile a CredentialsRequest, you could then see the log messages on the Azure console (that's how I got that JSON example ;) )

Thanks for fixing this so quickly!

Out of curiosity, how do you force CCO to re-reconcile? Just delete the container and then it runs as soon as it starts?

@dgoodwin
Copy link
Contributor

That will work, as will just adding a meaningless annotation to a credentials request.

@joelddiaz
Copy link
Contributor Author

Out of curiosity, how do you force CCO to re-reconcile? Just delete the container and then it runs as soon as it starts?

As long as you can get the code to pass through this throttling/sanity check https://github.com/openshift/cloud-credential-operator/blob/master/pkg/controller/credentialsrequest/credentialsrequest_controller.go#L433-L439 , then that will cause the code to go through a full re-reconcile.

@joelddiaz joelddiaz deleted the skip-role-assignmentt branch June 16, 2020 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants