Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid copying aggregated admin/edit/view roles during bootstrap #63761

Merged
merged 1 commit into from May 14, 2018

Conversation

@liggitt
Copy link
Member

liggitt commented May 13, 2018

Fixes #63760

At apiserver startup, prior to reconciling cluster roles, the following roles (if they exist) are copied:

  • admin -> system:aggregate-to-admin
  • edit -> system:aggregate-to-edit
  • view -> system:aggregate-to-view

This was added in 1.9 as part of role aggregation to ensure custom permissions added to the admin/edit/view roles were preserved, prior to making the admin/edit/view roles aggregated (since the permissions of an aggregated role are controller-managed)

When starting multiple members of a new HA cluster simultaneously, the following race can occur:

  • t=0, server 1,2,3 start up
  • t=1, server 1 finds no admin/edit/view roles exist, begins role reconciliation and creates the aggregated admin role
  • t=2, server 2 finds and copies the admin role created by server 1 to system:aggregate-to-admin

If this race is encountered, it results in system:aggregate-to-admin being an aggregated role, and its permissions subject to being overwritten by the aggregating controller. To prevent this from happening, the permission-preserving copy should only copy over roles that are not yet aggregated.

To correct this in clusters that have already encountered it, role reconciliation should remove aggregation from a role that is not expected to be aggregated at all.

corrects a race condition in bootstrapping aggregated cluster roles in new HA clusters
@deads2k

This comment has been minimized.

Copy link
Contributor

deads2k commented May 14, 2018

This pull reminds me. What did we ever doing about irreconcilable differences on rolebindings?

/lgtm

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented May 14, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

k8s-github-robot commented May 14, 2018

[MILESTONENOTIFIER] Milestone Pull Request: Up-to-date for process

@deads2k @liggitt

Pull Request Labels
  • sig/auth: Pull Request will be escalated to these SIGs if needed.
  • priority/important-soon: Escalate to the pull request owners and SIG owner; move out of milestone after several unsuccessful escalation attempts.
  • kind/bug: Fixes a bug discovered during the current release.
Help
@liggitt

This comment has been minimized.

Copy link
Member Author

liggitt commented May 14, 2018

What did we ever doing about irreconcilable differences on rolebindings?

We've always deleted/recreated if needed:

// Reset the binding completely if the roleRef is different
if expected.GetRoleRef() != existing.GetRoleRef() {
result.RoleBinding = expected
result.Operation = ReconcileRecreate
return result, nil
}

case ReconcileRecreate:
// Try deleting
err := o.Client.Delete(existingBinding.GetNamespace(), existingBinding.GetName(), existingBinding.GetUID())
switch {
case err == nil, errors.IsNotFound(err):
// object no longer exists, as desired
case errors.IsConflict(err):
// delete failed because our UID precondition conflicted
// this could mean another object exists with a different UID, re-run
return o.run(attempts + 1)
default:
// return other errors
return nil, err
}
// continue to create
fallthrough
case ReconcileCreate:

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

k8s-github-robot commented May 14, 2018

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

k8s-github-robot commented May 14, 2018

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit d5a930b into kubernetes:master May 14, 2018

15 of 16 checks passed

Submit Queue Required Github CI test is not green: pull-kubernetes-e2e-gce
Details
cla/linuxfoundation liggitt authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gke Skipped
pull-kubernetes-e2e-kops-aws Job succeeded.
Details
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce Job succeeded.
Details
pull-kubernetes-local-e2e Skipped
pull-kubernetes-local-e2e-containerized Skipped
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details

@liggitt liggitt deleted the liggitt:aggregated-bootstrap-race branch May 14, 2018

k8s-github-robot pushed a commit that referenced this pull request May 15, 2018

Kubernetes Submit Queue
Merge pull request #63762 from liggitt/automated-cherry-pick-of-#6376…
…1-upstream-release-1.10

Automatic merge from submit-queue.

Automated cherry pick of #63761: Avoid copying aggregated admin/edit/view roles during

Cherry pick of #63761 on release-1.10.

#63761: Avoid copying aggregated admin/edit/view roles during

k8s-github-robot pushed a commit that referenced this pull request May 15, 2018

Kubernetes Submit Queue
Merge pull request #63763 from liggitt/automated-cherry-pick-of-#6376…
…1-upstream-release-1.9

Automatic merge from submit-queue.

Automated cherry pick of #63761: Avoid copying aggregated admin/edit/view roles during

Cherry pick of #63761 on release-1.9.

#63761: Avoid copying aggregated admin/edit/view roles during
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.