Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use service accounts as clients for controllers #33310

Merged

Conversation

deads2k
Copy link
Contributor

@deads2k deads2k commented Sep 22, 2016

Makes it possible for the controller-manager to use service accounts to run individual controllers. To start, this only enables this feature if this particular controller manager has the power to create service account tokens. Otherwise, the full-powered client is used instead. This is a necessary step on the way to subdividing the authority of controllers.

@kubernetes/sig-auth
@erictune I know you care about this
@ncdc fyi


This change is Reviewable

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels Sep 22, 2016
//
// CAUTION: If you update code in this file, you may need to also update code
// in contrib/mesos/pkg/controllermanager/controllermanager.go
package app
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to a sub-package if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to a sub-package if possible.

Sure.

@lavalamp lavalamp assigned erictune and unassigned lavalamp Sep 22, 2016
@erictune
Copy link
Member

I love that you are making progress on this issue.

I want to get feedback from @cjcullen and @roberthbailey

In follow ons to this PR, David is planning to have each controller of the controller manager talk to the apiserver with a different service account for each sub-process of the controller manager.

That seems good to me because it allows for fault isolation of the many pieces of code in the controller manager.

The thing David is doing this in this PR is he is making it possible to mint several new service account tokens at kube-controller-manager startup time.

CJ, Robert and I had been thinking about having some kind of mapping between Kubernetes service accounts and GCP service accounts. I had thought maybe one way to establish this mapping is to have a controller that sees service accounts made in the kubernetes API, and then does some stuff with the GCP API and provides a GCP-managed credential. It would be harder to do this if you do this PR, because there is no service account object to watch.

Is it possible to require that the service accounts for the controller manager exist ahead of time as addon objects, and that the kube-controller-manager mounts all the tokens it needs? That would make upgrades crazy tricky though.


watcher, err := b.CoreClient.Secrets(b.Namespace).Watch(
api.ListOptions{
ResourceVersion: "0",
Copy link
Member

@liggitt liggitt Sep 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing this pop up more and more… why is it needed? Isn't it implicitly 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing this pop up more and more… why is it needed? Isn't it implicitly 0?

I didn't look it up. I would have expected an implicit "", which I would have expected to be "start from now".

@liggitt
Copy link
Member

liggitt commented Sep 23, 2016

It would be harder to do this if you do this PR, because there is no service account object to watch.

Not sure I understand this comment… this PR is creating service account objects, then watching for their API credentials

@erictune
Copy link
Member

Oh, sorry. I skimmed it first and made some assumptions.

Okay, my next question is how is the process going to work when someone adds a new control loop to controller manager.

If RBAC were the only form of Authz, then in a single PR, they can:

  • add a new controller
  • make the SA for the new controller
  • make the Role for the new controller
  • add the SA to the Role

But, if a different authorizer is in use, then it is going to be hard to do that in a single PR.

@deads2k
Copy link
Contributor Author

deads2k commented Sep 23, 2016

But, if a different authorizer is in use, then it is going to be hard to do that in a single PR.

I'd like to get to the point of "merge won't pass without RBAC permissions set properly for controller". That would likely be setting an e2e run with RBAC on and the insecure port disabled.

Other authorizers would be able to add the permissions at their leisure. Because of the way the controllers work, the forbiddens would simply cause the controller to requeue items until they were able to update them.

@deads2k deads2k added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Sep 23, 2016
@lavalamp
Copy link
Member

It's a string, so "" is the default, not "0". It is correct to specify "0".

On Thu, Sep 22, 2016 at 8:32 PM, David Eads notifications@github.com
wrote:

@deads2k commented on this pull request.

In cmd/kube-controller-manager/app/client_builder.go
#33310:

  • // we need the SA UID to find a matching SA token
  • sa, err := b.CoreClient.ServiceAccounts(b.Namespace).Get(name)
  • if err != nil && !apierrors.IsNotFound(err) {
  • return nil, err
    
  • } else if apierrors.IsNotFound(err) {
  • sa, err = b.CoreClient.ServiceAccounts(b.Namespace).Create(
    
  •     &api.ServiceAccount{ObjectMeta: api.ObjectMeta{Namespace: b.Namespace, Name: name}})
    
  • if err != nil {
    
  •     return nil, err
    
  • }
    
  • }
    +
  • watcher, err := b.CoreClient.Secrets(b.Namespace).Watch(
  • api.ListOptions{
    
  •     ResourceVersion: "0",
    

I'm seeing this pop up more and more… why is it needed? Isn't it
implicitly 0?

I didn't look it up. I would have expected an implicit "", which I would
have expected to be "start from now".


You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
#33310, or mute the thread
https://github.com/notifications/unsubscribe-auth/AAngllzQIUM2cifjbIstHS5tYFHoivDRks5qs0hYgaJpZM4KES9Q
.

@liggitt
Copy link
Member

liggitt commented Sep 23, 2016

What is the distinction between "" and "0"?

@lavalamp
Copy link
Member

"" gets interpreted as unset, which I believe will be interpreted as "give
me everything starting from 'now', whatever you think 'now' happens to be".
"0" means "give me everything starting from 0"

On Fri, Sep 23, 2016 at 2:00 PM, Jordan Liggitt notifications@github.com
wrote:

What is the distinction between "" and "0"?


You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
#33310 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAnglnYVUDCEdQpezHwQdM_7TH2sPTh0ks5qtD3dgaJpZM4KES9Q
.

@liggitt
Copy link
Member

liggitt commented Sep 23, 2016

If that's the case, won't "0" cause a 410 error once resourceVersion 0 is outside the watch window?

@lavalamp
Copy link
Member

Yes, it will. I thought this was test code, though?

On Fri, Sep 23, 2016 at 2:19 PM, Jordan Liggitt notifications@github.com
wrote:

Won't "0" cause a 410 error once resourceVersion 0 is outside the watch
window?


You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
#33310 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAnglml8DxLsyl3oQ_kXip4kj2o4qC_Gks5qtEJggaJpZM4KES9Q
.

@liggitt
Copy link
Member

liggitt commented Sep 23, 2016

No, not test code. I really thought omitted resourceVersion and resourceVersion "0" were equivalent, and that 0 was special cased in etcd storage and in the watch cache, but I'm not at a computer to verify

@ncdc
Copy link
Member

ncdc commented Sep 26, 2016

See also

// watching from resourceVersion 0, starts the watch at ~now and
// will return an initial watch event. Starting form ~now, rather
// the rv of the object will insure that we start the watch from
// inside the watch window, which the rv of the object might not be.
rv := "0"

@erictune
Copy link
Member

Other authorizers would be able to add the permissions at their leisure. Because of the way the controllers work, the forbiddens would simply cause the controller to requeue items until they were able to update them.

If you add a new service accounts called "deployment-controller" in namespace "kube-system", and the deployment controller loop uses that service account, then all the e2es which use non-RBAC authorizers will not pass. So you won't be able to merge that PR. So, is your plan:

  1. change other cluster setup methods in the same PR, so that e2es will pass
  2. make use of the new service accounts conditional, and off by default, and only tested by unit tests.
  3. something I haven't thought of.

@deads2k
Copy link
Contributor Author

deads2k commented Sep 26, 2016

If you add a new service accounts called "deployment-controller" in namespace "kube-system", and the deployment controller loop uses that service account, then all the e2es which use non-RBAC authorizers will not pass. So you won't be able to merge that PR. So, is your plan:

Pulling the list from the current CI:

  1. gce node e2e
  2. gce e2e
  3. gke smoke e2e
  4. kubemark gce e2e

Do you know which ones (if any) we'd be able to get to switch to using RBAC? We could also consider things like having the other authorizer whitelist SAs from kube-system (there's already an auto-group) to do anything.

@deads2k
Copy link
Contributor Author

deads2k commented Sep 29, 2016

@erictune Are you ok with this in principle? Even if we end up having to make it an option, the structure I've provided makes that pretty easy to do.

If you're ok with it in principle, I can get @liggitt to review the details.

@deads2k deads2k force-pushed the controller-04-auto-create-SA branch from 1f0f79e to f1e9766 Compare October 4, 2016 14:03
@deads2k
Copy link
Contributor Author

deads2k commented Oct 4, 2016

Comments addressed.

var clientBuilder controller.ControllerClientBuilder
if len(s.ServiceAccountKeyFile) > 0 {
clientBuilder = controller.SAControllerClientBuilder{
ClientConfig: kubeconfig,
Copy link
Member

@liggitt liggitt Oct 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to remove the credentials from the client config (upstream AnonymousClientConfig?)

// config returns a complete clientConfig for constructing clients. This is separate in anticipation of composition
// which means that not all clientsets are known here
func (b SAControllerClientBuilder) Config(name string) (*restclient.Config, error) {
clientConfig := *b.ClientConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or remove credentials from the copy here

if err != nil && !apierrors.IsNotFound(err) {
return nil, err
} else if apierrors.IsNotFound(err) {
sa, err = b.CoreClient.ServiceAccounts(b.Namespace).Create(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to ensure the namespace exists as well

case watch.Deleted:
return false, nil
case watch.Error:
return false, fmt.Errorf("error watching")
Copy link
Member

@liggitt liggitt Oct 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with high write loads, we could exceed the watch window before we get a token. add a TODO to ListWatchUntil to handle re-establishment, and include the watch error we encountered in the returned error message

case watch.Error:
return false, fmt.Errorf("error watching")
case watch.Added, watch.Modified:
secret := event.Object.(*api.Secret)
Copy link
Member

@liggitt liggitt Oct 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serviceaccount.IsServiceAccountToken(secret, sa) will do the type/name/uid checks for you

len(secret.Data[api.ServiceAccountTokenKey]) == 0 {
return false, nil
}
clientConfig.BearerToken = string(secret.Data[api.ServiceAccountTokenKey])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restclient.AddUserAgent(&clientConfig, serviceaccount.MakeUsername(b.Namespace, name))

Copy link
Member

@liggitt liggitt Oct 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO or follow-up issue to verify the token is valid? c.f. openshift/origin#3498

@@ -206,20 +220,50 @@ func Run(s *options.CMServer) error {
panic("unreachable")
}

func StartControllers(s *options.CMServer, kubeconfig *restclient.Config, stop <-chan struct{}, recorder record.EventRecorder) error {
func StartControllers(s *options.CMServer, kubeconfig *restclient.Config, rootClientBuilder, clientBuilder controller.ControllerClientBuilder, stop <-chan struct{}, recorder record.EventRecorder) error {
client := func(userAgent string) clientset.Interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/userAgent/serviceAccountName/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and sweep what we're passing in as userAgent to make sure we're happy with the names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and sweep what we're passing in as userAgent to make sure we're happy with the names

those haven't changed.

@deads2k deads2k force-pushed the controller-04-auto-create-SA branch from f1e9766 to 3f8927b Compare October 5, 2016 14:27
@k8s-github-robot k8s-github-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 5, 2016
@deads2k deads2k force-pushed the controller-04-auto-create-SA branch from 3f8927b to 59156de Compare October 5, 2016 14:48
@k8s-ci-robot
Copy link
Contributor

Jenkins GCI GKE smoke e2e failed for commit 59156deae9e02b96cb781605b339026e6891e706. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 5, 2016
@deads2k
Copy link
Contributor Author

deads2k commented Oct 5, 2016

@liggitt before I rebase, looks ok?

@liggitt
Copy link
Member

liggitt commented Oct 5, 2016

LGTM

@liggitt
Copy link
Member

liggitt commented Oct 5, 2016

even assuming rbac, will need a way to coordinate the names used for the service accounts (here in the controllermanager) and the auto-assigned rolebindings giving those permissions (in the apiserver bootstrap)

@deads2k deads2k force-pushed the controller-04-auto-create-SA branch from 59156de to 8ea2acc Compare October 5, 2016 17:15
@deads2k deads2k added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 5, 2016
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 5, 2016
@deads2k deads2k added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Oct 5, 2016
@liggitt
Copy link
Member

liggitt commented Oct 5, 2016

also needs to give deployers a way to continue using the kubeconfig credentials for all the controllers, for those who have secured setups and authz already in place, and aren't interested in subdividing

@deads2k
Copy link
Contributor Author

deads2k commented Oct 5, 2016

also needs to give deployers a way to continue using the kubeconfig credentials for all the controllers, for those who have secured setups and authz already in place, and aren't interested in subdividing

opened #34133

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 234c562 into kubernetes:master Oct 5, 2016
@deads2k deads2k deleted the controller-04-auto-create-SA branch February 1, 2017 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants