Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use shared informers in gc controller if possible #45427

Merged
merged 1 commit into from May 23, 2017

Conversation

ncdc
Copy link
Member

@ncdc ncdc commented May 5, 2017

Modify the garbage collector controller to try to use shared informers for resources, if possible, to reduce the number of unique reflectors listing and watching the same thing.

cc @kubernetes/sig-api-machinery-pr-reviews @caesarxuchao @deads2k @liggitt @sttts @smarterclayton @timothysc @soltysh @Kargakis @kubernetes/rh-cluster-infra @derekwaynecarr @wojtek-t @gmarek

@ncdc ncdc added release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels May 5, 2017
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 5, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@k8s-github-robot k8s-github-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 5, 2017
@ncdc
Copy link
Member Author

ncdc commented May 5, 2017

@k8s-bot bazel test this

// add the event to the dependencyGraphBuilder's graphChanges.
AddFunc: func(obj interface{}) {
setObjectTypeMeta(obj)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this was originally needed. Can runtime objects exist w/o a Kind?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if you don't have this, kind and apiVersion are empty

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, we needed it because TypeMeta is reset when unmarshalled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// +k8s:conversion-fn=drop
func Convert_v1_TypeMeta_To_v1_TypeMeta(in, out *TypeMeta, s conversion.Scope) error {
// These values are explicitly not copied
//out.APIVersion = in.APIVersion
//out.Kind = in.Kind
return nil
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we doing this? Does it mean that any controller that cares about TypeMeta needs to reset the fields?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kargakis I don't remember the history. @smarterclayton or @deads2k or @liggitt or @lavalamp probably do?

return nil, fmt.Errorf("expected runtime.Object, got %#v", obj)
}
if clone {
copy, err := api.Scheme.DeepCopy(runtimeObject)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this DeepCopy affect the performance?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's presumably going to do memory allocations and take some cpu time that it wouldn't otherwise be doing. I haven't measured it yet. I wanted to get this up for review first.

I just looked a bit more, and it only uses the object so it can get access to the meta.Accessor and meta.TypeAccessor. Maybe we could be more efficient and not use the full object?

Copy link
Member Author

@ncdc ncdc May 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we could not clone and instead include the GVK as part of the event object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible change that removes the need to clone: https://gist.github.com/ncdc/24b2716d448cd820eac897c4e5ba5521

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the changes in my gist pass unit & integration tests locally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed this as the 2nd commit in this PR. PTAL

@ncdc
Copy link
Member Author

ncdc commented May 5, 2017

@k8s-bot bazel test this

1 similar comment
@ncdc
Copy link
Member Author

ncdc commented May 5, 2017

@k8s-bot bazel test this

rbac.NewRule("list", "watch").Groups(certificatesGroup).Resources("certificatesigningrequests").RuleOrDie(),
rbac.NewRule("list", "watch").Groups(storageGroup).Resources("storageclasses").RuleOrDie(),
// Needed for all shared informers
rbac.NewRule("list", "watch").Groups("*").Resources("*").RuleOrDie(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of scary. It means the core controllers can see anything in the entire system, whereas before it had to be enumerated. Not terribly different from the loopback controller, but means we don't have a resource that you can't add to garbage collector (like can you use this to fish for token names by doing timing channel attacks on the gc controller?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Loopback client

Copy link
Member Author

@ncdc ncdc May 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I debated doing this vs enumerating. The problem with enumerating is that as new types are added, people likely will forget to add them to the policy, and then you'll see RBAC denials in the logs that may or may not be meaningful (depending on whether or not the new types need to participate in GC).

Also the current GC controller rule is exactly this, but it's scoped to the GC controller and not the controller-manager's client. So this does expand what the controller-manager client can list/watch, but it doesn't broaden the GC controller's powers, as they're already this broad.

We don't need to rush this PR in, so let's discuss how we want to proceed and I'll make the changes once we reach a consensus.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton have you had any more time to think about this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Game theory:

  1. We add a new resource which is super powerful and the garbage collection controller shouldn't have.
  2. We close the hole where garbage collector seeing secrets means it's root on the cluster anyway
  3. Would everyone have to migrate from * to a maintained list because 1 now shouldn't be given to GC?

Is there anything more powerful than the GC controller already is (can delete everything in the cluster)? Should the GC controller not control PodSecurityPolicy by default? Should the GC controller control RBAC by default?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton I would expect 1. the encryption in the storage to be configurable so I can encrypt more than secrets and 2. I could see the etcd write key being stored in its own TPR that I would probably not want the GC to have access to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would everyone have to migrate from * to a maintained list because 1 now shouldn't be given to GC?

So we would expect anyone contributing an aggregated API server or a customresource will create another role and bind the garbage collector to it? It could be done, but we should set the expectation now so we can start working through examples.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We add a new resource which is super powerful and the garbage collection controller shouldn't have.

Add it where - to the core? As a TPR? Via an extension apiserver? Or maybe it doesn't matter and the answer is just "yes" 😄.

To recap where we are today, in the master branch, without my PR:

  1. The GC controller client is allowed to get, list, watch, patch, update, and delete / (
    rbac.NewRule("get", "list", "watch", "patch", "update", "delete").Groups("*").Resources("*").RuleOrDie(),
    )
  2. The GC controller only sets up monitoring resources when it is constructed, and it doesn't refresh (e.g. to pick up new TPRs) unless I'm missing seeing it (
    if err := gb.monitorsForResources(deletableResources); err != nil {
    )

My PR grants the controller-manager the ability to list/watch /, so all current and future shared informers can function without having to remember to modify the policy.

Maybe we need to consider looking into a way to restrict which portions of the code (controllers) are allowed to use specific shared informers? Otherwise I don't have a good idea for how to use shared informers in the GC controller without granting list/watch of / to the controller-manager client.

@ncdc
Copy link
Member Author

ncdc commented May 5, 2017

@k8s-bot bazel test this

},
}

shared, err := gb.sharedInformers.ForResource(resource)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where you're restarting the shared informers. This could suddenly start watching a previously unwatched resource, so you'll need to be sure that you start the informers again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the informer needs to restart when gc restarts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current flow with this PR looks like this:

  1. Create GC controller
    1. Create "monitors" aka controllers and set up event handlers
  2. Run GC controller
    1. Call Run() on all the monitors. If the monitor is a dummy controller from a shared informer, this will be a no-op. Otherwise, it will start the lister/watcher/reflector from the non-shared controller.
  3. Instantiate and run various other controllers
  4. Start the shared informers

Given that the GC controller doesn't currently reread discovery and create new monitors for newly seen resources (or does it???), can we punt on this for now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the GC controller doesn't currently reread discovery and create new monitors for newly seen resources (or does it???), can we punt on this for now?

It will happens in 1.7. I guess it doesn't have to be this pull, but definitely place a comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do all controllers (gc, replicaset controller, etc.) always restart at the same time? I think it's true, so i think the current PR works.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caesarxuchao what are you referring to re controllers restarting? AFAIK all controllers start when the kube-controller-manager starts, and they all stop when that process terminates.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please ignore the "restarting" part. What i was concerned with is the following situation: the shared informer had started by other controllers, then GC starts, and it misses the earlier events. This case seems to be impossible since GC calls Run() on all monitors when it starts.

But will other controllers call Run() on a shared informer? Will that cause GC see duplicate events (though GC should operate correctly if there are duplicate events)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 3 similar functions that can start these things:

  1. shared informer factory Start() - this starts any initialized but unstarted shared informers in the factory. It's a no-op for ones that are already started.
  2. shared informer Run() - this starts running the given shared informer, and is what (1) calls. This is never invoked anywhere in the GC controller code.
  3. controller Run() - this runs a controller, and is what the GC controller calls when it's starting monitors. For shared informers, they return a dummy controller whose Run() implementation is a no-op, so we're safe here.

There is a potential danger if multiple callers invoke a shared informer's Run() function, and we should probably guard against that happening, but it's not in the code right now.

Does this answer your question?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, @ncdc .

this runs a controller, and is what the GC controller calls when it's starting monitors. For shared informers, they return a dummy controller whose Run() implementation is a no-op, so we're safe here.

Does this mean the gc will miss the ADD events for the objects that are already in the sharedInformer before gc invokes the dummyController.Run()?

// TODO: consider store in one storage.
glog.V(5).Infof("create storage for resource %s", resource)
client, err := gb.metaOnlyClientPool.ClientForGroupVersionKind(kind)
if err != nil {
return nil, err
}
gb.registeredRateLimiterForControllers.registerIfNotPresent(resource.GroupVersion(), client, "garbage_collector_monitoring")
setObjectTypeMeta := func(obj interface{}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this mutation become unnecessary in the non-shared informer case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I pass down the gvk, which has all the info we need, in the event

// Check if the node already exsits
existingNode, found := gb.uidToNode.Read(accessor.GetUID())
switch {
case (event.eventType == addEvent || event.eventType == updateEvent) && !found:
newNode := &node{
identity: objectReference{
OwnerReference: metav1.OwnerReference{
APIVersion: typeAccessor.GetAPIVersion(),
Kind: typeAccessor.GetKind(),
APIVersion: event.gvk.GroupVersion().String(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build up the string, don't rely on the stringification for display purposes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC typeAccessor.GetAPIVersion() is doing exactly this, but I can build it up instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted with @deads2k on irc. typeAccessor.GetAPIVersion() just does this under the covers:

obj.GetObjectKind().GroupVersionKind().GroupVersion().String()

which is a multi-line function that has some special casing for "v1" vs everything else. David said he won't block the PR on this (given that it's not a change in functionality).

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 12, 2017
@k8s-github-robot k8s-github-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels May 16, 2017
@ncdc
Copy link
Member Author

ncdc commented May 16, 2017

@k8s-bot unit test this #44846
@k8s-bot kops aws e2e test this

@ncdc
Copy link
Member Author

ncdc commented May 16, 2017

All tests are green. Ready for final review, I think. We also still need to resolve @smarterclayton's question about granting the controller manager client permission to list/watch everything. I'll squash once we've resolved the outstanding policy question.

@ncdc
Copy link
Member Author

ncdc commented May 17, 2017 via email

@ncdc
Copy link
Member Author

ncdc commented May 18, 2017 via email

@caesarxuchao
Copy link
Member

@ncdc you are right. They are handled by the listener and processor logic. Sorry i should had read the sharedInformer code more patiently.

@ncdc
Copy link
Member Author

ncdc commented May 22, 2017

Bump, anyone have any other comments?

@smarterclayton
Copy link
Contributor

smarterclayton commented May 22, 2017

I'm fine with */*. If no other comments, this is lgtm as well

@ncdc
Copy link
Member Author

ncdc commented May 22, 2017

Squashed

@ncdc ncdc added this to the v1.7 milestone May 22, 2017
@ncdc
Copy link
Member Author

ncdc commented May 22, 2017

@k8s-bot pull-kubernetes-federation-e2e-gce test this

@k8s-ci-robot
Copy link
Contributor

@ncdc: The following test(s) failed:

Test name Commit Details Rerun command
pull-kubernetes-federation-e2e-gce 2480f2c link @k8s-bot pull-kubernetes-federation-e2e-gce test this

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@ncdc
Copy link
Member Author

ncdc commented May 22, 2017

Green tests (federation is non-blocking due to flakiness for now)

@smarterclayton
Copy link
Contributor

I'd like one or two reviewers to chime in

@ncdc
Copy link
Member Author

ncdc commented May 22, 2017

Definitely

@deads2k
Copy link
Contributor

deads2k commented May 22, 2017

lgtm

@caesarxuchao
Copy link
Member

The gc changes lgtm.

@smarterclayton
Copy link
Contributor

/lgtm

as per reviewers

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 22, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ncdc, smarterclayton

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 22, 2017
@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 46201, 45952, 45427, 46247, 46062)

@k8s-github-robot k8s-github-robot merged commit cc6e51c into kubernetes:master May 23, 2017
@ncdc ncdc deleted the gc-shared-informers branch October 22, 2018 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants