Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Informer examples #30

Closed
wants to merge 7 commits into from
Closed

Conversation

wfarr
Copy link
Contributor

@wfarr wfarr commented Nov 4, 2016

Demonstrates a basic controller watching Pod events in a cluster.

Tested against kubernetes 1.4.5 on GKE.

attn @caesarxuchao
cc @mkobetic @ibawt @lxfontes


This change is Reviewable

Copy link
Member

@caesarxuchao caesarxuchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. Generally looks good. Made a few suggestions. Thank you, @wfarr!


## Use Cases

* Capturing resource events for logging to external systems (eg. monitor non-Normal Events and publish metrics to a time series database)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a native speaker, why do you use "non-Normal" instead of "abnormal" here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Folks may want to report events that currently have the Warning type (eg. pod create failures) to external error aggregation services or time-series databases. "Normal" is one of the Types possible on an Event (the other being Warning right now, but the typedef stipulates more may be added later). I'll try and think of some better wording for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. How about putting quotes around "Normal"?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wfarr are you still on it , its been a couple months old now

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm addressing some of the remaining code-review comments in a new branch against kubernetes/kubernetes, and fixed this in kubernetes/kubernetes@9645bbf

## Recommendations

The design of `cache.NewInformer` lends itself towards writing simple, idempotent operations.
It includes its own reconciliation loop which refreshes resources every resync period in addition to on-the-fly event handlers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"refreshes resources" is a little bit vague. Every resync period, the informer just sends all the locally cached resources to the ResourceEventHandlerFuncs. So perhaps change to "loops through resources in the local cache..."?

Also, the list/watch mechanism guarantees the controller won't miss any event, so in most cases, resync is not necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any sort of retry behavior in the list/watch if a handler on the Informer should fail?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any sort of retry behavior in the list/watch if a handler on the Informer should fail?

Yes. The informer is backed by a controller which keeps a known cache for its backing reflector, which allow the reflector to retry list/watch operations indefinitely while keeping the notifications coherent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL 👍


> If you write a controller that for a ThirdPartyResource that manages the lifecycle of Database objects, you might use `{{.Metadata.Name}}-{{.Metadata.UID}}` as the name of the remote database instance.

Some resources may require more than UID to guarantee uniqueness (for example `v1.Event`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get why event is special?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I called out v1.Event as during the course of writing some tools, I came across https://github.com/kubernetes/kubernetes/blob/master/docs/design/event_compression.md, which covers some interesting uniqueness keys in use for that type specifically.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UID still guarantees the uniqueness of a v1.Event, because you can only get v1.Event object for each UID.

Though one v1.Event could be a compression of a bunch of similar events that exist in the memory by a controller.

panic(err)
}

stopchan := make(chan struct{}, 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider just call it stop. stopchan looks like Hungarian notation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

DeleteFunc: delete,
})

// store can be used to List and Get, never to write directly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth mentioning one should not modify the object acquired from the store.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

fmt.Println("listing pods from store:")
for _, obj := range store.List() {
pod := obj.(*v1.Pod)
fmt.Printf("%#v\n", pod)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this will be empty?


func update(old, new interface{}) {
oldpod := old.(*v1.Pod)
// newpod := new.(*v1.Pod)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not printing the newpod as well?

nit: s/oldPod/oldpod

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest set the resync period to 0 in this example, otherwise you will get duplicate updates every resync period.

Tested against kubernetes 1.4.5 on GKE.

Example:

> If you write a controller that for a ThirdPartyResource that manages the lifecycle of Database objects, you might use `{{.Metadata.Name}}-{{.Metadata.UID}}` as the name of the remote database instance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, when reading this again, I don't follow the example anymore. Do you mind elaborating it or chatting on slack (I might not be able to reply immediately)?


> If you write a controller that for a ThirdPartyResource that manages the lifecycle of Database objects, you might use `{{.Metadata.Name}}-{{.Metadata.UID}}` as the name of the remote database instance.

Some resources may require more than UID to guarantee uniqueness (for example `v1.Event`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UID still guarantees the uniqueness of a v1.Event, because you can only get v1.Event object for each UID.

Though one v1.Event could be a compression of a bunch of similar events that exist in the memory by a controller.

@wfarr
Copy link
Contributor Author

wfarr commented Dec 3, 2016

Sorry for the delay on revisiting this. All the comments/review should be addressed at this point I think.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 3, 2016
pod := obj.(*v1.Pod)

// This will likely be empty the first run, but may not
fmt.Printf("%#v\n", pod)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will it not be empty? I guess I don't see why the example tries to list the store at this point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the loop should be moved below

go controller.Run(stop)

and consider adding this:

# This way you can wait until a store finished its initial synchonization
cache.WaitForCacheSync(stop, store.HasSynced)

That is very nice to know, when you want to populate secondary caches.
After that the store.List() makes more sense in my eyes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. This doc has more details.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caesarxuchao
Copy link
Member

Thanks @wfarr. Sorry, I didn't get time reviewing it earlier.
I only two nits left. Could you squash though? I'll merge it then. Thanks again.

signal.Notify(signals, syscall.SIGINT, syscall.SIGTERM)
s := <-signals
fmt.Printf("received signal %#v, exiting...\n", s)
close(stop)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would highlight here that the controller will likely not get to actually react to the closed channel and get preempted by the os.Exit on the next line. Alternatively include your trick exploiting the fact that controller.Run is blocking and passing a stopped channel into the controller goroutine that is closed when Run returns.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mkobetic I think I understand the problem you're describing here, but I don't understand the trick you suggested. I'm new to Golang. Please can you provide some further explanation.

@ideahitme
Copy link

ideahitme commented Feb 15, 2017

sorry for the intrusion but is there any plan on getting this merged ? IMO it will be useful to have these examples in the repo

@ideahitme
Copy link

ideahitme commented Feb 16, 2017

example is no longer correct as per the latest code. clientset.CoreClient to be replaced by clientset.Core().RestClient() iirc

@caesarxuchao caesarxuchao mentioned this pull request Feb 22, 2017
11 tasks
## Use Cases

* Capturing resource events for logging to external systems (eg. monitor non-Normal Events and publish metrics to a time series database)
* Creating lifecycle controllers for ThirdPartyResources (eg. coordinate create/update/delete of an external datastore represented via a ThirdPartyResource type)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah i have been looking for such examples. Thanks a ton for initiating this. Couple questions, mostly naive :-)

  • i am now seeing shared_informer stuff and changes related to that. Is this example already old now ? It might be helpful to at least put a comment in the doc saying how this differs from it
  • I have seen simple controller examples which dont use the framework. A word or too about that would be helpful as well or how this is better example

})

// store can be used to List and Get
// NEVER modify objects from the store. It's a read-only, local cache.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put some context of this wrt reflector, and cache. My understanding is that reflector uses the source to populate the store and then uses fifo queue for notifications, which are acted by the above add/delete/update functions , is that right ?

@krmayankk
Copy link

@wfarr @caesarxuchao are you guys still on it ? I will try this example to see if it works. Would love if you can revive this and get this in and some more examples like this on usage of controller framework cc @smarterclayton

@wfarr
Copy link
Contributor Author

wfarr commented Feb 27, 2017

I don't really have the time to keep polishing this example up. If someone else wants to pick this up and push it across the finish line, feel free to do so.

Internally at Shopify, we're still making use of more or less the same constructs with good success using v2.0.0 of the client.

@caesarxuchao
Copy link
Member

@mbohlool would you have time to polishing this example? It's almost there. Thanks.

@mbohlool mbohlool self-assigned this Mar 2, 2017
@mbohlool
Copy link
Contributor

mbohlool commented Mar 2, 2017

@caesarxuchao Sure, assigned it to myself to follow up on it. Will send another PR with polished example.

@lavalamp
Copy link
Member

lavalamp commented Mar 2, 2017 via email

@timothysc
Copy link
Member

I'll be able to publish mine early next week, and I'll link back and once it's available.

@mbohlool
Copy link
Contributor

mbohlool commented Mar 6, 2017

@timothysc can you clarify your comment? Are you going to send a PR for this example?

@timothysc
Copy link
Member

@mbohlool I can link to a repo once I open it up. Still doing some vetting atm.

@mbohlool
Copy link
Contributor

mbohlool commented Mar 7, 2017

@timothysc the PR should be send to main kubernetes repo. I am still confused about your comments but assuming you are writing re-writing this example to address comments.

"k8s.io/client-go/kubernetes"
"k8s.io/client-go/pkg/api"
"k8s.io/client-go/pkg/api/v1"
"k8s.io/client-go/pkg/fields"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be k8s.io/apimachinery ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wallrj
Copy link

wallrj commented Apr 13, 2017

In #30 (comment) @ideahitme wrote:

example is no longer correct as per the latest code. clientset.CoreClient to be replaced by clientset.Core().RestClient() iirc

Fixed in kubernetes/kubernetes@c89391b

@wallrj
Copy link

wallrj commented Apr 13, 2017

In #30 (comment) @lavalamp wrote:

I think the canonical location for these examples is currently the main repo...

I've opened a new PR kubernetes/kubernetes#44446 and attempted to address the remaining code review comments.

wallrj added a commit to wallrj/kubernetes that referenced this pull request Apr 13, 2017
@mbohlool
Copy link
Contributor

closing this in favor of kubernetes/kubernetes#44446

@mbohlool mbohlool closed this Apr 13, 2017
wallrj added a commit to wallrj/kubernetes that referenced this pull request Apr 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet