-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement workqueue #63
Conversation
@lilic are your E2E test going to cover the functionality of the peer IP handling, e.g. that we correctly replace the IP of a dead pod? |
@asymmetric My idea was we just need to test that a CM was updated after our Pod died, not sure we need to care about if it was the correct IP. Think that might be more unit tests, wdyt? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we can merge this I would say we should take care of:
- Watching
Deployments
to avoid unnecessary calls. - Save
store
/cache
of all the resources we do a Get/List call to. Think that only needs to be SG, Pods and Deployments for now.
pkg/habitat/controller/controller.go
Outdated
|
||
queue workqueue.RateLimitingInterface | ||
|
||
store cache.Store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can describe this better and call it something like SGStore
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now it's the only store though :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but it still stores just SG cache so we can still rename it to describe what it stores. Its up to you I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll leave it like this, if you don't mind. I'll add some field comments to explain what each field does instead.
pkg/habitat/controller/controller.go
Outdated
UpdateFunc: hc.onUpdate, | ||
DeleteFunc: hc.onDelete, | ||
AddFunc: hc.enqueueSG, | ||
UpdateFunc: func(old, new interface{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest we move this to a separate func
?
pkg/habitat/controller/controller.go
Outdated
DeleteFunc: hc.onDelete, | ||
AddFunc: hc.enqueueSG, | ||
UpdateFunc: func(old, new interface{}) { | ||
oldSG, ok1 := old.(*crv1.ServiceGroup) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we separate to a func
, we can then also return early here and log the error. So:
if !ok1 {
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do that even without extracting to a named function though. I quite like this small anonymous function (and I stole it from here ;))
pkg/habitat/controller/controller.go
Outdated
return true | ||
} | ||
|
||
func (hc *HabitatController) syncServiceGroup(key string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can just call this sync
? As you mention in the comment we react on pod events as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This follows the example of other controllers, and I guess the idea is that this function syncs the resource this controller cares about, including all depdencies.
pkg/habitat/controller/controller.go
Outdated
} | ||
|
||
func (hc *HabitatController) syncServiceGroup(key string) error { | ||
// we got woken up, so either: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can shorten this comment to something like: (Feel free to adjust the comment.)
// We react on every watched resource event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That slipped in, wasnt' planning on having it committed :)
But I guess if it's reworded it can be useful, as you say.
return err | ||
} | ||
if !exists { | ||
// The SG was deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this be true if Pod
was deleted as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignore the comment, the store
naming confused me.
pkg/habitat/controller/controller.go
Outdated
return hc.handleServiceGroupDeletion(key) | ||
} | ||
|
||
// it was either created or updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something like:
// Resource was either created or updated.
pkg/habitat/controller/controller.go
Outdated
level.Error(hc.logger).Log("msg", err) | ||
return | ||
// Create Deployment, if it doesn't already exist. | ||
d, dErr := hc.config.KubernetesClientset.AppsV1beta1Client.Deployments(apiv1.NamespaceDefault).Create(deployment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should introduce watching for deployments as well. After that we can just do a Get
here to our cache/store instead of a create to the API. This way we save calls to the API and use our cache better.
As right now from what I can see it will do a Create
call is being made every time we update/create our Pod or SG.
pkg/habitat/controller/controller.go
Outdated
FieldSelector: fs.String(), | ||
} | ||
|
||
// TODO We can use the store for this query. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest we do this now. From what I can see we already have access to pod store/cache, we just need to save it? Or is there anything else stopping us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking this and other things (like increasing the number of workers or using watchers for other types of resources) can be done as additional PRs later on, as there is already enough going on in this PR.
I see those changes as basically performance optimizations, and I think we can split them up from this PR, which is about core functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with that (although I would prefer we do it in this PR), but then we shouldn't close the issue it relates to, but rather maybe add a list of AC to the issue. Otherwise we can forget about it.
Something like:
- Introduce WorkQueue
- Add Deployment watching
- Store Pod cache
etc.
This was we can just pick up those tasks and keep track of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you mean #51, this PR does close that issue AFAICT. The features you mentioned are additional nice-to-haves IMO.
Or did you mean another issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add separate issues for the items you mentioned above.
pkg/habitat/controller/controller.go
Outdated
return | ||
} | ||
|
||
new, ok2 := newObj.(*apiv1.Pod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new
maybe not a reserved name, but I would still like to avoid using that. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't know about this.
I tested this branch and it seems like writing IP is not working the same, compared to master. When I create two SG one after another, the ConfigMaps contain the same IP. |
pkg/habitat/controller/controller.go
Outdated
oldSG, ok1 := oldObj.(*crv1.ServiceGroup) | ||
newSG, ok2 := newObj.(*crv1.ServiceGroup) | ||
|
||
if !ok1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move it before newSG, ok2
? That way we can return early and not create newSG
.
e603102
to
6fc4f71
Compare
Good catch! PTALA. |
8a437f0
to
84b08ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested again and it didn't seem to fix the problem I was having earlier. When creating two SG, example1 and example2, both CM contain IP from same SG (in my case example1).
Ignore me, forgot to pull in the latest changes. :)
pkg/habitat/controller/controller.go
Outdated
|
||
newCM := newConfigMap(sg.Name, deploymentUID, leaderIP) | ||
|
||
var cm *apiv1.ConfigMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this declaration needed?
pkg/habitat/controller/controller.go
Outdated
})) | ||
|
||
running := metav1.ListOptions{ | ||
FieldSelector: fs.String(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just create fs
and ls
inline, as it more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this style more readable, and unless there are guidelines going against it, I'd like to keep it.
It's also how we do it elsewhere.
pkg/habitat/controller/controller.go
Outdated
return | ||
// Create Deployment, if it doesn't already exist. | ||
d, dErr := hc.config.KubernetesClientset.AppsV1beta1Client.Deployments(apiv1.NamespaceDefault).Create(deployment) | ||
if dErr != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not the standard naming of error
?
pkg/habitat/controller/controller.go
Outdated
return dErr | ||
} | ||
|
||
d, dErr = hc.config.KubernetesClientset.AppsV1beta1Client.Deployments(sg.Namespace).Get(deployment.Name, metav1.GetOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused why we need to Get
, as our Create
already returns a deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because if we're here, the Create
has failed, and the deployment already exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I would suggest we restructure this, as right now from first glance its not really clear whats happening.
I would actually reverse those and start by first seeing if our deployment already exists, rather then trying to create it. If it doesn't then create it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way we save one call though. If the resource doesn't exist, we just create it, with no need to perform the check first.
I've seen the pattern in other controllers as well, so I'm more inclined on leaving this as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance argument will probably become irrelevant if/when we move to using a Deployment
s cache, so I think we can refactor this there.
But for now it seems like a good enough reason to me. I can add a comment to make it clearer though.
WDYT?
Are there any more blockers? If not, can this get a green-light? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. We just shouldn't forget about the followup issues to this PR. Other then that lets just clean up the commits and then 👍
f9915f9
to
9082f29
Compare
This PR adds a ServiceGroup workqueue.
Following the guidelines, all event handlers (SG and Pod for now) do is retrieve the appropriate SG key and add a job to the queue.
The worker then consumes these jobs, and performs the necessary operations.
Closes #26, #51 and obsoletes #56.