-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
servicecontroller requeues bad changes at the end of the queue, ignoring subsequent changes #21952
Comments
cc @thockin because I think you were having a look at this area |
Actually, I'm confused as to what is happening here. I think AddIfNotPresent is supposed to prevent this. It doesn't appear to be doing so in practice. |
I think if you have a reliable repro you should send a surgical fix for 1.2, and post 1.2 we should just rewrite it per #21625 (comment) Without having dug through the code yet, I'm going to wave my hands a bit and say it should be requeuing a key, not the actual object. The key that's requeued should be used to lookup the object, which should always mirror the object in etcd. |
Yes I agree 100%. I was thinking about this and think it might actually be good news:
|
I don't mean a defer... I mean a goroutine which sleeps before re-enqueuing. A deferred re-enqueue. |
Should be resolved by #22069 |
I found a concrete problem with the servicecontroller (more concrete than some of my locking complaints - I saw this one in the real world). When we hit an error applying an update, we requeue the changes at the end of the queue: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/service/servicecontroller.go#L191
So suppose the user makes a "bad" change to a service (e.g. support we try to enable UDP on an AWS LoadBalancer). The service controller loop will observe that change, try to make the change, fail and keep re-enqueuing it for retry indefinitely. Not great, but OK.
Now suppose the user notices the bad change, and makes a corrective "good" change. The good change will join the queue and be processed. But the bad change will still be being requeued, and so will keep being retried. We are now trying to apply an older version of the change than the one we have already successfully applied.
In my particular case, I had a path dependency bug, and so I did Good1, Bad, Good2. Bad could not be applied to Good1, but could be applied to Good2 (a bug, but that's an issue in my code). So the sequence I got was Good1 -> Good2 -> Bad. i.e. I ended up on Bad, not Good2.
The text was updated successfully, but these errors were encountered: