New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Controller manager restart default initialization failure. #10030
Comments
|
I get the feeling this can happen if the rc manager runs it's first loop after the restart before the watch has populated the store (but it will correct itself, like you noted). We've been meaning to autdit controller framework uses cases for things like restarts (#9026), but if we really need a fix we'd need to communicate that the watch goroutines have run to the rc manager. |
|
@bprashanth or gate the run loops on a watch sync. |
|
The problem is that RC manager holds state that prevents the overshoot. That state isn't shared, so the standby doesn't know how many creations/deletions are queued and is likely to double-create/delete. |
|
It should only be a problem if a fail-over is triggered while a large operation is in progress. Hm, but if as @bprashanth says, it doesn't gate actions on the store being filled, that's a different and more serious problem. |
|
yeah, we should either prepopulate that store or communicate that it has finished a first sync otherwise a restart of the binary could cause overshooting and correction |
|
If @bprashanth is right, then restarting rc manager with big but stable RCs in the system sholud be a big problem. We've added a method that will tell you if it's had at least one full sync, btw. |
|
Didn't see that method go in, but now that we have it #10147 fixes the issue i mentioned |
|
Replying to @davidopp comment on the pr, since the comment will be lost once the pr is closed.
There's still the case of the standby not knowing how many requests are in flight if the primary goes down during a resize (eg: manager 1 creates 1000 replicas, it dies, manager 2 takes over and starts up a watch, sees 900 replicas because 100 are still in flight and decides to create 100 more). It will end up overshooting and correcting a maximum of 500 replicas (more like 20 since that's the qps on the connection). Do we want to consider a more complete fix for 1.0? |
|
@bprashanth, why can't we gather the details of the pending items as well? It appears to only see (x) running, but why gather and += pending? |
|
Not sure I understand what you mean. The case I'm talking about the requests are in flight (sent by the old manager, not yet written to etcd) |
|
@timothysc It does see pending pods; it just won't see pods that have been created but not yet propagated. There should really not be very many pods in this category, and for this reason I think it is probably not a problem at the moment. #10147 therefore at least knocks this issue down to p3 and post-1.0. |
|
Moving to 1.0-post now that #10147 has been merged. Thanks for the report, @timothysc, and @bprashanth and @lavalamp for working on this. |
|
This bug, i assume, only surfaces if you restart kube-controller-manager at the end of the test. |
|
restart in the middle of density... density has a check for total (pending + running), and the total count far exceeds, and density will fail. |
|
Yeah, restart in the middle to test. But this should be 98% fixed with @bprashanth's change. Is there still a problem? |
|
If you time it right (eg while it's creating replicas), you can get it to overshoot, this is known. It should always correct itself and it should not overshoot if restarted in the steady state. |
|
cool ! so... maybe its fixed in 98% of the scenarios ? as per the bprashant change? I tried killing/restaring several times, couldnt get an overshoot. |
|
Let's try on our beasty cluster in the a.m. |
|
yes |
|
Yes. |
|
@jayunit100 to be more clear on the case you should expect overshoot:
|
|
@bprashanth I assume this issue is still open because we don't have fixes for the two issues in your last comment? How hard would it be to fix? |
|
Last I remember they still existed but were lower priority b/c of the caps we had in place. If we ever have a high throughput it could be a real concern. |
|
I'm going to close this one but we should be mindful of the legacy in the reflector re-design. /cc @hongchaodeng |
|
Is there a list of legacy issues? Are they all covered by tests? If not, we could add the testing for it? |
|
Oct 03 19:08:40 openstack-master.novalocal systemd[1]: Failed to start Atomic OpenShift Master API. Any solution for this? |
In validating HA-fail-over on density, we verified that the controller manager appears to act in an uninitialized fashion, e.g. it doesn't gather complete state before acting. It over-corrects which causes a fair amount of unnecessary churn on the system.
It eventually recovers, and converges, but it fails density tests by overshooting.
Reproduction:
This can also be seen by watching the pending count spike on a
systemctl restart kube-controller-managerin the middle of a ramping-up on a wide replication controller./cc @brendandburns, @rrati, @davidopp, @jayunit100, @eparis, @bprashanth
The text was updated successfully, but these errors were encountered: