Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upController manager restart default initialization failure. #10030
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bprashanth
Jun 18, 2015
Member
I get the feeling this can happen if the rc manager runs it's first loop after the restart before the watch has populated the store (but it will correct itself, like you noted). We've been meaning to autdit controller framework uses cases for things like restarts (#9026), but if we really need a fix we'd need to communicate that the watch goroutines have run to the rc manager.
@lavalamp
|
I get the feeling this can happen if the rc manager runs it's first loop after the restart before the watch has populated the store (but it will correct itself, like you noted). We've been meaning to autdit controller framework uses cases for things like restarts (#9026), but if we really need a fix we'd need to communicate that the watch goroutines have run to the rc manager. |
bprashanth
added
the
team/master
label
Jun 18, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@bprashanth or gate the run loops on a watch sync. |
satnam6502
added
the
priority/backlog
label
Jun 18, 2015
satnam6502
added this to the v1.0-candidate milestone
Jun 18, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
lavalamp
Jun 19, 2015
Member
The problem is that RC manager holds state that prevents the overshoot. That state isn't shared, so the standby doesn't know how many creations/deletions are queued and is likely to double-create/delete.
|
The problem is that RC manager holds state that prevents the overshoot. That state isn't shared, so the standby doesn't know how many creations/deletions are queued and is likely to double-create/delete. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
lavalamp
Jun 19, 2015
Member
It should only be a problem if a fail-over is triggered while a large operation is in progress.
Hm, but if as @bprashanth says, it doesn't gate actions on the store being filled, that's a different and more serious problem.
|
It should only be a problem if a fail-over is triggered while a large operation is in progress. Hm, but if as @bprashanth says, it doesn't gate actions on the store being filled, that's a different and more serious problem. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bprashanth
Jun 19, 2015
Member
yeah, we should either prepopulate that store or communicate that it has finished a first sync otherwise a restart of the binary could cause overshooting and correction
|
yeah, we should either prepopulate that store or communicate that it has finished a first sync otherwise a restart of the binary could cause overshooting and correction |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
lavalamp
Jun 19, 2015
Member
If @bprashanth is right, then restarting rc manager with big but stable RCs in the system sholud be a big problem.
We've added a method that will tell you if it's had at least one full sync, btw.
|
If @bprashanth is right, then restarting rc manager with big but stable RCs in the system sholud be a big problem. We've added a method that will tell you if it's had at least one full sync, btw. |
davidopp
added
priority/important-soon
and removed
priority/backlog
labels
Jun 20, 2015
davidopp
modified the milestones:
v1.0,
v1.0-candidate
Jun 20, 2015
davidopp
assigned
kprobst and
bprashanth
and unassigned
kprobst
Jun 20, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bprashanth
Jun 20, 2015
Member
Didn't see that method go in, but now that we have it #10147 fixes the issue i mentioned
|
Didn't see that method go in, but now that we have it #10147 fixes the issue i mentioned |
davidopp
referenced this issue
Jun 20, 2015
Merged
Prevent restarts of kube-controller from overshooting and correcting replicas #10147
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bprashanth
Jun 22, 2015
Member
Replying to @davidopp comment on the pr, since the comment will be lost once the pr is closed.
Does this fully fix #10030?
There's still the case of the standby not knowing how many requests are in flight if the primary goes down during a resize (eg: manager 1 creates 1000 replicas, it dies, manager 2 takes over and starts up a watch, sees 900 replicas because 100 are still in flight and decides to create 100 more). It will end up overshooting and correcting a maximum of 500 replicas (more like 20 since that's the qps on the connection).
Do we want to consider a more complete fix for 1.0?
|
Replying to @davidopp comment on the pr, since the comment will be lost once the pr is closed.
There's still the case of the standby not knowing how many requests are in flight if the primary goes down during a resize (eg: manager 1 creates 1000 replicas, it dies, manager 2 takes over and starts up a watch, sees 900 replicas because 100 are still in flight and decides to create 100 more). It will end up overshooting and correcting a maximum of 500 replicas (more like 20 since that's the qps on the connection). Do we want to consider a more complete fix for 1.0? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
timothysc
Jun 22, 2015
Member
@bprashanth, why can't we gather the details of the pending items as well? It appears to only see (x) running, but why gather and += pending?
|
@bprashanth, why can't we gather the details of the pending items as well? It appears to only see (x) running, but why gather and += pending? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bprashanth
Jun 22, 2015
Member
Not sure I understand what you mean. The case I'm talking about the requests are in flight (sent by the old manager, not yet written to etcd)
|
Not sure I understand what you mean. The case I'm talking about the requests are in flight (sent by the old manager, not yet written to etcd) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
lavalamp
Jun 22, 2015
Member
@timothysc It does see pending pods; it just won't see pods that have been created but not yet propagated. There should really not be very many pods in this category, and for this reason I think it is probably not a problem at the moment. #10147 therefore at least knocks this issue down to p3 and post-1.0.
|
@timothysc It does see pending pods; it just won't see pods that have been created but not yet propagated. There should really not be very many pods in this category, and for this reason I think it is probably not a problem at the moment. #10147 therefore at least knocks this issue down to p3 and post-1.0. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
davidopp
Jun 24, 2015
Member
Moving to 1.0-post now that #10147 has been merged. Thanks for the report, @timothysc, and @bprashanth and @lavalamp for working on this.
|
Moving to 1.0-post now that #10147 has been merged. Thanks for the report, @timothysc, and @bprashanth and @lavalamp for working on this. |
davidopp
modified the milestones:
v1.0-post,
v1.0
Jun 24, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jayunit100
Jul 14, 2015
Member
This bug, i assume, only surfaces if you restart kube-controller-manager at the end of the test.
otherwise i guess overshoot might be corrected before it occurs.
correct? am working on local reproducer and trying to make it robust
|
This bug, i assume, only surfaces if you restart kube-controller-manager at the end of the test. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
timothysc
Jul 14, 2015
Member
restart in the middle of density... density has a check for total (pending + running), and the total count far exceeds, and density will fail.
|
restart in the middle of density... density has a check for total (pending + running), and the total count far exceeds, and density will fail. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
lavalamp
Jul 14, 2015
Member
Yeah, restart in the middle to test. But this should be 98% fixed with @bprashanth's change. Is there still a problem?
|
Yeah, restart in the middle to test. But this should be 98% fixed with @bprashanth's change. Is there still a problem? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bprashanth
Jul 14, 2015
Member
If you time it right (eg while it's creating replicas), you can get it to overshoot, this is known. It should always correct itself and it should not overshoot if restarted in the steady state.
|
If you time it right (eg while it's creating replicas), you can get it to overshoot, this is known. It should always correct itself and it should not overshoot if restarted in the steady state. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jayunit100
Jul 14, 2015
Member
cool ! so... maybe its fixed in 98% of the scenarios ? as per the bprashant change?
Is this now virtually impossible to reproduce?
Pods: 30 out of 30 created, 0 running, 30 pending, 0 waiting, 0 inactive, 0 unknown
INFO: 2015-07-14 16:56:07.679790472 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 0 running, 30 pending, 0 waiting, 0 inactive, 0 unknown
INFO: 2015-07-14 16:56:17.680058695 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 0 running, 30 pending, 0 waiting, 0 inactive, 0 unknown
INFO: 2015-07-14 16:56:27.680314095 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 0 running, 30 pending, 0 waiting, 0 inactive, 0 unknown
KILL KCM,
INFO: 2015-07-14 16:56:37.680653983 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 5 running, 25 pending, 0 waiting, 0 inactive, 0 unknown
INFO: 2015-07-14 16:56:47.680985771 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 5 running, 25 pending, 0 waiting, 0 inactive, 0 unknown
INFO: 2015-07-14 16:56:57.681283125 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 5 running, 25 pending, 0 waiting, 0 inactive, 0 unknown
INFO: 2015-07-14 16:57:07.681461051 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 5 running, 25 pending, 0 waiting, 0 inactive, 0 unknown
RESTART KCM
INFO: 2015-07-14 16:57:17.681642482 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 9 running, 21 pending, 0 waiting, 0 inactive, 0 unknown
INFO: 2015-07-14 16:57:27.681929083 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 10 running, 20 pending, 0 waiting, 0 inactive, 0 unknown
INFO: 2015-07-14 16:57:37.682268394 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 13 running, 17 pending, 0 waiting, 0 inactive, 0 unknown
INFO: 2015-07-14 16:57:47.682679824 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 13 running, 17 pending, 0 waiting, 0 inactive, 0 unknown
KILL KCM
INFO: 2015-07-14 16:57:57.683031949 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 21 running, 9 pending, 0 waiting, 0 inactive, 0 unknown
INFO: 2015-07-14 16:58:07.683384456 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 21 running, 9 pending, 0 waiting, 0 inactive, 0 unknown
RESTART KCM
INFO: 2015-07-14 16:58:17.68379428 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 26 running, 4 pending, 0 waiting, 0 inactive, 0 unknown
INFO: 2015-07-14 16:58:27.684272592 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 26 running, 4 pending, 0 waiting, 0 inactive, 0 unknown
INFO: 2015-07-14 16:58:37.684671371 -0400 EDT density30-b3c38835-2a6a-11e5-91f7-3c970e4b8bbd Pods: 30 out of 30 created, 30 running, 0 pending, 0 waiting, 0 inactive, 0 unknown
INFO: E2E startup time for 30 pods: 2m50.010079055s
I tried killing/restaring several times, couldnt get an overshoot.
|
cool ! so... maybe its fixed in 98% of the scenarios ? as per the bprashant change?
I tried killing/restaring several times, couldnt get an overshoot. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Let's try on our beasty cluster in the a.m. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
yes |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Yes. |
jayunit100
referenced this issue
Jul 16, 2015
Closed
Tolerate overshooting during density tests #11378
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bprashanth
Jul 16, 2015
Member
@jayunit100 to be more clear on the case you should expect overshoot:
- With a single rc manager: you restart the manager right after it starts a bunch of POST requests for pod creates, but before the POSTs can complete. So realistic scenario if the apiserver is loaded and these POSTs take a second each to reach etcd, and the manager restarts within that second, there could be overshoot.
- With 2 rc managers in HA: You kill the active after the POSTs have been submitted and/or completed but before the passive rc's watch has observed them, the passive doesn't know to wait on the watch. it has inconsistent local state and will overshoot.
|
@jayunit100 to be more clear on the case you should expect overshoot:
|
bgrant0607
removed this from the
v1.0-post milestone
Jul 24, 2015
erictune
added
team/control-plane
and removed
team/master
labels
Aug 19, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
davidopp
Dec 15, 2015
Member
@bprashanth I assume this issue is still open because we don't have fixes for the two issues in your last comment? How hard would it be to fix?
|
@bprashanth I assume this issue is still open because we don't have fixes for the two issues in your last comment? How hard would it be to fix? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
timothysc
Dec 16, 2015
Member
Last I remember they still existed but were lower priority b/c of the caps we had in place. If we ever have a high throughput it could be a real concern.
|
Last I remember they still existed but were lower priority b/c of the caps we had in place. If we ever have a high throughput it could be a real concern. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
timothysc
Jun 17, 2016
Member
I'm going to close this one but we should be mindful of the legacy in the reflector re-design.
/cc @hongchaodeng
|
I'm going to close this one but we should be mindful of the legacy in the reflector re-design. /cc @hongchaodeng |
timothysc
closed this
Jun 17, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
hongchaodeng
Jun 17, 2016
Member
Is there a list of legacy issues? Are they all covered by tests? If not, we could add the testing for it?
|
Is there a list of legacy issues? Are they all covered by tests? If not, we could add the testing for it? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sureshpalemoni
Oct 3, 2017
Oct 03 19:08:40 openstack-master.novalocal systemd[1]: Failed to start Atomic OpenShift Master API.
Oct 03 19:08:40 openstack-master.novalocal systemd[1]: Unit origin-master-api.service entered failed state.
Oct 03 19:08:40 openstack-master.novalocal systemd[1]: origin-master-api.service failed.
Any solution for this?
sureshpalemoni
commented
Oct 3, 2017
|
Oct 03 19:08:40 openstack-master.novalocal systemd[1]: Failed to start Atomic OpenShift Master API. Any solution for this? |
timothysc commentedJun 18, 2015
In validating HA-fail-over on density, we verified that the controller manager appears to act in an uninitialized fashion, e.g. it doesn't gather complete state before acting. It over-corrects which causes a fair amount of unnecessary churn on the system.
It eventually recovers, and converges, but it fails density tests by overshooting.
Reproduction:
This can also be seen by watching the pending count spike on a
systemctl restart kube-controller-managerin the middle of a ramping-up on a wide replication controller./cc @brendandburns, @rrati, @davidopp, @jayunit100, @eparis, @bprashanth