Skip to content
This repository has been archived by the owner on Dec 5, 2017. It is now read-only.

investigate: node registrator doesn't play well with slaves that die and come back #778

Closed
jdef opened this issue Feb 10, 2016 · 7 comments

Comments

@jdef
Copy link

jdef commented Feb 10, 2016

reported here: #768 (comment)

/cc @ravlir

@jdef
Copy link
Author

jdef commented Feb 10, 2016

@ravlir do you happen to know about the conditions in which the slave came back? for example, did it come back up but with a different slaveID or the same slaveID as before?

@ravilr
Copy link

ravilr commented Feb 12, 2016

the previously reported scenario involved mesos slave registering back with a different slaveId:

I0129 02:30:33.423883 2985 slave.cpp:859] Registered with master master@1.1.1.1:5050; given slave ID 20160129-022011-3340029194-5050-31106-S5
.....
I0205 02:10:24.347128 21347 slave.cpp:606] Slave asked to shut down by master@1.1.1.1:5050 because 'Slave attempted to re-register after removal'
.....
I0205 02:10:45.308372 21408 slave.cpp:859] Registered with master master@1.1.1.1:5050; given slave ID 20160201-215024-3340029194-5050-15361-S0

@ravilr
Copy link

ravilr commented Mar 3, 2016

some more details:

I0302 22:11:20.818402 1 nodecontroller.go:450] Deleting node (no longer present in cloud provider): xyz.com

https://github.com/mesosphere/kubernetes/blob/v0.7.2-v1.1.5/pkg/controller/node/nodecontroller.go#L449
this seems to happen when the mesos-slave agent on the slave node is brought down and the k8s nodecontroller deletes the node from the api as the listSlaves call from mesos cloud provider doesn't return the slave host.
Once the slave agent is started back, the scheduler fails to add back the node as the offers from that host doesn't pass the Compat check as the check looks for the node to be registered in k8s api:
https://github.com/mesosphere/kubernetes/blob/v0.7.2-v1.1.5/contrib/mesos/pkg/scheduler/components/framework/framework.go#L122

@jdef
Copy link
Author

jdef commented Mar 3, 2016

[EDIT] Thanks for investigating this further. What should happen in this
situation is that when the slave comes back up, the mesos cloud provider
should see it and begin to report it in the list of nodes that it knows
about. Presumably k8s would see this change and create an api.Node in
apiserver. Are we sure that this isn't happening at all, or is it just a
matter of timing (as in, if we wait long enough the right thing happens)?

On Wed, Mar 2, 2016 at 9:46 PM, ravilr notifications@github.com wrote:

some more details:

I0302 22:11:20.818402 1 nodecontroller.go:450] Deleting node (no longer
present in cloud provider): xyz.com

https://github.com/mesosphere/kubernetes/blob/v0.7.2-v1.1.5/pkg/controller/node/nodecontroller.go#L449
this seems to happen when the mesos-slave agent on the slave node is
brought down and the k8s nodecontroller deletes the node from the api as
the listSlaves call from mesos cloud provider doesn't return the slave
host.
Once the slave agent is started back, the scheduler fails to add back the
node as the offers from that host doesn't pass the Compat check as the
check looks for the node to be registered in k8s api:

https://github.com/mesosphere/kubernetes/blob/v0.7.2-v1.1.5/contrib/mesos/pkg/scheduler/components/framework/framework.go#L122


Reply to this email directly or view it on GitHub
#778 (comment)
.

@ravilr
Copy link

ravilr commented Mar 3, 2016

yes, the mesos slave node which come back up with different slaveID, never seems to be registered back as k8s api.node object, unless the k8sm scheduler is restarted.
my understanding is that the k8sm scheduler registers the slave nodes with slave attributes converted to k8s labels in k8s api registry based on the offers it sees from the mesos master. for some reason, this doesn't seem to be happening after the slave registers back with the mesos master. i do see the added back slave offering it resources and mesos master recovering all resources from the k8sm framework:
I0303 20:22:42.809595 1765 hierarchical.hpp:814] Recovered ports():[31000-32000]; cpus():24; mem():62791; disk():208307 (total: ports():[31000-32000]; cpus():24; mem():62791; disk():208307, allocated: ) on slave 20160212-004300-3609200202-5050-1762-S5 from framework 20160211-225703-3609200202-5050-14672-0000

but, once the scheduler is restarted, the node appears in the k8s node registry.

sequence of steps to repro:

  1. mesos slave stopped
  2. k8s nodecontroller after node grace period (default 40s) checks there has been no hearbeat and asks the cloudprovider about the node. as the node is not listed in master slave list, nodecontroller deletes the node from k8s node api registry.
  3. mesos slave is brought up after slave ping timeout, master asks the slave trying to register with the same slave ID to shutdown. slave is brought up again by the systemd manager; slave is registered with the new slaveID.

@jdef
Copy link
Author

jdef commented Mar 4, 2016

I found the problem, a bug in the queue/ package. Will push a fix shortly

On Thu, Mar 3, 2016 at 3:25 PM, ravilr notifications@github.com wrote:

yes, the mesos slave node which come back up with different slaveID, never
seems to be registered back as k8s api.node object, unless the k8sm
scheduler is restarted.

my understanding is that the k8sm scheduler registers the slave nodes with
slave attributes converted to k8s labels in k8s api registry based on the
offers it sees from the mesos master. for some reason, this doesn't seem to
be happening after the slave registers back with the mesos master. i do see
the added back slave offering it resources and mesos master recovering all
resources from the k8sm framework:
I0303 20:22:42.809595 1765 hierarchical.hpp:814] Recovered ports():[31000-32000];
cpus(
):24; mem():62791; disk():208307 (total: ports():[31000-32000];
cpus(
):24; mem():62791; disk():208307, allocated: ) on slave
20160212-004300-3609200202-5050-1762-S5 from framework
20160211-225703-3609200202-5050-14672-0000

but, once the scheduler is restarted, the node appears in the k8s node
registry.

sequence of steps to repro:

  1. mesos slave stopped
  2. k8s nodecontroller after node grace period (default 40s) checks there
    has been no hearbeat and asks the cloudprovider about the node. as the node
    is not listed in master slave list, nodecontroller deletes the node from
    k8s node api registry.
  3. mesos slave is brought up after slave ping timeout, master asks the
    slave trying to register with the same slave ID to shutdown. slave is
    brought up again by the systemd manager; slave is registered with the new
    slaveID.


Reply to this email directly or view it on GitHub
#778 (comment)
.

@jdef
Copy link
Author

jdef commented Mar 4, 2016

xref kubernetes/kubernetes#22500

On Thu, Mar 3, 2016 at 11:08 PM, James DeFelice james@mesosphere.io wrote:

I found the problem, a bug in the queue/ package. Will push a fix shortly

On Thu, Mar 3, 2016 at 3:25 PM, ravilr notifications@github.com wrote:

yes, the mesos slave node which come back up with different slaveID,
never seems to be registered back as k8s api.node object, unless the k8sm
scheduler is restarted.

my understanding is that the k8sm scheduler registers the slave nodes
with slave attributes converted to k8s labels in k8s api registry based on
the offers it sees from the mesos master. for some reason, this doesn't
seem to be happening after the slave registers back with the mesos master.
i do see the added back slave offering it resources and mesos master
recovering all resources from the k8sm framework:
I0303 20:22:42.809595 1765 hierarchical.hpp:814] Recovered ports():[31000-32000];
cpus(
):24; mem():62791; disk():208307 (total: ports():[31000-32000];
cpus(
):24; mem():62791; disk():208307, allocated: ) on slave
20160212-004300-3609200202-5050-1762-S5 from framework
20160211-225703-3609200202-5050-14672-0000

but, once the scheduler is restarted, the node appears in the k8s node
registry.

sequence of steps to repro:

  1. mesos slave stopped
  2. k8s nodecontroller after node grace period (default 40s) checks there
    has been no hearbeat and asks the cloudprovider about the node. as the node
    is not listed in master slave list, nodecontroller deletes the node from
    k8s node api registry.
  3. mesos slave is brought up after slave ping timeout, master asks the
    slave trying to register with the same slave ID to shutdown. slave is
    brought up again by the systemd manager; slave is registered with the new
    slaveID.


Reply to this email directly or view it on GitHub
#778 (comment)
.

@jdef jdef added this to the v0.7.3 milestone Mar 4, 2016
@jdef jdef closed this as completed Mar 4, 2016
@jdef jdef removed the LGTM label Mar 4, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants