Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

continuously delete pods on nodes that don't exist #22667

Merged
merged 1 commit into from
Mar 8, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
37 changes: 25 additions & 12 deletions pkg/controller/node/nodecontroller.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ import (
"k8s.io/kubernetes/pkg/controller/framework"
"k8s.io/kubernetes/pkg/fields"
"k8s.io/kubernetes/pkg/kubelet/util/format"
"k8s.io/kubernetes/pkg/labels"
"k8s.io/kubernetes/pkg/runtime"
"k8s.io/kubernetes/pkg/types"
"k8s.io/kubernetes/pkg/util"
Expand Down Expand Up @@ -284,6 +285,8 @@ func (nc *NodeController) Run(period time.Duration) {
return false, remaining
})
}, nodeEvictionPeriod, wait.NeverStop)

go wait.Until(nc.cleanupOrphanedPods, 30*time.Second, wait.NeverStop)
}

// Generates num pod CIDRs that could be assigned to nodes.
Expand Down Expand Up @@ -368,6 +371,28 @@ func (nc *NodeController) maybeDeleteTerminatingPod(obj interface{}) {
}
}

// cleanupOrphanedPods deletes pods that are bound to nodes that don't
// exist.
func (nc *NodeController) cleanupOrphanedPods() {
pods, err := nc.podStore.List(labels.Everything())
if err != nil {
utilruntime.HandleError(err)
return
}

for _, pod := range pods {
if pod.Spec.NodeName == "" {
continue
}
if _, exists, _ := nc.nodeStore.Store.GetByKey(pod.Spec.NodeName); exists {
continue
}
if err := nc.forcefullyDeletePod(pod); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will get here if GetByKey returns an error. Also, is it possible for the cache to be stale?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetByKey, can't return errors stupidly. That should be cleaned up:

https://github.com/kubernetes/kubernetes/blob/master/pkg/client/cache/store.go#L192

I'll do this in a follow up.

These cashes can be stale. There's a race here if a (node get's created -> scheduler get's update of node -> new pod gets assigned to a node -> controller get's update of pod) before (node get's created -> controller get's update of node). It seems less likely and of smaller consequence then the bug this fixes. Any suggestion on how to work around this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the node has to become ready in the scheduler flow and doesn't have to become ready in the controller manager flow which also makes the race less likely. Still a race though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intuition is that this race is pretty minor and less significant than the bug it fixes, but our intuition is often wrong about this stuff, and on an antagonistically loaded machine, you could imagine being in some kind of steady state forever where the controllers fight each other.

But there's a non-controller behavior that's also new. If I manually specify a host in my single pod (no controller), but make a typo, my pod disappears and never runs. That'll seem weird. It's also possible for the user to submit the pod in the same race window (hard to imagine this really happening) and the same thing happens.

Suggestions for fixing the race... I'll think about it. Concurrency + global state + caching lends itself to this.

utilruntime.HandleError(err)
}
}
}

func forcefullyDeletePod(c clientset.Interface, pod *api.Pod) error {
var zero int64
err := c.Core().Pods(pod.Namespace).Delete(pod.Name, &api.DeleteOptions{GracePeriodSeconds: &zero})
Expand Down Expand Up @@ -759,18 +784,6 @@ func (nc *NodeController) tryUpdateNodeStatus(node *api.Node) (time.Duration, ap
return gracePeriod, lastReadyCondition, readyCondition, err
}

// returns true if the provided node still has pods scheduled to it, or an error if
// the server could not be contacted.
func (nc *NodeController) hasPods(nodeName string) (bool, error) {
selector := fields.OneTermEqualSelector(api.PodHostField, nodeName)
options := api.ListOptions{FieldSelector: selector}
pods, err := nc.kubeClient.Core().Pods(api.NamespaceAll).List(options)
if err != nil {
return false, err
}
return len(pods.Items) > 0, nil
}

// evictPods queues an eviction for the provided node name, and returns false if the node is already
// queued for eviction.
func (nc *NodeController) evictPods(nodeName string) bool {
Expand Down
42 changes: 42 additions & 0 deletions pkg/controller/node/nodecontroller_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1116,6 +1116,48 @@ func TestCheckPod(t *testing.T) {
}
}

func TestCleanupOrphanedPods(t *testing.T) {
newPod := func(name, node string) api.Pod {
return api.Pod{
ObjectMeta: api.ObjectMeta{
Name: name,
},
Spec: api.PodSpec{
NodeName: node,
},
}
}
pods := []api.Pod{
newPod("a", "foo"),
newPod("b", "bar"),
newPod("c", "gone"),
}
nc := NewNodeController(nil, nil, 0, nil, nil, 0, 0, 0, nil, false)

nc.nodeStore.Store.Add(newNode("foo"))
nc.nodeStore.Store.Add(newNode("bar"))
for _, pod := range pods {
p := pod
nc.podStore.Store.Add(&p)
}

var deleteCalls int
var deletedPodName string
nc.forcefullyDeletePod = func(p *api.Pod) error {
deleteCalls++
deletedPodName = p.ObjectMeta.Name
return nil
}
nc.cleanupOrphanedPods()

if deleteCalls != 1 {
t.Fatalf("expected one delete, got: %v", deleteCalls)
}
if deletedPodName != "c" {
t.Fatalf("expected deleted pod name to be 'c', but got: %q", deletedPodName)
}
}

func newNode(name string) *api.Node {
return &api.Node{
ObjectMeta: api.ObjectMeta{Name: name},
Expand Down