Delete node attachments when node is removed #2409

dperny · 2017-10-12T23:39:41Z

When a node is removed, delete all of its attachment tasks, so that any networks being used by those tasks can be successfully removed.

Provides a workaround to the state where a node with attachments is somehow removed from the cluster while attached to a network, preventing the network from being removed. Does not fix many other related bugs.

Signed-off-by: Drew Erny drew.erny@docker.com

anshulpundir · 2017-10-13T06:55:35Z

manager/controlapi/node.go

+	for _, task := range tasks {
+		// if the task is an attachment, then we just delete it. the
+		// allocator will do the heavy lifting.
+		// basically, GetAttachment will return the an attachment if that's


remove an/the

anshulpundir · 2017-10-13T06:56:55Z

manager/controlapi/node.go

@@ -313,6 +333,8 @@ func (s *Server) RemoveNode(ctx context.Context, request *api.RemoveNodeRequest)
 			return err
 		}

+		removeNodeAttachments(t, request.NodeID)


We should check error and bubble it up. Probably fail the node remove.

marcusmartins · 2017-10-13T09:07:14Z

How are the changes tested?

dperny · 2017-10-13T17:48:02Z

They're not yet. I need to write a unit test for the removeNodeAttachments function. That's why I broke it out. It's just a pain because it touches the store so much.

dperny · 2017-10-13T18:41:15Z

Added unit test

codecov · 2017-10-13T22:04:05Z

Codecov Report

Merging #2409 into master will increase coverage by 0.24%.
The diff coverage is 59.09%.

@@            Coverage Diff             @@
##           master    #2409      +/-   ##
==========================================
+ Coverage   60.51%   60.76%   +0.24%     
==========================================
  Files         128      128              
  Lines       26339    26361      +22     
==========================================
+ Hits        15940    16018      +78     
+ Misses       9004     8938      -66     
- Partials     1395     1405      +10

anshulpundir · 2017-10-13T22:21:38Z

manager/controlapi/node.go

+		// if the task is an attachment, then we just delete it. the allocator
+		// will do the heavy lifting. basically, GetAttachment will return the
+		// attachment if that's the kind of runtime, or nil if it's not.
+		if task.Spec.GetAttachment() != nil {


Spoke with @dperny offline but capturing here for everyone else: I guess it makes sense to enforce a model where the task reaper is the only place where tasks are deleted. We could mark these tasks as orphaned and let them be cleaned out by the reaper, with the caveat that the network attachments are removable for orphaned tasks.

I agree with marking the tasks Orphaned when the associated node is deleted. I think that's exactly the right behavior.

This should be done in the orchestrators, which is what controls task lifecycle. For example, in the replicated orchestrator:

func (r *Orchestrator) handleTaskEvent(ctx context.Context, event events.Event) { switch v := event.(type) { case api.EventDeleteNode: r.restartTasksByNodeID(ctx, v.Node.ID)

We would want to modify this code to set the old task's state (not desired state) to Orphaned. The code here might be a little hard to follow, because the tasks get added to a map that's eventually processed as a batch, with those tasks getting passed to Restart. We could potentially have a separate map for tasks that need to become orphaned, and pass them to the restart manager in the same way, but then set the state to Orphaned after calling Restart.

The global orchestrator would need similar changes.

For network attachment tasks, I'm not entirely sure what the best way is. We could create a simple orchestrator for those that just watches for node deletion events. I think that's the cleanest way, but as a simpler kludge we could handle this in the network allocator.

I don't think any changes in the task reaper or network allocator are necessary, because once a task's state is set to Orphaned, its network resources are supposed to be freed. But it's definitely worth confirming that this works as expected after the node has been deleted.

I see that when we remove a node, we delete all tasks for that node from the store. What is the reason to not keep the history around in that case ? Since the service may still be around, doesn't it make sense to keep the task history from that node around ? @aaronlehmann

It was done for global service tasks because otherwise these tasks would stay in the store forever. The task reaper keeps a certain number per node, so there's no provision for removing all the tasks from dead nodes. I'm not sure I see a better way than deleting it immediately, because with the node no longer in the system, we'll never know when the task has shut truly shut down, and is finally safe to delete. Possibly we could use the orphaned state, as discussed above, though if we wanted to be really careful, we would set that state after a delay passes, like we do for unresponsive nodes.

For reference, here's the commit that made the change: 56463e4

dperny · 2017-10-17T22:36:10Z

I think the correct approach to take is to have the node delete operation also mark all of the node's attachments as ORPHANED. if it's done this way, a node cannot be removed without its attachments being orphaned, and an orphaned attachment cannot exist as long as its node is present. This prevents the issue where we might fail in between deleting a node and orphaning its tasks, which means we don't have to reconcile the state of attachments (yet, at least. we will still need a permanent fix to solve all of the other situations in which an attachment can become "stuck").

When a node is removed, delete all of its attachment tasks, so that any networks being used by those tasks can be successfully removed. Provides a workaround to the state where a node with attachments is somehow removed from the cluster while attached to a network, preventing the network from being removed. Does not fix many other related bugs. Includes a unit test for the function that removes node attachment tasks. Signed-off-by: Drew Erny <drew.erny@docker.com>

dperny · 2017-10-18T19:05:02Z

applied this patch to a dev build of the docker daemon and i can confirm it does work; when a node is removed, the tasks are removed with it and the network can be successfully removed

anshulpundir

Keeping the task lifecycle within the orchestrator does make sense to me, although this does create the problem of lacking crash consistency: when a node is removed, the orchestrator changes the desired state of the deleted node tasks outside the node remove transaction.

We can either live with that small window where its possible to lose task deleting from a removed node, or we could allow task state transactions outside the orchestrator. Maybe the first approach is better, and can be supplemented by possibly having another way to clean up the store in the background.

So, we're pressed by time, I am OK with treating the current approach as a short-term fix and working on the longer-term fix to possibly have a separate light weight orchestrator for the network attachments.

Thoughts ? @dperny @aaronlehmann @nishanttotla

anshulpundir · 2017-10-18T20:51:47Z

manager/controlapi/node.go

@@ -313,6 +336,10 @@ func (s *Server) RemoveNode(ctx context.Context, request *api.RemoveNodeRequest)
 			return err
 		}

+		if err := removeNodeAttachments(tx, request.NodeID); err != nil {


It probably makes sense to add a comment here to say why we're doing this.

anshulpundir

After a quick discussion between @dperny @andrewhsu and I, we decided to merge this. Speaking with @dperny, this should serve as the workaround for the most common scenario where the network is not removable because the serviceless task is running on a node that is gone. @dperny will provide a writeup.

Does it sense to run the e2e suite before merging this ?

I'd also suggest opening another issue in swarmkit for the long term fix capturing some of the discussion from this PR.

When a node is removed, delete all of its attachment tasks, so that any networks being used by those tasks can be successfully removed. Provides a workaround to the state where a node with attachments is somehow removed from the cluster while attached to a network, preventing the network from being removed. Does not fix many other related bugs. Includes a unit test for the function that removes node attachment tasks. Cherry picks moby#2409 to 17.03. Cherry-pick applies cleanly. (cherry picked from commit 0c7b2fc) Signed-off-by: Drew Erny <drew.erny@docker.com>

Upgrade swarmkit dependency. Changes: moby/swarmkit@ce5f7b8a (HEAD -> master, origin/master, origin/HEAD) Merge pull request moby/swarmkit#2411 from crunchywelch/2401-arm64_support moby/swarmkit@b0856099 Merge pull request moby/swarmkit#2423 from thaJeztah/new-misty-handle moby/swarmkit@2bd294fc Update Misty's GitHub handle moby/swarmkit@0769c605 Comments for orphaned state/task reaper. (moby/swarmkit#2421) moby/swarmkit@de950a7e Generic resource cli (moby/swarmkit#2347) moby/swarmkit@312be598 Provide custom gRPC dialer to override default proxy dialer (moby/swarmkit#2419) moby/swarmkit@4f12bf79 Merge pull request moby/swarmkit#2415 from cheyang/master moby/swarmkit@8f9f7dc1 add pid limits moby/swarmkit@da5ee2a6 Merge pull request moby/swarmkit#2409 from dperny/workaround-attachments moby/swarmkit@0c7b2fc2 Delete node attachments when node is removed moby/swarmkit@9d702763 normalize "aarch64" architectures to "arm64" moby/swarmkit@28f91d8...ce5f7b8 Signed-off-by: Marcus Martins <marcus@docker.com>

anshulpundir reviewed Oct 13, 2017

View reviewed changes

dperny force-pushed the workaround-attachments branch from 72ba4d7 to 8b9ef83 Compare October 13, 2017 16:55

dperny force-pushed the workaround-attachments branch from 8b9ef83 to a806a0a Compare October 13, 2017 18:38

dperny force-pushed the workaround-attachments branch from a806a0a to a2fd247 Compare October 13, 2017 20:24

anshulpundir reviewed Oct 13, 2017

View reviewed changes

dperny force-pushed the workaround-attachments branch from a2fd247 to 4e11431 Compare October 13, 2017 22:25

dperny force-pushed the workaround-attachments branch from 4e11431 to 0c7b2fc Compare October 17, 2017 22:46

nishanttotla self-requested a review October 18, 2017 18:32

anshulpundir reviewed Oct 18, 2017

View reviewed changes

anshulpundir approved these changes Oct 18, 2017

View reviewed changes

dperny merged commit da5ee2a into moby:master Oct 18, 2017

andrewhsu mentioned this pull request Oct 18, 2017

[17.06] Delete node attachments when node is removed #2414

Merged

dperny mentioned this pull request Oct 23, 2017

[v17.03] Cherry-pick Delete node attachments when node is removed #2417

Merged

This was referenced Oct 30, 2017

Revendored Swarmkit moby/moby#35326

Merged

Docker daemon crash after joining a swarm cluster moby/moby#35379

Open

marcusmartins mentioned this pull request Nov 3, 2017

Bump docker/swarmkit to ce5f7b8a6f5a1b1e4a29cafdf887304e400e08a9 moby/moby#35405

Closed

nishanttotla mentioned this pull request Nov 3, 2017

[17.11] Delete node attachments when node is removed #2428

Closed

thaJeztah mentioned this pull request Nov 20, 2017

[17.09] Delete node attachments when node is removed #2456

Merged

fcrisciani mentioned this pull request Dec 4, 2017

swarm mode duplicate ip addresses moby/moby#33795

Closed

dperny mentioned this pull request Apr 20, 2018

TLA+ model for SwarmKit tasks #2613

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delete node attachments when node is removed #2409

Delete node attachments when node is removed #2409

dperny commented Oct 12, 2017

anshulpundir Oct 13, 2017

anshulpundir Oct 13, 2017

marcusmartins commented Oct 13, 2017

dperny commented Oct 13, 2017

dperny commented Oct 13, 2017

codecov bot commented Oct 13, 2017 •

edited

Loading

anshulpundir Oct 13, 2017

aaronlehmann Oct 14, 2017

anshulpundir Oct 16, 2017 •

edited

Loading

aaronlehmann Oct 17, 2017

dperny commented Oct 17, 2017

dperny commented Oct 18, 2017

anshulpundir left a comment •

edited

Loading

anshulpundir Oct 18, 2017

anshulpundir left a comment •

edited

Loading

Delete node attachments when node is removed #2409

Delete node attachments when node is removed #2409

Conversation

dperny commented Oct 12, 2017

anshulpundir Oct 13, 2017

Choose a reason for hiding this comment

anshulpundir Oct 13, 2017

Choose a reason for hiding this comment

marcusmartins commented Oct 13, 2017

dperny commented Oct 13, 2017

dperny commented Oct 13, 2017

codecov bot commented Oct 13, 2017 • edited Loading

Codecov Report

anshulpundir Oct 13, 2017

Choose a reason for hiding this comment

aaronlehmann Oct 14, 2017

Choose a reason for hiding this comment

anshulpundir Oct 16, 2017 • edited Loading

Choose a reason for hiding this comment

aaronlehmann Oct 17, 2017

Choose a reason for hiding this comment

dperny commented Oct 17, 2017

dperny commented Oct 18, 2017

anshulpundir left a comment • edited Loading

Choose a reason for hiding this comment

anshulpundir Oct 18, 2017

Choose a reason for hiding this comment

anshulpundir left a comment • edited Loading

Choose a reason for hiding this comment

codecov bot commented Oct 13, 2017 •

edited

Loading

anshulpundir Oct 16, 2017 •

edited

Loading

anshulpundir left a comment •

edited

Loading

anshulpundir left a comment •

edited

Loading