allocator: Avoid assigning duplicate IPs during initialization #2237

aaronlehmann · 2017-06-10T00:59:57Z

When the allocator starts up, there is a pass that "allocates" the
existing tasks/nodes/services in the store. In fact, existing tasks are
typically already allocated, and this is mostly populating local state
to reflect which IPs are taken. However, if there are any tasks in the
store which are brand new, or previously failed to allocated, these will
actually receive new allocations.

The problem is that allocation of new IPs is interspersed with updating
local state with existing IPs. If a task, node, or service that needs an
IP is processed before one that claims a specific IP, the IP claimed by
the latter task be assigned.

This change makes the allocator do two passes on initialization. First
it handles objects that claim a specific IP, then it handles all other
objects.

cc @abhinandanpb @mavenugo @sanimej @ijc

aaronlehmann · 2017-06-10T01:02:15Z

One open question: Do we need to do something similar for networks? I notice we store a DriverState field on networks, but I'm not sure if this includes IP addresses.

When the allocator starts up, there is a pass that "allocates" the existing tasks/nodes/services in the store. In fact, existing tasks are typically already allocated, and this is mostly populating local state to reflect which IPs are taken. However, if there are any tasks in the store which are brand new, or previously failed to allocated, these will actually receive new allocations. The problem is that allocation of new IPs is interspersed with updating local state with existing IPs. If a task, node, or service that needs an IP is processed before one that claims a specific IP, the IP claimed by the latter task be assigned. This change makes the allocator do two passes on initialization. First it handles objects that claim a specific IP, then it handles all other objects. Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>

aluzzardi · 2017-06-10T01:47:47Z

LGTM

Unrelated to this PR, but the complexity of this part is frightening, had a hard time understanding the implications :/

ijc · 2017-06-12T09:55:02Z

manager/allocator/allocator_test.go

@@ -212,6 +213,7 @@ func TestAllocator(t *testing.T) {
 	go func() {
 		assert.NoError(t, a.Run(context.Background()))
 	}()
+	defer a.Stop()


This is an unrelated fix I think?

Not a fix, just a cleanup to move a.Stop to a defer statement.

ijc · 2017-06-12T09:58:06Z

I 100% agree WRT the complexity.

In this case although the diff looks pretty large and confusing the bulk of it is just factoring out allocateTasks and allocateServices and then arranging to call those and allocateNodes twice. IMO the refactoring is a good thing in its own right since that function is pretty big and tricky to follow.

So, LGTM. (I'm a little worried about the apparent non-determinism in the test case, implied by the looping 100 times, but I guess that is unavoidable for an issue like this).

aaronlehmann · 2017-06-12T14:26:30Z

I didn't observe any non-determinism, but wanted to make sure the test case is aggressive enough to catch regressions. The problem triggers much more quickly when creating tasks in reverse numerical order, which is why the test does this. If I remember correctly, it only takes 3 tasks to trigger the problem. But in forward numerical order, it took about 52 tasks (since the allocator iterates over tasks in lexical order, which I suppose is close to numerical order). To avoid encoding assumptions about the iteration order, I decided to do 100 iterations. The test case runs instantly so I don't think there's any downside.

We could consider randomizing the task IDs, but I generally prefer to avoid nondeterminism in tests.

mavenugo · 2017-06-12T15:39:54Z

Thanks @aaronlehmann @aluzzardi @ijc am not super familiar with this area of the code. But it does feel complicated.

BTW, should this be back ported to 17.03 as well ? I have seen multiple reports of duplicate IP usage in swarm-mode.

aaronlehmann · 2017-06-12T16:40:11Z

BTW, should this be back ported to 17.03 as well ?

Yes, if there are plans for another 17.03 release, I think it should be. There are also other high-priority fixes that would make sense to backport. Please let me know if I should work on these backports.

abhi · 2017-06-12T16:43:26Z

LGTM

To get the changes: * moby/swarmkit#2234 * moby/swarmkit#2237 * moby/swarmkit#2238 Signed-off-by: Andrew Hsu <andrewhsu@docker.com>

aaronlehmann added the priority/P1 label Jun 10, 2017

aaronlehmann force-pushed the duplicate-ips branch from 0064ad4 to 61483fb Compare June 10, 2017 01:33

aaronlehmann mentioned this pull request Jun 10, 2017

17.06.0 RC3 tracker docker/for-linux#8

Closed

40 tasks

ijc reviewed Jun 12, 2017

View reviewed changes

aaronlehmann merged commit a4bf013 into moby:master Jun 12, 2017

aaronlehmann deleted the duplicate-ips branch June 12, 2017 16:44

aaronlehmann added this to the 17.06 milestone Jun 12, 2017

aaronlehmann added the process/cherry-picked label Jun 12, 2017

This was referenced Jun 12, 2017

Vendor swarmkit a4bf013 moby/moby#33642

Merged

Vendor swarmkit 98e5a93 docker-archive/docker-ce#67

Closed

andrewhsu mentioned this pull request Jun 12, 2017

revendor github.com/docker/swarmkit to ef3c57a docker-archive/docker-ce#69

Merged

aaronlehmann mentioned this pull request Jun 26, 2017

swarm mode duplicate ip addresses moby/moby#33795

Closed

silvin-lubecki pushed a commit to silvin-lubecki/docker-ce that referenced this pull request Feb 3, 2020

revendor github.com/docker/swarmkit to 6083c76

cd38044

To get the changes: * moby/swarmkit#2234 * moby/swarmkit#2237 * moby/swarmkit#2238 Signed-off-by: Andrew Hsu <andrewhsu@docker.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allocator: Avoid assigning duplicate IPs during initialization #2237

allocator: Avoid assigning duplicate IPs during initialization #2237

aaronlehmann commented Jun 10, 2017

aaronlehmann commented Jun 10, 2017

aluzzardi commented Jun 10, 2017

ijc Jun 12, 2017

aaronlehmann Jun 12, 2017

ijc commented Jun 12, 2017

aaronlehmann commented Jun 12, 2017

mavenugo commented Jun 12, 2017

aaronlehmann commented Jun 12, 2017

abhi commented Jun 12, 2017

allocator: Avoid assigning duplicate IPs during initialization #2237

allocator: Avoid assigning duplicate IPs during initialization #2237

Conversation

aaronlehmann commented Jun 10, 2017

aaronlehmann commented Jun 10, 2017

aluzzardi commented Jun 10, 2017

ijc Jun 12, 2017

Choose a reason for hiding this comment

aaronlehmann Jun 12, 2017

Choose a reason for hiding this comment

ijc commented Jun 12, 2017

aaronlehmann commented Jun 12, 2017

mavenugo commented Jun 12, 2017

aaronlehmann commented Jun 12, 2017

abhi commented Jun 12, 2017