Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allocator: Avoid assigning duplicate IPs during initialization #2237

Merged
merged 1 commit into from
Jun 12, 2017

Conversation

aaronlehmann
Copy link
Collaborator

When the allocator starts up, there is a pass that "allocates" the
existing tasks/nodes/services in the store. In fact, existing tasks are
typically already allocated, and this is mostly populating local state
to reflect which IPs are taken. However, if there are any tasks in the
store which are brand new, or previously failed to allocated, these will
actually receive new allocations.

The problem is that allocation of new IPs is interspersed with updating
local state with existing IPs. If a task, node, or service that needs an
IP is processed before one that claims a specific IP, the IP claimed by
the latter task be assigned.

This change makes the allocator do two passes on initialization. First
it handles objects that claim a specific IP, then it handles all other
objects.

cc @abhinandanpb @mavenugo @sanimej @ijc

@aaronlehmann
Copy link
Collaborator Author

One open question: Do we need to do something similar for networks? I notice we store a DriverState field on networks, but I'm not sure if this includes IP addresses.

When the allocator starts up, there is a pass that "allocates" the
existing tasks/nodes/services in the store. In fact, existing tasks are
typically already allocated, and this is mostly populating local state
to reflect which IPs are taken. However, if there are any tasks in the
store which are brand new, or previously failed to allocated, these will
actually receive new allocations.

The problem is that allocation of new IPs is interspersed with updating
local state with existing IPs. If a task, node, or service that needs an
IP is processed before one that claims a specific IP, the IP claimed by
the latter task be assigned.

This change makes the allocator do two passes on initialization. First
it handles objects that claim a specific IP, then it handles all other
objects.

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
@aluzzardi
Copy link
Member

LGTM

Unrelated to this PR, but the complexity of this part is frightening, had a hard time understanding the implications :/

@@ -212,6 +213,7 @@ func TestAllocator(t *testing.T) {
go func() {
assert.NoError(t, a.Run(context.Background()))
}()
defer a.Stop()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unrelated fix I think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fix, just a cleanup to move a.Stop to a defer statement.

@ijc
Copy link
Contributor

ijc commented Jun 12, 2017

I 100% agree WRT the complexity.

In this case although the diff looks pretty large and confusing the bulk of it is just factoring out allocateTasks and allocateServices and then arranging to call those and allocateNodes twice. IMO the refactoring is a good thing in its own right since that function is pretty big and tricky to follow.

So, LGTM. (I'm a little worried about the apparent non-determinism in the test case, implied by the looping 100 times, but I guess that is unavoidable for an issue like this).

@aaronlehmann
Copy link
Collaborator Author

I didn't observe any non-determinism, but wanted to make sure the test case is aggressive enough to catch regressions. The problem triggers much more quickly when creating tasks in reverse numerical order, which is why the test does this. If I remember correctly, it only takes 3 tasks to trigger the problem. But in forward numerical order, it took about 52 tasks (since the allocator iterates over tasks in lexical order, which I suppose is close to numerical order). To avoid encoding assumptions about the iteration order, I decided to do 100 iterations. The test case runs instantly so I don't think there's any downside.

We could consider randomizing the task IDs, but I generally prefer to avoid nondeterminism in tests.

@mavenugo
Copy link
Contributor

Thanks @aaronlehmann @aluzzardi @ijc am not super familiar with this area of the code. But it does feel complicated.

BTW, should this be back ported to 17.03 as well ? I have seen multiple reports of duplicate IP usage in swarm-mode.

@aaronlehmann
Copy link
Collaborator Author

BTW, should this be back ported to 17.03 as well ?

Yes, if there are plans for another 17.03 release, I think it should be. There are also other high-priority fixes that would make sense to backport. Please let me know if I should work on these backports.

@abhi
Copy link
Contributor

abhi commented Jun 12, 2017

LGTM

@aaronlehmann aaronlehmann merged commit a4bf013 into moby:master Jun 12, 2017
@aaronlehmann aaronlehmann deleted the duplicate-ips branch June 12, 2017 16:44
@aaronlehmann aaronlehmann added this to the 17.06 milestone Jun 12, 2017
silvin-lubecki pushed a commit to silvin-lubecki/docker-ce that referenced this pull request Feb 3, 2020
To get the changes:
* moby/swarmkit#2234
* moby/swarmkit#2237
* moby/swarmkit#2238

Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Feb 3, 2020
To get the changes:
* moby/swarmkit#2234
* moby/swarmkit#2237
* moby/swarmkit#2238

Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Mar 10, 2020
To get the changes:
* moby/swarmkit#2234
* moby/swarmkit#2237
* moby/swarmkit#2238

Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Mar 23, 2020
To get the changes:
* moby/swarmkit#2234
* moby/swarmkit#2237
* moby/swarmkit#2238

Signed-off-by: Andrew Hsu <andrewhsu@docker.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants