-
Notifications
You must be signed in to change notification settings - Fork 882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Predictive interface order / default gateway #2093
Comments
In my usecase, I need these networks:
Result: When only having two networks, the macvlan becomes eth0 and default gw. When having more networks, it's totally unpredictable. |
Brilliant solution @cziebuhr , your patch worked for me. Do you think it's likely to get merged soon (or at all)? |
Created PR #2103 which at least solves point a) |
Hey. Sorry for the delayed response. Needed to do some background research as I wasn't familiar with this section of the code. Overall, as stated, I think that the change in #2103 takes us in the right direction. From my query of those who know the history of this part of the code a bit better, the intent of the priority was definitely to allow the user to order the interfaces. Agree that keeping them in a heap works against that goal and so putting them in an ordered array as in the PR will fix that. Now regarding part b) my reading of the code is that the sorting was stable, but perhaps not what we might want. The endpoints were effectively sorted into categories. Non-internal endpoints in the gateway networks come last preceeded by internal endpoints in gateway networks. Internal endpoints in non-gateway networks come precede those and finally the first category is non-internal endpoints in non-gateway networks. Within those categories, if both endpoints have join info, then interfaces with v6 info always come before those without. Finally, they are sorted by priority with ties being resolved by network name lexicographic order. So the question becomes whether this is the correct ordering. Pictorally, this looks something like:
Now we could make it so that priority is considered first so instead what we have is a set of lists like the following ordered by priority:
This would allow the user (when we get to part c)) to specify ordering but still have a set of criteria for breaking ties when priority is the same. Also, if priority isn't specified, the behavior should match what we have today. W.r.t. c) Yes, we need to expose the priority in some way for the endpoints in order to allow the users to specify the ordering. This would require more investigation and obviously would require a separate PR in https://github.com/moby/moby. Suggestions / patches are by all means welcome. |
Am I right that there can only be one gateway network? If yes, then
|
Yep, that's what I was trying to get at. Sorry if unclear. There is one difference though: priority comparison currently puts higher priority as lower in the comparison so higher priority interfaces come first: func (eh epHeap) Less(i, j int) bool {
...
return cip > cjp // endpoint i < endpoint j if prio i > prio j
} (annotation mine) I'd think we'd want to keep this. Although it's less numerically intuitive in the code, having higher priority endpoint interfaces come earlier in the listing would match my intuition. |
Regarding c) The PRs docker/cli#2529 and moby/moby#40974 seem related to this one, the latter requiring #2550 to compile. I do wonder, whether this implementation would expose users to "auto-magic" of epPriority, mentioned in @cziebuhr's initial post. |
Any updates on this issue? |
I tried to setup a container with several networks, but didn't succeed in successfully configuring them. Especially when using ipv6 and trying to get a stable interface name of a macvlan network, I got mad. After looking at the docker sources, I stumbled across the following items:
a) I wonder about the usage of container/heap for sb.endpoints. IMHO heap is useful when using heap.Pop() for removing the respective minimum element. Instead, sb.Endpoints() and sb.getConnectedEndpoints() iterate directly over the sb.endpoints slice. In this case, only the first element is guaranteed to be the minimum element of the slice. All other elements have random order.
In some setups, this leads to interface names inside the container having unpredictive order (moby/moby#25181) and an unpredictive default-gateway. I created a patch to change from heap to an ordered array instead.
b) The eh.Less() function also produces unstable results, when e.g. several networks are internal. I'm unsure if checking for endpointInGWNetwork(), getNetwork().Internal() and joinInfo.gw is essential or if it was just convenience for most setups. I think we should skip processing these checks in eh.Less() and instead move them to sb.getGatewayEndpoint(). Also see #1609, #1443 and #1141.
c) There is already some code to set epPriority, unfortunately it's not yet exported through the api and it's still subject to additional auto-magic. I'd favor a solution where I can change the interface order as a user instead of having different results in each docker version.
The text was updated successfully, but these errors were encountered: