-
Notifications
You must be signed in to change notification settings - Fork 38.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubelet/kubenet: split hostport handling into separate module #26934
Conversation
2016/06/07 07:20:44 e2e.go:212: Running: build-release |
Sigh... a couple instances of:
which seems to be flake #26431 |
// Open any hostports the pod's containers want | ||
runningPods, err := plugin.getRunningPods() | ||
if err == nil { | ||
err = plugin.hostportHandler.OpenPodHostportsAndSync(pod, BridgeName, runningPods) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say if the CNI ADD succeeded, but port is in use. SetUpPod will end up setting up the veth but fail at opening hostport. IIUC, kubelet will just kill infra container without calling TearDownPod (it assumes SetUpPod won't be partially successful). Then this Pod will stuck, because CNI ADD will always return failure since eth0 has been setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need some restructure. Other parts looks good overall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say if the CNI ADD succeeded, but port is in use. SetUpPod will end up setting up the veth but fail at opening hostport. IIUC, kubelet will just kill infra container without calling TearDownPod (it assumes SetUpPod won't be partially successful). Then this Pod will stuck, because CNI ADD will always return failure since eth0 has been setup.
You are correct that this PR will exacerbate the situation a bit, so I'll try to fix that. However, it looks like all the network plugins have problems with this, since they don't necessarily clean up after themselves on errors in SetUpPod(). And since (at least) dockertools/manager.go never calls TearDownPod() when setup fails, I think there are already cases where failures will leave the pod network configured... I'll try to figure something out here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the pod will actually get stuck, since eth0 will get destroyed along with the infra container's network namespace. But what won't be cleaned up is any IPAM leases or shaper stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh that is even worse, because kubelet will keep retry and used up all IPs. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Researching this more indicates that the docker runtime will eventually tear down the infra container. I'm not sure about the rkt runtime. But depending on the runtime to call TearDownPod() on a SetUpPod() failure seems pretty fragile, and I think the plugins should do it themsevles if they can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. That is mostly because network plugin is event triggered.
I was thinking adding a Reconcile interface into network plugin. It can be triggered periodically and do whatever it want to ensure/cleanup network configuration.
Still flaking on #26431... |
return false, nil | ||
} | ||
|
||
func normalizeRule(rule string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have a shell-command-line-parsing utility somewhere?
maybe this function should panic/fail if it hits something that it's not going to parse correctly (eg, '\"
' or '\
' [that's backslash space])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, or just fix it to normalize the []string
form rather than the string
form
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not easy to do []string normalization, because the rules coming from Save()/SaveAll() are already in string-form, and we'd need to split them apart into []string and handle quoted comments.
@freehan @danwinship PTAL, thanks! |
200f879
to
841e320
Compare
|
@k8s-bot test this issue: #IGNORE |
} | ||
} | ||
|
||
func NewTestHostportHandler(iptables utiliptables.Interface, portOpener hostportOpener) HostportHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe put this in hostport_test?
@freehan @danwinship PTAL, thanks! |
runningPods := make([]*hostport.RunningPod, 0) | ||
for _, p := range pods { | ||
for _, c := range p.Containers { | ||
if c.Name != dockertools.PodInfraContainerName { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment here for something like "Assuming pod specs yield by runtime.GetPods include name and ID of containers in pod"
if err := plugin.setup(namespace, name, id, pod); err != nil { | ||
// Make sure everything gets cleaned up on errors | ||
podIP, _ := plugin.podIPs[id] | ||
plugin.teardown(namespace, name, id, podIP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe log the err from teardown ?
I just have a bunch of nits. |
Relying on the runtime to later call cleanup is fragile, so make sure that everything gets nicely cleaned up when setup errors occur.
@freehan updated for your comments, PTAL |
LGTM. This is great! We should get it in. Maybe for 1.3.1 |
GCE e2e build/test passed for commit a519e8a. |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
GCE e2e build/test passed for commit a519e8a. |
Automatic merge from submit-queue |
This pulls the hostport functionality of kubenet out into a separate module so that it can be more easily tested and potentially used from other code (maybe CNI, maybe downstream consumers like OpenShift, etc). Couldn't find a mock iptables so I wrote one, but I didn't look very hard.
@freehan @thockin @bprashanth