Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sdn: try cleaning up OVS rules even if sandbox is gone #18166

Merged

Conversation

dcbw
Copy link
Contributor

@dcbw dcbw commented Jan 18, 2018

If a sandbox is deleted underneath kubernetes its netns will
be gone and its veth interface will be deleted by the kernel.
That means we can't inspect the veth for its IP address and
other details, which are used to remove OVS flows for the
interface.

But we've already got code to find out the IP using the
sandbox ID which kubelet passes down to us. Let's use
that code to at least delete the stale OVS flows.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1518684

@danwinship @openshift/networking @knobunc

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 18, 2018
@dcbw dcbw force-pushed the ovs-cleanup-when-sandbox-is-gone branch from b71decb to db7b59d Compare January 18, 2018 20:23
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 18, 2018
@danwinship
Copy link
Contributor

Hm... we could find the veth name in this case too; given the OVS flow, we can extract the port number and use ovs-vsctl to get the corresponding port name (ovs-vsctl --no-heading --data=bare --columns=name find Interface ofport=${PORT})...

@dcbw
Copy link
Contributor Author

dcbw commented Jan 18, 2018

/test unit

@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 19, 2018
@dcbw
Copy link
Contributor Author

dcbw commented Jan 19, 2018

Hm... we could find the veth name in this case too; given the OVS flow, we can extract the port number and use ovs-vsctl to get the corresponding port name (ovs-vsctl --no-heading --data=bare --columns=name find Interface ofport=${PORT})...

@danwinship ok, how about now? Let's just stuff the sandbox ID into external-ids and use that to clean things up when we can. The one case where we need the hostVeth name is deleting the port, and we can get that if we have the sandbox ID. (open question, can we just delete the port & interface records from the OVSDB directly, or does 'del-port' do something special?)

As a further optimization, we could stuff the IP address into external-ids too and kill the flow note stuff perhaps? Much nicer parsing OVSDB output than dump-flows.

Copy link
Contributor

@knobunc knobunc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@knobunc knobunc requested review from danwinship, rajatchopra and pravisankar and removed request for smarterclayton January 19, 2018 14:24
@knobunc knobunc self-assigned this Jan 19, 2018
@knobunc
Copy link
Contributor

knobunc commented Jan 19, 2018

/retest

Copy link
Contributor

@danwinship danwinship left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LATM ("looks awesome to me").

(open question, can we just delete the port & interface records from the OVSDB directly, or does 'del-port' do something special?)

That doesn't appear to be documented. (Note that you can't just ovs-vsctl destroy most objects anyway; you have to remove references to them (from br0's ports property in this case) and let OVS GC them.)

As a further optimization, we could stuff the IP address into external-ids too and kill the flow note stuff perhaps? Much nicer parsing OVSDB output than dump-flows.

Yes

func (oc *ovsController) ensureOvsPort(hostVeth string) (int, error) {
return oc.ovs.AddPort(hostVeth, -1)
func (oc *ovsController) ensureOvsPort(hostVeth, sandboxID string) (int, error) {
return oc.ovs.AddPort(hostVeth, -1, "external-ids=sandbox="+sandboxID, "--", "set", "Port", hostVeth, "external-ids=sandbox="+sandboxID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh.

Fortunately, I don't think there's any need to bother with Port properties. Commands that act on Ports will allow a Port name rather than a UUID (like we used to do in ClearPodBandwidth()), and the Port name is the same as the Interface name, so if you changed ovsif.Find() to let you specify the column you wanted returned, then you could have getPortsForSandbox() return a list of Interface names rather than a list of Port UUIDs, and that would work everywhere. In fact, it would simplify TearDownPod().

return err
}

portUUIDs, err := oc.getPortsForSandbox(sandboxID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we now absolutely depend on every port having a sandbox ID in its external-ids then we need to make AlreadySetUp() fail for bridges with non-external-id'ed ports attached. So change ruleVersion at the top of the file. (In which case you'll also want danwinship@dbbf89ac.)

UpdatePod() will also need to be updated to call ensureOvsPort().

uuids := make([]string, 0)
for _, line := range strings.Split(output, "\n") {
parts := strings.Split(line, ":")
if len(parts) >= 2 && strings.TrimSpace(parts[0]) == "_uuid" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can force ovs-vsctl to do all this for you. --no-heading --columns=_uuid find ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then use strings.Fields to get the trimmed array.

func (oc *ovsController) SetPodBandwidth(hostVeth string, ingressBPS, egressBPS int64) error {
// note pod ingress == OVS egress and vice versa
func (oc *ovsController) getPortsForSandbox(sandboxID string) ([]string, error) {
return oc.ovs.Find("port", "external-ids=sandbox="+sandboxID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does a full match on external-ids (ie, it will fail if external-ids also contains other keys). Changing the = between external-ids and sandbox to a : will change it to a subkey match (checking only that external-ids contains a matching sandbox key)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

func (oc *ovsController) SetPodBandwidth(hostVeth, sandboxID string, ingressBPS, egressBPS int64) error {
// note pod ingress == OVS egress and vice versa

if err := oc.ClearPodBandwidth(nil, sandboxID); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could remove the hostVeth arg to SetPodBandwidth() and just call getPortsForSandbox() and then pass that list to ClearPodBandwidth() (which then no longer needs its own call-getPortsForSandbox() special case), and change the remaining hostVeth references in SetPodBandwidth() to use the portsList instead. (I'm not totally sure if this would be a net improvement.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danwinship I'd rather still pass it in so we know what the most-recently-created veth will be for actually setting the bandwidth. Otherwise we'd have to assume it's the first item in the ports array, or do it for all ports. We already have the info in the veth name here.

But I agree we don't need ClearPodBandwidth() to do its own getPortsForSandboxID().

@dcbw dcbw force-pushed the ovs-cleanup-when-sandbox-is-gone branch from c17cd2b to 22f3e02 Compare January 23, 2018 02:33
@dcbw
Copy link
Contributor Author

dcbw commented Jan 23, 2018

@danwinship @rajatchopra PTAL thanks! Addressed all your comments.

@dcbw
Copy link
Contributor Author

dcbw commented Jan 23, 2018

/test extended_conformance_install issue #17882

@danwinship
Copy link
Contributor

lgtm except I think you still need to update UpdatePod? Right now the changed ruleVersion means it will run UpdatePod on all the pods, but UpdatePod won't actually set the external-id

@dcbw
Copy link
Contributor Author

dcbw commented Jan 23, 2018

lgtm except I think you still need to update UpdatePod? Right now the changed ruleVersion means it will run UpdatePod on all the pods, but UpdatePod won't actually set the external-id

@danwinship do we? If we bump the flow version, on restart that out to blows the entire bridge away on upgrade including all the flows, ports, interfaces, qos, etc. That makes Update() fail and the pod's sandbox gets killed and then completely restarted.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 23, 2018
@danwinship
Copy link
Contributor

duh, right. (not bothering to /lgtm yet since you have to rebase)

@dcbw dcbw force-pushed the ovs-cleanup-when-sandbox-is-gone branch from 22f3e02 to db19d04 Compare January 23, 2018 17:27
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 23, 2018
@dcbw
Copy link
Contributor Author

dcbw commented Jan 23, 2018

@danwinship rebased

@danwinship
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 23, 2018
@dcbw
Copy link
Contributor Author

dcbw commented Jan 23, 2018

/test cmd

@dcbw
Copy link
Contributor Author

dcbw commented Jan 23, 2018

/test extended_conformance_install issue #17556

@dcbw
Copy link
Contributor Author

dcbw commented Jan 23, 2018

/approve

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 23, 2018
dcbw and others added 3 commits January 23, 2018 16:21
If a sandbox is deleted underneath kubernetes its netns will
be gone and its veth interface will be deleted by the kernel.
That means we can't inspect the veth for its IP address and
other details, which are used to remove OVS flows for the
interface.

But we've already got code to find out the IP using the
sandbox ID which kubelet passes down to us.  Let's use
that code to at least delete the stale OVS flows.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1518684
Store the sandbox ID in the OVSDB as an external-id and use that
to find things we need to clean up.  Previously these were leaked
if the network namespace was removed underneath kubernetes.
@dcbw dcbw force-pushed the ovs-cleanup-when-sandbox-is-gone branch from db19d04 to 87fdfb0 Compare January 23, 2018 22:21
@openshift-merge-robot openshift-merge-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 23, 2018
@dcbw
Copy link
Contributor Author

dcbw commented Jan 23, 2018

/test unit

@dcbw
Copy link
Contributor Author

dcbw commented Jan 24, 2018

@danwinship @knobunc can I get another LGTM? nothing has changed in the code, but I mistakenly repushed because I thought that might trigger re-evaluation of the pkg/util/ovs/OWNERS changes

// note pod ingress == OVS egress and vice versa
// Returned list can also be used for port names
func (oc *ovsController) getInterfacesForSandbox(sandboxID string) ([]string, error) {
return oc.ovs.Find("interface", "name", "external-ids:sandbox="+sandboxID)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ovs AddPort() and Create() has 'external-ids=' and Find() has 'external-ids:', initially thought ':' is a typo but looking at the man page, column[:key]=value is the correct syntax.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pravisankar yeah we want a substring match in Find, while Add/Create set it.

@pravisankar
Copy link

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 24, 2018
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, dcbw, pravisankar

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@dcbw
Copy link
Contributor Author

dcbw commented Jan 24, 2018

/test extended_conformance_install issue #17556

@pravisankar
Copy link

/retest

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue.

@openshift-merge-robot openshift-merge-robot merged commit ae43279 into openshift:master Jan 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. component/networking lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants