Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cni/linux: ensure only the latest sandbox has external-ids:iface-id set #869

Merged
merged 1 commit into from
Oct 22, 2019

Conversation

dcbw
Copy link
Contributor

@dcbw dcbw commented Oct 18, 2019

Kubelet may keep multiple sandboxes running for a given pod if some are
waiting for garbage collection. We need to make sure that only the
latest sandbox has iface-id set to the container's namespace/name or
ovn-controller may get confused about which OVS port to associate with
the Pod's logical switch port.

@squeed @girishmg @danwinship

@dcbw dcbw force-pushed the pod-iface-id branch 3 times, most recently from fbbe186 to dd58f9e Compare October 18, 2019 18:40
@danwinship
Copy link
Contributor

Shouldn't the old interface have been deleted by this point? Is this a sign that some previous cleanup failed?

Do you have logs showing this happening?

@squeed
Copy link
Contributor

squeed commented Oct 21, 2019

In the case where the sandbox dies, there's no guarantee that the previous one will have a CNI DEL issued before the replacement has a CNI ADD. I've personally seen this, and it's pretty easy to reproduce: just kill the pause container.

1 similar comment
@squeed
Copy link
Contributor

squeed commented Oct 21, 2019

In the case where the sandbox dies, there's no guarantee that the previous one will have a CNI DEL issued before the replacement has a CNI ADD. I've personally seen this, and it's pretty easy to reproduce: just kill the pause container.

Kubelet may keep multiple sandboxes running for a given pod if some are
waiting for garbage collection. We need to make sure that only the
latest sandbox has iface-id set to the container's namespace/name or
ovn-controller may get confused about which OVS port to associate with
the Pod's logical switch port.

Signed-off-by: Dan Williams <dcbw@redhat.com>
@dcbw
Copy link
Contributor Author

dcbw commented Oct 21, 2019

Shouldn't the old interface have been deleted by this point? Is this a sign that some previous cleanup failed?

Do you have logs showing this happening?

@danwinship to echo what Casey said, Kubelet may keep a couple sandboxes for a given pod around up until the GC interval. Then it will delete them as they expire. Only one of the sandboxes will be "live" (eg SANDBOX_READY) but for OVN only that sandbox should have the iface-id set.

@squeed
Copy link
Contributor

squeed commented Oct 22, 2019

Without knowing the OVN data model very well, is it possible to use the containerid, which will be unique, as the externalid?

@girishmg
Copy link
Member

@squeed

The iface-id is already unique across multiple instances of these sandboxes for a a given pod. I think the issue is we cannot have multiple physical OVS internal ports that are all tagged with the same 'iface-id'. The ovn-controller will be confused as to which of the physical port it needs to bind the logical flows to.

// but only the latest one should have the iface-id set.
uuids, _ := ovsFind("Interface", "_uuid", "external-ids:iface-id="+ifaceID)
for _, uuid := range uuids {
if out, err := ovsExec("remove", "Interface", uuid, "external-ids", "iface-id"); err != nil {
Copy link
Member

@girishmg girishmg Oct 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcbw if you want to remove the interface itself, then you will need "destroy".

# ovs-vsctl --help |grep 'destroy TBL'
  destroy TBL REC             delete RECord from TBL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@girishmg I wasn't actually intending to destroy the Interface object, just to clear its iface-id from its Interface record. The actual Interface would be destroyed when kubelet/CRI finally called CNI DEL for the sandbox.

@dcbw
Copy link
Contributor Author

dcbw commented Oct 22, 2019

Without knowing the OVN data model very well, is it possible to use the containerid, which will be unique, as the externalid?

@squeed the master doesn't have any access to sandbox-level stuff since that's all done on the node and not recorded in the Kube API AFAIK...

The iface-id is already unique across multiple instances of these sandboxes for a a given pod. I think the issue is we cannot have multiple physical OVS internal ports that are all tagged with the same 'iface-id'. The ovn-controller will be confused as to which of the physical port it needs to bind the logical flows to.

@girishmg I think what Casey is saying is that we can avoid this problem entirely if we name the OVN port after the CNI sandbox ID (which will be unique) rather than the pod namespace+name (which is shared between sandboxes). However, we can't do that because the master has no idea what the sandbox is.

@girishmg
Copy link
Member

/lgtm

@dcbw dcbw merged commit b771d38 into ovn-org:master Oct 22, 2019
oilbeater added a commit to kubeovn/kube-ovn that referenced this pull request Oct 28, 2019
andreaskaris pushed a commit to andreaskaris/ovn-kubernetes that referenced this pull request Dec 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants