New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exit if the plugin is taking too long #581

Merged
merged 3 commits into from Sep 10, 2018

Conversation

Projects
None yet
2 participants
@tomdee
Contributor

tomdee commented Aug 27, 2018

  • Tests
  • Documentation
  • Release note
Shut down CNI plugin if execution takes over 30 seconds

Fixes projectcalico/calico#2098

@tomdee tomdee changed the title from <!-- A few sentences describing the overall goals of the pull request's commits. Please include - the type of fix - (e.g. bug fix, new feature, documentation) - some details on _why_ this PR should be merged - the details of the testing you've done on it (both manual and automated) - which components are affected by this PR - links to issues that this PR addresses --> to Exit if the plugin is taking too long Aug 27, 2018

@caseydavenport caseydavenport reopened this Sep 4, 2018

@caseydavenport caseydavenport added this to the Calico v3.3.0 milestone Sep 5, 2018

@tomdee

This comment has been minimized.

Contributor

tomdee commented Sep 6, 2018

Based on some more digging, the offending line is

_, err := ip.DelLinkByNameAddr(args.IfName, netlink.FAMILY_V4)
(or possibly the lien above).

This code is being called when the CNI delete method is called (when the container is torn down).

It's the last things that happens and it only occurs after the IP address is released.

@tomdee tomdee force-pushed the tomdee:self-destruct branch from d124035 to aad12ff Sep 7, 2018

tomdee added some commits Sep 6, 2018

@tomdee tomdee force-pushed the tomdee:self-destruct branch from aad12ff to 070ba5a Sep 7, 2018

@tomdee tomdee requested a review from caseydavenport Sep 7, 2018

return err
})

if err != nil {

This comment has been minimized.

@caseydavenport

caseydavenport Sep 7, 2018

Member

This could be simplified to be ch <- err

fmt.Fprintf(os.Stderr, "Calico CNI deleted device in netns %s\n", args.Netns)
}
case <-time.After(5 * time.Second):
fmt.Fprintf(os.Stderr, "Calico CNI timed out deleting device in netns %s\n", args.Netns)

This comment has been minimized.

@caseydavenport

caseydavenport Sep 7, 2018

Member

Did you consider returning an error in this scenario?

I suppose it's probably OK, but might we end up leaking devices this way?

@tomdee tomdee merged commit 3df9af8 into projectcalico:master Sep 10, 2018

2 checks passed

license/cla Contributor License Agreement is signed.
Details
semaphoreci The build passed on Semaphore.
Details

@tomdee tomdee deleted the tomdee:self-destruct branch Sep 10, 2018

@caseydavenport caseydavenport referenced this pull request Sep 19, 2018

Merged

cherry-pick various fixes to release-v3.2 #599

0 of 3 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment