Skip to content

Conversation

@ninzavivek
Copy link
Contributor

@ninzavivek ninzavivek commented Oct 1, 2020

CNI is some cases is unable to talk to CNS. This can happen, CNS crashing or race between start/CNI getting invoked.

Currently if this communication errrs out on DEL calls, CNI will silently ignore the error. Container runtime assumes everything went fine and deletes the container. But the HNS endpoints are still there on the host result into them being orphaned. If a container with similar configuration is placed on the host , then ADD call will fail with object already exists error.

Please note, CNI DEL are supposed to be idempotent, and should generally handle the resource not found/exist error elegantely without any error.

@ninzavivek ninzavivek closed this Oct 2, 2020
@ninzavivek ninzavivek reopened this Oct 2, 2020
@ninzavivek ninzavivek changed the title fix cni error Throw CNI error - On CNI DEL call, if communication with CNS cannot be established. Oct 2, 2020
@ninzavivek ninzavivek marked this pull request as ready for review October 2, 2020 02:58
@codecov
Copy link

codecov bot commented Oct 2, 2020

Codecov Report

Merging #683 into master will decrease coverage by 0.06%.
The diff coverage is 4.54%.

@@            Coverage Diff             @@
##           master     #683      +/-   ##
==========================================
- Coverage   38.63%   38.56%   -0.07%     
==========================================
  Files          78       79       +1     
  Lines       10430    10446      +16     
==========================================
- Hits         4030     4029       -1     
- Misses       5909     5925      +16     
- Partials      491      492       +1     

@ashvindeodhar
Copy link
Member

Ideally we shouldn't need to go to CNS to get the network name for the endpoint that is already created. Regardless of the CNS state we should be able to delete the endpoint in CNI.

Copy link
Contributor

@ramiro-gamarra ramiro-gamarra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple comments

@ashvindeodhar
Copy link
Member

looks good to me but lets wait for the pipeline issue to get resolved so that this gets validated with e2e tests

@matmerr
Copy link
Member

matmerr commented Oct 14, 2020

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@ninzavivek ninzavivek merged commit eba4207 into Azure:master Oct 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants