-
Notifications
You must be signed in to change notification settings - Fork 260
Add pending program status for IPs in CNS. #690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add logic structure of how to update program status.
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Codecov Report
@@ Coverage Diff @@
## master #690 +/- ##
==========================================
- Coverage 38.60% 38.54% -0.07%
==========================================
Files 78 79 +1
Lines 10433 10471 +38
==========================================
+ Hits 4028 4036 +8
- Misses 5913 5943 +30
Partials 492 492 |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| ncRequest.NetworkContainerid = nc.ID | ||
| ncRequest.NetworkContainerType = cns.Docker | ||
| ncRequest.Version = nc.Version | ||
| ncRequest.NCVersion = nc.Version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this changed? It should be set to ncRequest.Version only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change it back to ncRequest.Version.
I didn't reuse the Version since the usage of Version is not clear in CNS previously. There are Version, which compared with VMVersion and HostVersion which record NMA version. After double checking code, resusing Version should be fine.
cns/restserver/util.go
Outdated
| func (service *HTTPRestService) addIPConfigStateUntransacted(ncVersion, ncId string, ipconfigs map[string]cns.SecondaryIPConfig) { | ||
|
|
||
| existingNCStatus, _ := service.state.ContainerStatus[ncId] | ||
| existingNCVersion, _ := strconv.Atoi(existingNCStatus.CreateNetworkContainerRequest.NCVersion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wont return the expected value, NCRequest is already overwritten in the caller API (saveNetworkContainerGoalState, line 120). You need to pass existing Version (not NCVersion) from the called like we are passing existingIpconfigs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update.
I think we should not overwritten existing NC status unless CNS finish all the updating works in this call. Imagine if call crash in the middle, update is not finished but existing NC status got overwritten. Another plus is we don't need to create new object to carry existing NC status temporarily. I can follow the previous style for now but thinking open another PR which do all the works needed before updating existing NC status.
cns/restserver/util.go
Outdated
|
|
||
| existingNCStatus, _ := service.state.ContainerStatus[ncId] | ||
| existingNCVersion, _ := strconv.Atoi(existingNCStatus.CreateNetworkContainerRequest.NCVersion) | ||
| nmAgentNCVersion := getNCVersionFromNMAgent(ncId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually recommend to move it to background or make a call in a go routing and wait for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we SHOULD not make an outbound call while acquiring a lock to the Service. it must be outside the lock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep it as it for this PR since getNCVersionFromNMAgent didn't have any logic now. This PR is a structure and I'll fill in logic in next PR. It can be a background thread or separate go routine.
No matter whether we query NMAgent in background or not, we should compare NMAgent's version every time when NC is updated. I don't think NC update will be too frequent to bug NMAgent.
cns/restserver/util.go
Outdated
| } | ||
| func (service *HTTPRestService) addIPConfigStateUntransacted(ncVersion, ncId string, ipconfigs map[string]cns.SecondaryIPConfig) { | ||
|
|
||
| existingNCStatus, _ := service.state.ContainerStatus[ncId] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again this is already updated in the caller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update.
cns/restserver/util.go
Outdated
| } | ||
|
|
||
| // To do, complete this logic according to getNetworkContainerStatus | ||
| func getNCVersionFromNMAgent(ncid string) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the logic to get NMAgent ver in ImdsClientInterface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getNetworkContainerStatus is using imdsClient.GetNetworkContainerInfoFromHost. I think GetNetworkContainerInfoFromHost covers most of this logic. I can double check. Will update there.
cns/restserver/util.go
Outdated
| if _, exists := service.PodIPConfigState[ipId]; exists { | ||
| continue | ||
| } | ||
| func (service *HTTPRestService) addIPConfigStateUntransacted(ncVersion, ncId string, ipconfigs map[string]cns.SecondaryIPConfig) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in this function you also need to pass existingIpconfigs so that you can keep those IPs as available and mark the new ones in PendingProgramming State
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, existingSecondaryIPConfigs is needed.
cns/NetworkContainerContract.go
Outdated
| Available = "Available" | ||
| Allocated = "Allocated" | ||
| PendingRelease = "PendingRelease" | ||
| PendingProgram = "PendingProgram" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we rename it to PendingProgramming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
cns/restserver/util.go
Outdated
|
|
||
| // Add the newIpConfigs, ignore if ip state is already in the map | ||
| service.addIPConfigStateUntransacted(req.NetworkContainerid, newIPConfigs) | ||
| service.addIPConfigStateUntransacted(req.NCVersion, req.NetworkContainerid, newIPConfigs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some logs to updateIpConfigsStateUntransacted function, it is missing logs overall. Specially when we are updating the state or if this api is returning error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double checked, error msg is already logged, reference
https://github.com/Azure/azure-container-networking/blob/master/cns/restserver/api.go#L831
…working into programmedIp
cns/restserver/util.go
Outdated
|
|
||
| existingNCStatus, _ := service.state.ContainerStatus[ncId] | ||
| existingNCVersion, _ := strconv.Atoi(existingNCStatus.CreateNetworkContainerRequest.NCVersion) | ||
| nmAgentNCVersion := getNCVersionFromNMAgent(ncId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we SHOULD not make an outbound call while acquiring a lock to the Service. it must be outside the lock
cns/restserver/util.go
Outdated
| if nmAgentNCVersion >= existingNCVersion { | ||
| service.addIPConfigStateUntransacted(cns.Available, cns.PendingProgramming, req.NetworkContainerid, newIPConfigs) | ||
| } else { | ||
| service.addIPConfigStateUntransacted(cns.PendingProgramming, cns.PendingProgramming, req.NetworkContainerid, newIPConfigs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to move existing IPs back to PendingState if they are already available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We are NOT make an outboud call. getNCVersionFromNMAgent didn't have implementation yet. It could be the function which fetch NMAgent version locally from what background thread got and saved.
- We not not moving existing IPs back to Pending State.
…to a background thread.
neaggarwMS
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one minor comment.
cns/fakes/imdsclientfake.go
Outdated
| // GetNMagentVersion - Mock implementation to return host NMAgent NC version | ||
| // Set it as 0 which is the same as default initial NC version for testing purpose | ||
| func (imdsClient *ImdsClientTest) GetNMagentVersion() int { | ||
| logger.Printf("[Azure CNS] GetNMagentVersionFromNMAgent") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the log to get the NC version from NMAgent
cns/imdsclient/api.go
Outdated
| GetNetworkContainerInfoFromHost(networkContainerID string, primaryAddress string, authToken string, apiVersion string) (*ContainerVersion, error) | ||
| GetPrimaryInterfaceInfoFromHost() (*InterfaceInfo, error) | ||
| GetPrimaryInterfaceInfoFromMemory() (*InterfaceInfo, error) | ||
| GetNMagentVersion() int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename the function name GetNetworkContainerInfoFromHost and pass all the parameters networkContainerID string, primaryAddress string, apiVersion string)
*Add pending program status for IPs in CNS.
*Add logic structure of how to update program status.
feat:
Add logic structure for CNS udpate NC secondary IP to pending program before available
Reason for Change:
Pending program is a status to track that IP is available in NC but not in NMAgent side yet. IP is available when NMAgent is also ready.
Issue Fixed:
Prevent CNS allocate pending program IP to pod.