Skip to content

Conversation

@jaer-tsun
Copy link
Contributor

@jaer-tsun jaer-tsun commented Jun 6, 2020

What this PR does / why we need it:

  1. CNS talks to DNC to get node goalstate
  2. CNS checks with NMA if it has goalstate for NC

Special notes for your reviewer:
re-review after pretty fat rebase

@jaer-tsun jaer-tsun requested a review from ashvindeodhar June 6, 2020 00:03
@codecov
Copy link

codecov bot commented Jun 6, 2020

Codecov Report

Merging #574 into master will decrease coverage by 0.46%.
The diff coverage is 10.11%.

@@            Coverage Diff             @@
##           master     #574      +/-   ##
==========================================
- Coverage   42.43%   41.97%   -0.47%     
==========================================
  Files          72       72              
  Lines        9993    10151     +158     
==========================================
+ Hits         4241     4261      +20     
- Misses       5280     5421     +141     
+ Partials      472      469       -3     

@jaer-tsun jaer-tsun force-pushed the cnsToDnc branch 2 times, most recently from 7a4c854 to 603f703 Compare June 9, 2020 16:46
@jaer-tsun jaer-tsun force-pushed the cnsToDnc branch 2 times, most recently from d4f235a to c7d9479 Compare June 18, 2020 22:50
@jaer-tsun jaer-tsun changed the title [WIP] Cns to dnc Cns to dnc Jun 24, 2020
@jaer-tsun jaer-tsun force-pushed the cnsToDnc branch 6 times, most recently from 3c80a80 to 510ce73 Compare July 13, 2020 18:50
@jaer-tsun jaer-tsun force-pushed the cnsToDnc branch 2 times, most recently from a8609ac to db7726b Compare July 15, 2020 17:52
@tamilmani1989 tamilmani1989 changed the title Cns to dnc CNS to DNC communication in Managed DNC Jul 15, 2020
Comment on lines 1266 to 1289

var (
context = podInfo.PodName + podInfo.PodNamespace
dncEP = service.GetOption(acn.OptPrivateEndpoint).(string)
infraVnet = service.GetOption(acn.OptInfrastructureNetwork).(string)
nodeID = service.GetOption(acn.OptNodeID).(string)
isManagedDnc = dncEP != "" && infraVnet != "" && nodeID != ""
)

containerID, exists = service.state.ContainerIDByOrchestratorContext[context]
if !exists {
if isManagedDnc {
service.lock.Unlock()
getNetworkContainerResponse.Response.ReturnCode, getNetworkContainerResponse.Response.Message = service.SyncNodeStatus(dncEP, infraVnet, nodeID, req.OrchestratorContext)
service.lock.Lock()
if getNetworkContainerResponse.Response.ReturnCode == NotFound {
return getNetworkContainerResponse
}

containerID = service.state.ContainerIDByOrchestratorContext[context]
}
} else if isManagedDnc {
_, getNetworkContainerResponse.Response.ReturnCode, getNetworkContainerResponse.Response.Message = service.isNCWaitingForUpdate(service.state.ContainerStatus[containerID].CreateNetworkContainerRequest.Version, containerID)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to separate function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move what to separate function?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

github doesnt have good gui...

Comment on lines 1266 to 1289

containerID, exists = service.state.ContainerIDByOrchestratorContext[context]
if !exists {
if isManagedDnc {
service.lock.Unlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you unlock?..no lock acquired before this code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's actually supposed to be RUnlock and RLock

@jaer-tsun jaer-tsun force-pushed the cnsToDnc branch 2 times, most recently from 4a63970 to cef191e Compare July 15, 2020 18:47
@jaer-tsun jaer-tsun requested a review from tamilmani1989 July 15, 2020 18:51
go func(ep, vnet, node string) {
// Periodically poll (30s) DNC for node updates
for {
<-time.NewTicker(time.Second * 30).C
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move 30 sec to a const

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's only used here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my comment in general is to not hardcode values like 5 seconds, 30 seconds. Instead declare those as variable like syncNodeInterval.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just finished reviewing the DNC PR. We are retrieving all the NCs every 30 seconds. let's expose this as a configurable param via config file. If we see we are hitting cosmosdb RU limit due to NC processing in DNC we can increase this w/o needing to update the code.

@ashvindeodhar
Copy link
Member

Can you add the UTs to test register, sync and NMA api? You will need to inject the dependency / mock those components

@jaer-tsun jaer-tsun requested a review from ashvindeodhar July 15, 2020 22:45
@jaer-tsun jaer-tsun force-pushed the cnsToDnc branch 2 times, most recently from 5037681 to 3530953 Compare July 15, 2020 23:13
getNetworkContainerVersionURL string) (*http.Response, error) {
logger.Printf("[NMAgentClient] GetNetworkContainerVersion NC: %s", networkContainerID)

response, err := common.GetHttpClient().Get(getNetworkContainerVersionURL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

common.GetHttpClient()- can this function return nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's init at start


// SetNodeOrchestrator :- Set node orchestrator after registering with mDNC
func (service *HTTPRestService) SetNodeOrchestrator(r *cns.SetOrchestratorTypeRequest) {
body, _ := json.Marshal(r)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

catching marshal error and logging it?

// SetNodeOrchestrator :- Set node orchestrator after registering with mDNC
func (service *HTTPRestService) SetNodeOrchestrator(r *cns.SetOrchestratorTypeRequest) {
body, _ := json.Marshal(r)
req, _ := http.NewRequest(http.MethodPost, "", bytes.NewBuffer(body))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here..catching error and logging it

)

// try to retrieve NodeInfoResponse from mDNC
response, err = httpc.Get(fmt.Sprintf(common.SyncNodeNetworkContainersURLFmt, dncEP, infraVnet, nodeID, dncApiVersion))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good we move this to separate statement and log full url

waitingForUpdate = true
returnCode = NetworkContainerPendingStatePropagation
message = fmt.Sprintf("[Azure-CNS] Network container %s v%d had not propagated to respective NMA w/ v%d", ncid, programmedVersion, nmaVersion)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to log if programmedversion < nmaversion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, CNS should be receiving latest

@jaer-tsun jaer-tsun requested a review from ashvindeodhar July 21, 2020 18:29
ashvindeodhar
ashvindeodhar previously approved these changes Jul 21, 2020
neaggarwMS
neaggarwMS previously approved these changes Jul 22, 2020

type ManagedSettings struct {
PrivateEndpoint string
InfrastructureNetwork string
Copy link
Member

@tamilmani1989 tamilmani1989 Jul 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this vnetid? if this is id, can we name it as InfrastructureNetworkID.
also can you add a comment what value privateendpoint takes(is it dns or ip?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added ID, and it takes either IP or DNS, that's why it's just Endpoint

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but can add a comment there


if tmpReturnCode == UnexpectedError {
continue
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think i asked earlier...not sure what you replied for that.. what if tmpReturnCode != Success && tmpReturnCode !=unexpectederror && bytes.Compare(nc.OrchestratorContext, contextFromCNI) != 0 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then that means it's NetworkContainerPendingStatePropagation; in this case, the err gets returned to CNI and CNS will still save the state

Comment on lines 220 to 222
dncEP = service.GetOption(acn.OptPrivateEndpoint).(string)
infraVnet = service.GetOption(acn.OptInfrastructureNetwork).(string)
nodeID = service.GetOption(acn.OptNodeID).(string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we not using config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do use config, those values are passed to the service

nodeID = service.GetOption(acn.OptNodeID).(string)
)

returnCode, msg := service.SyncNodeStatus(dncEP, infraVnet, nodeID, json.RawMessage{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add comment why are we passing empty podcontext

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only CNI passes non-empty, there's a func comment about this already

Comment on lines +287 to +289
privateEndpoint := acn.GetArg(acn.OptPrivateEndpoint).(string)
infravnet := acn.GetArg(acn.OptInfrastructureNetworkID).(string)
nodeID := acn.GetArg(acn.OptNodeID).(string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this? we are going to get these from config anyway

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checked with jaeryn..we should not advise customer to use cmdline args..its just for internal testing and it should be removed if not needed

Copy link
Member

@tamilmani1989 tamilmani1989 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jaer-tsun jaer-tsun merged commit dd7abd1 into Azure:master Jul 23, 2020
@jaer-tsun jaer-tsun deleted the cnsToDnc branch July 23, 2020 20:03
neaggarwMS pushed a commit to neaggarwMS/azure-container-networking that referenced this pull request Nov 13, 2020
* initial changes for CNS->DNC support

* Adding changes for CNS to be compatible with managed DNC (reverse communication channel)

* adding NC version validation with respective NMA

* return errors for respective NC based on orchestrator context from CNI

* add nc version check via NMA

* adding logic to SyncNodeStatus and check if NCWaitingForUpdate for CniADD and CnsAttach calls

* addressing most of ashvin's comments

* adding managed config

* fat rebase

* addressing some comments

* slight optimizations...

* adding channel mode instead of managed bool

* set err in register node so that we keep looping

* addressing ashvin's comments

* fix test

* removing swift prefix mods for mdnc

* addressing tamanoha's comments

Co-authored-by: Jaeryn <tsun.chu@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants