Skip to content

Conversation

@neaggarwMS
Copy link
Member

@neaggarwMS neaggarwMS commented Mar 2, 2021

Reason for Change:
Handle the race scenario where HTTP Listener gets started before initializing CNS state. During Initialization/Reconciliation, CNS gets the CRD and recreates the NC request with SecondaryIPs, then it gets list of pods and set those ips as Allocated. If HTTP listener is started then CNI will start requesting ips to CNS and CNS will start allocating right after they are set to Available, thus double allocations. Thus reordered the calls and ensured HTTP Listener is started after initializing CNS.

Also fixed

  • some logging
  • avoid double release during failure in IPAM pool monitor

Issue Fixed:

Requirements:

Notes:

@codecov
Copy link

codecov bot commented Mar 2, 2021

Codecov Report

Merging #809 (0cc36b3) into master (08f0006) will increase coverage by 0.17%.
The diff coverage is 43.96%.

@@            Coverage Diff             @@
##           master     #809      +/-   ##
==========================================
+ Coverage   42.17%   42.34%   +0.17%     
==========================================
  Files         143      144       +1     
  Lines       13936    14191     +255     
==========================================
+ Hits         5877     6009     +132     
- Misses       7356     7456     +100     
- Partials      703      726      +23     

@neaggarwMS neaggarwMS force-pushed the cns_fixReconcileBug branch from c7eaf64 to 438acda Compare March 2, 2021 02:43
logger.Errorf("[cns-rc] Error initializing cns state: %v", err)
return err
crdRC.lock.Lock()
if crdRC.initialized != true {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could simplify with if !crdRC.initialized {}


// UpdateCRDSpec updates the CRD spec
func (crdRC *crdRequestController) UpdateCRDSpec(cntxt context.Context, crdSpec nnc.NodeNetworkConfigSpec) error {
func (crdRC *crdRequestController) UpdateCRDSpec(cntxt context.Context, crdSpec nnc.NodeNetworkConfigSpec) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think anything changed here, just an extra space snuck in before the return type

func (pm *CNSIPAMPoolMonitor) increasePoolSize() error {
pm.mu.Lock()
defer pm.mu.Unlock()
pm.mu.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, why the flip?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason in particular, just coding practice to add a defer unlock API before Lock.


In reply to: 585721003 [](ancestors = 585721003)

time.Sleep(time.Millisecond * 500)
}

ctx := context.Background()
Copy link
Collaborator

@thatmattlong thatmattlong Mar 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point we need to use contexts properly, such as adding a cancel func or timeout. However, this PR doesn't need to address this, since there are context.Background()s and context.TODO()s all throughout the CNS code, so we probably need one PR just to address context. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed


In reply to: 585724212 [](ancestors = 585724212)

Reconciler *CrdReconciler
initialized bool
Started bool
lock sync.Mutex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at this and its usage for a bit, I understand the intent, but I think this is a sign that this struct needs to be split into different abstractions. Ideally, this shouldn't need to carry state about initialization, we should just have ready-to-go dependencies injected, or document proper usage. I would suggest adding a todo to clean up further because the intention of these is hard to follow. Also, since there is a getter, Started need not be exported.

Copy link
Member Author

@neaggarwMS neaggarwMS Mar 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This piece of code requires adding more tests. Created a PBI to incorporate feedback and add more tests
PBI: https://msazure.visualstudio.com/One/_workitems/edit/9284845


In reply to: 585726500 [](ancestors = 585726500)

return err
}

return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just

return service.StartListener(config)

@neaggarwMS neaggarwMS merged commit cab6615 into Azure:master Mar 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants