-
Notifications
You must be signed in to change notification settings - Fork 260
Azure CNS fix reconcile bug #809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…zure-container-networking into cns_fixReconcileBug
Codecov Report
@@ Coverage Diff @@
## master #809 +/- ##
==========================================
+ Coverage 42.17% 42.34% +0.17%
==========================================
Files 143 144 +1
Lines 13936 14191 +255
==========================================
+ Hits 5877 6009 +132
- Misses 7356 7456 +100
- Partials 703 726 +23 |
c7eaf64 to
438acda
Compare
…zure-container-networking into cns_fixReconcileBug
| logger.Errorf("[cns-rc] Error initializing cns state: %v", err) | ||
| return err | ||
| crdRC.lock.Lock() | ||
| if crdRC.initialized != true { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could simplify with if !crdRC.initialized {}
|
|
||
| // UpdateCRDSpec updates the CRD spec | ||
| func (crdRC *crdRequestController) UpdateCRDSpec(cntxt context.Context, crdSpec nnc.NodeNetworkConfigSpec) error { | ||
| func (crdRC *crdRequestController) UpdateCRDSpec(cntxt context.Context, crdSpec nnc.NodeNetworkConfigSpec) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think anything changed here, just an extra space snuck in before the return type
| func (pm *CNSIPAMPoolMonitor) increasePoolSize() error { | ||
| pm.mu.Lock() | ||
| defer pm.mu.Unlock() | ||
| pm.mu.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, why the flip?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason in particular, just coding practice to add a defer unlock API before Lock.
In reply to: 585721003 [](ancestors = 585721003)
| time.Sleep(time.Millisecond * 500) | ||
| } | ||
|
|
||
| ctx := context.Background() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point we need to use contexts properly, such as adding a cancel func or timeout. However, this PR doesn't need to address this, since there are context.Background()s and context.TODO()s all throughout the CNS code, so we probably need one PR just to address context. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Reconciler *CrdReconciler | ||
| initialized bool | ||
| Started bool | ||
| lock sync.Mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking at this and its usage for a bit, I understand the intent, but I think this is a sign that this struct needs to be split into different abstractions. Ideally, this shouldn't need to carry state about initialization, we should just have ready-to-go dependencies injected, or document proper usage. I would suggest adding a todo to clean up further because the intention of these is hard to follow. Also, since there is a getter, Started need not be exported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This piece of code requires adding more tests. Created a PBI to incorporate feedback and add more tests
PBI: https://msazure.visualstudio.com/One/_workitems/edit/9284845
In reply to: 585726500 [](ancestors = 585726500)
| return err | ||
| } | ||
|
|
||
| return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can just
return service.StartListener(config)
Reason for Change:
Handle the race scenario where HTTP Listener gets started before initializing CNS state. During Initialization/Reconciliation, CNS gets the CRD and recreates the NC request with SecondaryIPs, then it gets list of pods and set those ips as Allocated. If HTTP listener is started then CNI will start requesting ips to CNS and CNS will start allocating right after they are set to Available, thus double allocations. Thus reordered the calls and ensured HTTP Listener is started after initializing CNS.
Also fixed
Issue Fixed:
Requirements:
Notes: