-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved NIC creation and deletion logic #594
Improved NIC creation and deletion logic #594
Conversation
pkg/driver/driver_azure.go
Outdated
// Change to something like below, however I am unable to find the helper method for this. | ||
// if err != nil && management.IsResourceNotFoundError(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am having some trouble with validating the error code when NIC doesn't exist. Any suggestions?
cc: @AxiomSamarth , @amshuman-kr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on your suggestion i have updated the logic here to this - https://github.com/gardener/machine-controller-manager/pull/594/files#diff-36365e28193d28c39f36ac4d9172fbf8eac69553361f6179fa3184f9cad9eb0cR1329-R1342. I hope this is more appropriate now.
/invite @MSSedusch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @prashanth26! The PR itself looks good. Some optional nit-picks and design pattern suggestions below.
pkg/driver/driver_azure.go
Outdated
@@ -1035,20 +1047,64 @@ func (clients *azureDriverClients) checkOrphanDisks(ctx context.Context, resourc | |||
} | |||
|
|||
func (clients *azureDriverClients) deleteNIC(ctx context.Context, resourceGroupName string, nicName string) error { | |||
var ( | |||
nicDeletionTimeout = 10 * time.Minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pkg/driver/driver_azure.go
Outdated
sleepInterval = 10 * time.Second | ||
maxSleepInterval = 3 * time.Minute | ||
currentSleepTime = 0 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constants? Or even better, why not use client-go/backoff? https://github.com/kubernetes/client-go/blob/master/util/flowcontrol/backoff.go#L32-L38
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually thinking of using a backoff client, didn't know which one to use. Yes, this makes sense. Let me check it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked through using the client-go/backoff mechanism. However, it looks like it's usecase is a little different compared to ours. It seems like it's keeping a bucket for objects with different names that are being backed off. Refer below
- https://github.com/kubernetes/client-go/blob/master/util/flowcontrol/backoff.go#L37
- https://github.com/kubernetes/kubernetes/blob/019080fc4d32bc7acc994605afbc2fac416188d7/pkg/controller/daemon/daemon_controller.go#L820-L838
So looked up more appropriate libraries for our use-case. I found this backoff library that seems to be used by a lot of projects. And usage also seems simple and works well in our case. The Licence however is MIT, I hope that should be fine as I see other vendors using similar licenses - https://github.com/cenkalti/backoff/blob/v4/LICENSE.
e461713
to
69a2814
Compare
c3e3b06
to
4ff6c55
Compare
/ok-to-test |
4ff6c55
to
99eb67d
Compare
99eb67d
to
398d725
Compare
/ok-to-test |
/needs review |
@AxiomSamarth - We will probably need to incorporate this change prior to the MCM azure provider release. |
/* | ||
NIC creation | ||
*/ | ||
NIC, err := clients.nic.Get(ctx, resourceGroupName, nicName, "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Prashanth,
If this Get()
call returns an error apart from 404 Not Found, it will result in nil pointer error at line 634 VMParameters := d.getVMParameters(vmName, vmImageRef, *NIC.ID)
as it will have no reference to NIC.
Found this issue while I was incorporating this in OOT and wrote a UT for the same. Please let me know your thoughts about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. This is why we need tests. :)
I have tried to fix this scenario with this commit. Please let me know if it makes sense. The commit looks a little confusing, however in reality it is only one extra if condition. Not sure why GitHub shows the diff changes differently. Locally it looks much less.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect! This works like a charm! Thank you :)
pkg/driver/driver_azure.go
Outdated
return nil | ||
} | ||
|
||
klog.V(4).Infof("NIC doesn't existance for %q", nicName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep the log as simple as NIC not found? The current one does not seem to be so good. Just a thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Here is the change - 3c0e156.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the change @prashanth26! and sorry for the delay in approving the PR. LGTM
Thanks @amshuman-kr . Will squash all changes, run the changes once more and then merge it. |
- NIC creation now, adopts any existing NICs with matching name - NIC deletion confirms deletion by performing GET on deletion - Improved creation logs to give more details - Revendored additional libraries
3c0e156
to
570486e
Compare
Made a final round of test. Looks good to me. Merging now. |
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes https://github.com/gardener/machine-controller-manager/issues/544
Special notes for your reviewer:
Release note: