-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SetNodeUpdateStatusNeeded whenever nodeAdd event is received #37727
Conversation
93fccb9
to
3418cbd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple small comments
@@ -457,6 +460,7 @@ func (asw *actualStateOfWorld) updateNodeStatusUpdateNeeded(nodeName types.NodeN | |||
"Failed to set statusUpdateNeeded to needed %t because nodeName=%q does not exist", | |||
needed, | |||
nodeName) | |||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Errors should not be silently swallowed, consumers should decide if they want to swallow the error. So I'd suggest modifying this chain to bubble the error up the caller and let the caller decide to ignore it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree and I wanted to change it back to return fmt.Errorf()
as it previously was. However that would involve changing the return type of the function and everywhere where it is being called (twice in this case) which might also affect some tests.
It might make sense to not touch production code that has been tested before. Also this would involve an additional overhead on the intended fix of this PR.
I can definitely create a separate issue and PR to refactor this code and spend some time with the changes that it would potentially affect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. We can do this during normal 1.6 development.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #39056
@@ -1094,6 +1094,17 @@ func Test_OneVolumeTwoNodes_TwoDevicePaths(t *testing.T) { | |||
verifyAttachedVolume(t, attachedVolumes, generatedVolumeName2, string(volumeName), node2Name, devicePath2, true /* expectedMountedByNode */, false /* expectNonZeroDetachRequestedTime */) | |||
} | |||
|
|||
func Test_SetNodeStatusUpdateNeededError(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Write a comment before the test summarizing what it is testing (expected behavior).
Then inside the test I like to break it down in to three sections:
// Arrange
// Act
// Assert
This makes it easier to follow the test. Basically you setup, do something, then verify it. Up to you if you want to follow this pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
After stubbing out code that deletes pods, I was able to verify that the upgrade works and I'm able to see the data that was written before the upgrade. kubectl describe pods |
3418cbd
to
b25a426
Compare
Jenkins GCI GKE smoke e2e failed for commit b25a426. Full PR test history. The magic incantation to run this job again is |
Awesome. |
@k8s-bot gci gke e2e test this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more nit
} | ||
nodeName := types.NodeName(node.Name) | ||
adc.nodeUpdate(nil, obj) | ||
adc.actualStateOfWorld.SetNodeStatusUpdateNeeded(nodeName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a comment why we are doing this since it is not obvious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
b25a426
to
638ef1b
Compare
/lgtm |
Removing LGTM while @rkouj investigates another upgrade failure. |
I retested it and it's working fine. It looks like first time around I didn't have the fix in my branch when I was doing the test. I'll retest this again tomorrow morning to be absolutely sure. |
Retested. The fix works. This is when the node object got recreated E1201 This is when the nodeStatusUpdateNeeded got called. I1201 |
Frequency of updates on the nodeAdd https://gist.github.com/anonymous/bf6ea5f3e465d261f918a7e062966515 |
@saad-ali PTAL |
Reviewed 1 of 4 files at r1, 2 of 3 files at r2, 1 of 1 files at r3. Comments from Reviewable |
Thanks for revalidating! /lgtm Review status: all files reviewed at latest revision, 3 unresolved discussions. Comments from Reviewable |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
Jenkins kops AWS e2e failed for commit 638ef1b. Full PR test history. The magic incantation to run this job again is |
Jenkins GCE e2e failed for commit 638ef1b. Full PR test history. The magic incantation to run this job again is |
@k8s-bot cvm gce e2e test this |
@k8s-bot kops aws e2e test this |
Automatic merge from submit-queue |
Commit found in the "release-1.5" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked. |
What this PR does / why we need it:
Bug fix and SetNodeStatusUpdateNeeded for a node whenever its api object is added. This is to ensure that we don't lose the attached list of volumes in the node when its api object is deleted and recreated.
fixes #37586
#37585
Special notes for your reviewer: