Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix StatefulSet e2e flake #42367

Merged
merged 1 commit into from
Mar 4, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions test/e2e/framework/statefulset_utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -355,7 +355,8 @@ func (s *StatefulSetTester) SetHealthy(ss *apps.StatefulSet) {
}
}

func (s *StatefulSetTester) waitForStatus(ss *apps.StatefulSet, expectedReplicas int32) {
// WaitForStatus waits for the ss.Status.Replicas to be equal to expectedReplicas
func (s *StatefulSetTester) WaitForStatus(ss *apps.StatefulSet, expectedReplicas int32) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update this to also check that ss.Status.ObservedGeneration >= ss.Generation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that observedGeneration is not working for StatefulSets and sent a PR: #42429

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kargakis Using this code

if ss.Status.ObservedGeneration == nil { 
        Logf("ss observed generation is nil")
} else {
        Logf("ss observed generation %d",ss.Status.ObservedGeneration)
}
Logf("ss generation %d",ss.Generation)
if ssGet.Status.ObservedGeneration == nil {
        Logf("ssGet observed generation is nil")
} else {
        Logf("ssGet observed generation %d",ssGet.Status.ObservedGeneration)
}
Logf("ssGet generation %d",ssGet.Generation)

The observed generation of both ss and ssGet are always nil. However the generation appears to increment consistently.

what about the following

if  ssGet.Generation < ss.Generation {
     return false, nil
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kargakis just saw the above do you want to wait until #42429 merges before submitting this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I just tagged it with a manual approval since the hack changes were minimal and it should be merged when you get back online:) Once you update this helper with the new check feel free to apply lgtm

Logf("Waiting for statefulset status.replicas updated to %d", expectedReplicas)

ns, name := ss.Namespace, ss.Name
Expand All @@ -365,6 +366,9 @@ func (s *StatefulSetTester) waitForStatus(ss *apps.StatefulSet, expectedReplicas
if err != nil {
return false, err
}
if *ssGet.Status.ObservedGeneration < ss.Generation {
return false, nil
}
if ssGet.Status.Replicas != expectedReplicas {
Logf("Waiting for stateful set status to become %d, currently %d", expectedReplicas, ssGet.Status.Replicas)
return false, nil
Expand Down Expand Up @@ -402,7 +406,7 @@ func DeleteAllStatefulSets(c clientset.Interface, ns string) {
if err := sst.Scale(&ss, 0); err != nil {
errList = append(errList, fmt.Sprintf("%v", err))
}
sst.waitForStatus(&ss, 0)
sst.WaitForStatus(&ss, 0)
Logf("Deleting statefulset %v", ss.Name)
if err := c.Apps().StatefulSets(ss.Namespace).Delete(ss.Name, nil); err != nil {
errList = append(errList, fmt.Sprintf("%v", err))
Expand Down
3 changes: 3 additions & 0 deletions test/e2e/statefulset.go
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,7 @@ var _ = framework.KubeDescribe("StatefulSet", func() {

By("Before scale up finished setting 2nd pod to be not ready by breaking readiness probe")
sst.BreakProbe(ss, testProbe)
sst.WaitForStatus(ss, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be WaitForStatus(ss, 2)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

status.Replicas should always be 1 here regardless of the pod being ready or not. It seems that this is not the case actually and we treat this field as ReadyReplicas for ReplicaSets/Deployments (DaemonSets have a different name but there is still a separation between created and ready in the status). So the test seems fine but API-wise I think there is an unnecessary incosistency between the workload apis.

sst.WaitForRunningAndNotReady(2, ss)

By("Continue scale operation after the 2nd pod, and scaling down to 1 replica")
Expand Down Expand Up @@ -280,6 +281,7 @@ var _ = framework.KubeDescribe("StatefulSet", func() {
By("Confirming that stateful set scale up will halt with unhealthy stateful pod")
sst.BreakProbe(ss, testProbe)
sst.WaitForRunningAndNotReady(*ss.Spec.Replicas, ss)
sst.WaitForStatus(ss, 0)
sst.UpdateReplicas(ss, 3)
sst.ConfirmStatefulPodCount(1, ss, 10*time.Second)

Expand Down Expand Up @@ -309,6 +311,7 @@ var _ = framework.KubeDescribe("StatefulSet", func() {
Expect(err).NotTo(HaveOccurred())

sst.BreakProbe(ss, testProbe)
sst.WaitForStatus(ss, 0)
sst.WaitForRunningAndNotReady(3, ss)
sst.UpdateReplicas(ss, 0)
sst.ConfirmStatefulPodCount(3, ss, 10*time.Second)
Expand Down