Skip to content
This repository has been archived by the owner on Jan 12, 2022. It is now read-only.

Statefulset deemed ready but pods were still rolling out #1294

Closed
sreis opened this issue Aug 17, 2021 · 0 comments · Fixed by #1296
Closed

Statefulset deemed ready but pods were still rolling out #1294

sreis opened this issue Aug 17, 2021 · 0 comments · Fixed by #1296
Assignees
Labels
area: cli type: bug user-facing problem (describe the problem in the title!)

Comments

@sreis
Copy link
Contributor

sreis commented Aug 17, 2021

Describe the bug

During an upgrade to version v2010.08.13, the Cortex ingester Statefulset was marked ready, but the rollout was still in progress. The CLI exited with success, and the rollout continued in the background.

Here is the message printed by the CLI when waiting for the deployments to finish:

2021-08-17T15:11:33.168Z info: waiting for 1 StatefulSets
2021-08-17T15:11:33.168Z debug:     Waiting for 1 pods to be ready for StatefulSet cortex/ingester
2021-08-17T15:11:33.169Z info: waiting for 0 Certificates
2021-08-17T15:11:38.170Z info: waiting for 0 Deployments
2021-08-17T15:11:38.170Z info: waiting for 0 DaemonSets
2021-08-17T15:11:38.171Z info: waiting for 1 StatefulSets
2021-08-17T15:11:38.171Z debug:     Waiting for 1 pods to be ready for StatefulSet cortex/ingester
2021-08-17T15:11:38.171Z info: waiting for 0 Certificates
2021-08-17T15:11:43.172Z info: waiting for 0 Deployments
2021-08-17T15:11:43.172Z info: waiting for 0 DaemonSets
2021-08-17T15:11:43.172Z info: waiting for 0 StatefulSets
2021-08-17T15:11:43.172Z info: waiting for 0 Certificates
2021-08-17T15:11:43.173Z info: waiting for expected HTTP responses at these URLs: {

This is the ingester Statefulset spec, notice the 5 replicas:

> k -n cortex get sts/ingester -o yaml  
apiVersion: apps/v1
kind: StatefulSet
metadata:
  creationTimestamp: "2021-08-17T14:53:15Z"
  generation: 1
  name: ingester
  namespace: cortex
  ownerReferences:
  - apiVersion: cortex.opstrace.io/v1alpha1
    kind: Cortex
    name: opstrace-cortex
    uid: 9cd17ef2-b268-4340-8d5f-594f1343efa6
  resourceVersion: "56009873"
  selfLink: /apis/apps/v1/namespaces/cortex/statefulsets/ingester
  uid: 6501310f-dd0a-4f55-99e5-d13c790accb0
spec:
  podManagementPolicy: Parallel
  replicas: 5

To Reproduce

This bug occurred in an Opstrace instance with 5 ingesters. When it finished the rollout of ingester-2, the Statefulset was marked ready, but it was still missing ingester-1 and 0 since 3 and 4 had already completed the rollout.

Expected behavior

The CLI Statefulset readiness check should wait for all pods to be ready.

@sreis sreis self-assigned this Aug 17, 2021
@sreis sreis added area: cli type: bug user-facing problem (describe the problem in the title!) backlog state (used by codetree) labels Aug 17, 2021
@opstracy opstracy removed the backlog state (used by codetree) label Aug 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area: cli type: bug user-facing problem (describe the problem in the title!)
Development

Successfully merging a pull request may close this issue.

2 participants