Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration failure ended into data corruption #82

Closed
shubham14bajpai opened this issue Jan 12, 2021 · 0 comments · Fixed by #85
Closed

Migration failure ended into data corruption #82

shubham14bajpai opened this issue Jan 12, 2021 · 0 comments · Fixed by #85

Comments

@shubham14bajpai
Copy link
Contributor

shubham14bajpai commented Jan 12, 2021

Describe the bug: Migration failure ended into data corruption.

Expected behaviour: Even if the migration failed the pool should have been renamed and data should not be corrupted.

Steps to reproduce the bug:

Created a SPC in 1.7.0 then upgraded it to 2.4.0

mayadata:upgrade$ kubectl get spc,csp
NAME                                     AGE
storagepoolclaim.openebs.io/cstor-pool   82m

NAME                                   ALLOCATED   FREE    CAPACITY   STATUS    READONLY   TYPE      AGE
cstorpool.openebs.io/cstor-pool-3w1a   334K        39.7G   39.8G      Healthy   false      striped   82m


Started migration and made it fail just after the CSPI became online

mayadata:upgrade$ k get cspc,cspi
NAME                                           HEALTHYINSTANCES   PROVISIONEDINSTANCES   DESIREDINSTANCES   AGE
cstorpoolcluster.cstor.openebs.io/cstor-pool   1                  1                      1                  40m

NAME                                                 HOSTNAME    FREE     CAPACITY    READONLY   PROVISIONEDREPLICAS   HEALTHYREPLICAS   STATUS   AGE
cstorpoolinstance.cstor.openebs.io/cstor-pool-972g   127.0.0.1   38500M   38500378k   false      1                     0                 ONLINE   40m

Checked the zpool status on CSPI
 
mayadata:upgrade$ kubectl -n openebs exec -it cstor-pool-972g-7f4cfdd794-z598d -c cstor-pool-mgmt -- bash
root@cstor-pool-972g-7f4cfdd794-z598d:/# zpool status
  pool: cstor-7d9da0d6-904b-4310-8d90-3da1aacf4774
 state: ONLINE
  scan: none requested
config:

	NAME                                        STATE     READ WRITE CKSUM
	cstor-7d9da0d6-904b-4310-8d90-3da1aacf4774  ONLINE       0     0     0
	  /var/openebs/sparse/3-ndm-sparse.img      ONLINE       0     0     0
	  /var/openebs/sparse/0-ndm-sparse.img      ONLINE       0     0     0
	  /var/openebs/sparse/1-ndm-sparse.img      ONLINE       0     0     0
	  /var/openebs/sparse/2-ndm-sparse.img      ONLINE       0     0     0

errors: No known data errors

Then scaled up the old CSP deploy and checked the pool status (unexpected behaviour)

mayadata:openebs$ kubectl -n openebs exec -it cstor-pool-3w1a-56695f78b7-x957h -c cstor-pool-mgmt -- bash
root@cstor-pool-3w1a-56695f78b7-x957h:/# zpool status
  pool: cstor-76aad699-4e5f-4bd5-9a1b-16008d0d5c54
 state: ONLINE
  scan: none requested
config:

	NAME                                        STATE     READ WRITE CKSUM
	cstor-76aad699-4e5f-4bd5-9a1b-16008d0d5c54  ONLINE       0     0     0
	  /var/openebs/sparse/3-ndm-sparse.img      ONLINE       0     0     0
	  /var/openebs/sparse/0-ndm-sparse.img      ONLINE       0     0     0
	  /var/openebs/sparse/1-ndm-sparse.img      ONLINE       0     0     0
	  /var/openebs/sparse/2-ndm-sparse.img      ONLINE       0     0     0

errors: No known data errors

Was able to still write data using the application pod

Restarted the CSPI pod and pool got imported but the status gives an error

mayadata:migrate$ k logs -f cstor-pool-972g-7f4cfdd794-2g8l2 -c cstor-pool-mgmt
+ rm /usr/local/bin/zrepl
+ pool_manager_pid=7
+ /usr/local/bin/pool-manager start
+ trap _sigint INT
+ trap _sigterm SIGTERM
+ wait 7
E0112 10:35:27.740140       7 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 10:35:27.740345       7 pool.go:123] Waiting for pool container to start...
E0112 10:35:30.751974       7 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 10:35:30.752010       7 pool.go:123] Waiting for pool container to start...
E0112 10:35:33.755995       7 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 10:35:33.756057       7 pool.go:123] Waiting for pool container to start...
E0112 10:35:36.770793       7 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 10:35:36.770879       7 pool.go:123] Waiting for pool container to start...
E0112 10:35:39.783352       7 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 10:35:39.783374       7 pool.go:123] Waiting for pool container to start...
E0112 10:35:42.787035       7 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 10:35:42.787113       7 pool.go:123] Waiting for pool container to start...
E0112 10:35:45.797694       7 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 10:35:45.797771       7 pool.go:123] Waiting for pool container to start...
I0112 10:35:45.809247       7 controller.go:109] Setting up event handlers for CSPI
I0112 10:35:45.809704       7 controller.go:115] will set up informer event handlers for cvr
I0112 10:35:45.810120       7 new_restore_controller.go:105] Setting up event handlers for restore
I0112 10:35:45.886391       7 controller.go:110] Setting up event handlers for backup
I0112 10:35:45.893357       7 runner.go:38] Starting CStorPoolInstance controller
I0112 10:35:45.893409       7 runner.go:41] Waiting for informer caches to sync
I0112 10:35:45.909280       7 common.go:262] CStorPool found: [cannot open 'name': no such pool ]
I0112 10:35:45.909483       7 run_restore_controller.go:38] Starting CStorRestore controller
I0112 10:35:45.909525       7 run_restore_controller.go:41] Waiting for informer caches to sync
I0112 10:35:45.909556       7 run_restore_controller.go:53] Started CStorRestore workers
I0112 10:35:45.909674       7 runner.go:39] Starting CStorVolumeReplica controller
I0112 10:35:45.909706       7 runner.go:42] Waiting for informer caches to sync
I0112 10:35:45.909727       7 runner.go:47] Starting CStorVolumeReplica workers
I0112 10:35:45.909749       7 runner.go:54] Started CStorVolumeReplica workers
I0112 10:35:45.909893       7 runner.go:38] Starting CStorBackup controller
I0112 10:35:45.909926       7 runner.go:41] Waiting for informer caches to sync
I0112 10:35:45.993629       7 runner.go:45] Starting CStorPoolInstance workers
I0112 10:35:45.993667       7 runner.go:51] Started CStorPoolInstance workers
I0112 10:35:46.010362       7 runner.go:53] Started CStorBackup workers
I0112 10:35:46.017415       7 import.go:73] Importing pool 764d0038-cb8d-4b34-8ef8-5fb1efa80081 cstor-7d9da0d6-904b-4310-8d90-3da1aacf4774
I0112 10:35:51.166697       7 event.go:281] Event(v1.ObjectReference{Kind:"CStorPoolInstance", Namespace:"openebs", Name:"cstor-pool-972g", UID:"764d0038-cb8d-4b34-8ef8-5fb1efa80081", APIVersion:"cstor.openebs.io/v1", ResourceVersion:"9230", FieldPath:""}): type: 'Normal' reason: 'Pool Imported' Pool Import successful: cstor-7d9da0d6-904b-4310-8d90-3da1aacf4774
^C
mayadata:migrate$ k exec -it cstor-pool-972g-7f4cfdd794-2g8l2 -c cstor-pool-mgmt -- bash
root@cstor-pool-972g-7f4cfdd794-2g8l2:/# zpool status
  pool: cstor-7d9da0d6-904b-4310-8d90-3da1aacf4774
 state: ONLINE
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: none requested
config:

	NAME                                        STATE     READ WRITE CKSUM
	cstor-7d9da0d6-904b-4310-8d90-3da1aacf4774  ONLINE       0     0     7
	  /var/openebs/sparse/3-ndm-sparse.img      ONLINE       0     0    14
	  /var/openebs/sparse/0-ndm-sparse.img      ONLINE       0     0     0
	  /var/openebs/sparse/1-ndm-sparse.img      ONLINE       0     0     0
	  /var/openebs/sparse/2-ndm-sparse.img      ONLINE       0     0     0

errors: 1 data errors, use '-v' for a list 

Still able to write the data using the pod.

Restarted the CSP pool pod the import failed (expected behaviour)

mayadata:upgrade$ k logs -f cstor-pool-3w1a-56695f78b7-nb2zp -c cstor-pool-mgmt
+ rm /usr/local/bin/zrepl
+ exec /usr/local/bin/cstor-pool-mgmt start
E0112 11:09:32.888080       7 pool.go:501] zpool status returned error in zrepl startup : exit status 1
I0112 11:09:32.888334       7 pool.go:502] Waiting for zpool replication container to start...
E0112 11:09:35.896036       7 pool.go:501] zpool status returned error in zrepl startup : exit status 1
I0112 11:09:35.896298       7 pool.go:502] Waiting for zpool replication container to start...
E0112 11:09:38.903751       7 pool.go:501] zpool status returned error in zrepl startup : exit status 1
I0112 11:09:38.903805       7 pool.go:502] Waiting for zpool replication container to start...
E0112 11:09:41.912888       7 pool.go:501] zpool status returned error in zrepl startup : exit status 1
I0112 11:09:41.912968       7 pool.go:502] Waiting for zpool replication container to start...
E0112 11:09:44.920051       7 pool.go:501] zpool status returned error in zrepl startup : exit status 1
I0112 11:09:44.920155       7 pool.go:502] Waiting for zpool replication container to start...
E0112 11:09:47.928038       7 pool.go:501] zpool status returned error in zrepl startup : exit status 1
I0112 11:09:47.928138       7 pool.go:502] Waiting for zpool replication container to start...
I0112 11:09:47.983445       7 common.go:218] CStorPool CRD found
I0112 11:09:47.987162       7 common.go:236] CStorVolumeReplica CRD found
I0112 11:09:47.987794       7 new_pool_controller.go:103] Setting up event handlers
I0112 11:09:47.988014       7 new_replica_controller.go:118] will set up informer event handlers for cvr
I0112 11:09:47.988181       7 new_backup_controller.go:104] Setting up event handlers for backup
I0112 11:09:47.990730       7 new_restore_controller.go:103] Setting up event handlers for restore
I0112 11:09:47.993062       7 run_pool_controller.go:43] Starting CStorPool controller
I0112 11:09:47.993095       7 run_pool_controller.go:46] Waiting for informer caches to sync
I0112 11:09:47.996167       7 new_pool_controller.go:125] cStorPool Added event : cstor-pool-3w1a, 76aad699-4e5f-4bd5-9a1b-16008d0d5c54
I0112 11:09:47.997357       7 event.go:281] Event(v1.ObjectReference{Kind:"CStorPool", Namespace:"", Name:"cstor-pool-3w1a", UID:"76aad699-4e5f-4bd5-9a1b-16008d0d5c54", APIVersion:"openebs.io/v1alpha1", ResourceVersion:"13474", FieldPath:""}): type: 'Normal' reason: 'Synced' Received Resource create event
W0112 11:09:47.997459       7 common.go:271] CStorPool not found. Retrying after 5s, err: <nil>
I0112 11:09:47.997871       7 handler.go:598] cVR 'pvc-cb2f311d-b114-4927-bf1b-ab30738a270d-cstor-pool-3w1a': uid '6109daf9-a239-4049-b255-1aaf9671a7e0': phase 'Healthy': is_empty_status: false
I0112 11:09:47.998211       7 event.go:281] Event(v1.ObjectReference{Kind:"CStorVolumeReplica", Namespace:"openebs", Name:"pvc-cb2f311d-b114-4927-bf1b-ab30738a270d-cstor-pool-3w1a", UID:"6109daf9-a239-4049-b255-1aaf9671a7e0", APIVersion:"openebs.io/v1alpha1", ResourceVersion:"13475", FieldPath:""}): type: 'Normal' reason: 'Synced' Received Resource create event
I0112 11:09:48.093300       7 run_pool_controller.go:50] Starting CStorPool workers
I0112 11:09:48.093360       7 run_pool_controller.go:56] Started CStorPool workers
I0112 11:09:48.236208       7 new_pool_controller.go:167] cStorPool Modify event : cstor-pool-3w1a, 76aad699-4e5f-4bd5-9a1b-16008d0d5c54
I0112 11:09:48.237655       7 event.go:281] Event(v1.ObjectReference{Kind:"CStorPool", Namespace:"", Name:"cstor-pool-3w1a", UID:"76aad699-4e5f-4bd5-9a1b-16008d0d5c54", APIVersion:"openebs.io/v1alpha1", ResourceVersion:"13490", FieldPath:""}): type: 'Normal' reason: 'Synced' Received Resource modify event
E0112 11:09:48.574618       7 run_pool_controller.go:117] error syncing 'cstor-pool-3w1a': expected csp object but got 
cstorpool {null
}
W0112 11:09:53.005226       7 common.go:271] CStorPool not found. Retrying after 5s, err: <nil>
W0112 11:09:58.013215       7 common.go:271] CStorPool not found. Retrying after 5s, err: <nil>
W0112 11:10:03.021787       7 common.go:271] CStorPool not found. Retrying after 5s, err: <nil>
^C
mayadata:upgrade$ k exec -it cstor-pool-3w1a-56695f78b7-nb2zp -- bash
Defaulting container name to cstor-pool.
Use 'kubectl describe pod/cstor-pool-3w1a-56695f78b7-nb2zp -n openebs' to see all of the containers in this pod.
root@cstor-pool-3w1a-56695f78b7-nb2zp:/# zpool status
no pools available
root@cstor-pool-3w1a-56695f78b7-nb2zp:/# zpool import
2021-01-12/11:10:45.346 Iterating over all the devices to find zfs devices using blkid
2021-01-12/11:10:45.377 Iterated over cache devices to find zfs devices
no pools available to import
root@cstor-pool-3w1a-56695f78b7-nb2zp:/# 


Again restarted the CSPI pool pod and ended up in the user issue

mayadata:upgrade$ k logs -f cstor-pool-972g-7f4cfdd794-f2lsm -c cstor-pool-mgmt
+ rm /usr/local/bin/zrepl
+ pool_manager_pid=8
+ trap _sigint INT
+ /usr/local/bin/pool-manager start
+ trap _sigterm SIGTERM
+ wait 8
E0112 11:13:02.634184       8 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 11:13:02.634240       8 pool.go:123] Waiting for pool container to start...
E0112 11:13:05.637713       8 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 11:13:05.637805       8 pool.go:123] Waiting for pool container to start...
E0112 11:13:08.653611       8 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 11:13:08.653714       8 pool.go:123] Waiting for pool container to start...
E0112 11:13:11.668001       8 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 11:13:11.668128       8 pool.go:123] Waiting for pool container to start...
E0112 11:13:14.680239       8 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 11:13:14.680294       8 pool.go:123] Waiting for pool container to start...
E0112 11:13:17.690164       8 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 11:13:17.690218       8 pool.go:123] Waiting for pool container to start...
E0112 11:13:20.702640       8 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 11:13:20.702696       8 pool.go:123] Waiting for pool container to start...
E0112 11:13:23.717248       8 pool.go:122] zpool status returned error in zrepl startup : exit status 1
I0112 11:13:23.717277       8 pool.go:123] Waiting for pool container to start...
I0112 11:13:23.723416       8 controller.go:109] Setting up event handlers for CSPI
I0112 11:13:23.723781       8 controller.go:115] will set up informer event handlers for cvr
I0112 11:13:23.724125       8 new_restore_controller.go:105] Setting up event handlers for restore
I0112 11:13:23.733100       8 controller.go:110] Setting up event handlers for backup
I0112 11:13:23.737086       8 runner.go:38] Starting CStorPoolInstance controller
I0112 11:13:23.737111       8 runner.go:41] Waiting for informer caches to sync
I0112 11:13:23.743502       8 common.go:262] CStorPool found: [cannot open 'name': no such pool ]
I0112 11:13:23.743575       8 run_restore_controller.go:38] Starting CStorRestore controller
I0112 11:13:23.743584       8 run_restore_controller.go:41] Waiting for informer caches to sync
I0112 11:13:23.743595       8 run_restore_controller.go:53] Started CStorRestore workers
I0112 11:13:23.743643       8 runner.go:39] Starting CStorVolumeReplica controller
I0112 11:13:23.743655       8 runner.go:42] Waiting for informer caches to sync
I0112 11:13:23.743662       8 runner.go:47] Starting CStorVolumeReplica workers
I0112 11:13:23.743670       8 runner.go:54] Started CStorVolumeReplica workers
I0112 11:13:23.743719       8 runner.go:38] Starting CStorBackup controller
I0112 11:13:23.743732       8 runner.go:41] Waiting for informer caches to sync
I0112 11:13:23.743742       8 runner.go:53] Started CStorBackup workers
I0112 11:13:23.837328       8 runner.go:45] Starting CStorPoolInstance workers
I0112 11:13:23.837409       8 runner.go:51] Started CStorPoolInstance workers
I0112 11:13:23.891344       8 import.go:73] Importing pool 764d0038-cb8d-4b34-8ef8-5fb1efa80081 cstor-7d9da0d6-904b-4310-8d90-3da1aacf4774
E0112 11:13:24.039603       8 import.go:94] Failed to import pool by reading cache file: cannot import 'cstor-7d9da0d6-904b-4310-8d90-3da1aacf4774': I/O error
	Recovery is possible, but will result in some data loss.
	Returning the pool to its state as of Tue Jan 12 11:13:10 2021
	should correct the problem.  Approximately 5 seconds of data
	must be discarded, irreversibly.  Recovery can be attempted
	by executing 'zpool import -F cstor-7d9da0d6-904b-4310-8d90-3da1aacf4774'.  A scrub of the pool
	is strongly recommended after recovery.
 : exit status 1
E0112 11:13:25.375807       8 import.go:114] Failed to import pool by scanning directory: 2021-01-12/11:13:24.042 Verifying pool existence on the device /var/openebs/sparse/0-ndm-sparse.img
2021-01-12/11:13:24.042 Verifying pool existence on the device /var/openebs/sparse/3-ndm-sparse.img
2021-01-12/11:13:24.042 Verifying pool existence on the device /var/openebs/sparse/4-ndm-sparse.img
2021-01-12/11:13:24.043 Verifying pool existence on the device /var/openebs/sparse/2-ndm-sparse.img
2021-01-12/11:13:24.043 Skipping /var/openebs/sparse/4-ndm-sparse.img device due to no labels on device
2021-01-12/11:13:24.043 Verifying pool existence on the device /var/openebs/sparse/shared-cstor-pool
2021-01-12/11:13:24.043 ERROR Skipping /var/openebs/sparse/shared-cstor-pool device due to failure in read stats or it is not a regular file/block device
2021-01-12/11:13:24.042 Verifying pool existence on the device /var/openebs/sparse/1-ndm-sparse.img
2021-01-12/11:13:25.069 Verified the device /var/openebs/sparse/1-ndm-sparse.img for pool existence
2021-01-12/11:13:25.081 Verified the device /var/openebs/sparse/3-ndm-sparse.img for pool existence
2021-01-12/11:13:25.092 Verified the device /var/openebs/sparse/2-ndm-sparse.img for pool existence
2021-01-12/11:13:25.107 Verified the device /var/openebs/sparse/0-ndm-sparse.img for pool existence
cannot import 'cstor-7d9da0d6-904b-4310-8d90-3da1aacf4774': I/O error
	Recovery is possible, but will result in some data loss.
	Returning the pool to its state as of Tue Jan 12 11:13:10 2021
	should correct the problem.  Approximately 5 seconds of data
	must be discarded, irreversibly.  Recovery can be attempted
	by executing 'zpool import -F cstor-7d9da0d6-904b-4310-8d90-3da1aacf4774'.  A scrub of the pool
	is strongly recommended after recovery.
 : exit status 1

The user had hundreds of restarts on his pods and his node went down a couple of times.

The suspected reason for the lock not working is that the path is not same for csp and cspi deployments

mayadata:upgrade$ k get deploy cstor-pool-3w1a cstor-pool-972g -oyaml
apiVersion: v1
items:
- apiVersion: apps/v1
  kind: Deployment
  metadata:
    annotations:
      cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
      deployment.kubernetes.io/revision: "2"
      openebs.io/monitoring: pool_exporter_prometheus
    creationTimestamp: "2021-01-12T09:33:27Z"
    generation: 4
    labels:
      app: cstor-pool
      openebs.io/cas-template-name: cstor-pool-create-default-1.7.0
      openebs.io/cstor-pool: cstor-pool-3w1a
      openebs.io/storage-pool-claim: cstor-pool
      openebs.io/version: 2.4.0
      manager: kube-controller-manager
      operation: Update
      time: "2021-01-12T11:17:10Z"
    name: cstor-pool-3w1a
    namespace: openebs
    ownerReferences:
    - apiVersion: openebs.io/v1alpha1
      blockOwnerDeletion: true
      controller: true
      kind: CStorPool
      name: cstor-pool-3w1a
      uid: 76aad699-4e5f-4bd5-9a1b-16008d0d5c54
    resourceVersion: "14526"
    selfLink: /apis/apps/v1/namespaces/openebs/deployments/cstor-pool-3w1a
    uid: d4a15972-8ede-4f07-abd8-fc51e9061f19
  spec:
    progressDeadlineSeconds: 600
    replicas: 1
    revisionHistoryLimit: 10
    selector:
      matchLabels:
        app: cstor-pool
    strategy:
      type: Recreate
    template:
      metadata:
        annotations:
          cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
          openebs.io/monitoring: pool_exporter_prometheus
          prometheus.io/path: /metrics
          prometheus.io/port: "9500"
          prometheus.io/scrape: "true"
        creationTimestamp: null
        labels:
          app: cstor-pool
          openebs.io/cstor-pool: cstor-pool-3w1a
          openebs.io/storage-pool-claim: cstor-pool
          openebs.io/version: 2.4.0
      spec:
        containers:
        - env:
          - name: OPENEBS_IO_CSTOR_ID
            value: 76aad699-4e5f-4bd5-9a1b-16008d0d5c54
          image: quay.io/openebs/cstor-pool:2.4.0
          imagePullPolicy: IfNotPresent
          lifecycle:
            postStart:
              exec:
                command:
                - /bin/sh
                - -c
                - sleep 2
          livenessProbe:
            exec:
              command:
              - /bin/sh
              - -c
              - timeout 120 zfs set io.openebs:livenesstimestamp="$(date +%s)" cstor-$OPENEBS_IO_CSTOR_ID
            failureThreshold: 3
            initialDelaySeconds: 300
            periodSeconds: 60
            successThreshold: 1
            timeoutSeconds: 150
          name: cstor-pool
          ports:
          - containerPort: 12000
            protocol: TCP
          - containerPort: 3233
            protocol: TCP
          - containerPort: 3232
            protocol: TCP
          resources:
            limits:
              memory: 4Gi
            requests:
              memory: 2Gi
          securityContext:
            privileged: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /dev
            name: device
          - mountPath: /var/openebs/cstor-pool
            name: storagepath
          - mountPath: /tmp
            name: tmp
          - mountPath: /var/openebs/sparse
            name: sparse
          - mountPath: /run/udev
            name: udev
          - mountPath: /var/tmp/sock
            name: sockfile
        - env:
          - name: OPENEBS_IO_CSTOR_ID
            value: 76aad699-4e5f-4bd5-9a1b-16008d0d5c54
          - name: POD_NAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.name
          - name: NAMESPACE
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
          - name: RESYNC_INTERVAL
            value: "30"
          image: quay.io/openebs/cstor-pool-mgmt:2.4.0
          imagePullPolicy: IfNotPresent
          name: cstor-pool-mgmt
          resources: {}
          securityContext:
            privileged: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /dev
            name: device
          - mountPath: /tmp
            name: tmp
          - mountPath: /var/openebs/cstor-pool
            name: storagepath
          - mountPath: /var/openebs/sparse
            name: sparse
          - mountPath: /run/udev
            name: udev
          - mountPath: /var/tmp/sock
            name: sockfile
        - args:
          - -e=pool
          command:
          - maya-exporter
          image: quay.io/openebs/m-exporter:2.4.0
          imagePullPolicy: IfNotPresent
          name: maya-exporter
          ports:
          - containerPort: 9500
            protocol: TCP
          resources: {}
          securityContext:
            privileged: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /dev
            name: device
          - mountPath: /tmp
            name: tmp
          - mountPath: /var/openebs/cstor-pool
            name: storagepath
          - mountPath: /var/openebs/sparse
            name: sparse
          - mountPath: /run/udev
            name: udev
          - mountPath: /var/tmp/sock
            name: sockfile
        dnsPolicy: ClusterFirst
        nodeSelector:
          kubernetes.io/hostname: 127.0.0.1
        restartPolicy: Always
        schedulerName: default-scheduler
        securityContext: {}
        serviceAccount: openebs-maya-operator
        serviceAccountName: openebs-maya-operator
        terminationGracePeriodSeconds: 30
        volumes:
        - hostPath:
            path: /dev
            type: Directory
          name: device
        - hostPath:
            path: /var/openebs/cstor-pool/cstor-pool
            type: DirectoryOrCreate
          name: storagepath
        - emptyDir: {}
          name: sockfile
        - hostPath:
            path: /var/openebs/sparse/shared-cstor-pool
            type: DirectoryOrCreate
          name: tmp
        - hostPath:
            path: /var/openebs/sparse
            type: DirectoryOrCreate
          name: sparse
        - hostPath:
            path: /run/udev
            type: Directory
          name: udev
  status:
    conditions:
    - lastTransitionTime: "2021-01-12T09:33:28Z"
      lastUpdateTime: "2021-01-12T09:50:42Z"
      message: ReplicaSet "cstor-pool-3w1a-56695f78b7" has successfully progressed.
      reason: NewReplicaSetAvailable
      status: "True"
      type: Progressing
    - lastTransitionTime: "2021-01-12T11:17:10Z"
      lastUpdateTime: "2021-01-12T11:17:10Z"
      message: Deployment does not have minimum availability.
      reason: MinimumReplicasUnavailable
      status: "False"
      type: Available
    observedGeneration: 4
    replicas: 1
    unavailableReplicas: 1
    updatedReplicas: 1
- apiVersion: apps/v1
  kind: Deployment
  metadata:
    annotations:
      deployment.kubernetes.io/revision: "1"
      openebs.io/monitoring: pool_exporter_prometheus
    creationTimestamp: "2021-01-12T10:16:15Z"
    generation: 1
    labels:
      app: cstor-pool
      openebs.io/cstor-pool-cluster: cstor-pool
      openebs.io/cstor-pool-instance: cstor-pool-972g
      openebs.io/version: 2.4.0
    manager: kube-controller-manager
      operation: Update
      time: "2021-01-12T11:13:07Z"
    name: cstor-pool-972g
    namespace: openebs
    ownerReferences:
    - apiVersion: cstor.openebs.io/v1
      blockOwnerDeletion: true
      controller: true
      kind: CStorPoolInstance
      name: cstor-pool-972g
      uid: 764d0038-cb8d-4b34-8ef8-5fb1efa80081
    resourceVersion: "13982"
    selfLink: /apis/apps/v1/namespaces/openebs/deployments/cstor-pool-972g
    uid: d0c27829-288b-4300-9301-b5da91841ebc
  spec:
    progressDeadlineSeconds: 600
    replicas: 1
    revisionHistoryLimit: 10
    selector:
      matchLabels:
        app: cstor-pool
    strategy:
      type: Recreate
    template:
      metadata:
        annotations:
          cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
          openebs.io/monitoring: pool_exporter_prometheus
          prometheus.io/path: /metrics
          prometheus.io/port: "9500"
          prometheus.io/scrape: "true"
        creationTimestamp: null
        labels:
          app: cstor-pool
          openebs.io/cstor-pool-cluster: cstor-pool
          openebs.io/cstor-pool-instance: cstor-pool-972g
          openebs.io/version: 2.4.0
      spec:
        containers:
        - env:
          - name: OPENEBS_IO_CSPI_ID
            value: 764d0038-cb8d-4b34-8ef8-5fb1efa80081
          - name: RESYNC_INTERVAL
            value: "30"
          - name: POD_NAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.name
          - name: NAMESPACE
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
          - name: OPENEBS_IO_POOL_NAME
            value: 7d9da0d6-904b-4310-8d90-3da1aacf4774
          image: openebs/cstor-pool-manager:2.4.0
          imagePullPolicy: IfNotPresent
          name: cstor-pool-mgmt
          resources: {}
          securityContext:
            privileged: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /dev
            name: device
          - mountPath: /tmp
            name: tmp
          - mountPath: /run/udev
            name: udev
          - mountPath: /var/openebs/cstor-pool
            name: storagepath
          - mountPath: /var/tmp/sock
            name: sockfile
          - mountPath: /var/openebs/sparse
            name: sparse
        - env:
          - name: OPENEBS_IO_CSTOR_ID
            value: 764d0038-cb8d-4b34-8ef8-5fb1efa80081
          - name: OPENEBS_IO_POOL_NAME
            value: 7d9da0d6-904b-4310-8d90-3da1aacf4774
          image: openebs/cstor-pool:2.4.0
          imagePullPolicy: IfNotPresent
          lifecycle:
            postStart:
              exec:
                command:
                - /bin/sh
                - -c
                - sleep 2
          livenessProbe:
            exec:
              command:
              - /bin/sh
              - -c
              - timeout 120 zfs set io.openebs:livenesstimestamp="$(date +%s)" cstor-$OPENEBS_IO_POOL_NAME
            failureThreshold: 3
            initialDelaySeconds: 300
            periodSeconds: 60
            successThreshold: 1
            timeoutSeconds: 150
          name: cstor-pool
          ports:
          - containerPort: 12000
            protocol: TCP
          - containerPort: 3232
            protocol: TCP
          - containerPort: 3233
            protocol: TCP
          resources:
            limits:
              memory: 4Gi
            requests:
              memory: 2Gi
          securityContext:
            privileged: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /dev
            name: device
          - mountPath: /tmp
            name: tmp
          - mountPath: /run/udev
            name: udev
          - mountPath: /var/openebs/cstor-pool
            name: storagepath
          - mountPath: /var/tmp/sock
            name: sockfile
          - mountPath: /var/openebs/sparse
            name: sparse
        - args:
          - -e=pool
          command:
          - maya-exporter
          image: openebs/m-exporter:2.4.0
          imagePullPolicy: IfNotPresent
          name: maya-exporter
          ports:
          - containerPort: 9500
            protocol: TCP
          resources: {}
          securityContext:
            privileged: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /dev
            name: device
          - mountPath: /tmp
            name: tmp
          - mountPath: /run/udev
            name: udev
          - mountPath: /var/openebs/cstor-pool
            name: storagepath
          - mountPath: /var/tmp/sock
            name: sockfile
          - mountPath: /var/openebs/sparse
            name: sparse
        dnsPolicy: ClusterFirst
        nodeSelector:
          kubernetes.io/hostname: 127.0.0.1
        restartPolicy: Always
        schedulerName: default-scheduler
        securityContext: {}
        serviceAccount: openebs-maya-operator
        serviceAccountName: openebs-maya-operator
        terminationGracePeriodSeconds: 30
        volumes:
        - hostPath:
            path: /dev
            type: Directory
          name: device
        - hostPath:
            path: /run/udev
            type: Directory
          name: udev
        - hostPath:
            path: /var/openebs/cstor-pool/cstor-pool
            type: DirectoryOrCreate
          name: tmp
        - hostPath:
            path: /var/openebs/sparse
            type: DirectoryOrCreate
          name: sparse
        - hostPath:
            path: /var/openebs/cstor-pool/cstor-pool
            type: DirectoryOrCreate
          name: storagepath
        - emptyDir: {}
          name: sockfile
  status:
    availableReplicas: 1
    conditions:
    - lastTransitionTime: "2021-01-12T10:16:15Z"
      lastUpdateTime: "2021-01-12T10:16:39Z"
      message: ReplicaSet "cstor-pool-972g-7f4cfdd794" has successfully progressed.
      reason: NewReplicaSetAvailable
      status: "True"
      type: Progressing
    - lastTransitionTime: "2021-01-12T11:13:07Z"
      lastUpdateTime: "2021-01-12T11:13:07Z"
      message: Deployment has minimum availability.
      reason: MinimumReplicasAvailable
      status: "True"
      type: Available
    observedGeneration: 1
    readyReplicas: 1
    replicas: 1
    updatedReplicas: 1
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

To track we can follow up with this thread: https://kubernetes.slack.com/archives/CUAKPFU78/p1608665319368100

Environment details:

  • OpenEBS version (use kubectl get po -n openebs --show-labels): 2.4.0
  • Kubernetes version (use kubectl version): 1.18
  • Cloud provider or hardware configuration: Rancher
  • OS (e.g: cat /etc/os-release): Centos
  • kernel (e.g: uname -a):
  • others: VMware Virtual disks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant