Skip to content
Pre-release
Pre-release

@kmova kmova released this Feb 23, 2019 · 1 commit to v0.8.x since this release

Getting Started

Prerequisite to install

  • Kubernetes 1.9.7+ is installed
  • Make sure that you run the below installation steps with cluster admin context. The installation will involve creating a new Service Account and assigning to OpenEBS components.
  • Make sure iSCSI Initiator is installed on the Kubernetes nodes.
  • NDM helps in discovering the devices attached to Kubernetes nodes, which can be used to create storage pools. If you like to exclude some of the disks from getting discovered, update the filters on NDM to exclude paths before installing OpenEBS.
  • NDM runs as a privileged pod since it needs to access the device information. Please make the necessary changes to grant access to run in privileged mode. For example, when running in RHEL/CentOS, you may need to set the security context appropriately.

Using kubectl

kubectl apply -f https://openebs.github.io/charts/openebs-operator-0.8.1.yaml

Using helm stable charts

helm repo update
helm install  --namespace openebs --name openebs stable/openebs

Sample Storage Pool Claims, Storage Class and PVC configurations to make use of new features can be found here: Sample YAMLs

For more details refer to the documentation at: https://docs.openebs.io/

Change Summary

New Capabilities

  • Support for new Storage Policies:
    • TargetTolerations ( Applicable to both cStor and Jiva volumes) (openebs/maya#921). TargetTolerations policy can be used to allow scheduling of cStor or Jiva Target Pods on nodes with taints. The TargetTolerations can be specified in the StorageClass as follows, where t1, t2 represent the taints and the conditions as expected by Kubernetes.
      annotations:
        cas.openebs.io/config: |
          - name: TargetTolerations
            value: |
              t1:
                key: "key1"
                operator: "Equal"
                value: "value1"
                effect: "NoSchedule"
              t2:
                key: "key1"
                operator: "Equal"
                value: "value1"
                effect: "NoExecute"    
      
    • ReplicaTolerations ( Applicable to Jiva volumes) (openebs/maya#921). ReplicaTolerations policy can be used to allow scheduling of Jiva Replica Pods on nodes with taints. The ReplicaTolerations can be specified in the StorageClass as follows, where t1, t2 represent the taints and the conditions as expected by Kubernetes
      annotations:
        cas.openebs.io/config: |
          - name: ReplicaTolerations
            value: |
              t1:
                key: "key1"
                operator: "Equal"
                value: "value1"
                effect: "NoSchedule"
              t2:
                key: "key1"
                operator: "Equal"
                value: "value1"
                effect: "NoExecute"    
      
    • TargetNodeSelector policy can be used with cStor volumes as well(openebs/maya#914). TargetNodeSelector policy can be specified in the Storage Class as follows, to pin the targets to certain set of Kuberenetes nodes labelled as node: storage-node :
      annotations:
         openebs.io/cas-type: cstor
         cas.openebs.io/config: |
          - name: TargetNodeSelector
            value: |-
              node: storage-node
      
    • ScrubImage (Applicable to Jiva Volumes) (openebs/maya#936). After the Jiva volumes are deleted, a Scrub job is scheduled to clear the data. The container used to complete the scrub job is available at quay.io/openebs/openebs-tools:3.8. For deployments where the images can't be downloaded from the Internet, this image can be hosted locally and the location specified using the ScrubImage policy in the StorageClass as follows:
      annotations:
        cas.openebs.io/config: |
          - name: ScrubImage
            value: localrepo/openebs-tools:latest
      

Enhancements

  • Enhanced the documentation for better readability and revamped the guides for cStor Pool and Volume provisioning.
  • Enhanced the quorum handling logic in cStor volumes to reach quorum in a quicker way by optimizing the retries and timeouts required to establish the quorum.(openebs/zfs#182)
  • Enhanced the cStor Pool Pods to include a Liveness check to fail fast if the underlying disks have been detached. (openebs/maya#894)
  • Enhanced the cStor Volume - Target Pods to get reschedule faster in the event of a node failure. (openebs/maya#894)
  • Enhanced the cStor Volume Replica placement to distribute randomly between the available Pools. Prior to this fix, the Replicas were always placed on the first available Pool. In case the Volumes are launched with Replica count of 1, all the replicas were scheduled onto the first Pool only. (openebs/maya#910)
  • Enhanced the Jiva and cStor provisioner to set the upgrade strategy in respective deployments to recreate. (openebs/maya#923)
  • Enhanced the node-disk-manager to fetch additional details about the underlying disks via openSeaChest. The details will be fetched for devices that support them. (openebs/node-disk-manager#185)
    A new section called “Stats” is added that will include the information like:
    Stats:
      Device Utilization Rate:  0
      Disk Temperature:
        Current Temperature:   0
        Highest Temperature:   0
        Lowest Temperature:    0
      Percent Endurance Used:  0
      Total Bytes Read:        0
      Total Bytes Written:     0
    
  • Enhanced the node-disk-manager to add additional information to the DiskCRs like if the disks is partitioned or has a filesystem on it. (openebs/node-disk-manager#197)
    FileSystem and Partition details will be included in the Disk CR as follows:
    partitionDetails:
      - fileSystemType: None
        partitionType: "0x83"
      - fileSystemType: None
        partitionType: "0x8e"
    
    If the disk is formatted as a whole with a filesystem
    fileSystem: ext4
    
  • Enhanced the Disk CRs to include a property called managed, that will indicate whether node-disk-manager should modify the Disk CRs. When managed is set to false, node-disk-manager will not update the status of the Disk CRs. This is helpful in cases, where administrators would like to create Disk CRs for devices that are not yet supported by node-disk-manager like partitioned disks or nvme devices. A sample YAML for specifying a custom disk can be found here. (openebs/node-disk-manager#192)
  • Enhance the debuggability of cstor and jiva volumes by adding additional details about the IOs in the CLI commands and logs. For more details check:
  • Enhanced uZFS to include the zpool clear command (openebs/zfs#186) (openebs/zfs#187)
  • Enhanced the OpenEBS CRDs to include custom columns to be displayed in kubectl get .. output of the CR. This feature requires K8s 1.11 or higher. (openebs/maya#925)
    The sample output looks as follows:
    $ kubectl get csp -n openebs
    NAME                     ALLOCATED   FREE    CAPACITY    STATUS    TYPE       AGE
    sparse-claim-auto-lja7   125K        9.94G   9.94G       Healthy   striped    1h
    
    $ kubectl get cvr -n openebs
    NAME                                                              USED  ALLOCATED  STATUS    AGE
    pvc-9ca83170-01e3-11e9-812f-54e1ad0c1ccc-sparse-claim-auto-lja7   6K    6K         Healthy   1h
    
    $ kubectl get cstorvolume -n openebs
    NAME                                        STATUS    AGE
    pvc-9ca83170-01e3-11e9-812f-54e1ad0c1ccc    Healthy   4h
    
    $ kubectl get disk
    NAME                                      SIZE          STATUS   AGE
    sparse-5a92ced3e2ee21eac7b930f670b5eab5   10737418240   Active   10m
    

Major Bugs Fixed

  • Fixed an issue where cStor target was not rebuilding the data onto replicas, after the underlying cStor Pool was recreated with new disks. This scenario occurs in Cloud Provider or Private Cloud with Virtual Machine deployments, where ephemeral disks are used to create cstor pools and after a node reboot, the node comes up with new ephemeral disks. (openebs/zfs#164) (openebs/maya#899)
  • Fixed an issue where cstor volume causes timeout for iscsi discovery command and can potentially trigger a K8s vulnerability that can bring down a node with high RAM usage. (openebs/istgt#231)
  • Fixed an issue where cStor iSCSI target was not confirming to iSCSI protocol w.r.t immediate bit during discovery phase. (openebs/istgt#231)
  • Fixed an issue where some of the internal snapshots and clones created for cStor Volume rebuild purpose were not cleaned up. (openebs/zfs#200)
  • Fixed an issue where Jiva Replica Pods continued to show running even when the underlying disks were detached. The error is now handled and Pod is restarted. (#1387)
  • Fixed an issue where Jiva Replicas that have large number of sparse extent files created would timeout in connecting to their Jiva Targets. This will cause data unavailability, if the Target can’t establish quorum with already connected replicas. (openebs/jiva#184)
  • Fixed an issue where node-disk-manager pods were not getting upgraded to latest version, even after the image version was changed. (openebs/node-disk-manager#200)
  • Fixed an issue where ndm would get into CrashLoopBackoff when run in unprivileged mode (openebs/node-disk-manager#198)
  • Fixed an issue with mayactl in displaying cStor volume details, when cStor target is deployed in its PVC namespace. (openebs/maya#891)

Backwards Incompatibilities

  • From 0.8.0:
    • mayactl snapshot commands are deprecated in favor of the kubectl approach of taking snapshots. snapshot` commands.
  • For previous releases, please refer to the respective release notes and upgrade steps.

Upgrade

Upgrade to 0.8.1 is supported only from 0.8.0 and follows a similar approach like earlier releases.

  • Upgrade OpenEBS Control Plane components
  • Upgrade jiva PV at a time to 0.8.1, one at a time
  • Upgrade cStor Pools to 0.8.1 and its associated Volumes, one at a time.

Note that the upgrade uses the node labels to pin the Jiva replicas to the nodes where they are present. On node restart, these labels will disappear and can cause the replica to be not scheduled.

The scripts and detailed instructions for upgrade are available here.

Uninstall

  • The recommended steps to uninstall are:
    • delete all the OpenEBS PVCs that were created
    • delete all the SPCs (in case of cStor)
    • ensure that no volume or pool pods are pending in terminating state kubectl get pods -n <openebs namespace>
    • ensure that no openebs custom resources are present kubectl get cvr -n <openebs namespace>
    • delete the openebs either via helm purge or kubectl delete
  • Uninstalling the OpenEBS doesn't automatically delete the CRDs that were created. If you would like to complete remove the CRDs and the associated objects, run the following commands:
    kubectl delete crd castemplates.openebs.io
    kubectl delete crd cstorpools.openebs.io
    kubectl delete crd cstorvolumereplicas.openebs.io
    kubectl delete crd cstorvolumes.openebs.io
    kubectl delete crd runtasks.openebs.io
    kubectl delete crd storagepoolclaims.openebs.io
    kubectl delete crd storagepools.openebs.io
    kubectl delete crd volumesnapshotdatas.volumesnapshot.external-storage.k8s.io
    kubectl delete crd volumesnapshots.volumesnapshot.external-storage.k8s.io
    
  • As part of deleting the Jiva Volumes - OpenEBS launches scrub jobs for clearing the data from the nodes. The completed jobs need to be cleared using the following command:
    kubectl delete jobs -l openebs.io/cas-type=jiva -n <namespace>

Limitations / Known Issues

  • The current version of the OpenEBS volumes are not optimized for performance sensitive applications
  • cStor Target or Pool pods can at times be stuck in a Terminating state. They will need to be manually cleaned up using kubectl delete with 0 sec grace period. Example: kubectl delete deploy <volume-target-deploy> -n openebs --force --grace-period=0
  • cStor Pool pods can consume more Memory when there is continuous load. This can cross memory limit and cause pod evictions. It is recommend that you create the cStor pools by setting the Memory limits and requests.
  • Jiva Volumes are not recommended if your use case requires snapshots and clone capabilities.
  • Jiva Replicas use sparse file to store the data. When the application causing too many fragments (extents) to be created on the sparse file, the replica restart can cause replica take longer time to get attached to the target. This issue was seen when there were 31K fragments created.
  • Volume Snapshots are dependent on the functionality provided by the Kubernetes. The support is currently alpha. The only operations supported are:
  • Create Snapshot, Delete Snapshot and Clone from a Snapshot
    Creation of the Snapshot - uses a reconciliation loop, which would mean that a Create Snapshot operation will be retried on failure, till the Snapshot has been successfully created. This may not be an desirable option in cases where Point in Time snapshots are expected.
  • If you using K8s version earlier than 1.12, in certain cases, it will be observed that when the node the target pod is offline, the target pod can take more than 120 seconds to get rescheduled. This is because, target pods are configured with Tolerations based on the Node Condition, and TaintNodesByCondition is available only from K8s 1.12. If running earlier version, you may have to enable the alpha gate for TaintNodesByCondition. If there is active load on the volume when the target pod goes offline, the volume will be marked as read-only.
  • If you are using K8s version 1.13 or later, that includes the checks on ephemeral storage limits on the Pods, there is a chance that OpenEBS cstor and jiva pods can get evicted - because there are no ephemeral requests specified. To avoid this issue, you can specify the ephemeral storage requests in the storage class or storage pool claim. (#2294)
  • When disks used by a cStor Pool are detached and reattached, the cStor Pool may miss to detect this event in certain scenarios. A manual intervention may be required to bring the cStor Pool online. (#2363)
  • When the underlying disks used by cStor or Jiva volumes are under disk pressure due to heavy IO load, and if the Replicas taken longer than 60 seconds to process the IO, the Volumes will get into Read-Only state. In 0.8.1, logs have been added to the cstor and jiva replicas to indicate if IO has longer latency. (#2337)

For a more comprehensive list of open issues uncovered through e2e, please refer to open issues.

Additional details and note to upgrade and uninstall are available on Project Tracker Wiki.

Assets 2
You can’t perform that action at this time.