Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helm: allow defining multiple cephEcStorageClass #13405

Conversation

Javlopez
Copy link
Contributor

@Javlopez Javlopez commented Dec 15, 2023

This Pull Request enables the definition of a list of ecBlockPools, each with its own storageClass. Prior to this change, it was necessary to define separate storage classes. Now, we can specify all the requirements for ecBlockPools directly under each ecBlockPool definition

Testing

Since this scenario is not covered by CI, you can test by yourself using minikube with at least 5 nodes

minikube start --disk-size=20g --extra-disks=1 --driver kvm2 --nodes 5

And use this configuration in values.yaml

diff --git a/deploy/charts/rook-ceph-cluster/values.yaml b/deploy/charts/rook-ceph-cluster/values.yaml
index 3b3015113..a6a27367f 100644
--- a/deploy/charts/rook-ceph-cluster/values.yaml
+++ b/deploy/charts/rook-ceph-cluster/values.yaml
@@ -279,25 +279,25 @@ cephClusterSpec:
   resources:
     mgr:
       limits:
-        cpu: "1000m"
-        memory: "1Gi"
-      requests:
         cpu: "500m"
         memory: "512Mi"
+      requests:
+        cpu: "250m"
+        memory: "256Mi"
     mon:
       limits:
-        cpu: "2000m"
-        memory: "2Gi"
-      requests:
-        cpu: "1000m"
+        cpu: "500m"
         memory: "1Gi"
+      requests:
+        cpu: "250m"
+        memory: "512Mi"
     osd:
       limits:
-        cpu: "2000m"
-        memory: "4Gi"
-      requests:
         cpu: "1000m"
-        memory: "4Gi"
+        memory: "2Gi"
+      requests:
+        cpu: "500m"
+        memory: "1Gi"
     prepareosd:
       # limits: It is not recommended to set limits on the OSD prepare job
       #         since it's a one-time burst for memory that must be allowed to
@@ -520,11 +520,11 @@ cephFileSystems:
         activeStandby: true
         resources:
           limits:
-            cpu: "2000m"
-            memory: "4Gi"
-          requests:
             cpu: "1000m"
-            memory: "4Gi"
+            memory: "1Gi"
+          requests:
+            cpu: "500m"
+            memory: "1Gi"
         priorityClassName: system-cluster-critical
     storageClass:
       enabled: true
@@ -594,11 +594,11 @@ cephObjectStores:
         port: 80
         resources:
           limits:
-            cpu: "2000m"
-            memory: "2Gi"
-          requests:
             cpu: "1000m"
             memory: "1Gi"
+          requests:
+            cpu: "500m"
+            memory: "500Mi"
         # securePort: 443
         # sslCertificateRef:
         instances: 1
@@ -625,63 +625,63 @@ cephObjectStores:
       #   secretName: ceph-objectstore-tls
       # ingressClassName: nginx
 # cephECBlockPools are disabled by default, please remove the comments and set desired values to enable it
-#cephECBlockPools:
-#  # For erasure coded a replicated metadata pool is required.
-#  # https://rook.io/docs/rook/latest/CRDs/Shared-Filesystem/ceph-filesystem-crd/#erasure-coded
-#  - name: ec-metadata-pool
-#    # see https://github.com/rook/rook/blob/master/Documentation/CRDs/Block-Storage/ceph-block-pool-crd.md#spec for available configuration
-#    spec:
-#      replicated:
-#        size: 2
-#  - name: ec-data-pool
-#    spec:
-#      failureDomain: osd
-#      erasureCoded:
-#        dataChunks: 2
-#        codingChunks: 1
-#      deviceClass: hdd
+cephECBlockPools:
+  # For erasure coded a replicated metadata pool is required.
+  # https://rook.io/docs/rook/latest/CRDs/Shared-Filesystem/ceph-filesystem-crd/#erasure-coded
+  - name: ec-metadata-pool
+    # see https://github.com/rook/rook/blob/master/Documentation/CRDs/Block-Storage/ceph-block-pool-crd.md#spec for available configuration
+    spec:
+      replicated:
+        size: 2
+  - name: ec-data-pool
+    spec:
+      failureDomain: osd
+      erasureCoded:
+        dataChunks: 2
+        codingChunks: 1
+      deviceClass: hdd
 
 # cephECStorageClass also is disabled by default, please remove the comments and set desired values to enable it
 # if cephECBlockPools are uncommented you must remove the comments of cephEcStorageClass as well
-#cephECStorageClasses:
-#  - name: rook-ceph-block
-#    # Change "rook-ceph" provisioner prefix to match the operator namespace if needed
-#    provisioner: rook-ceph.rbd.csi.ceph.com # driver:namespace:operator
-#    parameters:
-#      # clusterID is the namespace where the rook cluster is running
-#      # If you change this namespace, also change the namespace below where the secret namespaces are defined
-#      clusterID: rook-ceph # namespace:cluster
-#
-#      # If you want to use erasure coded pool with RBD, you need to create
-#      # two pools. one erasure coded and one replicated.
-#      # You need to specify the replicated pool here in the `pool` parameter, it is
-#      # used for the metadata of the images.
-#      # The erasure coded pool must be set as the `dataPool` parameter below.
-#      dataPool: ec-data-pool
-#      pool: ec-metadata-pool
-#
-#      # (optional) mapOptions is a comma-separated list of map options.
-#      # For krbd options refer
-#      # https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
-#      # For nbd options refer
-#      # https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
-#      # mapOptions: lock_on_read,queue_depth=1024
-#
-#      # (optional) unmapOptions is a comma-separated list of unmap options.
-#      # For krbd options refer
-#      # https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
-#      # For nbd options refer
-#      # https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
-#      # unmapOptions: force
-#
-#      # RBD image format. Defaults to "2".
-#      imageFormat: "2"
-#
-#      # RBD image features, equivalent to OR'd bitfield value: 63
-#      # Available for imageFormat: "2". Older releases of CSI RBD
-#      # support only the `layering` feature. The Linux kernel (KRBD) supports the
-#      # full feature complement as of 5.4
-#      # imageFeatures: layering,fast-diff,object-map,deep-flatten,exclusive-lock
-#      imageFeatures: layering
-#    allowVolumeExpansion: true
-#    reclaimPolicy: Delete
\ No newline at end of file
+cephECStorageClasses:
+  - name: rook-ceph-block
+    # Change "rook-ceph" provisioner prefix to match the operator namespace if needed
+    provisioner: rook-ceph.rbd.csi.ceph.com # driver:namespace:operator
+    parameters:
+    # clusterID is the namespace where the rook cluster is running
+    # If you change this namespace, also change the namespace below where the secret namespaces are defined
+    clusterID: rook-ceph # namespace:cluster
+
+    # If you want to use erasure coded pool with RBD, you need to create
+    # two pools. one erasure coded and one replicated.
+    # You need to specify the replicated pool here in the `pool` parameter, it is
+    # used for the metadata of the images.
+    # The erasure coded pool must be set as the `dataPool` parameter below.
+    dataPool: ec-data-pool
+    pool: ec-metadata-pool
+
+    # (optional) mapOptions is a comma-separated list of map options.
+    # For krbd options refer
+    # https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
+    # For nbd options refer
+    # https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
+    # mapOptions: lock_on_read,queue_depth=1024
+
+    # (optional) unmapOptions is a comma-separated list of unmap options.
+    # For krbd options refer
+    # https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
+    # For nbd options refer
+    # https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
+    # unmapOptions: force
+
+    # RBD image format. Defaults to "2".
+    imageFormat: "2"
+
+    # RBD image features, equivalent to OR'd bitfield value: 63
+    # Available for imageFormat: "2". Older releases of CSI RBD
+    # support only the `layering` feature. The Linux kernel (KRBD) supports the
+    # full feature complement as of 5.4
+    # imageFeatures: layering,fast-diff,object-map,deep-flatten,exclusive-lock
+    imageFeatures: layering
+    allowVolumeExpansion: true
+    reclaimPolicy: Delete

Which issue is resolved by this Pull Request:
Resolves #12772

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next minor release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

@Javlopez Javlopez self-assigned this Dec 15, 2023
@Javlopez Javlopez added the helm label Dec 15, 2023
Copy link
Member

@travisn travisn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the following:

  • PR and commit description to explain why the change is needed
  • If someone had defined the EC storage class previously, will upgrades be affected?
  • Since this scenario is not tested in the CI, please comment on the scenarios you could test.

@Javlopez Javlopez force-pushed the feature/Allow-defining-multiple-cephECStorageClass branch from 947ee26 to b048e09 Compare January 8, 2024 05:48
@Javlopez Javlopez requested a review from travisn January 8, 2024 05:54
@Javlopez Javlopez force-pushed the feature/Allow-defining-multiple-cephECStorageClass branch from b048e09 to fde3309 Compare January 8, 2024 06:07
@Javlopez Javlopez force-pushed the feature/Allow-defining-multiple-cephECStorageClass branch 2 times, most recently from 0908c23 to 2762224 Compare January 9, 2024 01:26
@Javlopez Javlopez requested a review from travisn January 9, 2024 01:33
@folliehiyuki
Copy link

Sorry that I forgot about my opened issue completely for a long time! This PR got lost in my GitHub notifications, and I just noticed it today.

Some questions from me (taken from my issue description):

  • Can you add isDefault option to allow setting 1 of the defined StorageClass as the default one?
  • Can you consider my proposal (making cephECStorageClass a sub-option of cephECBlockPools)?

@Javlopez Javlopez force-pushed the feature/Allow-defining-multiple-cephECStorageClass branch 2 times, most recently from 4718883 to 6305730 Compare January 12, 2024 06:42
@travisn
Copy link
Member

travisn commented Jan 12, 2024

Can you add isDefault option to allow setting 1 of the defined StorageClass as the default one?

I see @Javlopez added this now

Can you consider my proposal (making cephECStorageClass a sub-option of cephECBlockPools)?

This change would also require other changes to the structure to define the metadataPool and the dataPool. While I do like that schema better, I wonder how much it will adversely affect anyone that is currently creating the EC block pools. There is already some small breaking change in this PR to change the EC storage classes to a list, so perhaps it is acceptable to make that change at the same time.
@folliehiyuki @anthonyeleven As helm users, would this be an easy update to manage, or is it painful whenever there is helm settings restructuring like this?

@anthonyeleven
Copy link
Contributor

Honestly I don't really grok Helm myself, but others on my team do. If this means formatting the config file / chart slightly differently when updating to a Rook release, I think that's manageable. Especially if it adds functionality we don't have today. What I'm most wary of is anything that could result in Rook perturbing existing Ceph resources - we're currently testing to ensure that the recent change that lets Rook modify CRUSH rules will wreak any havoc on our clusters.

And of course I look forward to Rook configuring multiple RGW storageclasses / placement targets so I don't have to do it by hand - and then worry that Rook will scribble over it.

@Javlopez Javlopez force-pushed the feature/Allow-defining-multiple-cephECStorageClass branch from 6305730 to 7c07497 Compare January 16, 2024 02:34
@folliehiyuki
Copy link

My Helm upgrade workflow always involves rendering the actual manifests, and then checking diff against the current objects. It's usually a daunting task if refactoring is required (in this case changing the value to keep the rendered manifests the same), but not unmanageable. This is just a casual Helm problem.

I'm expecting this change and aware of it, so we already have a plan to handle the upgrade process.

@travisn
Copy link
Member

travisn commented Jan 16, 2024

Sounds like the benefit of the change is worth the update. @Javlopez Want to go ahead with those additional changes?

@Javlopez Javlopez force-pushed the feature/Allow-defining-multiple-cephECStorageClass branch 3 times, most recently from 837e51d to 1960a98 Compare January 22, 2024 19:52
deploy/charts/rook-ceph-cluster/values.yaml Outdated Show resolved Hide resolved
deploy/charts/rook-ceph-cluster/values.yaml Outdated Show resolved Hide resolved
Copy link

mergify bot commented Jan 29, 2024

This pull request has merge conflicts that must be resolved before it can be merged. @Javlopez please rebase it. https://rook.io/docs/rook/latest/Contributing/development-flow/#updating-your-fork

This update modifies the Helm chart rook-ceph-cluster
to allow define cephStorageClass as part of the blokpool
also the documentation was updated for cephECBlockPool definition

Signed-off-by: Javier <sjavierlopez@gmail.com>
@Javlopez Javlopez force-pushed the feature/Allow-defining-multiple-cephECStorageClass branch from 1960a98 to 95ae42c Compare January 30, 2024 01:23
@Javlopez Javlopez requested a review from travisn January 30, 2024 04:30
Copy link
Member

@travisn travisn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@folliehiyuki @anthonyeleven Might you be able to test this to see if there is any additional feedback before we get it into a release?

@anthonyeleven
Copy link
Contributor

@travisn Potentially. We have a lab cluster and my protege has been working with storageclasses to ensure that the manual work we did to correct the .mgr pool and to retrofit deviceclass-constraints into CRUSH rules won't be trampled by the more recent Rook release that changes CRUSH rules.

We'd need to know where to find the Rook artifacts necessary.

@travisn
Copy link
Member

travisn commented Feb 1, 2024

@travisn Potentially. We have a lab cluster and my protege has been working with storageclasses to ensure that the manual work we did to correct the .mgr pool and to retrofit deviceclass-constraints into CRUSH rules won't be trampled by the more recent Rook release that changes CRUSH rules.

We'd need to know where to find the Rook artifacts necessary.

It may be simplest to checkout this branch or else the master branch after this is merged, then use the local directory to install the chart, as we use during development. Thanks

@mohitrajain
Copy link

@travisn, I work with @anthonyeleven, and we are currently testing pool changes from v1.12.8 in our staging cluster. Once confirmed, I will proceed with upgrading to the latest version of Rook, v1.13.3, and then test with the current fork for defining multiple cephEcStorageClasses. This might happen towards the end of this week.

@travisn travisn merged commit 2d74528 into rook:master Feb 7, 2024
48 of 50 checks passed
@travisn
Copy link
Member

travisn commented Feb 7, 2024

@travisn, I work with @anthonyeleven, and we are currently testing pool changes from v1.12.8 in our staging cluster. Once confirmed, I will proceed with upgrading to the latest version of Rook, v1.13.3, and then test with the current fork for defining multiple cephEcStorageClasses. This might happen towards the end of this week.

Thanks, we will wait for confirmation before backporting this to the 1.13 release branch.

@Javlopez Javlopez deleted the feature/Allow-defining-multiple-cephECStorageClass branch February 9, 2024 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow defining multiple cephECStorageClass and set any of them as default storageclass of the cluster
6 participants