Fix storageclass selection #150

csibbitt · 2020-11-24T17:08:41Z

While testing compatibility with Openshift Container Storage, I found that the existing settings for picking a storageclass did not work. I manually tested the storageClassName element in the new place in the AlertMan and Prom templates and it worked as expected, so I wrote this up to adjust.

* selector and class were in the wrong part of the Prom and AlertMan templates * They were missing from the ES template * Existing storage_resources var was doing nothing, so I'm removing it * No need to deprecate or migrate, if anyone used this it did nothing

* Ran operator-sdk generate bundle --channels stable,latest --default-channel stable --version 1.1.1 * Adjusted CSV UI hints

csibbitt · 2020-11-24T17:08:54Z

Not yet tested

csibbitt · 2020-11-24T17:09:37Z

deploy/crds/infra.watch_servicetelemetrys_crd.yaml

@@ -55,9 +55,6 @@ spec:
                          storageClass:
                            description: Storage class name used for Alertmanager PVC
                            type: string
-                          storageResources:


This param was being used in the wrong part of the template so could never have had any effect. I think it's safe to simply remove it.

deploy/crds/infra.watch_v1beta1_servicetelemetry_cr.yaml

...g/service-telemetry-operator/manifests/service-telemetry-operator.clusterserviceversion.yaml

roles/servicetelemetry/defaults/main.yml

csibbitt · 2020-11-24T17:14:40Z

...g/service-telemetry-operator/manifests/service-telemetry-operator.clusterserviceversion.yaml

@@ -482,6 +481,8 @@ spec:
                  value: service-telemetry-operator
                - name: ANSIBLE_GATHERING
                  value: explicit
+                - name: PROMETHEUS_WEBHOOK_SNMP_IMAGE


This is not my change, but it came is as part of the bundle update. It must not have been updated previously after this was added to build/operator.yaml.

Yea I noticed that I don't get the change unless I run it locally in the branch. I must have added it post-1.1 tag :(

It also didn't show up when I had --output-dir but worked when it was in the local repo. Thanks for adding that!

csibbitt · 2020-11-24T17:15:32Z

deploy/olm-catalog/service-telemetry-operator/metadata/annotations.yaml

@@ -1,6 +1,6 @@
 annotations:
  operators.operatorframework.io.bundle.channel.default.v1: stable
-  operators.operatorframework.io.bundle.channels.v1: latest
+  operators.operatorframework.io.bundle.channels.v1: stable,latest


@leifmadsen I used the bundle update command you showed me - is this the right channels setting (I haven't looked this up yet to be 100% sure I know what it means)

oh right, sorry, actually you want --channels latest and --default-channel stable.

--channels is a list of channels this bundle can be a part of -- we only want HEAD to be in latest channel

--default-channel defines the bundles default preferred channel -- we want this as stable so that we don't auto-install latest channel

csibbitt · 2020-11-24T17:37:28Z

Removed the 20Gi to 20G change because it resulted in the following when used against an existing deployment:

# persistentvolumeclaims "elasticsearch-data-elasticsearch-es-default-0" was not valid:
# * spec.resources.requests.storage: Forbidden: field can not be less than previous value

leifmadsen · 2020-11-24T20:58:24Z

bundle.Dockerfile

@@ -4,7 +4,7 @@ LABEL operators.operatorframework.io.bundle.mediatype.v1=registry+v1
 LABEL operators.operatorframework.io.bundle.manifests.v1=manifests/
 LABEL operators.operatorframework.io.bundle.metadata.v1=metadata/
 LABEL operators.operatorframework.io.bundle.package.v1=service-telemetry-operator
-LABEL operators.operatorframework.io.bundle.channels.v1=latest
+LABEL operators.operatorframework.io.bundle.channels.v1=stable,latest


Should probably revert this back to just latest since that's what the HEAD bundle should probably target.

* Not allowed to supply a blank storageClassName and there is no natural default * Even empty selectors cause failures on provisioners that do not support them * 'Failed to provision volume with StorageClass "standard": claim.Spec.Selector is not supported for dynamic provisioning on Cinder'

csibbitt · 2020-11-24T21:44:46Z

Testing

Started from Leif's already manually deployed STF (no infrawatch subscriptions to mess with things)
Deleted existing stf object and all pvcs oc delete stf default; oc delete pvc --all
Built my operator with build.sh
Editted deploy/olm-catalog/service-telemetry-operator/manifests/service-telemetry-operator.clusterserviceversion.yaml to point to my new image and oc apply -f <..sto-csv.yaml>
Deployed a default STF from the UI, all defaults except I turned ES on
1. Verified that the "resources" param is removed, and all three had the "selector" and "class" params

Verified the pvcs were created with the default storage class ("standard", Cinder-backed, non-OCS):

$ oc get pvc
NAME                                             STATUS       VOLUME                                     CAPACITY   ACCESS MODES       STORAGECLASS   AGE
alertmanager-default-db-alertmanager-default-0   Bound        pvc-6b74e598-ff3a-4d1d-948f-94d193ae75f7   19Gi       RWO                standard       55s
elasticsearch-data-elasticsearch-es-default-0    Bound        pvc-db3610dd-4c8a-426d-ac7b-08de2254035f   20Gi       RWO                standard       58s
prometheus-default-db-prometheus-default-0       Bound        pvc-7054ce9b-4cc4-4368-9b7e-abc5e16d890d   19Gi       RWO                standard       68s

Wipe it out oc delete stf default; oc delete pvc --all
Redeploy from UI, this time selecting ocs-storagecluster-ceph-rbd from the pre-populated storageClass dropdown

Verified the pvcs were created with the OCS storageclass:

$ oc get pvc
NAME                                             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                  AGE    ````
alertmanager-default-db-alertmanager-default-0   Bound    pvc-d568268e-1864-426d-9daa-4a69ade58862   19Gi       RWO            ocs-storagecluster-ceph-rbd   2m18s    ````
elasticsearch-data-elasticsearch-es-default-0    Bound    pvc-bba7664e-2e6a-4509-a292-c170d2041546   20Gi       RWO            ocs-storagecluster-ceph-rbd   45s    ````
prometheus-default-db-prometheus-default-0       Bound    pvc-d132e038-7279-4adc-81f9-cfe835d849fa   19Gi       RWO            ocs-storagecluster-ceph-rbd   56s    ````

As of this patch, I'd say that STF supports OCS for persistent storage of metrics and events.

I did not test any scenarios involving changing the class of already provisioned storage.

I did not (really) test the selectors, but they appear to be in the right place now since they fail if they are unsupported.

csibbitt · 2020-11-24T22:10:37Z

Validate OCS as a storage backend for dynamic PVs backing STF and OCP

csibbitt added 2 commits November 24, 2020 11:08

Bundle update

0ed3ec1

* Ran operator-sdk generate bundle --channels stable,latest --default-channel stable --version 1.1.1 * Adjusted CSV UI hints

csibbitt commented Nov 24, 2020

View reviewed changes

leifmadsen reviewed Nov 24, 2020

View reviewed changes

csibbitt commented Nov 24, 2020

View reviewed changes

csibbitt requested a review from paramite November 24, 2020 17:15

Backing out 20Gi -> 20G change

94ad965

leifmadsen reviewed Nov 24, 2020

View reviewed changes

Reverting accidental stable channel inclusion

b37660c

csibbitt force-pushed the csibbitt-1176-fix-storageclass-selection branch from a4fad94 to b37660c Compare November 24, 2020 21:51

leifmadsen approved these changes Nov 24, 2020

View reviewed changes

csibbitt merged commit 124b0a5 into master Nov 25, 2020

csibbitt deleted the csibbitt-1176-fix-storageclass-selection branch November 25, 2020 14:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix storageclass selection #150

Fix storageclass selection #150

csibbitt commented Nov 24, 2020

csibbitt commented Nov 24, 2020 •

edited

csibbitt Nov 24, 2020

csibbitt Nov 24, 2020

leifmadsen Nov 24, 2020

csibbitt Nov 24, 2020

leifmadsen Nov 24, 2020

csibbitt commented Nov 24, 2020 •

edited

leifmadsen Nov 24, 2020

csibbitt commented Nov 24, 2020

csibbitt commented Nov 24, 2020

Fix storageclass selection #150

Fix storageclass selection #150

Conversation

csibbitt commented Nov 24, 2020

csibbitt commented Nov 24, 2020 • edited

Not yet tested

csibbitt Nov 24, 2020

Choose a reason for hiding this comment

csibbitt Nov 24, 2020

Choose a reason for hiding this comment

leifmadsen Nov 24, 2020

Choose a reason for hiding this comment

csibbitt Nov 24, 2020

Choose a reason for hiding this comment

leifmadsen Nov 24, 2020

Choose a reason for hiding this comment

csibbitt commented Nov 24, 2020 • edited

leifmadsen Nov 24, 2020

Choose a reason for hiding this comment

csibbitt commented Nov 24, 2020

Testing

csibbitt commented Nov 24, 2020

csibbitt commented Nov 24, 2020 •

edited

csibbitt commented Nov 24, 2020 •

edited