fix(helm): update rook-ceph group ( v1.14.2 → v1.14.8 ) (patch) #27

renovate · 2024-05-04T01:41:00Z

This PR contains the following updates:

Package	Update	Change
rook-ceph	patch	`v1.14.2` -> `v1.14.8`
rook-ceph-cluster	patch	`v1.14.2` -> `v1.14.8`

Release Notes

rook/rook (rook-ceph)

`v1.14.8`

Compare Source

Improvements

Rook v1.14.8 is a patch release limited in scope and focusing on feature additions and bug fixes to the Ceph operator.

osd: Fix activate failure when block device moves (#14374, @BlaineEXE)
csi: Update csi-addons repo link for correctly versioned downloads (#14408, @Madhu-1)
build: Update go-retryablehttp from 0.7.6 to 0.7.7 (#14391, @subhamkrai)
osd: Use old passphrase to kill the LUKS slot during key rotation (#14367, @black-dragon74)
csi: Skip creating networkFence when csi is disabled (#14294, @subhamkrai)

`v1.14.7`

Compare Source

What's Changed

monitoring: fix CephPoolGrowthWarning expression (#14346, @matofeder)
monitoring: Set honor labels on the service monitor (#14339, @travisn)

Full Changelog: rook/rook@v1.14.6...v1.14.7

`v1.14.6`

Compare Source

What's Changed

build: add result of codegen (#14287, @obnoxxx)
build: remove iproute build dependency on centos repo (#14299, @BlaineEXE)
mon: Allow overriding the mon endpoint with annotation (#13500, @travisn)
multus: add and test ipv6 support for validation tool (#14302, @BlaineEXE)
monitoring: fix exporter service monitor selector (#14313, @matofeder)
monitoring: update to the latest ceph prometheus rules (#14312, @matofeder)
doc: add recommendation for nfs in external cluster (#13876, @parth-gr)
pool: get the exact deviceClass name instead of crushroot+deviceClass (#14325, @ideepika)
helm: allow custom labels and annotations for storage classes (#14323, @catdog2)
core: Update go modules for snyk security check (#14331, @travisn)

`v1.14.5`

Compare Source

Improvements

Rook v1.14.5 is a patch release limited in scope and focusing on feature additions and bug fixes to the Ceph operator.

mon: Fix the bind address when IPv6 and msgr2 are enabled (#14248, @BlaineEXE)
osd: Configure cluster full settings related to OSDs filling up (#14281, @travisn)
core: Remove unnecessary owner refs in resource cleanup jobs (#14234, @sp98)
mgr: Set balancer mode for the balancer mgr module in the CephCluster CR (#14232, @sp98)
osd: Reduce safe-to-destroy retry timeout to 15s (#14257, @bdowling)
docs: Document how to define a StorageClass to consume a RADOS namespace (#14173, @obnoxxx)
core: Fix missing env in subvolume group cleanup job (#14236, @sp98)

`v1.14.4`

Compare Source

Improvements

Rook v1.14.4 is a patch release limited in scope and focusing on feature additions and bug fixes to the Ceph operator.

core: Remove obsolete Ceph Pacific checks (#14210, @satoru-takeuchi)
osd: Add cephcluster status for deprecated OSDs that should be replaced (#14187, @travisn)
mgr: Fix UpdateActiveMgrLabel to retry label update on failure (#14160, @rkachach)
ci: Update ubuntu image from 20.04 to 22.04 (#14166, @subhamkrai)

`v1.14.3`

Compare Source

Improvements

Rook v1.14.3 is a patch release limited in scope and focusing on feature additions and bug fixes to the Ceph operator.

csi: Fix missing namespace in internal csi cluster config map (#14154, @BlaineEXE)
osd: Limit storageClassDeviceSet names to 40 chars (#14134, @subhamkrai)
mon: Disable the msgr v1 port listening inside the mon pod if msgr2 is required (#14147, @travisn)
external: Restructure external cluster examples manifests (#13932, @smoshiur1237)
mon: Allow mon scale-down when mons are portable (#14106, @subhamkrai)
osd: Legacy LVM-based OSDs on PVCs crash on resize init container (#14100, @travisn)
csi: Update csi sidecars image version (#14129, @iPraveenParihar)
csi: Create csi configmap if csi controller is disabled (#14125, @parth-gr)
operator: Support custom dashboard service labels and annotations (#14115, @sfackler)
external: Add support for rados namespace for rbd EC pools (#13769, @parth-gr)
ci: Use markdownlint to enforce mkdocs compatibility (#14114, @BlaineEXE)

Configuration

📅 Schedule: Branch creation - "on saturday" (UTC), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about these updates again.

If you want to rebase/retry this PR, check this box

This PR has been generated by Mend Renovate. View repository job log here.

github-actions · 2024-05-04T01:41:24Z

--- kubernetes/apps/rook-ceph/rook-ceph/app Kustomization: flux-system/rook-ceph HelmRelease: rook-ceph/rook-ceph-operator

+++ kubernetes/apps/rook-ceph/rook-ceph/app Kustomization: flux-system/rook-ceph HelmRelease: rook-ceph/rook-ceph-operator

@@ -13,13 +13,13 @@

     spec:
       chart: rook-ceph
       sourceRef:
         kind: HelmRepository
         name: rook-ceph
         namespace: flux-system
-      version: v1.14.2
+      version: v1.14.8
   install:
     remediation:
       retries: 3
   interval: 30m
   timeout: 15m
   upgrade:
--- kubernetes/apps/rook-ceph/rook-ceph/cluster Kustomization: flux-system/rook-ceph-cluster HelmRelease: rook-ceph/rook-ceph-cluster

+++ kubernetes/apps/rook-ceph/rook-ceph/cluster Kustomization: flux-system/rook-ceph-cluster HelmRelease: rook-ceph/rook-ceph-cluster

@@ -13,13 +13,13 @@

     spec:
       chart: rook-ceph-cluster
       sourceRef:
         kind: HelmRepository
         name: rook-ceph
         namespace: flux-system
-      version: v1.14.2
+      version: v1.14.8
   dependsOn:
   - name: rook-ceph-operator
     namespace: rook-ceph
   install:
     remediation:
       retries: 3

github-actions · 2024-05-04T01:41:28Z

--- HelmRelease: rook-ceph/rook-ceph-cluster PrometheusRule: rook-ceph/prometheus-ceph-rules

+++ HelmRelease: rook-ceph/rook-ceph-cluster PrometheusRule: rook-ceph/prometheus-ceph-rules

@@ -261,13 +261,13 @@

         severity: warning
         type: ceph_default
     - alert: CephDeviceFailurePredictionTooHigh
       annotations:
         description: The device health module has determined that devices predicted
           to fail can not be remediated automatically, since too many OSDs would be
-          removed from the cluster to ensure performance and availabililty. Prevent
+          removed from the cluster to ensure performance and availability. Prevent
           data integrity issues by adding new OSDs so that data may be relocated.
         documentation: https://docs.ceph.com/en/latest/rados/operations/health-checks#device-health-toomany
         summary: Too many devices are predicted to fail, unable to resolve
       expr: ceph_health_detail{name="DEVICE_HEALTH_TOOMANY"} == 1
       for: 1m
       labels:
@@ -504,13 +504,13 @@

       expr: ceph_health_detail{name="PG_RECOVERY_FULL"} == 1
       for: 1m
       labels:
         oid: 1.3.6.1.4.1.50495.1.2.1.7.5
         severity: critical
         type: ceph_default
-    - alert: CephPGUnavilableBlockingIO
+    - alert: CephPGUnavailableBlockingIO
       annotations:
         description: Data availability is reduced, impacting the cluster's ability
           to service I/O. One or more placement groups (PGs) are in a state that blocks
           I/O.
         documentation: https://docs.ceph.com/en/latest/rados/operations/health-checks#pg-availability
         summary: PG is unavailable, blocking I/O
@@ -626,15 +626,15 @@

       labels:
         oid: 1.3.6.1.4.1.50495.1.2.1.8.3
         severity: warning
         type: ceph_default
     - alert: CephNodeNetworkBondDegraded
       annotations:
-        summary: Degraded Bond on Node {{ $labels.instance }}
         description: Bond {{ $labels.master }} is degraded on Node {{ $labels.instance
           }}.
+        summary: Degraded Bond on Node {{ $labels.instance }}
       expr: |
         node_bonding_slaves - node_bonding_active != 0
       labels:
         severity: warning
         type: ceph_default
     - alert: CephNodeDiskspaceWarning
@@ -662,12 +662,23 @@

         > 0))  )
       labels:
         severity: warning
         type: ceph_default
   - name: pools
     rules:
+    - alert: CephPoolGrowthWarning
+      annotations:
+        description: Pool '{{ $labels.name }}' will be full in less than 5 days assuming
+          the average fill-up rate of the past 48 hours.
+        summary: Pool growth rate may soon exceed capacity
+      expr: (predict_linear(ceph_pool_percent_used[2d], 3600 * 24 * 5) * on(pool_id,
+        instance, pod) group_right() ceph_pool_metadata) >= 95
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.9.2
+        severity: warning
+        type: ceph_default
     - alert: CephPoolBackfillFull
       annotations:
         description: A pool is approaching the near full threshold, which will prevent
           recovery/backfill operations from completing. Consider adding more capacity.
         summary: Free space in a pool is too low for recovery/backfill
       expr: ceph_health_detail{name="POOL_BACKFILLFULL"} > 0
@@ -718,22 +729,113 @@

       expr: ceph_healthcheck_slow_ops > 0
       for: 30s
       labels:
         severity: warning
         type: ceph_default
     - alert: CephDaemonSlowOps
-      for: 30s
-      expr: ceph_daemon_health_metrics{type="SLOW_OPS"} > 0
-      labels:
-        severity: warning
-        type: ceph_default
-      annotations:
-        summary: '{{ $labels.ceph_daemon }} operations are slow to complete'
+      annotations:
         description: '{{ $labels.ceph_daemon }} operations are taking too long to
           process (complaint time exceeded)'
         documentation: https://docs.ceph.com/en/latest/rados/operations/health-checks#slow-ops
+        summary: '{{ $labels.ceph_daemon }} operations are slow to complete'
+      expr: ceph_daemon_health_metrics{type="SLOW_OPS"} > 0
+      for: 30s
+      labels:
+        severity: warning
+        type: ceph_default
+  - name: hardware
+    rules:
+    - alert: HardwareStorageError
+      annotations:
+        description: Some storage devices are in error. Check `ceph health detail`.
+        summary: Storage devices error(s) detected
+      expr: ceph_health_detail{name="HARDWARE_STORAGE"} > 0
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.13.1
+        severity: critical
+        type: ceph_default
+    - alert: HardwareMemoryError
+      annotations:
+        description: DIMM error(s) detected. Check `ceph health detail`.
+        summary: DIMM error(s) detected
+      expr: ceph_health_detail{name="HARDWARE_MEMORY"} > 0
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.13.2
+        severity: critical
+        type: ceph_default
+    - alert: HardwareProcessorError
+      annotations:
+        description: Processor error(s) detected. Check `ceph health detail`.
+        summary: Processor error(s) detected
+      expr: ceph_health_detail{name="HARDWARE_PROCESSOR"} > 0
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.13.3
+        severity: critical
+        type: ceph_default
+    - alert: HardwareNetworkError
+      annotations:
+        description: Network error(s) detected. Check `ceph health detail`.
+        summary: Network error(s) detected
+      expr: ceph_health_detail{name="HARDWARE_NETWORK"} > 0
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.13.4
+        severity: critical
+        type: ceph_default
+    - alert: HardwarePowerError
+      annotations:
+        description: Power supply error(s) detected. Check `ceph health detail`.
+        summary: Power supply error(s) detected
+      expr: ceph_health_detail{name="HARDWARE_POWER"} > 0
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.13.5
+        severity: critical
+        type: ceph_default
+    - alert: HardwareFanError
+      annotations:
+        description: Fan error(s) detected. Check `ceph health detail`.
+        summary: Fan error(s) detected
+      expr: ceph_health_detail{name="HARDWARE_FANS"} > 0
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.13.6
+        severity: critical
+        type: ceph_default
+  - name: PrometheusServer
+    rules:
+    - alert: PrometheusJobMissing
+      annotations:
+        description: The prometheus job that scrapes from Ceph MGR is no longer defined,
+          this will effectively mean you'll have no metrics or alerts for the cluster.  Please
+          review the job definitions in the prometheus.yml file of the prometheus
+          instance.
+        summary: The scrape job for Ceph MGR is missing from Prometheus
+      expr: absent(up{job="rook-ceph-mgr"})
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.12.1
+        severity: critical
+        type: ceph_default
+    - alert: PrometheusJobExporterMissing
+      annotations:
+        description: The prometheus job that scrapes from Ceph Exporter is no longer
+          defined, this will effectively mean you'll have no metrics or alerts for
+          the cluster.  Please review the job definitions in the prometheus.yml file
+          of the prometheus instance.
+        summary: The scrape job for Ceph Exporter is missing from Prometheus
+      expr: sum(absent(up{job="rook-ceph-exporter"})) and sum(ceph_osd_metadata{ceph_version=~"^ceph
+        version (1[89]|[2-9][0-9]).*"}) > 0
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.12.1
+        severity: critical
+        type: ceph_default
   - name: rados
     rules:
     - alert: CephObjectMissing
       annotations:
         description: The latest version of a RADOS object can not be found, even though
           all OSDs are up. I/O requests for this object from clients will block (hang).
@@ -760,7 +862,218 @@

       expr: ceph_health_detail{name="RECENT_CRASH"} == 1
       for: 1m
       labels:
         oid: 1.3.6.1.4.1.50495.1.2.1.1.2
         severity: critical
         type: ceph_default
+  - name: rbdmirror
+    rules:
+    - alert: CephRBDMirrorImagesPerDaemonHigh
+      annotations:
+        description: Number of image replications per daemon is not supposed to go
+          beyond threshold 100
+        summary: Number of image replications are now above 100
+      expr: sum by (ceph_daemon, namespace) (ceph_rbd_mirror_snapshot_image_snapshots)
+        > 100
+      for: 1m
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.10.2
+        severity: critical
+        type: ceph_default
+    - alert: CephRBDMirrorImagesNotInSync
+      annotations:
+        description: Both local and remote RBD mirror images should be in sync.
+        summary: Some of the RBD mirror images are not in sync with the remote counter
+          parts.
+      expr: sum by (ceph_daemon, image, namespace, pool) (topk by (ceph_daemon, image,
+        namespace, pool) (1, ceph_rbd_mirror_snapshot_image_local_timestamp) - topk
+        by (ceph_daemon, image, namespace, pool) (1, ceph_rbd_mirror_snapshot_image_remote_timestamp))
+        != 0
+      for: 1m
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.10.3
+        severity: critical
+        type: ceph_default
+    - alert: CephRBDMirrorImagesNotInSyncVeryHigh
+      annotations:
+        description: More than 10% of the images have synchronization problems
+        summary: Number of unsynchronized images are very high.
+      expr: count by (ceph_daemon) ((topk by (ceph_daemon, image, namespace, pool)
+        (1, ceph_rbd_mirror_snapshot_image_local_timestamp) - topk by (ceph_daemon,
+        image, namespace, pool) (1, ceph_rbd_mirror_snapshot_image_remote_timestamp))
+        != 0) > (sum by (ceph_daemon) (ceph_rbd_mirror_snapshot_snapshots)*.1)
+      for: 1m
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.10.4
+        severity: critical
+        type: ceph_default
+    - alert: CephRBDMirrorImageTransferBandwidthHigh
+      annotations:
[Diff truncated by flux-local]
--- HelmRelease: rook-ceph/rook-ceph-operator ConfigMap: rook-ceph/rook-ceph-operator-config

+++ HelmRelease: rook-ceph/rook-ceph-operator ConfigMap: rook-ceph/rook-ceph-operator-config

@@ -27,17 +27,17 @@

   CSI_PROVISIONER_PRIORITY_CLASSNAME: system-cluster-critical
   CSI_RBD_FSGROUPPOLICY: File
   CSI_CEPHFS_FSGROUPPOLICY: File
   CSI_NFS_FSGROUPPOLICY: File
   CSI_CEPHFS_KERNEL_MOUNT_OPTIONS: ms_mode=prefer-crc
   ROOK_CSI_CEPH_IMAGE: quay.io/cephcsi/cephcsi:v3.11.0
-  ROOK_CSI_REGISTRAR_IMAGE: registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.10.0
-  ROOK_CSI_PROVISIONER_IMAGE: registry.k8s.io/sig-storage/csi-provisioner:v4.0.0
-  ROOK_CSI_SNAPSHOTTER_IMAGE: registry.k8s.io/sig-storage/csi-snapshotter:v7.0.1
-  ROOK_CSI_ATTACHER_IMAGE: registry.k8s.io/sig-storage/csi-attacher:v4.5.0
-  ROOK_CSI_RESIZER_IMAGE: registry.k8s.io/sig-storage/csi-resizer:v1.10.0
+  ROOK_CSI_REGISTRAR_IMAGE: registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.10.1
+  ROOK_CSI_PROVISIONER_IMAGE: registry.k8s.io/sig-storage/csi-provisioner:v4.0.1
+  ROOK_CSI_SNAPSHOTTER_IMAGE: registry.k8s.io/sig-storage/csi-snapshotter:v7.0.2
+  ROOK_CSI_ATTACHER_IMAGE: registry.k8s.io/sig-storage/csi-attacher:v4.5.1
+  ROOK_CSI_RESIZER_IMAGE: registry.k8s.io/sig-storage/csi-resizer:v1.10.1
   ROOK_CSI_IMAGE_PULL_POLICY: IfNotPresent
   CSI_ENABLE_CSIADDONS: 'false'
   ROOK_CSIADDONS_IMAGE: quay.io/csiaddons/k8s-sidecar:v0.8.0
   CSI_ENABLE_TOPOLOGY: 'false'
   ROOK_CSI_ENABLE_NFS: 'false'
   CSI_ENABLE_LIVENESS: 'true'
--- HelmRelease: rook-ceph/rook-ceph-operator Deployment: rook-ceph/rook-ceph-operator

+++ HelmRelease: rook-ceph/rook-ceph-operator Deployment: rook-ceph/rook-ceph-operator

@@ -26,13 +26,13 @@

       - effect: NoExecute
         key: node.kubernetes.io/unreachable
         operator: Exists
         tolerationSeconds: 5
       containers:
       - name: rook-ceph-operator
-        image: rook/ceph:v1.14.2
+        image: rook/ceph:v1.14.8
         imagePullPolicy: IfNotPresent
         args:
         - ceph
         - operator
         securityContext:
           capabilities:

renovate bot added renovate/helm type/patch labels May 4, 2024

github-actions bot added the area/kubernetes label May 4, 2024

renovate bot changed the title ~~fix(helm): update rook-ceph group ( v1.14.2 → v1.14.3 ) (patch)~~ fix(helm): update rook-ceph group ( v1.14.2 → v1.14.4 ) (patch) May 17, 2024

renovate bot force-pushed the renovate/patch-rook-ceph branch from d8c147b to aac2279 Compare May 17, 2024 05:11

renovate bot changed the title ~~fix(helm): update rook-ceph group ( v1.14.2 → v1.14.4 ) (patch)~~ fix(helm): update rook-ceph group ( v1.14.2 → v1.14.5 ) (patch) May 31, 2024

renovate bot force-pushed the renovate/patch-rook-ceph branch from aac2279 to edda4bf Compare May 31, 2024 01:31

renovate bot force-pushed the renovate/patch-rook-ceph branch from edda4bf to 68b9c45 Compare June 13, 2024 23:31

renovate bot changed the title ~~fix(helm): update rook-ceph group ( v1.14.2 → v1.14.5 ) (patch)~~ fix(helm): update rook-ceph group ( v1.14.2 → v1.14.6 ) (patch) Jun 13, 2024

renovate bot force-pushed the renovate/patch-rook-ceph branch from 68b9c45 to 33e0462 Compare June 21, 2024 18:56

renovate bot changed the title ~~fix(helm): update rook-ceph group ( v1.14.2 → v1.14.6 ) (patch)~~ fix(helm): update rook-ceph group ( v1.14.2 → v1.14.7 ) (patch) Jun 21, 2024

fix(helm): update rook-ceph group ( v1.14.2 → v1.14.8 )

7d323e8

renovate bot force-pushed the renovate/patch-rook-ceph branch from 33e0462 to 7d323e8 Compare July 3, 2024 22:54

renovate bot changed the title ~~fix(helm): update rook-ceph group ( v1.14.2 → v1.14.7 ) (patch)~~ fix(helm): update rook-ceph group ( v1.14.2 → v1.14.8 ) (patch) Jul 3, 2024

alexwaibel merged commit 2396c7d into main Jul 17, 2024
4 checks passed

alexwaibel deleted the renovate/patch-rook-ceph branch July 17, 2024 05:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(helm): update rook-ceph group ( v1.14.2 → v1.14.8 ) (patch) #27

fix(helm): update rook-ceph group ( v1.14.2 → v1.14.8 ) (patch) #27

renovate bot commented May 4, 2024 •

edited

Loading

github-actions bot commented May 4, 2024 •

edited

Loading

github-actions bot commented May 4, 2024 •

edited

Loading

fix(helm): update rook-ceph group ( v1.14.2 → v1.14.8 ) (patch) #27

fix(helm): update rook-ceph group ( v1.14.2 → v1.14.8 ) (patch) #27

Conversation

renovate bot commented May 4, 2024 • edited Loading

Release Notes

v1.14.8

Improvements

v1.14.7

What's Changed

v1.14.6

What's Changed

v1.14.5

Improvements

v1.14.4

Improvements

v1.14.3

Improvements

Configuration

github-actions bot commented May 4, 2024 • edited Loading

github-actions bot commented May 4, 2024 • edited Loading

renovate bot commented May 4, 2024 •

edited

Loading

`v1.14.8`

`v1.14.7`

`v1.14.6`

`v1.14.5`

`v1.14.4`

`v1.14.3`

github-actions bot commented May 4, 2024 •

edited

Loading

github-actions bot commented May 4, 2024 •

edited

Loading