fix(helm): update rook-ceph group ( v1.14.5 → v1.14.8 ) (patch) #304

parsec-renovate · 2024-06-13T23:08:11Z

This PR contains the following updates:

Package	Update	Change
rook-ceph	patch	`v1.14.5` -> `v1.14.8`
rook-ceph-cluster	patch	`v1.14.5` -> `v1.14.8`

Release Notes

rook/rook (rook-ceph)

`v1.14.8`

Compare Source

Improvements

Rook v1.14.8 is a patch release limited in scope and focusing on feature additions and bug fixes to the Ceph operator.

osd: Fix activate failure when block device moves (#14374, @BlaineEXE)
csi: Update csi-addons repo link for correctly versioned downloads (#14408, @Madhu-1)
build: Update go-retryablehttp from 0.7.6 to 0.7.7 (#14391, @subhamkrai)
osd: Use old passphrase to kill the LUKS slot during key rotation (#14367, @black-dragon74)
csi: Skip creating networkFence when csi is disabled (#14294, @subhamkrai)

`v1.14.7`

Compare Source

What's Changed

monitoring: fix CephPoolGrowthWarning expression (#14346, @matofeder)
monitoring: Set honor labels on the service monitor (#14339, @travisn)

Full Changelog: rook/rook@v1.14.6...v1.14.7

`v1.14.6`

Compare Source

What's Changed

build: add result of codegen (#14287, @obnoxxx)
build: remove iproute build dependency on centos repo (#14299, @BlaineEXE)
mon: Allow overriding the mon endpoint with annotation (#13500, @travisn)
multus: add and test ipv6 support for validation tool (#14302, @BlaineEXE)
monitoring: fix exporter service monitor selector (#14313, @matofeder)
monitoring: update to the latest ceph prometheus rules (#14312, @matofeder)
doc: add recommendation for nfs in external cluster (#13876, @parth-gr)
pool: get the exact deviceClass name instead of crushroot+deviceClass (#14325, @ideepika)
helm: allow custom labels and annotations for storage classes (#14323, @catdog2)
core: Update go modules for snyk security check (#14331, @travisn)

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about these updates again.

If you want to rebase/retry this PR, check this box

This PR has been generated by Renovate Bot.

github-actions · 2024-06-13T23:08:34Z

--- kubernetes/apps/rook-ceph/rook-ceph/cluster Kustomization: flux-system/rook-ceph-cluster HelmRelease: rook-ceph/rook-ceph-cluster

+++ kubernetes/apps/rook-ceph/rook-ceph/cluster Kustomization: flux-system/rook-ceph-cluster HelmRelease: rook-ceph/rook-ceph-cluster

@@ -13,13 +13,13 @@

     spec:
       chart: rook-ceph-cluster
       sourceRef:
         kind: HelmRepository
         name: rook-ceph
         namespace: flux-system
-      version: v1.14.5
+      version: v1.14.8
   dependsOn:
   - name: rook-ceph-operator
     namespace: rook-ceph
   - name: snapshot-controller
     namespace: storage
   install:
--- kubernetes/apps/rook-ceph/rook-ceph/app Kustomization: flux-system/rook-ceph HelmRelease: rook-ceph/rook-ceph-operator

+++ kubernetes/apps/rook-ceph/rook-ceph/app Kustomization: flux-system/rook-ceph HelmRelease: rook-ceph/rook-ceph-operator

@@ -13,13 +13,13 @@

     spec:
       chart: rook-ceph
       sourceRef:
         kind: HelmRepository
         name: rook-ceph
         namespace: flux-system
-      version: v1.14.5
+      version: v1.14.8
   dependsOn:
   - name: snapshot-controller
     namespace: storage
   install:
     remediation:
       retries: 3

github-actions · 2024-06-13T23:08:35Z

--- HelmRelease: rook-ceph/rook-ceph-cluster PrometheusRule: rook-ceph/prometheus-ceph-rules

+++ HelmRelease: rook-ceph/rook-ceph-cluster PrometheusRule: rook-ceph/prometheus-ceph-rules

@@ -261,13 +261,13 @@

         severity: warning
         type: ceph_default
     - alert: CephDeviceFailurePredictionTooHigh
       annotations:
         description: The device health module has determined that devices predicted
           to fail can not be remediated automatically, since too many OSDs would be
-          removed from the cluster to ensure performance and availabililty. Prevent
+          removed from the cluster to ensure performance and availability. Prevent
           data integrity issues by adding new OSDs so that data may be relocated.
         documentation: https://docs.ceph.com/en/latest/rados/operations/health-checks#device-health-toomany
         summary: Too many devices are predicted to fail, unable to resolve
       expr: ceph_health_detail{name="DEVICE_HEALTH_TOOMANY"} == 1
       for: 1m
       labels:
@@ -504,13 +504,13 @@

       expr: ceph_health_detail{name="PG_RECOVERY_FULL"} == 1
       for: 1m
       labels:
         oid: 1.3.6.1.4.1.50495.1.2.1.7.5
         severity: critical
         type: ceph_default
-    - alert: CephPGUnavilableBlockingIO
+    - alert: CephPGUnavailableBlockingIO
       annotations:
         description: Data availability is reduced, impacting the cluster's ability
           to service I/O. One or more placement groups (PGs) are in a state that blocks
           I/O.
         documentation: https://docs.ceph.com/en/latest/rados/operations/health-checks#pg-availability
         summary: PG is unavailable, blocking I/O
@@ -626,15 +626,15 @@

       labels:
         oid: 1.3.6.1.4.1.50495.1.2.1.8.3
         severity: warning
         type: ceph_default
     - alert: CephNodeNetworkBondDegraded
       annotations:
-        summary: Degraded Bond on Node {{ $labels.instance }}
         description: Bond {{ $labels.master }} is degraded on Node {{ $labels.instance
           }}.
+        summary: Degraded Bond on Node {{ $labels.instance }}
       expr: |
         node_bonding_slaves - node_bonding_active != 0
       labels:
         severity: warning
         type: ceph_default
     - alert: CephNodeDiskspaceWarning
@@ -662,12 +662,23 @@

         > 0))  )
       labels:
         severity: warning
         type: ceph_default
   - name: pools
     rules:
+    - alert: CephPoolGrowthWarning
+      annotations:
+        description: Pool '{{ $labels.name }}' will be full in less than 5 days assuming
+          the average fill-up rate of the past 48 hours.
+        summary: Pool growth rate may soon exceed capacity
+      expr: (predict_linear(ceph_pool_percent_used[2d], 3600 * 24 * 5) * on(pool_id,
+        instance, pod) group_right() ceph_pool_metadata) >= 95
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.9.2
+        severity: warning
+        type: ceph_default
     - alert: CephPoolBackfillFull
       annotations:
         description: A pool is approaching the near full threshold, which will prevent
           recovery/backfill operations from completing. Consider adding more capacity.
         summary: Free space in a pool is too low for recovery/backfill
       expr: ceph_health_detail{name="POOL_BACKFILLFULL"} > 0
@@ -718,22 +729,113 @@

       expr: ceph_healthcheck_slow_ops > 0
       for: 30s
       labels:
         severity: warning
         type: ceph_default
     - alert: CephDaemonSlowOps
-      for: 30s
-      expr: ceph_daemon_health_metrics{type="SLOW_OPS"} > 0
-      labels:
-        severity: warning
-        type: ceph_default
-      annotations:
-        summary: '{{ $labels.ceph_daemon }} operations are slow to complete'
+      annotations:
         description: '{{ $labels.ceph_daemon }} operations are taking too long to
           process (complaint time exceeded)'
         documentation: https://docs.ceph.com/en/latest/rados/operations/health-checks#slow-ops
+        summary: '{{ $labels.ceph_daemon }} operations are slow to complete'
+      expr: ceph_daemon_health_metrics{type="SLOW_OPS"} > 0
+      for: 30s
+      labels:
+        severity: warning
+        type: ceph_default
+  - name: hardware
+    rules:
+    - alert: HardwareStorageError
+      annotations:
+        description: Some storage devices are in error. Check `ceph health detail`.
+        summary: Storage devices error(s) detected
+      expr: ceph_health_detail{name="HARDWARE_STORAGE"} > 0
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.13.1
+        severity: critical
+        type: ceph_default
+    - alert: HardwareMemoryError
+      annotations:
+        description: DIMM error(s) detected. Check `ceph health detail`.
+        summary: DIMM error(s) detected
+      expr: ceph_health_detail{name="HARDWARE_MEMORY"} > 0
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.13.2
+        severity: critical
+        type: ceph_default
+    - alert: HardwareProcessorError
+      annotations:
+        description: Processor error(s) detected. Check `ceph health detail`.
+        summary: Processor error(s) detected
+      expr: ceph_health_detail{name="HARDWARE_PROCESSOR"} > 0
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.13.3
+        severity: critical
+        type: ceph_default
+    - alert: HardwareNetworkError
+      annotations:
+        description: Network error(s) detected. Check `ceph health detail`.
+        summary: Network error(s) detected
+      expr: ceph_health_detail{name="HARDWARE_NETWORK"} > 0
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.13.4
+        severity: critical
+        type: ceph_default
+    - alert: HardwarePowerError
+      annotations:
+        description: Power supply error(s) detected. Check `ceph health detail`.
+        summary: Power supply error(s) detected
+      expr: ceph_health_detail{name="HARDWARE_POWER"} > 0
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.13.5
+        severity: critical
+        type: ceph_default
+    - alert: HardwareFanError
+      annotations:
+        description: Fan error(s) detected. Check `ceph health detail`.
+        summary: Fan error(s) detected
+      expr: ceph_health_detail{name="HARDWARE_FANS"} > 0
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.13.6
+        severity: critical
+        type: ceph_default
+  - name: PrometheusServer
+    rules:
+    - alert: PrometheusJobMissing
+      annotations:
+        description: The prometheus job that scrapes from Ceph MGR is no longer defined,
+          this will effectively mean you'll have no metrics or alerts for the cluster.  Please
+          review the job definitions in the prometheus.yml file of the prometheus
+          instance.
+        summary: The scrape job for Ceph MGR is missing from Prometheus
+      expr: absent(up{job="rook-ceph-mgr"})
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.12.1
+        severity: critical
+        type: ceph_default
+    - alert: PrometheusJobExporterMissing
+      annotations:
+        description: The prometheus job that scrapes from Ceph Exporter is no longer
+          defined, this will effectively mean you'll have no metrics or alerts for
+          the cluster.  Please review the job definitions in the prometheus.yml file
+          of the prometheus instance.
+        summary: The scrape job for Ceph Exporter is missing from Prometheus
+      expr: sum(absent(up{job="rook-ceph-exporter"})) and sum(ceph_osd_metadata{ceph_version=~"^ceph
+        version (1[89]|[2-9][0-9]).*"}) > 0
+      for: 30s
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.12.1
+        severity: critical
+        type: ceph_default
   - name: rados
     rules:
     - alert: CephObjectMissing
       annotations:
         description: The latest version of a RADOS object can not be found, even though
           all OSDs are up. I/O requests for this object from clients will block (hang).
@@ -760,7 +862,218 @@

       expr: ceph_health_detail{name="RECENT_CRASH"} == 1
       for: 1m
       labels:
         oid: 1.3.6.1.4.1.50495.1.2.1.1.2
         severity: critical
         type: ceph_default
+  - name: rbdmirror
+    rules:
+    - alert: CephRBDMirrorImagesPerDaemonHigh
+      annotations:
+        description: Number of image replications per daemon is not supposed to go
+          beyond threshold 100
+        summary: Number of image replications are now above 100
+      expr: sum by (ceph_daemon, namespace) (ceph_rbd_mirror_snapshot_image_snapshots)
+        > 100
+      for: 1m
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.10.2
+        severity: critical
+        type: ceph_default
+    - alert: CephRBDMirrorImagesNotInSync
+      annotations:
+        description: Both local and remote RBD mirror images should be in sync.
+        summary: Some of the RBD mirror images are not in sync with the remote counter
+          parts.
+      expr: sum by (ceph_daemon, image, namespace, pool) (topk by (ceph_daemon, image,
+        namespace, pool) (1, ceph_rbd_mirror_snapshot_image_local_timestamp) - topk
+        by (ceph_daemon, image, namespace, pool) (1, ceph_rbd_mirror_snapshot_image_remote_timestamp))
+        != 0
+      for: 1m
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.10.3
+        severity: critical
+        type: ceph_default
+    - alert: CephRBDMirrorImagesNotInSyncVeryHigh
+      annotations:
+        description: More than 10% of the images have synchronization problems
+        summary: Number of unsynchronized images are very high.
+      expr: count by (ceph_daemon) ((topk by (ceph_daemon, image, namespace, pool)
+        (1, ceph_rbd_mirror_snapshot_image_local_timestamp) - topk by (ceph_daemon,
+        image, namespace, pool) (1, ceph_rbd_mirror_snapshot_image_remote_timestamp))
+        != 0) > (sum by (ceph_daemon) (ceph_rbd_mirror_snapshot_snapshots)*.1)
+      for: 1m
+      labels:
+        oid: 1.3.6.1.4.1.50495.1.2.1.10.4
+        severity: critical
+        type: ceph_default
+    - alert: CephRBDMirrorImageTransferBandwidthHigh
+      annotations:
[Diff truncated by flux-local]
--- HelmRelease: rook-ceph/rook-ceph-operator Deployment: rook-ceph/rook-ceph-operator

+++ HelmRelease: rook-ceph/rook-ceph-operator Deployment: rook-ceph/rook-ceph-operator

@@ -26,13 +26,13 @@

       - effect: NoExecute
         key: node.kubernetes.io/unreachable
         operator: Exists
         tolerationSeconds: 5
       containers:
       - name: rook-ceph-operator
-        image: rook/ceph:v1.14.5
+        image: rook/ceph:v1.14.8
         imagePullPolicy: IfNotPresent
         args:
         - ceph
         - operator
         securityContext:
           capabilities:

parsec-renovate bot added renovate/helm type/patch labels Jun 13, 2024

github-actions bot added the area/kubernetes label Jun 13, 2024

parsec-renovate bot force-pushed the renovate/patch-rook-ceph branch from 4f9b92e to 78c0911 Compare June 21, 2024 19:06

parsec-renovate bot changed the title ~~fix(helm): update rook-ceph group ( v1.14.5 → v1.14.6 ) (patch)~~ fix(helm): update rook-ceph group ( v1.14.5 → v1.14.7 ) (patch) Jun 21, 2024

fix(helm): update rook-ceph group ( v1.14.5 → v1.14.8 )

c29ec93

parsec-renovate bot force-pushed the renovate/patch-rook-ceph branch from 78c0911 to c29ec93 Compare July 3, 2024 20:04

parsec-renovate bot changed the title ~~fix(helm): update rook-ceph group ( v1.14.5 → v1.14.7 ) (patch)~~ fix(helm): update rook-ceph group ( v1.14.5 → v1.14.8 ) (patch) Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(helm): update rook-ceph group ( v1.14.5 → v1.14.8 ) (patch) #304

fix(helm): update rook-ceph group ( v1.14.5 → v1.14.8 ) (patch) #304

parsec-renovate bot commented Jun 13, 2024 •

edited

Loading

github-actions bot commented Jun 13, 2024 •

edited

Loading

github-actions bot commented Jun 13, 2024 •

edited

Loading

fix(helm): update rook-ceph group ( v1.14.5 → v1.14.8 ) (patch) #304

Are you sure you want to change the base?

fix(helm): update rook-ceph group ( v1.14.5 → v1.14.8 ) (patch) #304

Conversation

parsec-renovate bot commented Jun 13, 2024 • edited Loading

Release Notes

v1.14.8

Improvements

v1.14.7

What's Changed

v1.14.6

What's Changed

Configuration

github-actions bot commented Jun 13, 2024 • edited Loading

github-actions bot commented Jun 13, 2024 • edited Loading

parsec-renovate bot commented Jun 13, 2024 •

edited

Loading

`v1.14.8`

`v1.14.7`

`v1.14.6`

github-actions bot commented Jun 13, 2024 •

edited

Loading

github-actions bot commented Jun 13, 2024 •

edited

Loading