-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement volume health monitoring feature to detect abnormal volumes #819
base: master
Are you sure you want to change the base?
Conversation
For this PR, we need |
d0697c7
to
ce587bc
Compare
@@ -47,30 +47,34 @@ const ( | |||
// csiResizerImage = csi-resizer:v1.8.0 | |||
csiResizerImage = "csi-resizer@sha256:819f68a4daf75acec336302843f303cf360d4941249f9f5019ffbb690c8ac7c0" | |||
openshiftCSIResizerImage = "registry.redhat.io/openshift4/ose-csi-external-resizer-rhel8@sha256:837b32a0c432123e2605ad6d029e7f3b4489d9c52a9d272c7a133d41ad10db87" | |||
|
|||
// csiHealthMonitorImage = csi-external-health-monitor-controller:v0.10.0 | |||
csiHealthMonitorImage = "registry.k8s.io/sig-storage/csi-external-health-monitor-controller:v0.10.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Can we keep this image in
quay.io/minio
? - Use sha256 hash of the image
- Use the image from
registry.redhat.io
for openshift
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will use the quay.io/minio/
image after pushing to the registry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't have any equivalent image for this available in redhat openshift
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't have any equivalent image for this available in redhat openshift
Then we are stuck to certify DirectPV for Openshift certification. We would need to build csi-external-health-monitor-controller
our own to certify it for openshift
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have sent a PR https://github.com/minio/directpv/pull/859/files to use a script to download and release the image which can comply with RedHat image certification testcases.
} | ||
|
||
// ResetConditionIfMatches resets the condition values to default if the type, reason and message matches. | ||
func ResetConditionIfMatches(conditions []metav1.Condition, ctype string, reason, message, newReason string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is not needed. Use UpdateCondition()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is required because we need to reset the condition to default only if all the condition value matches. Unlike UpdateCondition()
where only condition type is checked
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't make any difference to use UpdateCondition()
. Only difference is it appends the type
if not found.
For private registries, please note that the following image is required for enabling volume health monitoring | ||
|
||
``` | ||
quay.io/minio/csi-external-health-monitor-controller:v0.10.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- You would need to update https://github.com/minio/directpv/blob/master/docs/tools/push-images.sh instead of documenting here
- You would need to update https://github.com/minio/directpv/blob/master/docs/command-reference.md and https://github.com/minio/directpv/blob/master/docs/installation.md#directpv-csi-driver-installation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Volume health monitoring is not enabled by default, so, for a fresh installation we don't need this image. Users need this image only when they enable volume health monitoring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users make common mistakes to miss pulling this image when enabling it. There is no problem to have unused image of csi-external-health-monitor-controller
. When health monitoring is enabled, it works seamlessly than failures.
docs/monitoring.md
Outdated
@@ -79,3 +79,21 @@ scrape_configs: | |||
action: replace | |||
target_label: kubernetes_name | |||
``` | |||
|
|||
# Volume health monitoring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could have the doc like
# Monitoring volume health
DirectPV monitors volume health and reports abnormalities as event to PVCs and Pods. Currently volume mounts are monitored and more checks will be added later. This feature uses [Volume health monitoring CSI feature](https://kubernetes.io/docs/concepts/storage/volume-health-monitoring/).
To enable volume health monitoring, pass `--enable-volume-health-monitoring` flag to installation and `CSIVolumeHealth` feature gate must be enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have also added the following because this image is required only when volume health monitor is enabled.
For private registries, please note that the following image is required for enabling volume health monitoring
quay.io/minio/csi-external-health-monitor-controller:v0.10.0
062342b
to
2702abe
Compare
2702abe
to
190493f
Compare
This is a CSI feature which monitors the volume health, identifies and reports the abnormal volumes as events to the respective pods and PVCs.
Fixes #455
Steps to test :
NOTE: This feature requites CSIVolumeHealth feature gate to be enabled. For example, in minikube you can enable it like the following
However, the the external controller monitoring doesn't requite this feature gate. ie, you will still see "abnormal" events in your PVCs.
STEP 1: Enable CSIVolumeHealth feature gate
STEP 2: Install DirectPV with
--enable-volume-health-monitoring
flag.STEP 3: Discover and initialize drives
STEP 4: Deploy any workload
STEP 5: After the pods are up, umount a volume's container path or staging path.
STEP 6: Wait for few mins and describe the pods and PVCs to see the "abnormal" volume events.