-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smart percentage_used alert should be clear that nvme device is reaching its end of life #60
Comments
We don't have alert for nvme in hw-observer now. Do you submit it in wrong place? |
This is more of a feature request. I know this alert doesn't exist (yet) and I wanted to capture this requirement for when it is reimplemented in hw-observer. |
Got it . This is a missing part in hardware-observer. |
Example to get the metrics:
Currently it's a missing part in node exporter. |
Need to check if https://github.com/prometheus-community/smartctl_exporter has this. |
It's easy to misunderstand the percentage_used nvme alert. The hw-health charm config has a pretty good description:
The alert should suggest that the issue is related to filesystem getting full, rather than the NVME being close to death. E.g. Intel Optane drivers start to throttle and report read-only mode after hitting 105% percentage_used, which means they've exceeded their expected lifetime.
Something like: "nvme drive is close to reaching its estimated lifetime" would help.
The text was updated successfully, but these errors were encountered: