New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NVMe support #32
Comments
Good idea. Could you please share a |
Of course.
|
Here's the
And another one:
|
The SMART attributes of a NVMe drive can be seen as log identifier 02h on a NVMe device. Attributes worth to check for NVMe devicesCritical Warning
Bit 0: If set to ‘1’, then the available spare capacity has fallen below the threshold So, to my current understanding, a value of
Any hint in the right direction to understand how the bits are actually set and how this represents the final value would be much appreciated! Update: Yes! Seems I found it in the smartmontools source code: https://github.com/smartmontools/smartmontools/blob/e3fdde7aff4cd069e629ee987bf33ac8ccd621ad/smartmontools/nvmeprint.cpp#L300 These are the possible values for attribute
But what I still don't understand is what if multiple errors happen at the same time. E.g. available spare (0x01) and temperature threshold (0x02). Would that result in 0x03? I have nowhere seen any example like this. According to the source code, smartctl itself will already report a fail on the self-assessment check (step 1 in check_smart). In this case we could skip this attribute and focus on the other ones with performance data. Available Spare
Means as soon as the value is less than 100%, the device is slowly wearing out. This is an important indicator to see when a device will likely be "too old/too used" and needs to be replaced. Percentage Used
Not sure yet if this should be counted in. Media and Data Integrity Errors
Probably the most important attribute to be checked. Similar to "bad sectors" of a hard drive. Error Information Log Entries
Not sure yet, however this could be a helpful hint to see increasing issues on a device. Performance data to be collectedAll attributes except "Critical Warning"
|
Hi, are there any news on this? Can I offer help with something? |
@roben I have the code "in my mind" already, but I need a system with a NVMe to test. Anyone would be willing to give me a remote access to a system having a nvme? Contact me on https://www.claudiokuenzler.com/about/. |
Sorry, I only have company servers available where I can't provide access to. I stumbled upon this, though: https://github.com/thomas-krenn/check_smart_attributes#NVMedevices |
Working on it. Someone got me a remote access to a server with NVMe. |
@Rohlik @roben |
Thanks! It looks good:
with
It's hard to test for the faulty drive case, though, because they are all working fine. |
@roben Thanks for testing. I just pushed another important change (regex adjusted). Can you test again with the newest version from the nvme branch please: https://raw.githubusercontent.com/Napsty/check_smart/nvme/check_smart.pl Please also make a single NVME drive check if you can, to see if performance data are correctly appearing. (worked on the server I got access to) |
Here's the single device check:
The output for the multi device check was the same as above. |
NVMe support officially released with 6.7.0. |
Hello,
NVMe interface type is not currently supported but are becoming popular.
My suggestion is to add nvme option to
-i
parameter.The text was updated successfully, but these errors were encountered: