New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust Lifetime / wear levelling trigger #2
Comments
I realised today that your template already has a trigger for wear levelling but its called Lifetime and its configured like so: So I think all I need to do is change this to:
or maybe
To make it alert sooner. Even 90% wear levelling is too late in my experience. Waiting for it to get below 10% would be much too late for a timely alert. |
I was testing the wear levelling / Lifetime monitoring today with Zabbix 6.0 with a known bad (high wear levelling) Samsung SSD with no luck. Two questions: Does a disk need to be mounted for this template to work? I did mount my faulty disk but it still didn't trigger the Lifetime alert. Do I need to "Unlink and clear" a template every time I change a macro or trigger? |
I think this might be working under Zabbix 6 actually but I got the trigger prototype expression for Lifetime wrong. I think I should be using this:
Problem is that I need to wait 12 hours now to find out if that is correct because I don't know how to change the interval or manually trigger a re-check? I have inserted a Samsung SSD with a wear level value of 85 so it should cause this template to trigger, even though its not mounted, I presume. I have changed the item prototype for [{#SSDDISK} Wear Leveling Count] to use a 1hr interval but that doesn't update it every hour. I suspect thats because the Discovery rule for this template is still set to a 12hr interval but I've not worked out how to adjust that interval yet? |
Good news! This template does indeed work fine for Zabbix 6.0 when using Samsung SSDs attached to a RAID controller running in HBA mode. This is what I've added to my Zabbix notes about this template: This template is no use for monitoring Samsung SATA SSD based ZFS pools in its default configuration because it doesn't alert until the wear levelling gets as low as 9%. A new disk starts at 100% levelling. In my experience, if just one disk in a ZFS pool gets to about 91% or 90% wear levelling, it can tank the performance of the whole pool so we want to alert when the levelling of a disk gets to 8% use ie 92% as the alert point. This is how you configure the Samsung SSD Zabbix template to alert much sooner: Configuration -> Templates last(/S.M.A.R.T. SSD Samsung/ssd.v177[{#SSDDISK}])<93 Then click Update to adjust the trigger expression to alert much sooner, when the SSD's wear levelling count gets to 92% instead of the templates default of 9% (<10). You may want to add something like that to the README if you don't want to change to the default Lifetime trigger expression to something closer to mine? |
In my experience of using Samsung SSDs as members of ZFS pools, when one disks wear levelling count gets to about 9 or 10%, that disk can bring the IO performance of the whole pool to its knees so I'm going to create a trigger for my Samsung SSDs when they exceed > 7% wear levelling. It seems that this template doesn't include a trigger for wear levelling by default so I'd like to see one added.
Thanks
The text was updated successfully, but these errors were encountered: