Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

smartctl JSON keys are all wrong #42

Open
prologic opened this issue Apr 7, 2022 · 11 comments
Open

smartctl JSON keys are all wrong #42

prologic opened this issue Apr 7, 2022 · 11 comments

Comments

@prologic
Copy link

prologic commented Apr 7, 2022

I've fixed this exporter so that at least it runs without crashing in #41 -- But now I'm seeing that the output of the JSOn from smartctl --json --xall /dev/sdX is different to the keys the code is using in all of it's JSON queries.

What version of smartctl was this tested with?!

root@vz1:~# dpkg -l | grep smart
ii  libsmartcols1:amd64                  2.33.1-0.1                      amd64        smart column output alignment library
ii  smartmontools                        7.1-pve2                        amd64        control and monitor storage systems using S.M.A.R.T.
root@vz1:~# smartctl --version
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.3.18-2-pve] (local build)
@prologic
Copy link
Author

prologic commented Apr 7, 2022

For example I'm getting 0 values for all of smartctl_device_power_on_seconds

This is what's being logged:

smartctl_exporter[25705]: pot: gjson.Result{Type:0, Raw:"", Str:"", Num:0, Index:0}

@Enrico204
Copy link

Hi! I tested your #41 and it's working for me. I have smartctl 7.2 2020-12-30 r5155, and I can see some data in Prometheus. However I can't find the key smartctl_device_power_on_seconds in the JSON for my disk

@prologic
Copy link
Author

prologic commented Apr 8, 2022

Hi! I tested your #41 and it's working for me. I have smartctl 7.2 2020-12-30 r5155, and I can see some data in Prometheus. However I can't find the key smartctl_device_power_on_seconds in the JSON for my disk

That's because it doesn't exist.

I'm kind of tempted to rewrite this exporter :/ It makes far too many assumptions about the data it's expecting.

I'm getting some data now and I have a working Dashbaord, but all of the counters and gauges the exporter exposes (presumably trying to populate it from smartctl --json ... data) are basically all zero.

I really had to have a Dashboard gogether at all by grokking the actual attribute table (which is exposed directly by the exporter).

Screen Shot 2022-04-08 at 15 47 21

@lahwaacz
Copy link
Contributor

lahwaacz commented Apr 8, 2022

The smartctl_device_power_on_seconds is being logged here. It parses this field from smartctl --xall --json:

{
  ...
  "power_on_time": {
    "hours": 3521
  },
  ...
}

You can't rewrite the exporter without making any assumptions on what smartctl returns, since that is not standardized and differs per each disk and/or vendor. Which disk are you testing this with? 😉

@lahwaacz
Copy link
Contributor

lahwaacz commented Apr 8, 2022

Btw. the SMART attributes table is also parsed from JSON:

func (smart *SMARTctl) mineDeviceAttribute() {
for _, attribute := range smart.json.Get("ata_smart_attributes.table").Array() {
name := strings.TrimSpace(attribute.Get("name").String())
flagsShort := strings.TrimSpace(attribute.Get("flags.string").String())
flagsLong := smart.mineLongFlags(attribute.Get("flags"), []string{
"prefailure",
"updated_online",
"performance",
"error_rate",
"event_count",
"auto_keep",
})
id := attribute.Get("id").String()
for key, path := range map[string]string{
"value": "value",
"worst": "worst",
"thresh": "thresh",
"raw": "raw.value",
} {
smart.ch <- prometheus.MustNewConstMetric(
metricDeviceAttribute,
prometheus.GaugeValue,
attribute.Get(path).Float(),
smart.device.device,
smart.device.family,
smart.device.model,
smart.device.serial,
name,
flagsShort,
flagsLong,
key,
id,
)
}
}
}

@prologic
Copy link
Author

prologic commented Apr 8, 2022

As I said, the problem right now is that the top-level exported counters/gaguages are not being populated.

I believe this is because as you rightly pointed out, smartctl and disks are a little different on every system.

For example the following metric is zero (so are a bunch of other ones):

smartctl_device_bytes_read```

@lahwaacz
Copy link
Contributor

lahwaacz commented Apr 8, 2022

So what? It's zero for your disk, but arguably not for other disks. So far this seems like a complaint that this exporter does not behave well for your disk (and we still have no idea which model/vendor it actually is). Do you actually have a general improvement suggestion?

@prologic
Copy link
Author

prologic commented Apr 9, 2022

So what? It's zero for your disk, but arguably not for other disks. So far this seems like a complaint that this exporter does not behave well for your disk (and we still have no idea which model/vendor it actually is). Do you actually have a general improvement suggestion?

Da fuq?! Why the hostility? 😳 I'm actually trying to improve this exporter 🤦‍♂️

@lahwaacz
Copy link
Contributor

lahwaacz commented Apr 9, 2022

So again - what's your improvement suggestion?

@prologic
Copy link
Author

prologic commented Apr 9, 2022

@lahwaacz Jakub, dude, what is your problem? Why are you being so hostile?

I've literally spnet 100+hrs this week working in my day job, so I'm quite tired and exhausted. Right now let's focus on getting some outstanding PRs merged. That's probably the most impactful thing we can do here.

Who has the ability to review and merge PRs?

@NiceGuyIT
Copy link
Member

Hi @prologic. A lot has changed since this issue was opened. If there is a problem with the data, I would appreciate feedback on the troubleshooting document in #157. It (hopefully) explains how to map the JSON names to what you see in Prometheus and Grafana. Looking to hear back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants