New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add disk UUIDs as labels. #304
Comments
To answer my own question, UUIDs seem to be persistent as they are generated from device metadata. Ref: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/Online_Storage_Reconfiguration_Guide/persistent_naming-uuid_and_others.html |
👍 I think this would be useful. The difficulty is getting the UUIDs based on the name of the device read from /proc/diskstats. |
Since the UUIDs are generated from metadata, I guess its just a matter of figuring out how the generation is made and do the same? |
Sorta, it's complicated. Many of the UUIDs come from reading the filesystem metadata, not generating them programatically. For example:
|
Investing this further shows that you can infact manipulate the UUIDs. This would break any logic in node_exporter if it were to auto-generate them. I guess the easiest thing to do, would be to read /dev/disk/by-uuid, but that is not optimal. For example, my zfs array is not visible on my storage server. blkid however manages to fetch everything. Perhaps there is something in the blkid source worth looking at? |
It wouldn't be appropriate to have both labels on the disk metrics, as each uniquely identifies a disk. This may be best something handled by textfile collector. |
/dev/sdx does not uniquely identify a disk. If you for example plug-in random usb devices you're going to get burned if you're using node_exporter to monitor these disks. The devicelabel alone is not a good solution. UUIDs on the other hand, does in fact uniquely identify a disk. |
That depends on entirely your use case. |
From my point of view, I could debate that it makes more sense having the UUID as a label than the /dev/ name. It offers superior identification as it can even identify disks that are moved across servers/instances. I'm just sharing my suggestion that will make the node_exporter perform better in storage-environments. If its more hazzle than it's worth, then I'll just close this issue and do my own work-around. |
I agree that device label is insufficient to uniquely identify devices. I'd prefer both device and UUID. |
Device label is sufficient, and labels should be minimal. You either get UUID or device as a label. |
Respectfully disagree that device label is sufficient. It's not a unique identifier for a drive or a partition. |
You can't have two |
@brian-brazil Sorry, but that's not how hardware works. Device label is an indication of where it is connected, and UUID is an indication of what is connected. We want both. |
I am well aware of how hardware works. I consider the UUID to be an annotation, so it doesn't belong on these metrics. I'd expect the vast majority of our users not to care about UUIDs and device name order is pretty consistent these days particularly in cloud environments. If you're looking for this then it should come as other annotations do, via another metric taking the machine roles approach. |
And that's the difference, I don't think they're annotations. The exist as separate unique identifiying dimensions of the block device. "Where" and "Which". |
Generally when this happens you choose one to avoid having more labels to work with, and have the other via the machine roles approach. Of the two I believe the device name is what most users would want. |
@raypettersen So here's some data we found on how to get UUID information from udev. Get the udev info from /sys
Then you can get the current udev data from /run.
|
I completely agree with @raypettersen, uuid is the only way of identifying a volume or partition. The /dev point is actually irrelevant for metrics. One is monitoring a disk or partition, not a mounting point. Mounting points can change for various reasons. |
I'd like to remind ye that these are per-block device stats we're talking about in this issue, not volumes, filesystems or mount points. |
We could probably discuss this for hours. I do not agree with your logic Brian. Perhaps the solution is to create a new metric instead of messing with labels. I get you want labels to a minimum, but you should recognize that without a true identification of disks and partitions - metrics is at risk for becoming corrupt. Let's say you're monitoring backup storage that is mounted each night, and throughput performance is what you're after along with a couple of alarms. Somehow the backup media is mounted with a new label and the data you're getting is false. This would never happen if we had a way of pinpointing the with the help of UUID. This is a real life scenario from where I'm coming from. As SuperQ nailed it:
That is, how I see it the key to this argument. |
What @brian-brazil is suggesting is this is solved by having a metric that contains the UUID and device labels and use PromQL to join them.
I consider the block device UUID to be more important than the device name, and I think the device and UUID are separate metric dimensions that should always be included, but I understand where Brian is coming from. |
The one plus side to the |
Sounds like a good solution/workaround. |
I agree that technically UUID shouldn't be a label because the metric is about a device, not a volume. A UUID is also unbounded. Not sure if this has partical implications but I could imaging systems building/mounting images which always would cause a new timeseries to get created. I think join as @SuperQ describes is the right way. But think it would be nice if we could provide |
If we can agree on this, I'd close the issue and create a new one for adding such metric. |
No complaints from me. |
Yes, let's make this a text file collector for now. It could possibly be triggered/managed by udev infrastructure. |
Thank you @SuperQ , your idea helped to me. I had lots of such metrics: Then I created a script which generated dictionary-file for text-filecollector:
And then I got SUM for asmdev starting with DATA by this PromQL:
Cron refreshes resolve-ora-disks.prom every hour. |
In short:
Today:
node_disk_sectors_written{device="sdj"}
Suggestion:
node_disk_sectors_written{device="sdj", uuid="e7821b62-64a0-4f24-a19a-85ed74da0c14"}
Reason for this request is that we have external USB devices that we want monitored, but dashboards and so forth break when devices are occasionally mapped to new device-names. My understanding is that UUIDs are persistent, or maybe not?
The text was updated successfully, but these errors were encountered: