-
Notifications
You must be signed in to change notification settings - Fork 672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First implementation of physical_disk collector #803
First implementation of physical_disk collector #803
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Some feedback below. Also the CI is failing because you added it as a default collector, which means the end-to-end tests need to be updated (see tools/e2e-output.txt)
return nil, err | ||
} | ||
|
||
for _, disk := range dst { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The disk
label currently looks like this: "0 C:"
. Since I remember your use-case was in non-mounted disks, what does it look like under those circumstances @rob-scheepens?
I'm not super happy about having both disk number and mountpoint, to be honest. Depending on what the label is for non-mounted, perhaps we can do something that'll look reasonable for both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For non-mounted disks you will only see the disk number. I agree not to have the label and number, I was actually surprised to see C: show up. ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this fixed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, not yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like this to be resolved before we merge, since once this is released it's breaking to change the labels. I can see three different cases that can occur, maybe missing some?
- Not mounted
- Mounted as disk root
- Mounted as a directory inside another (eg D:\Foo\Bar)
Some possible resolutions:
- Always use disk number, skip the mountpoint
- Use mountpoint if available (I dislike this, since it'd mean you'd get a label change by mounting the disk)
- Label with disk number as
disk
, create a separate_info
metric which has the mountpoint
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd vote for a fix that shows Raw as label for disks that do not have a partition, and thus are also not mounted. We could combine this with the option of always using disk number, so that in case of a raw disk we would see "1 Raw", and in the case of a mounted partition, we would see the behavior as it currently is. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing Labels
Test vm's disks:
DISKPART> lis dis
Disk ### Status Size Free Dyn Gpt
-------- ------------- ------- ------- --- ---
Disk 0 Online 100 GB 0 B *
Disk 1 Offline 20 GB 20 GB
Disk 2 Offline 20 GB 20 GB
Disk 3 Offline 20 GB 20 GB
Disk 4 Offline 20 GB 20 GB
Disk 5 Offline 20 GB 20 GB
Disk 6 Offline 20 GB 20 GB
Disk 7 Online 50 GB 50 GB
Disk 8 Online 50 GB 50 GB
windows_physical_disk_idle_seconds_total{disk="1"} 1742.6185977999999
windows_physical_disk_idle_seconds_total{disk="2"} 1742.6185988
windows_physical_disk_idle_seconds_total{disk="3"} 1742.6185944
windows_physical_disk_idle_seconds_total{disk="4"} 1742.6185948999998
windows_physical_disk_idle_seconds_total{disk="5"} 1742.6185957999999
windows_physical_disk_idle_seconds_total{disk="6"} 1742.6185965
windows_physical_disk_idle_seconds_total{disk="7"} 40.4170576
windows_physical_disk_idle_seconds_total{disk="8"} 27.7063651
Note: disk 0 is excluded by blacklisting C:. None of disks 1-8 contain any partition.
Created partition on disk 7 using "crea par pri". No change in labels:
windows_physical_disk_idle_seconds_total{disk="1"} 1871.906501
windows_physical_disk_idle_seconds_total{disk="2"} 1871.9065021
windows_physical_disk_idle_seconds_total{disk="3"} 1871.9064982
windows_physical_disk_idle_seconds_total{disk="4"} 1871.9064985999998
windows_physical_disk_idle_seconds_total{disk="5"} 1871.9064994999999
windows_physical_disk_idle_seconds_total{disk="6"} 1871.9065008
windows_physical_disk_idle_seconds_total{disk="7"} 169.68967189999998
windows_physical_disk_idle_seconds_total{disk="8"} 156.99427
Assigned letter X using "assign letter=x", no change in labels still.
Also after formatting no change in labels. Only after reboot the X: volume shows up in the metrics list:
windows_physical_disk_idle_seconds_total{disk="1"} 7.8161184
windows_physical_disk_idle_seconds_total{disk="2"} 7.8161192999999995
windows_physical_disk_idle_seconds_total{disk="3"} 7.8161203
windows_physical_disk_idle_seconds_total{disk="4"} 7.8161213
windows_physical_disk_idle_seconds_total{disk="5"} 7.8161219
windows_physical_disk_idle_seconds_total{disk="6"} 7.8161236999999995
windows_physical_disk_idle_seconds_total{disk="7 X:"} 7.816124599999999
windows_physical_disk_idle_seconds_total{disk="8"} 7.8161261
Is this an issue with the exporter? The interesting thing is that when I did the same workflow for Z: on disk 8 (crea par pri, format ntfs quick, assign letter z), it did show up immediately:
windows_physical_disk_idle_seconds_total{disk="1"} 0.00039109999999999997
windows_physical_disk_idle_seconds_total{disk="2"} 0.00039139999999999997
windows_physical_disk_idle_seconds_total{disk="3"} 0.0003917
windows_physical_disk_idle_seconds_total{disk="4"} 0.0003915
windows_physical_disk_idle_seconds_total{disk="5"} 0.00039139999999999997
windows_physical_disk_idle_seconds_total{disk="6"} 0.0003915
windows_physical_disk_idle_seconds_total{disk="7 X:"} 0.00039139999999999997
windows_physical_disk_idle_seconds_total{disk="8 Z:"} 0.0003917
Deleting the partion was also immediately reflected:
windows_physical_disk_idle_seconds_total{disk="1"} 90.2290264
windows_physical_disk_idle_seconds_total{disk="2"} 90.2290273
windows_physical_disk_idle_seconds_total{disk="3"} 90.2290281
windows_physical_disk_idle_seconds_total{disk="4"} 90.22902839999999
windows_physical_disk_idle_seconds_total{disk="5"} 90.229029
windows_physical_disk_idle_seconds_total{disk="6"} 90.2290299
windows_physical_disk_idle_seconds_total{disk="7 X:"} 90.2290305
windows_physical_disk_idle_seconds_total{disk="8"} 90.1772701
Creating two partitions on disk 8 also was immediately reflected:
windows_physical_disk_idle_seconds_total{disk="1"} 157.879834
windows_physical_disk_idle_seconds_total{disk="2"} 157.8798352
windows_physical_disk_idle_seconds_total{disk="3"} 157.8798363
windows_physical_disk_idle_seconds_total{disk="4"} 157.8798369
windows_physical_disk_idle_seconds_total{disk="5"} 157.87983739999999
windows_physical_disk_idle_seconds_total{disk="6"} 157.8798382
windows_physical_disk_idle_seconds_total{disk="7 X:"} 157.8798388
windows_physical_disk_idle_seconds_total{disk="8 Y: Z:"} 157.77886379999998
To me, this labeling looks consistent and in line with what one would expact. You agree @carlpett? If so, we can put this conversation to rest and merge the PR, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a brief review of this conversation I assume option 2 has been chosen?
Use mountpoint if available (I dislike this, since it'd mean you'd get a label change by mounting the disk)
If so I'd recommend documenting this labeling behaviour, in order to avoid user surprise. Could you add an entry for this collector in the docs/
directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in #803 (comment), I'd like the behavior where the disk number is always shown, whether or not in combination with a partition letter or mount point, in case there is a mounted partition. If you agree, let me know and I'll create the doc.
Edit: thinking this through a bit more, let's just use the disk number (like node_exporter only lists the Linux device name, e.g. sda). In the future we could have a discussion on what data we would like to see in an info struct. Makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think either the disk number only or having the disk number and drive letter as separate labels would be fine.
E.G. windows_physical_disk_idle_seconds_total{disk="7", drive_letter="X:"} 157.8798388
Keeping the two pieces of information in a single label would have users resorting to regexes for label matching: windows_physical_disk_idle_seconds_total{disk=".*X:"}
or windows_physical_disk_idle_seconds_total{disk="7.*"}
What has your preference: remove it from the defaults, or shall I update the end-to-end tests? |
@carlpett : I've modified e2e-output and included it in this PR. Can you check if all is ok now? |
Thanks @rob-scheepens, apart from clarification on if one of the old review comments has been addressed, this looks good! Can you make sure to sign the DCO, though? |
3fc4cc6
to
1372152
Compare
Thanks! The e2e test appear to fail now though, although it's not clear to me why. Do they work locally for you? |
When I execute "C:\git\rob-scheepens\windows_exporter>powershell -NonInteractive -ExecutionPolicy Bypass -File .\tools\end-to-end-test.ps1" that works fine, but when I trigger the script through make, it fails:
I've attached a make -d log but that didn't show anything obvious to me. Also running ProcMon during make did not show an obvious problem, log also attached. Is the exit code a standard Windows code? If so it means "file not found", but as said, I didn't see that yet in the ProcMon file or make debug log. Thoughts @carlpett ? |
@carlpett Looking at the various logs generated I noticed that the physical_disk collector is not started during the e2e-test. Looks like I missed a few places where it should be included, working on that. Could that be the reason of the error code 2? if so, then why would things work ok when running the PS script directly, without make? |
@carlpett : looks like the three additional changes and a rebuild fixed it - can you verify? I am now able to run make e2e-test:
|
d2dd0aa
to
ed8fb36
Compare
@carlpett : build and test succeeded now. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Peak season delays in reviews right now... Thanks for fixing the E2E!
Let's get the final label question resolved and then we can merge 🍰
return nil, err | ||
} | ||
|
||
for _, disk := range dst { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like this to be resolved before we merge, since once this is released it's breaking to change the labels. I can see three different cases that can occur, maybe missing some?
- Not mounted
- Mounted as disk root
- Mounted as a directory inside another (eg D:\Foo\Bar)
Some possible resolutions:
- Always use disk number, skip the mountpoint
- Use mountpoint if available (I dislike this, since it'd mean you'd get a label change by mounting the disk)
- Label with disk number as
disk
, create a separate_info
metric which has the mountpoint
What do you think?
Sorry for the delay @carlpett, for some reason I was sure I had already replied... :( Regarding the labeling: I suggest we use the PhysicalDisk number in combination with a mountpoint if appropriate. Technically speaking a disk (PhysicalDisk) is never mounted, only partitions (LogicalDisks) are. An example label for C: would then look like:
An example label for a raw disk without any partitions would be:
This seems to match your third possible resolution mentioned above. Would you agree? |
@carlpett : friendly nudge. ;) |
@carlpett Just had another look at the labels on a test machine, and to me this looks good. Are you ok with this too? |
@rob-scheepens I don't think Calle's available at the moment. Could you rebase the feature branch on |
cfa7305
to
c3ebeb8
Compare
c3ebeb8
to
1f72ed4
Compare
Hi @breed808 , I cleaned up the git commit history and backed out the unused import, so CI should be able to finish ok now. |
3fbfb95
to
a5812b7
Compare
Closing this since we now have a working version of physical_disk collector using PDH instead of Perflib, as mentioned in #799 (comment). A PR for that will follow shortly. |
bef0b14
to
a69b4b3
Compare
fd72eb2
to
3c3e40e
Compare
Looks like the CI isn't happy due to the logging dependency in the collector. Have a look at #1192 for context on the logging change. |
Signed-off-by: Rob Scheepens <rob.scheepens@nutanix.com> Signed-off-by: Brantley West <brantley@nutanix.com>
Signed-off-by: Rob Scheepens <rob.scheepens@nutanix.com> Signed-off-by: Brantley West <brantley@nutanix.com>
Signed-off-by: Rob Scheepens <rob.scheepens@nutanix.com> Signed-off-by: Brantley West <brantley@nutanix.com>
Signed-off-by: Brantley West <brantley@nutanix.com>
Signed-off-by: Brantley West <brantley@nutanix.com>
Signed-off-by: Rob Scheepens <rob.scheepens@nutanix.com>
…ter label Signed-off-by: Brantley West <brantley@nutanix.com>
Signed-off-by: Brantley West <brantley@nutanix.com>
153656f
to
a4a2f8b
Compare
Signed-off-by: Brantley West <brantley@nutanix.com>
a4a2f8b
to
fe12d29
Compare
Signed-off-by: Ben Reedy <breed808@breed808.com>
Signed-off-by: Ben Reedy <breed808@breed808.com>
4aec302
to
1acad24
Compare
Apologies for the delay in getting back to this. I've re-run the CI and resolved the issues it's found. Are we happy with using the disk number in the |
@breed808 thanks for checking, and yes, we're happy with the disk number. |
…#803) * Added physical_disk collector. Signed-off-by: Rob Scheepens <rob.scheepens@nutanix.com> Signed-off-by: Brantley West <brantley@nutanix.com> * exporter.go: Added physical_disk to defaultCollectors Signed-off-by: Rob Scheepens <rob.scheepens@nutanix.com> Signed-off-by: Brantley West <brantley@nutanix.com> * Fix test cases for physicaldisk metrics Signed-off-by: Rob Scheepens <rob.scheepens@nutanix.com> Signed-off-by: Brantley West <brantley@nutanix.com> * physical_disk.go: cleanup gofmt error Signed-off-by: Brantley West <brantley@nutanix.com> * physical_disk.go: populate physical disk 'number' label Signed-off-by: Brantley West <brantley@nutanix.com> * Added docs/collector.physical_disk.md. Signed-off-by: Rob Scheepens <rob.scheepens@nutanix.com> * physical_disk.go: change 'number' label to 'disk' to match node_exporter label Signed-off-by: Brantley West <brantley@nutanix.com> * physical_disk.go: adopt github.com/go-kit/log Signed-off-by: Brantley West <brantley@nutanix.com> * physical_disk.go: adopt include/exclude disk list Signed-off-by: Brantley West <brantley@nutanix.com> * fix: Add init config for physical_disk collector Signed-off-by: Ben Reedy <breed808@breed808.com> * chore: gofmt physical_disk collector Signed-off-by: Ben Reedy <breed808@breed808.com> --------- Signed-off-by: Rob Scheepens <rob.scheepens@nutanix.com> Signed-off-by: Brantley West <brantley@nutanix.com> Signed-off-by: Ben Reedy <breed808@breed808.com> Co-authored-by: Brantley West <brantley@nutanix.com> Co-authored-by: Ben Reedy <breed808@breed808.com>
This PR is to implement a first incarnation of the physical_disk collector, as derived from the logical_disk collector. This as a result from the request in #795 (comment).