Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VMI filesystem usage metrics #7814

Merged

Conversation

machadovilaca
Copy link
Member

Currently, in the dashboard, under Virtualization -> Overview, the filesystem
report uses a metric about the pod filesystem usage. This PR adds new metrics
with values corresponding to VM filesystem usage so they can be used in the
dashboard.

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Release note:

Add VMI filesystem usage metrics

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/L area/monitoring labels May 27, 2022
@machadovilaca
Copy link
Member Author

/retest

1 similar comment
@machadovilaca
Copy link
Member Author

/retest

@enp0s3
Copy link
Contributor

enp0s3 commented Jun 6, 2022

/cc

@kubevirt-bot kubevirt-bot requested a review from enp0s3 June 6, 2022 16:15
Copy link
Contributor

@enp0s3 enp0s3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @machadovilaca Thank you for the PR
I have a small question below

metrics.pushCustomMetric(
"kubevirt_vmi_filesystem_total_bytes",
"Total VM filesystem capacity in bytes.",
prometheus.GaugeValue,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have you decided to work with Gauge metric instead of Counter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my understanding, for some types of filesystem, it is possible to change the total capacity, so this value might both go up and down, so I think we should use the Gauge here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

metrics.pushCustomMetric(
"kubevirt_vmi_filesystem_used_bytes",
"Used VM filesystem capacity in bytes.",
prometheus.GaugeValue,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, this value might go up and down arbitrarily, so I think we should use the Gauge

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor

@enp0s3 enp0s3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@machadovilaca Thank you! overall looks good, please see my comments below

fsLabels := []string{"disk"}

for _, fsStat := range vmFSStats.Items {
fsLabelValues := []string{fsStat.DiskName}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@machadovilaca Following is the struct:

type VirtualMachineInstanceFileSystem struct {
	DiskName       string `json:"diskName"`
	MountPoint     string `json:"mountPoint"`
	FileSystemType string `json:"fileSystemType"`
	UsedBytes      int    `json:"usedBytes"`
	TotalBytes     int    `json:"totalBytes"`
}

I see that you don't scrap the MountPoint and the FileSystemType values, is that in purpose?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really, will add

return
}

// GetDomainStats() or may hang for a long time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@machadovilaca Sorry I didn't understand the comment, can you please elaborate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

misadded "or "

@@ -536,17 +562,23 @@ func (ps *prometheusScraper) Scrape(socketFile string, vmi *k6tv1.VirtualMachine
}
defer cli.Close()

vmStats, exists, err := cli.GetDomainStats()
vmDomainStats, exists, err := cli.GetDomainStats()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@machadovilaca Can you please create a separate commit for the refactoring of the vmStats name? please add in the commit message the reason why the new name is better

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

func (ps *prometheusScraper) Report(socketFile string, vmi *k6tv1.VirtualMachineInstance, vmStats *stats.DomainStats) {
// statsMaxAge is an estimation - and there is not better way to do that. So it is possible that
func (ps *prometheusScraper) Report(socketFile string, vmi *k6tv1.VirtualMachineInstance, vmDomainStats *stats.DomainStats, vmFSStats k6tv1.VirtualMachineInstanceFileSystemList) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@machadovilaca I think it could be better if you will create a type vmStats struct and you will be able to encapsulate the vmDomainStats and vmFSStats there

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 great idea.
This way even the vmStats name can stay, only the type would change.
In the long term, it would keep our function signatures small, therefore more readable and easily changable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, changed

@@ -69,7 +73,7 @@ var _ = Describe("Prometheus", func() {

ps := prometheusScraper{ch: ch}

vmStats := &stats.DomainStats{
vmDomainStats := &stats.DomainStats{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@machadovilaca same here about a dedicated commit for refactoring

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -99,7 +103,7 @@ var _ = Describe("Prometheus", func() {
},
}
vmi := k6tv1.VirtualMachineInstance{}
ps.Report("test", &vmi, vmStats)
ps.Report("test", &vmi, vmStats, emptyVmFSStats)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@machadovilaca Here you've left the vmStats although you've refactored its name in previous places

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed when using the suggested struct

Copy link
Contributor

@iholder101 iholder101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @machadovilaca! Good work!

Comment on lines 464 to 490
fsLabels := []string{"disk"}

for _, fsStat := range vmFSStats.Items {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe return if len(vmFSStats.Items) == 0? This would prevent allocating fsLabels := []string{"disk"} for these scenarios.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

for _, fsStat := range vmFSStats.Items {
fsLabelValues := []string{fsStat.DiskName}

metrics.pushCustomMetric(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering - why not publish all data we have? e.g. MountPoint and FileSystemType? Maybe as labels?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but this link just leads me to "Files" tab :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you meant #7814 (comment). Got it. Thanks

Comment on lines 38 to 41
var emptyVmFSStats = k6tv1.VirtualMachineInstanceFileSystemList{
Items: []k6tv1.VirtualMachineInstanceFileSystem{},
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having global mutabale variables for tests is a bad practice. I think you should choose one of the two options:

  1. Declare the variable globally but initialize it in a BeforeEach clause
  2. Replace the variable with a function that returns an empty VMFSStats

Option 2 is safer and better imho, it isolates tests from one another and more readable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this relates to @enp0s3's suggestion about vmStats struct. If this was the call, all you would need to change here is to initialize another field in the struct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Expect(result.Desc().String()).To(ContainSubstring("kubevirt_vmi_filesystem_total_bytes"))
result = <-ch
Expect(result).ToNot(BeNil())
Expect(result.Desc().String()).To(ContainSubstring("kubevirt_vmi_filesystem_used_bytes"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe expect the channel to be empty here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@machadovilaca machadovilaca force-pushed the add-kubevirt-filesystem-usage-metric branch 3 times, most recently from 30844a0 to c23fb96 Compare July 20, 2022 10:50
@machadovilaca
Copy link
Member Author

/test pull-kubevirt-e2e-k8s-1.22-sig-compute

Copy link
Contributor

@iholder101 iholder101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @machadovilaca! Great work!
/lgtm

left small nit

@@ -543,6 +573,11 @@ type prometheusScraper struct {
ch chan<- prometheus.Metric
}

type VirtualMachineStats struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe VirtualMachineInstanceStats is more accurate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Jul 20, 2022
Signed-off-by: João Vilaça <jvilaca@redhat.com>
Signed-off-by: João Vilaça <jvilaca@redhat.com>
Signed-off-by: João Vilaça <jvilaca@redhat.com>
@machadovilaca machadovilaca force-pushed the add-kubevirt-filesystem-usage-metric branch from c23fb96 to 5621389 Compare July 20, 2022 14:10
@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Jul 20, 2022
@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Jul 20, 2022
@machadovilaca
Copy link
Member Author

/retest

@machadovilaca
Copy link
Member Author

/test pull-kubevirt-e2e-kind-1.22-sriov

1 similar comment
@machadovilaca
Copy link
Member Author

/test pull-kubevirt-e2e-kind-1.22-sriov

@machadovilaca
Copy link
Member Author

/test pull-kubevirt-e2e-kind-1.22-sriov

Copy link
Member

@stu-gott stu-gott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: stu-gott

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 25, 2022
@machadovilaca
Copy link
Member Author

/retest

1 similar comment
@machadovilaca
Copy link
Member Author

/retest

@kubevirt-bot
Copy link
Contributor

kubevirt-bot commented Jul 26, 2022

@machadovilaca: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubevirt-fossa 5621389 link false /test pull-kubevirt-fossa

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@machadovilaca
Copy link
Member Author

/test pull-kubevirt-e2e-k8s-1.22-operator

@machadovilaca
Copy link
Member Author

/test pull-kubevirt-unit-test

@kubevirt-bot kubevirt-bot merged commit cdb4032 into kubevirt:main Jul 27, 2022
@machadovilaca machadovilaca deleted the add-kubevirt-filesystem-usage-metric branch August 11, 2022 12:43
@machadovilaca
Copy link
Member Author

/cherry-pick release-0.53

@kubevirt-bot
Copy link
Contributor

@machadovilaca: new pull request created: #8621

In response to this:

/cherry-pick release-0.53

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/monitoring dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants