Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

network, netstat: Detect SR-IOV interfaces from the domain #7091

Merged
merged 3 commits into from Feb 14, 2022

Conversation

EdDev
Copy link
Member

@EdDev EdDev commented Jan 16, 2022

What this PR does / why we need it:

SR-IOV devices have been reported to the VMI status based on the
guest-agent report only.

With this change, the domain hostdevices are examined to include the
SR-IOV interfaces in the VMI status report.
Even if the guest-agent is not active, SR-IOV devices will be reported.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #7050

Special notes for your reviewer:
Depends on #7074 , #7092
Only the last 2 commits are relevant.

Release note:

SR-IOV interfaces are now reported in the VMI status even without an active guest-agent.

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Jan 16, 2022
@EdDev
Copy link
Member Author

EdDev commented Jan 16, 2022

/uncc stu-gott
/cc @ormergi @oshoval @orelmisan @maiqueb @dteplits

@kubevirt-bot kubevirt-bot requested review from oshoval, orelmisan, maiqueb and ormergi and removed request for stu-gott January 16, 2022 09:20
@kubevirt-bot
Copy link
Contributor

@EdDev: GitHub didn't allow me to request PR reviews from the following users: dteplits.

Note that only kubevirt members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/uncc stu-gott
/cc @ormergi @oshoval @orelmisan @maiqueb @dteplits

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@EdDev
Copy link
Member Author

EdDev commented Jan 16, 2022

/sig network

@EdDev
Copy link
Member Author

EdDev commented Jan 16, 2022

change: Rebase

@EdDev
Copy link
Member Author

EdDev commented Jan 17, 2022

/retest

@EdDev
Copy link
Member Author

EdDev commented Jan 17, 2022

/hold virt-handler is crashing on an attempt to access a dynamic library.

This may be due to the added dependency to the virt-launcher hostdevice package.

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 17, 2022
@EdDev EdDev force-pushed the netstat-sriov-from-domain branch 2 times, most recently from cefe4a0 to 3997f52 Compare January 17, 2022 14:04
@EdDev
Copy link
Member Author

EdDev commented Jan 17, 2022

changes:

  • virt-handler should not depend on virt-launcher code (as much as possible), as that code depends on dynamic libraries (e.g. libvirt) that are not present in the virt-handler image.
  • SR-IOV interfaces are now reported with an info source based on the domain and the guest-agent, therefore, the relevant e2e test expectations was updated to reflect that.

/unhold

@kubevirt-bot kubevirt-bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Jan 17, 2022
@kubevirt-bot kubevirt-bot added size/L and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL labels Jan 24, 2022
@EdDev
Copy link
Member Author

EdDev commented Jan 24, 2022

change: Rebase

Copy link
Contributor

@oshoval oshoval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@EdDev
Copy link
Member Author

EdDev commented Feb 3, 2022

change: Rebase

@EdDev
Copy link
Member Author

EdDev commented Feb 3, 2022

/retest

@EdDev
Copy link
Member Author

EdDev commented Feb 3, 2022

change: Even the unit tests should not depend on the virt-launcher sriov package. From some reasons, it expects a library to exist which is not included in the container that the tests are executed in.

Copy link
Contributor

@ormergi ormergi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!
Please see my inline comments.

Expect(setup.Vmi.Status.Interfaces).To(BeEmpty(), "the SR-IOV interface should not be reported in the status.")
Expect(setup.Vmi.Status.Interfaces).To(Equal([]v1.VirtualMachineInstanceNetworkInterface{
newVMIStatusIface(networkName, nil, ifaceMAC, "", netvmispec.InfoSourceDomain),
}), "the SR-IOV interface should not be reported in the status.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/should not/should

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -303,6 +305,7 @@ var _ = Describe("netstat", func() {

Expect(setup.Vmi.Status.Interfaces).To(Equal([]v1.VirtualMachineInstanceNetworkInterface{
newVMIStatusIface(primaryNetworkName, []string{primaryPodIPv4}, "", "", netvmispec.InfoSourceDomain),
newVMIStatusIface(networkName, nil, "", "", netvmispec.InfoSourceDomain),
}), "the SR-IOV interface should not be reported in the status.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/should not/should

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +223 to +231
func filterHostDevicesByAlias(hostDevices []api.HostDevice, prefix string) []api.HostDevice {
var filteredHostDevices []api.HostDevice

for _, hostDevice := range hostDevices {
if hostDevice.Alias != nil && strings.HasPrefix(hostDevice.Alias.GetName(), prefix) {
filteredHostDevices = append(filteredHostDevices, hostDevice)
}
}
return filteredHostDevices
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as before, we cannot depend on virt-launcher.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the other way is possible (using pkg/network/setup/netstate.go filterHostDevicesByAlias), right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about filterHostDevicesByAlias can we move it to a common place?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it fits under something equivalent to network/vmispec, unfortunately, someone already used domainspec for other stuff. I prefer not to just place it there without understanding well if the current content there fits.

If you have other suggestions, please mention it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I saw domainspec handle the network interfaces part of the domain XML,
does it mean that it cant have other stuff related to the domain spec?
Since filterHostDevicesByAlias generic form, it can also fit its own package outside pkg/network something like pkg/device.
Deciding to which package it should go shouldn't distract us from the fact that either way its better than have it in two places.

@@ -32,6 +32,8 @@ import (
"kubevirt.io/kubevirt/pkg/virt-launcher/virtwrap/api"
)

const SRIOVAliasPrefix = "sriov-"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have similar const at https://github.com/kubevirt/kubevirt/blob/release-0.50/pkg/virt-launcher/virtwrap/device/hostdevice/sriov/hostdev.go#L33
Do you think we can reuse it?
In general I think there should be single source for it,
it will improve the visibility and may save some time for next dev who would like to figure out
who add the SRIOVAliasPrefix to the host-devices and where is it read.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot reuse it. The soft reason is that we generally should never depend inwards into the kubevirt components from libraries (e.g. depend on virt-launcher code). The hard reasons is that the dependency on virt-launcher hostdevice package brings in dependencies on libvirt dynamic libraries, which do not exist in virt-handler (where this code is used).

I have found no good place to centralize this alias, but if such a place is found, the code can be easily adjusted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do the other way around, use SRIOVAliasPrefix from pkg/network/.. on virt-launcher, WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So your suggestion is to place this constant in the network/setup? It makes no sense to me to have the hostdev/sriov import it from here.
It needs to make sense, and this one does not.

I am also unsure if it fits the network package.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think of this: it makes no sense for hostdev/sriov package to import the whole setup package, when it needs one little constant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't meant that it must be on the setup package, actually I don't mind to which package it will go.
I just prefer to have single place for it like any other constant that is used if few places.
Moreover the SRIOVAliasPrefix is our "magic" word that enables finding SR-IOV interfaces on the domain (until we will stop using Hostdev..) thus I think there should be single place for it.
If it means that it should have its own package under pkg/network or somewhere else I think its still better then having it all over.
Having said all that I guess it may bring some difficulties.. then we could do it on a follow up PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a network/sriov package.
Done

Comment on lines 282 to 283
// The reporting of the SR-IOV interface when no guest-agent exists is missing.
// See https://github.com/kubevirt/kubevirt/issues/7050 for more information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed now, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 259 to 260
// The reporting of the SR-IOV interface when no guest-agent exists is missing.
// See https://github.com/kubevirt/kubevirt/issues/7050 for more information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed now, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thanks.
Done

@EdDev
Copy link
Member Author

EdDev commented Feb 6, 2022

change: Removed/fixed tests comments and assertion messages.

The SR-IOV alias prefix used on the domain spec hostdevice is extracted
to a separate package (`network/sriov`) in order to allow using the same
by other callers (that are not in the virt-launcher).

Signed-off-by: Edward Haas <edwardh@redhat.com>
@EdDev
Copy link
Member Author

EdDev commented Feb 7, 2022

change: Extracted the SR-IOV alias prefix to a package that can be used by both the virt-handler and virt-launcher (without depending on each other).

@rmohr
Copy link
Member

rmohr commented Feb 7, 2022

/retest

introduced a small mistake in the migrations job definition when making them mandatory. Should be fixed now.

@EdDev
Copy link
Member Author

EdDev commented Feb 8, 2022

/retest

@@ -144,7 +150,7 @@ func vmiInterfaceKey(vmiUID types.UID, interfaceName string) string {
return fmt.Sprintf("%s/%s", vmiUID, interfaceName)
}

func ifacesStatusFromDomain(domainSpecIfaces []api.Interface) []v1.VirtualMachineInstanceNetworkInterface {
func regularIfacesStatusFromDomain(domainSpecIfaces []api.Interface) []v1.VirtualMachineInstanceNetworkInterface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is "regular"? I would call it "nonSriov`.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nonSriov is not less confusing then "regular". But I find it ugly, telling me what it is not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Erm, I'm forced to agree that (at least) I don't have a clear expectation of what regular is when speaking about interfaces.

Non-SRIOV may be a bad name, but ... at least its clear (what it is not :) )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will call it “deviceInterfaceStatus..” and sriovHostDeviceInterfaceStatus..”.
(reflects the domain xml)

are you ok with that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -326,7 +326,7 @@ var _ = Describe("netstat", func() {
setup.NetStat.UpdateStatus(setup.Vmi, setup.Domain)

Expect(setup.Vmi.Status.Interfaces).To(Equal([]v1.VirtualMachineInstanceNetworkInterface{
newVMIStatusIface(networkName, nil, ifaceMAC, guestIfaceName, netvmispec.InfoSourceGuestAgent),
newVMIStatusIface(networkName, nil, ifaceMAC, guestIfaceName, netvmispec.InfoSourceDomainAndGA),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test for sriov net device reported by the GA only.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not possible.
If we see a rogue interface that does not appear in the domain, we cannot detect if it is SR-IOV or not.
There are tests for such interfaces, in the general sense.

SR-IOV devices have been reported to the VMI status based on the
guest-agent report only.

With this change, the domain hostdevices are examined to include the
SR-IOV interfaces in the VMI status report.
Even if the guest-agent is not active, SR-IOV devices will be reported.

Fix: kubevirt#7050

Signed-off-by: Edward Haas <edwardh@redhat.com>
@EdDev
Copy link
Member Author

EdDev commented Feb 13, 2022

change: Renamed the two functions that retrieve information from the domain. Per the review comments.

@AlonaKaplan
Copy link
Member

/approve

Thanks!

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: AlonaKaplan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 13, 2022
@EdDev
Copy link
Member Author

EdDev commented Feb 13, 2022

/retest

Copy link
Contributor

@oshoval oshoval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

The info under "Special notes for your reviewer"
can be deleted

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Feb 14, 2022
@EdDev
Copy link
Member Author

EdDev commented Feb 14, 2022

/retest

@kubevirt-bot kubevirt-bot merged commit 467ea8b into kubevirt:main Feb 14, 2022
@EdDev EdDev deleted the netstat-sriov-from-domain branch February 15, 2022 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/network size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SR-IOV interfaces are not reported correctly when the guest has no guest-agent.
7 participants