-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Passthrough NVIDIA Jetson AGX Orin WiFi card to NetVM #93
Conversation
Signed-off-by: Mika Tammi <mika.tammi@unikie.com>
Signed-off-by: Panu Finnila <panu.finnila@unikie.com>
Finally got my local I get the services running:
with
Services have started with no issues:
and
Still the netvm can't access WIFI. Do you see something that I missed? |
Ok. Got it working. I think this is good as it is to enable further Ghaf development on Orin without ethernet. Needs some guides and development mode changes to enable tools but those can be added in additional commits. |
# Passthrough Jetson Orin WiFi card | ||
boot.kernelParams = [ | ||
"vfio-pci" "ids=10ec:c82f" | ||
"vfio_iommu_type1.allow_unsafe_interrupts=1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allow_unsafe_interrupts
enables threat vector via MSI (message signaled interrupts). This is possible with PCI devices if there's access to device configuration space. See https://invisiblethingslab.com/resources/2011/Software%20Attacks%20on%20Intel%20VT-d.pdf - pages 6-7.
I'll add this note to acknowledge the threat from within the netvm and make a ghaf issue out of this to document it - at least until we have iommu group interrupt remapping support or other documented VMM mitigations. It may be that there's other mitigations these days that I'm not aware of. If that turns out to be the case, let's document those and close the issue after the fact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we already had some discussion about this, and in the discussion the conclusion was that this allow_unsafe_interrupts is not needed. But I'm not sure did anyone test with this line removed
+ interconnect-names = "dma-mem", "write"; | ||
+ /delete-property/ iommus; | ||
+ /delete-property/ msi-parent; | ||
+ /delete-property/ msi-map; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous concern with MSI seems irrelevant as we bypass the IOMMU protections completely for now. As this may be temporary but enables functional passthrough, I'll leave the earlier gh issue open and link this there.
@@ -6,6 +6,12 @@ | |||
firewall.allowedUDPPorts = [67]; # DHCP | |||
useNetworkd = true; | |||
}; | |||
|
|||
networking.nat = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Address host NAT in the minimal host documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This enables NAT from host to the VMs, i.e. you can share the host's network with VMs. This is not needed at all other than debugging purposes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. This should go only via nix development-modules then. Even if I'd like to see "no networking on host" in development mode as well, it may not be very practical at this phase in development when tools from nix caches are often quite helpful to try/debug something in the development. Making the host no-networking/read-only (like netvm) would require reflashing for any small change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works to enable further development of netvm with wifi passthrough.
# Passthrough Jetson Orin WiFi card | ||
boot.kernelParams = [ | ||
"vfio-pci" "ids=10ec:c82f" | ||
"vfio_iommu_type1.allow_unsafe_interrupts=1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we already had some discussion about this, and in the discussion the conclusion was that this allow_unsafe_interrupts is not needed. But I'm not sure did anyone test with this line removed
@@ -16,4 +16,7 @@ | |||
|
|||
networking.hostName = "ghaf-host"; | |||
system.stateVersion = "22.11"; | |||
|
|||
# PCI passthrough needs larger locked-in-memory space than default | |||
systemd.services."microvm@".serviceConfig.LimitMEMLOCK = 999999999; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line breaks the build
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was already fixed in microvm.nix astro/microvm.nix@68a0242
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll wait for @panufi 's comment on allow_unsafe_interrupts
before testing removing it.
@@ -6,6 +6,12 @@ | |||
firewall.allowedUDPPorts = [67]; # DHCP | |||
useNetworkd = true; | |||
}; | |||
|
|||
networking.nat = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This enables NAT from host to the VMs, i.e. you can share the host's network with VMs. This is not needed at all other than debugging purposes
new file mode 100644 | ||
index 000000000000..e4656287da82 | ||
--- /dev/null | ||
+++ b/nvidia/platform/t23x/concord/kernel-dts/tegra234-p3701-host-passthrough.dts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like this to be an overlay, instead of defining completely new devicetree that replaces the old
# TODO: Maybe add store path or some unique identifier to the filename | ||
extraFiles."dtbs/${config.hardware.deviceTree.name}" = "${config.hardware.deviceTree.package}/${config.hardware.deviceTree.name}"; | ||
extraInstallCommands = '' | ||
default_cfg=$(cat /boot/loader/loader.conf | grep default | awk '{print $2}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This stuff is now available in module boot/systemd-boot-dtb.nix. This whole WIP commit can be removed from this branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to push @panufi 's PR branch from his personal fork to under ghaf? We could then work many of these issues together in the same branch while he is out. Including rebasing with main to get some of the microvm.nix upstream changes merged to main earlier.
|
||
# Passthrough Jetson Orin WiFi card | ||
boot.kernelParams = [ | ||
"vfio-pci" "ids=10ec:c82f" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to have some toggle for this, and not enable this always by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the motivation on Orin reference device? Wifi is basically the only networking interface we can easily and consistently pass through to netvm. I'd rather have it enabled by default even when we can't/shouldn't make it connect any WIFI automatically. And to support wifi, bring nmcli
to netvm
to easily scan and connect wifis over debug console.
# path = "0001:01:00.0"; | ||
# } | ||
# ]; | ||
microvm.devices = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to have some toggle for this, and not enable this always by default. Also this change is NVIDIA specific but affects all other targets as well. Of course this is a bigger refactoring question that might not be addressed within the context of this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's valid toggle.
@@ -23,7 +23,8 @@ nixpkgs.lib.nixosSystem { | |||
# For WLAN firmwares | |||
hardware.enableRedistributableFirmware = true; | |||
|
|||
microvm.hypervisor = "crosvm"; | |||
# TODO: change back to crosvm after tested working | |||
microvm.hypervisor = "qemu"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the situation with the crosvm currently?
|
||
hardware.deviceTree = { | ||
enable = true; | ||
name = "tegra234-p3701-host-passthrough.dtb"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like this to be an overlay, instead of defining completely new devicetree that replaces the old
I wasn't able to change branch in this PR, so let's continue this in new PR #107 |
Using qemu as hypervisor instead of crosvm because current version of crosvm didn't support passthrough.