Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VGA/GPU PCI Passthrough not working. #3

Open
EpiJunkie opened this issue Aug 27, 2016 · 26 comments

Comments

Projects
None yet
4 participants
@EpiJunkie
Copy link
Member

commented Aug 27, 2016

On Wednesday night I attempted to get VGA PCI Passthrough to work with an ATI Radeon 5450 in a Dell R610 and was unsuccessful. While this is not a chyves issue, I have not found a thread/resource with attempts to get this working with bhyve.

My issue was the GPU would not load the ppt drivers. Here is my full dmesg after modifications to /boot/loader.conf. In the past I have been successful in passing NICs to virtualized guests on this hardware using FreeBSD 10.3 and VMware ESXi 6.

root@bhost:~ # sysctl -n kern.osreldate
1200003
root@bhost:~ # pciconf -lv vgapci0
vgapci0@pci0:4:0:0: class=0x030000 card=0xe164174b chip=0x68f91002 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD/ATI]'
    device     = 'Cedar [Radeon HD 5000/6000/7350/8350 Series]'
    class      = display
    subclass   = VGA
root@bhost:~ # pciconf -lv hdac0
hdac0@pci0:4:0:1:   class=0x040300 card=0xaa68174b chip=0xaa681002 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD/ATI]'
    device     = 'Cedar HDMI Audio [Radeon HD 5400/6300 Series]'
    class      = multimedia
    subclass   = HDA
root@bhost:~ # dmesg -a | grep -E '^ppt|^vgapci|^hda'
vgapci0: <VGA-compatible display> port 0xec00-0xecff mem 0xc0000000-0xcfffffff,0xdf1e0000-0xdf1fffff irq 38 at device 0.0 on pci3
hdac0: <ATI RV810 HDA Controller> mem 0xdf1dc000-0xdf1dffff irq 45 at device 0.1 on pci3
vgapci1: <VGA-compatible display> mem 0xd2000000-0xd27fffff,0xde7fc000-0xde7fffff,0xde800000-0xdeffffff irq 19 at device 3.0 on pci6
vgapci1: Boot video device
hdacc0: <ATI R6xx HDA CODEC> at cad 0 on hdac0
hdaa0: <ATI R6xx Audio Function Group> at nid 1 on hdacc0

/boot/loader.conf

...
pptdevs="4/0/0 4/0/1"
root@bhost:~ # reboot
<waiting>
root@bhost:~ # dmesg -a | grep -E '^ppt|^vgapci|^hda'
vgapci0: <VGA-compatible display> port 0xec00-0xecff mem 0xc0000000-0xcfffffff,0xdf1e0000-0xdf1fffff irq 38 at device 0.0 on pci3
hdac0: <ATI RV810 HDA Controller> mem 0xdf1dc000-0xdf1dffff irq 45 at device 0.1 on pci3
vgapci1: <VGA-compatible display> mem 0xd2000000-0xd27fffff,0xde7fc000-0xde7fffff,0xde800000-0xdeffffff irq 19 at device 3.0 on pci6
vgapci1: Boot video device
hdacc0: <ATI R6xx HDA CODEC> at cad 0 on hdac0
hdaa0: <ATI R6xx Audio Function Group> at nid 1 on hdacc0

I plan to install ESXi on this hardware configuration to rule out the hardware as a possibility. I also attempted to get the motherboard VGA device to passthrough but was unsuccessful with that as well.

On Reddit, /u/sirdond was successful in getting his video card to load the ppt drivers but was unsuccessful in getting the guest to start. Hopefully he will report his/her findings and error message here.

If anyone else could report their success and failures of VGA/GPU passthrough on bhyve here, it would be appreciated. Important things to post include formatted output from:

  • sysctl -n kern.osreldate
  • dmesg -a
  • pciconf -lv
  • grep "pptdevs" /boot/loader.conf
  • chyves <guest> get all"
  • The actual bhyve command used to start the guest. This is recorded by chyves after the line "20YY-MM-DDTHH:MM:SS+0000 - [3] - bhyve command:" in the log file at `/chyves//guests//logs/201xxx.log".
  • Any error messages with context.
@ghost

This comment has been minimized.

Copy link

commented Aug 27, 2016

Hey there,

Thanks for opening this issue! (u/sirdond here)

Here are the outputs of the aforementioned commands:

root@desktop:/home/dani # sysctl -n kern.osreldate
1100122
root@desktop:/home/dani #
root@desktop:/home/dani # pciconf -lv | grep -A4 '^ppt'
ppt0@pci0:1:0:0:    class=0x030000 card=0x27513842 chip=0x138110de rev=0xa2 hdr=0x00
    vendor     = 'NVIDIA Corporation'
    device     = 'GM107 [GeForce GTX 750]'
    class      = display
    subclass   = VGA
ppt1@pci0:1:0:1:    class=0x040300 card=0x27513842 chip=0x0fbc10de rev=0xa1 hdr=0x00
    vendor     = 'NVIDIA Corporation'
    class      = multimedia
    subclass   = HDA
root@desktop:/home/dani #
root@desktop:/home/dani # grep "pptdevs" /boot/loader.conf
pptdevs="1/0/0 1/0/1"
root@desktop:/home/dani #
root@desktop:/home/dani # chyves windowsguest get all
Getting all windowsguest's properties...
bargs                                -A -H -P -S
bhyve_net_type                       virtio-net
chyves_guest_version                 0200
cpu                                  2
creation                             Created on 2016 aug. 27 Szo 09:29:55 CEST by chyves v0.1.0 2016/08/21 using __create()
description                          -
loader                               uefi
net_ifaces                           tap50
notes                                -
os                                   default
pcidev_0                             passthru,1/0/0
pcidev_1                             passthru,1/0/1
ram                                  4G
rcboot                               0
revert_to_snapshot_method            off
revert_to_snapshot
serial                               nmdm50
template                             no
uefi_console_output                  vnc
uefi_firmware                        BHYVE_UEFI.fd
uefi_vnc_client_custom_cmd
uefi_vnc_client                      print
uefi_vnc_ip                          0.0.0.0
uefi_vnc_mouse_type                  ps2
uefi_vnc_pause_until_client_connect  no
uefi_vnc_port                        5900
uefi_vnc_res                         800x600
uuid                                 0d0ef768-6c28-11e6-af35-d0509927f3c3

Starting the VM:

root@desktop:/home/dani # chyves windowsguest start
Preparing to start guest: windowsguest
Loading guest parameters... done.
Checking if guest is running... not running.
Checking if VMM resources allocated... not allocated.
Checking which type of loader to use... UEFI.
Generating bhyve string for UEFI firmware... done.
Generating bhyve PCI string for 1 disks... done.
Generating bhyve PCI string for VNC console... done.
Generating bhyve PCI string for mouse... PS/2 mouse... done.
Generating bhyve string for network devices... 
tap50 does not exist on system... creating interface... done.
bridge0 does not exist on system... creating interface... done.
Adding 'tap50' as a member of 'bridge0'... done.
Adding 'em0' as a member of 'bridge0'... done.
   ...done.
Generating bhyve string for custom PCI devices (if any)... More devices than available PCI Slots and PCI functions can provide, some devices will not be attached.
done.

root@desktop:/home/dani # Starting windowsguest
fbuf frame buffer base: 0x942800000 [sz 16777216]
Assertion failed: (error == 0), function modify_bar_registration, file /usr/src/usr.sbin/bhyve/pci_emul.c, line 491.
Abort trap (core dumped)
Powering off windowsguest... 
Reclaiming windowsguest VMM resources...

I should mention here, the error occurs after the boot, when the Windows loading screen tries to load.

The actual bhyve command

2016-08-27T14:43:08+0000 - [3] - bhyve command:
2016-08-27T14:43:08+0000 - [3] - bhyve -A -H -P -S -c 2 -U 0d0ef768-6c28-11e6-af35-d0509927f3c3 -m 4G -s 0,hostbridge -s 3,ahci-cd,/chyves/zroot/ISO/null.iso/null.iso  -s 4,ahci-hd,/dev/zvol/zroot/chyves/guests/windowsguest/disk0  -s 5,virtio-net,tap50   -s 6,passthru,1/0/0 -s 6:1,passthru,1/0/1 -s 2,fbuf,tcp=0.0.0.0:5900,w=800,h=600  -l bootrom,/chyves/zroot/Firmware/BHYVE_UEFI.fd/BHYVE_UEFI.fd -s 31,lpc chy-windowsguest

#dmesg -a
http://pastebin.com/ZydJmMPm

As a supplement, I have an overall good result with this same hardware on an Arch Linux host, with QEMU doing PCI pass-trough to a Windows 7 uefi guest.

@EpiJunkie

This comment has been minimized.

Copy link
Member Author

commented Aug 27, 2016

@nydn thanks for posting here.

I would try turning the VNC console "off" (chyves windowsguest set uefi_console_output=serial) and see if that makes a difference. I have only seen this console message "fbuf frame buffer base: *" after the bhyve process has started and the guest is starting to boot or ready to boot when the wait directive is used (aka uefi_vnc_pause_until_client_connect=yes). It seems that the bhyve process has started but maybe there is a conflict with the VNC fbuf. Also I hunch that you may need to reboot the host after each guest boot when using VGA passthrough, this is highly speculative on my part.

Also do you have RDP enabled on the Windows guest? You may need to use that until you can pass an entire USB controller to the guest. bhyve does not support passing an individual USB device to a guest so the entire controller must be passed. Most systems have at least a couple of USB controllers fortunately.

@ghost

This comment has been minimized.

Copy link

commented Aug 27, 2016

Yes, RDP is configured on the guest. Turned off VNC as you advised, without a pass-through device I can connect through RDP and everything is good and stable. With pass-through set, I'm getting the same error as in the previous post ("Assertion failed: ..."), even after reboots.

Also fixed that missing swap that I just noticed from my 'dmesg' output, unfortunately, that wasn't the problem :)

@EpiJunkie

This comment has been minimized.

Copy link
Member Author

commented Aug 27, 2016

I have been able to get the ppt drivers to load for my ATI Radeon 5450. I overlooked a critical step, I did not have vmm_load="YES" set in /boot/loader.conf. I feel a bit foolish for that mistake, I was relying on chyves to load the module.

I get the same error message when starting a Windows guest.

[1] Starting pcipassgst
Assertion failed: (error == 0), function modify_bar_registration, file /usr/src/usr.sbin/bhyve/pci_emul.c, line 491.
Abort trap (core dumped)
[3] Exit code for bhyve guest 'pcipassgst' was '134'

However I tried CentOS (uefi) and FreeBSD (bhyveload) and both seem to boot correct and run without a core dump. The monitor never received signal over the HDMI port, I have not yet tested the DVI or VGA port. Nor have I opened the serial console. I am currently loading an Ubuntu ISO as that was my baseline as I know it will work with this card out-of-the-box. Will keep you posted.

@EpiJunkie

This comment has been minimized.

Copy link
Member Author

commented Aug 28, 2016

On the Ubuntu guest, the PCI card is being passed through however the card is not being initialized correctly. See below.

root@ubuntu:/home/justin# dmesg | egrep 'drm|radeon|hda'
[    5.098850] [drm] Initialized drm 1.1.0 20060810
[    5.144332] [drm] radeon kernel modesetting enabled.
[    5.589194] radeon 0000:00:07.0: can't derive routing for PCI INT A
[    5.589196] radeon 0000:00:07.0: PCI INT A: no GSI
[    5.589945] [drm] initializing kernel modesetting (CEDAR 0x1002:0x68F9 0x174B:0xE164).
[    5.589972] [drm] register mmio base: 0xC2060000
[    5.589974] [drm] register mmio size: 131072
[    5.590042] radeon 0000:00:07.0: Invalid ROM contents
[    5.590067] radeon 0000:00:07.0: Invalid ROM contents
[    5.590070] [drm:radeon_get_bios] *ERROR* Unable to locate a BIOS ROM
[    5.590072] radeon 0000:00:07.0: Fatal error during GPU init
[    5.590074] [drm] radeon: finishing device.
[    5.597510] radeon 0000:00:07.0: can't derive routing for PCI INT A
[    5.597518] radeon: probe of 0000:00:07.0 failed with error -22
[   14.469744] snd_hda_intel 0000:00:07.1: can't derive routing for PCI INT B
[   14.469749] snd_hda_intel 0000:00:07.1: PCI INT B: no GSI
[   14.469782] hda-intel 0000:00:07.1: Using LPIB position fix
[   14.469924] snd_hda_intel 0000:00:07.1: irq 54 for MSI/MSI-X
[   14.517048] hda-intel 0000:00:07.1: Enable sync_write for stable communication
root@ubuntu:/home/justin# 
root@ubuntu:/home/justin# lspci
00:00.0 Host bridge: Network Appliance Corporation Device 1275
00:02.0 VGA compatible controller: Device fb5d:40fb
00:03.0 SATA controller: Intel Corporation 82801HR/HO/HH (ICH8R/DO/DH) 6 port SATA Controller [AHCI mode]
00:04.0 SATA controller: Intel Corporation 82801HR/HO/HH (ICH8R/DO/DH) 6 port SATA Controller [AHCI mode]
00:05.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller
00:06.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper)
00:07.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cedar [Radeon HD 5000/6000/7350/8350 Series]
00:07.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cedar HDMI Audio [Radeon HD 5400/6300 Series]
00:08.0 Ethernet controller: Red Hat, Inc Virtio network device
00:1f.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
root@ubuntu:/home/justin#
root@ubuntu:/home/justin# lspci -v
...
00:07.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cedar [Radeon HD 5000/6000/7350/8350 Series] (prog-if 00 [VGA controller])
    Subsystem: PC Partner Limited / Sapphire Technology Device e164
    Flags: fast devsel, IRQ 38
    Memory at <ignored> (64-bit, prefetchable)
    Memory at c2060000 (64-bit, prefetchable) [size=128K]
    I/O ports at 2100 [size=256]
    Expansion ROM at df100000 [disabled] [size=128K]
    Capabilities: [50] Power Management version 3
    Capabilities: [58] Express Legacy Endpoint, MSI 00
    Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+

00:07.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cedar HDMI Audio [Radeon HD 5400/6300 Series]
    Subsystem: PC Partner Limited / Sapphire Technology Device aa68
    Flags: bus master, fast devsel, latency 0, IRQ 54
    Memory at c2080000 (64-bit, prefetchable) [size=16K]
    Capabilities: [50] Power Management version 3
    Capabilities: [58] Express Legacy Endpoint, MSI 00
    Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
    Kernel driver in use: snd_hda_intel
...

Full dmesg.

@ghost

This comment has been minimized.

Copy link

commented Aug 28, 2016

Your GPU ROM needs to support UEFI too, maybe that's why.
You can check your card here
However that list is not always correct, from what I have experienced.

Will also try out an Ubuntu in the next few days.

@ghost

This comment has been minimized.

Copy link

commented Sep 1, 2016

I tried to install an Ubuntu 16.04 Desktop with the UEFI loader and I can get to the very end without any errors, however, I got stuck on the reboot screen of the installer. If I stop the VM via chyve or bhyvectl and then start it again, the OS does not load. Can you explain me in some short entry how you installed Ubuntu with UEFI?

@EpiJunkie

This comment has been minimized.

Copy link
Member Author

commented Sep 2, 2016

That makes sense about the GPU ROM needing UEFI. I tried flashing the ATI 5450 with a UEFI ROM but it failed. I have another card but I do not think it has a UEFI ROM either, unfortunately that system does not have VT-d. I am waiting on a cheap CPU replacement to show up on eBay that is VT-d capable.

root@bhost:~ # chyves pcipass get all
Getting all pcipass's properties...
bargs                                -A -H -P -S
bhyve_net_type                       e1000
chyves_guest_version                 0200
cpu                                  2
creation                             Created on Thu Aug 25 20:55:04 MDT 2016 by chyves v0.1.0 2016/08/21 using __create()
description                          -
loader                               uefi
net_ifaces                           tap64
notes                                -
os                                   debian
pcidev_0                             passthru,4/0/0
pcidev_1                             passthru,4/0/1
pcidev_2                             virtio-net,tap10
ram                                  4G
rcboot                               0
revert_to_snapshot_method            off
revert_to_snapshot
serial                               nmdm64
template                             no
uefi_console_output                  vnc
uefi_firmware                        BHYVE_UEFI_20160526.fd
uefi_vnc_client_custom_cmd
uefi_vnc_client                      print
uefi_vnc_ip                          0.0.0.0
uefi_vnc_mouse_type                  usb3
uefi_vnc_pause_until_client_connect  yes
uefi_vnc_port                        5902
uefi_vnc_res                         1280x1024
uuid                                 f6350f61-6cb8-11e6-9dfb-00219ba30e02

I used ubuntu-16.04.1-desktop-amd64.iso on a non-LVM installation after selecting the "Try Ubuntu" boot option.

@EpiJunkie

This comment has been minimized.

Copy link
Member Author

commented Sep 2, 2016

As for the ~"Press ENTER to eject the optical media" message on the reboot. I had to do a chyves pcipass stop force to get it to fully shutdown as the keyboard input was not taking and most of the OS is down at that point so ACPI shutdown interrupts are not handled.

@ghost

This comment has been minimized.

Copy link

commented Sep 3, 2016

Thank you! I've tried with the exact same configuration as yours before and now again, but then, it always fails to boot the installed system after reboot. And I've finally found out why: the boot option does not get saved in the UEFI firmware and it just loads the default firmware. So, if you add a new boot option in the EFI menu (btw, I can't even use the shell, because there are no special characters like backward slash or colon, so I needed the menu) to load the installed system, Ubuntu in this case, and save it, then you can select it as a boot option and the guest system start perfectly. However after stopping the VM, the EFI is reset and you have to do the same steps again.

On the case of PCI passthru, sadly it's not working,this time I'm getting:

Assertion failed: (pi->pi_bar[baridx].type == PCIBAR_IO), function passthru_write, file /usr/src/usr.sbin/bhyve/pci_passthru.c, line 850.
@ghost

This comment has been minimized.

Copy link

commented Sep 3, 2016

@EpiJunkie

This comment has been minimized.

Copy link
Member Author

commented Sep 3, 2016

I realized what I did. I was primarily testing with ubuntu-14.04-desktop-amd64.iso which (for me) does respond to ~"Press ENTER to eject the optical media" after installation. I guess I never booted the guest again after powering down and did a fresh install each time I did start the guest. It seems the guest will not boot once the VMM resources are reclaimed. My guess is the UEFI firmware file is treated as a read only file and rebooting effectively removes that state of modifications by the installer. When I tested "ubuntu-16.04.1-desktop-amd64.iso" I guess I did not verify the guest would boot from the hard drive after installation. My apologies.

As a work around you can download the rEFInd CD-R image from here and then replace the null.iso file with it. With the CD-R image of rEFInd, the default boot selection is the hard drive. By replacing null.iso with rEFInd, it will always boot with all the UEFI guests on the system when no ISO image is specified.

unzip refind-cd-0.10.3.zip 
chyves iso delete null.iso
mv refind-cd-0.10.3.iso null.iso
chyves iso import null.iso

I have not tested this against too many UEFI guests (only Windows and Ubuntu) but I don't think it will cause a conflict with other OSes. But using rEFInd will increase boot times by 20 seconds due to the timeout. However, if the above method does cause a conflict, an alternative method is to undo the above method first:

chyves iso delete null.iso
rm null.iso
touch null.iso
chyves iso import null.iso

Then for each desired guest, attach the rEFInd CD-R image.

unzip refind-cd-0.10.3.zip 
chyves iso import refind-cd-0.10.3.iso
chyves <guest> set pcidev_{n}=ahci-cd,/chyves/<primary_pool>/ISO/refind-cd-0.10.3.iso/refind-cd-0.10.3.iso

I actually use this latter method to boot a debian grub-bhyve guest which would not boot after I upgrade to FreeBSD 11 even though there were no other changes to the host or guest.

Also, can you try if this works for you?
https://lists.freebsd.org/pipermail/freebsd-virtualization/2016-September/004691.html

Using UEFI loader, I ran ubuntu-14.04-desktop-amd64.iso as a live cd and added the 'pci=nocrs' kernel option but the card failed to initialize. Sort of expected due to the lack of UEFI support on the VGA card.

I changed the guest to boot using grub-bhyve, then installed from ubuntu-14.04-server-amd64.iso, and added the 'pci=nocrs' kernel option but the card failed to initialize. Here is the full boot output, below is just an excerpt.

justin@pcipass:~$ dmesg | grep -E 'drm|radeon|hda'
[    2.533392] [drm] Initialized drm 1.1.0 20060810
[    2.995605] snd_hda_intel 0000:00:06.1: can't derive routing for PCI INT B
[    2.995608] snd_hda_intel 0000:00:06.1: PCI INT B: no GSI
[    2.995697] hda-intel 0000:00:06.1: Using LPIB position fix
[    2.995842] snd_hda_intel 0000:00:06.1: irq 52 for MSI/MSI-X
[   28.314235] Modules linked in: radeon(+) ghash_clmulni_intel(+) snd_hda_intel(+) snd_hda_codec aesni_intel snd_hwdep aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd snd_pcm ttm snd_page_alloc snd_timer drm_kms_helper snd psmouse drm serio_raw soundcore i2c_algo_bit mac_hid lp parport e1000 ahci libahci
[   28.494079] [drm] radeon kernel modesetting enabled.
[   28.512427] hda-intel 0000:00:06.1: Enable sync_write for stable communication
[   28.723602] radeon 0000:00:06.0: can't derive routing for PCI INT A
[   28.723606] radeon 0000:00:06.0: PCI INT A: no GSI
[   28.726377] [drm] initializing kernel modesetting (CEDAR 0x1002:0x68F9 0x174B:0xE164).
[   28.731186] [drm] register mmio base: 0xC0060000
[   28.731188] [drm] register mmio size: 131072
[   28.731266] radeon 0000:00:06.0: Invalid ROM contents
[   28.732341] radeon 0000:00:06.0: Invalid ROM contents
[   28.733363] [drm:radeon_get_bios] *ERROR* Unable to locate a BIOS ROM
[   28.734686] radeon 0000:00:06.0: Fatal error during GPU init
[   28.736127] [drm] radeon: finishing device.
[   28.762837] radeon 0000:00:06.0: can't derive routing for PCI INT A
[   28.762847] radeon: probe of 0000:00:06.0 failed with error -22

I personally think having a VGA card initialize via UEFI is more likely to work than via grub-bhyve.

@EpiJunkie

This comment has been minimized.

Copy link
Member Author

commented Sep 7, 2016

There have been some interesting developments recently:

@nydn do you have a 12-CURRENT machine running that you can test with? I am still waiting on a deal on eBay for a third generation Intel Core i5/i7 processor.

@ghost

This comment has been minimized.

Copy link

commented Sep 8, 2016

I did not have much time last weekend, so could not try the rEFInd method, but will keep that in mind, thank you!

Also, I have no machine with 12-CURRENT, but there is a spare disk I can use for testing.

@Dolpa70

This comment has been minimized.

Copy link

commented Sep 11, 2016

Hi Folks,

Today I tested 12-CURRENT r305679. My hardware is EP2C602-4L/D16, Dual Xeon E5-2680 and 64GB of ECC DDR3. I would like to pass to the guest for testing a VGA, a HBA, a WiFi and an USB port. First of all I played with VGA AMD Radeon RX-480 (the card has EFI ROM).
I installed Ubuntu 16.04 in VNC mode without VGA and upgraded to the latest packages and installed the lastest amdgpu driver from amd.com and adding "pci=nocrs" to linux guest's kernel command line otherwise crashed the guest then reboot VNC turned off.

The guest booted successfully recognized the VGA but no output. I'm not expert to fully interpret the logs but I hope this will help. By the way the VGA and the amdgpu driver is working fine in a non virtualized environment. Unfortunately rest of my hardware was nor working.

# sysctl -n kern.osreldate
1200007

# grep "pptdevs" /boot/loader.conf
pptdevs="0/26/0 2/0/0 3/0/0 129/0/0 129/0/1"

# pciconf -lv | grep -A4 '^ppt'
ppt2@pci0:0:26:0:       class=0x0c0320 card=0x1d2d1849 chip=0x1d2d8086 rev=0x06 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'C600/X79 series chipset USB2 Enhanced Host Controller'
    class      = serial bus
    subclass   = USB
ppt0@pci0:2:0:0:        class=0x028000 card=0x21001a3b chip=0x0037168c rev=0x01 hdr=0x00
    vendor     = 'Qualcomm Atheros'
    device     = 'AR9485 Wireless Network Adapter'
    class      = network
ppt1@pci0:3:0:0:        class=0x010700 card=0x30201000 chip=0x00721000 rev=0x03 hdr=0x00
    vendor     = 'LSI Logic / Symbios Logic'
    device     = 'SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon]'
    class      = mass storage
    subclass   = SAS
ppt3@pci0:129:0:0:      class=0x030000 card=0xe347174b chip=0x67df1002 rev=0xc7 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD/ATI]'
    device     = 'Ellesmere [Polaris10]'
    class      = display
    subclass   = VGA
ppt4@pci0:129:0:1:      class=0x040300 card=0xaaf0174b chip=0xaaf01002 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD/ATI]'
    class      = multimedia
    subclass   = HDA

# chyves ubi1604 get all
Getting all ubi1604's properties...
bargs                      -A -H -P -S
bhyve_net_type             virtio-net
chyves_guest_version       0200
cpu                        8
creation                   Created on Sat Sep 10 21:11:29 CEST 2016 by chyves v0.1.1 2016/08/27 using __create()
description                -
loader                     uefi
net_ifaces                 tap53
notes                      -
os                         default
pcidev_6                   passthru,129/0/0
pcidev_7                   passthru,129/0/1
ram                        8G
rcboot                     0
revert_to_snapshot
revert_to_snapshot_method  off
serial                     nmdm52
tap53_mac                  00:16:3E:20:00:20
template                   no
uefi_console_output        console
uefi_firmware              BHYVE_UEFI_20160526.fd
uefi_vnc_ip                0.0.0.0
uefi_vnc_mouse_type        usb3
uefi_vnc_port              5901
uefi_vnc_res               1024x768
uuid                       52b852e1-778a-11e6-8a31-d05099799e99

On guest

# lspci
00:00.0 Host bridge: Network Appliance Corporation Device 1275
00:03.0 SATA controller: Intel Corporation 82801HR/HO/HH (ICH8R/DO/DH) 6 port SATA Controller [AHCI mode]
00:04.0 SATA controller: Intel Corporation 82801HR/HO/HH (ICH8R/DO/DH) 6 port SATA Controller [AHCI mode]
00:05.0 Ethernet controller: Red Hat, Inc Virtio network device
00:06.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 67df (rev c7)
00:06.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device aaf0
00:1f.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:06.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 67df (rev c7) (prog-if 00 [VGA controller])
  Subsystem: PC Partner Limited / Sapphire Technology Device e347
  Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
  Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
  Latency: 0, Cache Line Size: 64 bytes
  Interrupt: pin A routed to IRQ 37
  Region 0: Memory at d000000000 (64-bit, prefetchable) [size=256M]
  Region 2: Memory at c0200000 (64-bit, prefetchable) [size=2M]
  Region 4: I/O ports at 2100 [size=256]
  Region 5: Memory at c0400000 (32-bit, non-prefetchable) [size=256K]
  Expansion ROM at c0020000 [disabled] [size=128K]
  Capabilities: [48] Vendor Specific Information: Len=08 <?>
  Capabilities: [50] Power Management version 3
    Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
    Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
  Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
    DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
      ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
    DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
      RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
      MaxPayload 256 bytes, MaxReadReq 512 bytes
    DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
    LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L1, Exit Latency L0s <64ns, L1 <1us
      ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
    LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
      ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
    LnkSta: Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
    DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR+, OBFF Not Supported
    DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
    LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
       EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
  Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
    Address: 00000000feeff00c  Data: 41d1
  Kernel driver in use: amdgpu
  Kernel modules: amdgpu

00:06.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device aaf0
  Subsystem: PC Partner Limited / Sapphire Technology Device aaf0
  Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
  Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
  Latency: 0, Cache Line Size: 64 bytes
  Interrupt: pin B routed to IRQ 17
  Region 0: Memory at c0440000 (64-bit, prefetchable) [size=16K]
  Capabilities: [48] Vendor Specific Information: Len=08 <?>
  Capabilities: [50] Power Management version 3
    Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
    Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
  Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
    DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
      ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
    DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
      RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
      MaxPayload 256 bytes, MaxReadReq 512 bytes
    DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
    LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L1, Exit Latency L0s <64ns, L1 <1us
      ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
    LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
      ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
    LnkSta: Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
    DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR+, OBFF Not Supported
    DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
    LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
       EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
  Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
    Address: 00000000fee08000  Data: 0032
  Kernel modules: snd_hda_intel
# cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
  0:         46          0          0          0          0          0          0          0   IO-APIC   2-edge      timer
  1:          0          0          2          0          0          0          0          2   IO-APIC   1-edge      i8042
  8:          0          0          0          0          0          0          0          0   IO-APIC   8-edge      rtc0
  9:          0          0          0          0          0          0          0          0   IO-APIC   9-fasteoi   acpi
 12:         44          3          5          0          2          0          0          3   IO-APIC  12-edge      i8042
 32:          0          0          0          0          0          0          0          0   PCI-MSI 81920-edge      virtio0-config
 33:         84         78        495         69         91         65         55         57   PCI-MSI 81921-edge      virtio0-input.0
 34:          1          1          1          1          1          1          1          1   PCI-MSI 81922-edge      virtio0-output.0
 35:         66         19         62         13          6         20          4          8   PCI-MSI 49152-edge      0000:00:03.0
 36:       1470       1076        417        195        139         64         70         23   PCI-MSI 65536-edge      0000:00:04.0
 37:          1          0          1          0          0          1          0          1   PCI-MSI 98304-edge      amdgpu
NMI:          0          0          0          0          0          0          0          0   Non-maskable interrupts
LOC:       4068      10713      12373      12450      17272      10169      17346      18187   Local timer interrupts
SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
PMI:          0          0          0          0          0          0          0          0   Performance monitoring interrupts
IWI:          0          0          0          0          0          0          0          0   IRQ work interrupts
RTR:          0          0          0          0          0          0          0          0   APIC ICR read retries
RES:      11048       1206       1756       1294       1092       1255       1064       1224   Rescheduling interrupts
CAL:       1064       1405       1048       1608       1430       1211       1340       1306   Function call interrupts
TLB:         40         33         83         56         38         49         77         67   TLB shootdowns
TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts
THR:          0          0          0          0          0          0          0          0   Threshold APIC interrupts
DFR:          0          0          0          0          0          0          0          0   Deferred Error APIC interrupts
MCE:          0          0          0          0          0          0          0          0   Machine check exceptions
MCP:          1          1          1          1          1          1          1          1   Machine check polls
ERR:          0
MIS:          0
PIN:          0          0          0          0          0          0          0          0   Posted-interrupt notification event
PIW:          0          0          0          0          0          0          0          0   Posted-interrupt wakeup event

Guest full dmesg

Guest full Xorg.log

@EpiJunkie

This comment has been minimized.

Copy link
Member Author

commented Sep 11, 2016

Hello @Dolpa70, welcome to the mix. Thanks for posting your findings.

I was really hoping that after those recent commits mentioned above, we would have our first successful system. It seems you are really close.

This is promising:

 [    2.602406] amdgpu 0000:00:06.0: amdgpu: using MSI.
 [    2.603204] [drm] amdgpu: irq initialized.

However this is not:

[    2.420207] [drm] amdgpu kernel modesetting enabled.
[    2.428299] AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
[    2.428301] AMD IOMMUv2 functionality not available on this system
[    2.432284] CRAT table not found
[    2.432286] Finished initializing topology ret=0
[    2.432688] kfd kfd: Initialized module
[    2.434249] [drm] initializing kernel modesetting (POLARIS10 0x1002:0x67DF 0x174B:0xE347 0xC7).
[    2.434259] [drm] register mmio base: 0xC0400000
[    2.434260] [drm] register mmio size: 262144
[    2.434263] [drm] doorbell mmio base: 0xC0200000
[    2.434264] [drm] doorbell mmio size: 2097152
[    2.434363] amdgpu 0000:00:06.0: Invalid ROM contents

It does see you have two monitors connected:

[   35.926948] [drm] [Conn_Detect]  [HDMI-A-1] Panasonic-TV: [Block 0] 00 FF FF FF FF FF FF 00 34 A9 96 A2 01 01 01 01 00 19 01 03 80 80 48 78 0A DA FF A3 58 4A A2 29 17 49 4B 20 08 00 31 40 61 40 01 01 01 01 01 01 01 01 01 01 01 01 08 E8 00 30 F2 70 5A 80 B0 58 8A 00 BA 88 21 00 00 1E 02 3A 80 18 71 38 2D 40 58 2C 45 00 BA 88 21 00 00 1E 00 00 00 FC 00 50 61 6E 61 73 6F 6E 69 63 2D 54 56 0A 00 00 00 FD 00 17 3D 0F 88 3C 00 0A 20 20 20 20 20 20 01 21 ^
[   35.952026] [drm] [Conn_Detect]  [HDMI-A-2] D2343: [Block 0] 00 FF FF FF FF FF FF 00 1E 6D 2A 59 01 01 01 01 01 16 01 03 80 32 1C 78 EA E2 95 A2 55 4F 9F 26 11 50 54 21 08 00 71 40 81 C0 81 00 81 80 95 00 90 40 A9 C0 B3 00 02 3A 80 18 71 38 2D 40 58 2C 45 00 FD 1E 11 00 00 1E 00 00 00 FD 00 38 3D 1E 53 0F 00 0A 20 20 20 20 20 20 00 00 00 FC 00 44 32 33 34 33 0A 20 20 20 20 20 20 20 00 00 00 FF 00 0A 20 20 20 20 20 20 20 20 20 20 20 20 01 1B ^

To me it seems the biggest issue in the Xorg log are here:

[    49.112] (II) AMDGPU: Driver for AMD Radeon chipsets: BONAIRE, BONAIRE, BONAIRE,
    BONAIRE, BONAIRE, BONAIRE, BONAIRE, BONAIRE, BONAIRE, BONAIRE,
    BONAIRE, KABINI, KABINI, KABINI, KABINI, KABINI, KABINI, KABINI,
    KABINI, KABINI, KABINI, KABINI, KABINI, KABINI, KABINI, KABINI,
    KABINI, KAVERI, KAVERI, KAVERI, KAVERI, KAVERI, KAVERI, KAVERI,
    KAVERI, KAVERI, KAVERI, KAVERI, KAVERI, KAVERI, KAVERI, KAVERI,
    KAVERI, KAVERI, KAVERI, KAVERI, KAVERI, KAVERI, HAWAII, HAWAII,
    HAWAII, HAWAII, HAWAII, HAWAII, HAWAII, HAWAII, HAWAII, HAWAII,
    HAWAII, HAWAII, TOPAZ, TOPAZ, TOPAZ, TOPAZ, TOPAZ, TONGA, TONGA,
    TONGA, TONGA, TONGA, TONGA, TONGA, TONGA, TONGA, CARRIZO, CARRIZO,
    CARRIZO, CARRIZO, CARRIZO, FIJI, STONEY, POLARIS11, POLARIS11,
    POLARIS11, POLARIS11, POLARIS11, POLARIS11, POLARIS11, POLARIS11,
    POLARIS11, POLARIS10, POLARIS10, POLARIS10, POLARIS10, POLARIS10,
    POLARIS10, POLARIS10, POLARIS10, POLARIS10, POLARIS10, POLARIS10
[    49.113] (II) modesetting: Driver for Modesetting Kernel Drivers: kms
[    49.113] (II) FBDEV: driver for framebuffer: fbdev
[    49.113] (II) VESA: driver for VESA chipsets: vesa
[    49.113] (WW) xf86OpenConsole: setpgid failed: Operation not permitted
[    49.113] (WW) xf86OpenConsole: setsid failed: Operation not permitted
[    49.113] (EE) open /dev/dri/card0: No such file or directory
[    49.113] (WW) Falling back to old probe method for modesetting
[    49.113] (EE) open /dev/dri/card0: No such file or directory

To me it seems like a driver ("AMD IOMMUv2 functionality not available on this system") issue on the guest, specifically it being virtualized. Can you test this guest with another OS like Windows? Also have you tried to virtualize this video card on another hypervisor like KVM or VMware?

If it were my system I would try changing the bargs to '-H -P -S' and removing the 'pci=nocrs' but I would not expect it to work. The removed '-A' bhyve flag "Generate ACPI tables. Required for FreeBSD/amd64 guests.", so it might change how the virtual machine is created.

Last resort, I would join the FreeBSD virtualization mailing list and see what they think. Many of the bhyve developers monitor that list.

VGA AMD Radeon RX-480 (the card has EFI ROM).

One last thing, how did you determine that it is running an EFI rom? Thank you.

Hopefully we can get this figured out.

@Dolpa70

This comment has been minimized.

Copy link

commented Sep 11, 2016

Hello @EpiJunkie , thank you for your response!

Normally I use Qemu/KVM on Ubuntu 16.04 and all my guests working fine, which is currently linux/freebsd/windows10. My desktop is virtualized also an ubuntu16.04 and some time Windows10/Windows 2016 server and they are running in UEFI mode. Everything is good and I'm happy with the performance. My IBM M1015 HBA working with freenas and the Qualcomm Atheros wireless card working with pfsense.
I use linux for 20 years but recently beginning to be interested in FreeBSD and of course bhyve. So my spare time I testing it. I like virtualizations.

The answers for your questions, GPU-Z in windows report the VGA has EFI ROM and it's working fine in UEFI mode. Remove -A flag doesn't help. If I left the 'pci=nocrs' the guest crashed and I got this:

Assertion failed: (pi->pi_bar[baridx].type == PCIBAR_IO), function passthru_read, file /usr/src/usr.sbin/bhyve/pci_passthru.c, line 874.

Something is not clear for me, My Xorg.log said:

[    49.113] (EE) open /dev/dri/card0: No such file or directory
[    49.113] (WW) Falling back to old probe method for modesetting
[    49.113] (EE) open /dev/dri/card0: No such file or directory

But the device is exist there:

# ll /dev/dri/
total 0
drwxr-xr-x  2 root root       100 Sep 11  2016 ./
drwxr-xr-x 18 root root      4260 Sep 11  2016 ../
crw-rw----  1 root video 226,   0 Sep 11 11:51 card0
crw-rw----  1 root video 226,  64 Sep 11 11:51 controlD64
crw-rw----  1 root video 226, 128 Sep 11 11:51 renderD128

Window10 guest crashed with the following error (w and w/o -A bhyve flag):

Assertion failed: (error == 0), function modify_bar_registration, file /usr/src/usr.sbin/bhyve/pci_emul.c, line 491.

And as You said I have many AMD IOMMUv2 kernel ops in dmesg.
I hope we see soon a working guest with successful VGA PCI passthrough.

By the way I'm on the FreeBSD virtualization mailing list. I will send a mail about this.

@mattmacy

This comment has been minimized.

Copy link

commented Sep 10, 2017

Where do we stand on GPU passthrough now?

@archfan

This comment has been minimized.

Copy link

commented Sep 10, 2017

+1 ^ Any progress?

@mattmacy

This comment has been minimized.

Copy link

commented Sep 10, 2017

pciconf -lv in the host:

ppt0@pci0:2:0:0:        class=0x030000 card=0xe331174b chip=0x73001002 rev=0xcb hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD/ATI]'
    device     = 'Fiji [Radeon R9 FURY / NANO Series]'
    class      = display

lspci -kv in the guest:


00:06.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Fiji [Radeon R9 FURY / NANO Series] (rev cb) (prog-if 00 [VGA controller])
        Subsystem: PC Partner Limited / Sapphire Technology Fiji [Radeon R9 FURY / NANO Series]
        Flags: bus master, fast devsel, latency 0, IRQ 36
        Memory at d000000000 (64-bit, prefetchable) [size=256M]
        Memory at c0200000 (64-bit, prefetchable) [size=2M]
        I/O ports at 2100 [size=256]
        Memory at c0400000 (32-bit, non-prefetchable) [size=256K]
        Expansion ROM at c0020000 [disabled] [size=128K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
        Capabilities: [58] Express Legacy Endpoint, MSI 00
        Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu

var/log/syslog:

Sep 10 13:11:48 Duh0 kernel: [    1.420332] [drm] amdgpu kernel modesetting enabled.
Sep 10 13:11:48 Duh0 kernel: [    1.422255] AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
Sep 10 13:11:48 Duh0 kernel: [    1.422948] AMD IOMMUv2 functionality not available on this system
Sep 10 13:11:48 Duh0 kernel: [    1.425669] CRAT table not found
Sep 10 13:11:48 Duh0 kernel: [    1.426027] Finished initializing topology ret=0
Sep 10 13:11:48 Duh0 kernel: [    1.426618] kfd kfd: Initialized module
Sep 10 13:11:48 Duh0 kernel: [    1.427298] amdgpu 0000:00:06.0: can't derive routing for PCI INT A
Sep 10 13:11:48 Duh0 kernel: [    1.427957] amdgpu 0000:00:06.0: PCI INT A: no GSI
Sep 10 13:11:48 Duh0 kernel: [    1.428668] [drm] initializing kernel modesetting (FIJI 0x1002:0x7300 0x174B:0xE331 0xCB).
Sep 10 13:11:48 Duh0 kernel: [    1.429514] [drm] register mmio base: 0xC0400000
Sep 10 13:11:48 Duh0 kernel: [    1.429980] [drm] register mmio size: 262144
Sep 10 13:11:48 Duh0 kernel: [    1.430444] [drm] doorbell mmio base: 0xC0200000
Sep 10 13:11:48 Duh0 kernel: [    1.430909] [drm] doorbell mmio size: 2097152
Sep 10 13:11:48 Duh0 kernel: [    1.431384] amdgpu 0000:00:06.0: Invalid ROM contents
Sep 10 13:11:48 Duh0 kernel: [    1.555694] ATOM BIOS: 113
Sep 10 13:11:48 Duh0 kernel: [    1.555983] [drm] GPU not posted. posting now...
Sep 10 13:11:48 Duh0 kernel: [    1.658393] [drm] Changing default dispclk from 500Mhz to 600Mhz
Sep 10 13:11:48 Duh0 kernel: [    1.659481] amdgpu 0000:00:06.0: VRAM: 4096M 0x0000000000000000 - 0x00000000FFFFFFFF (4096M used)
Sep 10 13:11:48 Duh0 kernel: [    1.660379] amdgpu 0000:00:06.0: GTT: 4096M 0x0000000100000000 - 0x00000001FFFFFFFF
Sep 10 13:11:48 Duh0 kernel: [    1.661141] [drm] Detected VRAM RAM=4096M, BAR=256M
Sep 10 13:11:48 Duh0 kernel: [    1.661637] [drm] RAM width 512bits DDR
Sep 10 13:11:48 Duh0 kernel: [    1.662151] [TTM] Zone  kernel: Available graphics memory: 8216184 kiB
Sep 10 13:11:48 Duh0 kernel: [    1.662814] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
Sep 10 13:11:48 Duh0 kernel: [    1.663467] [TTM] Initializing pool allocator
Sep 10 13:11:48 Duh0 kernel: [    1.663913] [TTM] Initializing DMA pool allocator
Sep 10 13:11:48 Duh0 kernel: [    1.664403] [drm] amdgpu: 4096M of VRAM memory ready
Sep 10 13:11:48 Duh0 kernel: [    1.664902] [drm] amdgpu: 4096M of GTT memory ready.
Sep 10 13:11:48 Duh0 kernel: [    1.665409] [drm] GART: num cpu pages 1048576, num gpu pages 1048576
Sep 10 13:11:48 Duh0 kernel: [    1.669773] [drm] PCIE GART of 4096M enabled (table at 0x0000000000040000).
Sep 10 13:11:48 Duh0 kernel: [    1.670527] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
Sep 10 13:11:48 Duh0 kernel: [    1.671189] [drm] Driver supports precise vblank timestamp query.
Sep 10 13:11:48 Duh0 kernel: [    1.671965] amdgpu 0000:00:06.0: amdgpu: using MSI.
Sep 10 13:11:48 Duh0 kernel: [    1.672837] [drm] amdgpu: irq initialized.
Sep 10 13:11:48 Duh0 kernel: [    1.673272] Can't find requested voltage id in vdd_dep_on_sclk table!
Sep 10 13:11:48 Duh0 kernel: [    1.678314] amdgpu: powerplay initialized
Sep 10 13:11:48 Duh0 kernel: [    1.679641] [drm] AMDGPU Display Connectors
Sep 10 13:11:48 Duh0 kernel: [    1.680082] [drm] Connector 0:
Sep 10 13:11:48 Duh0 kernel: [    1.680398] [drm]   DP-1
Sep 10 13:11:48 Duh0 kernel: [    1.680660] [drm]   HPD2
Sep 10 13:11:48 Duh0 kernel: [    1.680921] [drm]   DDC: 0x4868 0x4868 0x4869 0x4869 0x486a 0x486a 0x486b 0x486b
Sep 10 13:11:48 Duh0 kernel: [    1.681662] [drm]   Encoders:
Sep 10 13:11:48 Duh0 kernel: [    1.681985] [drm]     DFP1: INTERNAL_UNIPHY1
Sep 10 13:11:48 Duh0 kernel: [    1.682421] [drm] Connector 1:
Sep 10 13:11:48 Duh0 kernel: [    1.682732] [drm]   DP-2
Sep 10 13:11:48 Duh0 kernel: [    1.682995] [drm]   HPD5
Sep 10 13:11:48 Duh0 kernel: [    1.683257] [drm]   DDC: 0x4874 0x4874 0x4875 0x4875 0x4876 0x4876 0x4877 0x4877
Sep 10 13:11:48 Duh0 kernel: [    1.683991] [drm]   Encoders:
Sep 10 13:11:48 Duh0 kernel: [    1.684292] [drm]     DFP2: INTERNAL_UNIPHY2
Sep 10 13:11:48 Duh0 kernel: [    1.684720] [drm] Connector 2:
Sep 10 13:11:48 Duh0 kernel: [    1.685031] [drm]   DP-3
Sep 10 13:11:48 Duh0 kernel: [    1.685291] [drm]   HPD6
Sep 10 13:11:48 Duh0 kernel: [    1.685551] [drm]   DDC: 0x487c 0x487c 0x487d 0x487d 0x487e 0x487e 0x487f 0x487f
Sep 10 13:11:48 Duh0 kernel: [    1.686330] [drm]   Encoders:
Sep 10 13:11:48 Duh0 kernel: [    1.686643] [drm]     DFP3: INTERNAL_UNIPHY2
Sep 10 13:11:48 Duh0 kernel: [    1.687083] [drm] Connector 3:
Sep 10 13:11:48 Duh0 kernel: [    1.687400] [drm]   HDMI-A-1
Sep 10 13:11:48 Duh0 kernel: [    1.687700] [drm]   HPD3
Sep 10 13:11:48 Duh0 kernel: [    1.687967] [drm]   DDC: 0x4870 0x4870 0x4871 0x4871 0x4872 0x4872 0x4873 0x4873
Sep 10 13:11:48 Duh0 kernel: [    1.688714] [drm]   Encoders:
Sep 10 13:11:48 Duh0 kernel: [    1.689022] [drm]     DFP4: INTERNAL_UNIPHY1
Sep 10 13:11:48 Duh0 kernel: [    1.689458] [drm] Connector 4:
Sep 10 13:11:48 Duh0 kernel: [    1.689776] [drm]   DVI-D-1
Sep 10 13:11:48 Duh0 kernel: [    1.690068] [drm]   HPD1
Sep 10 13:11:48 Duh0 kernel: [    1.690348] [drm]   DDC: 0x4878 0x4878 0x4879 0x4879 0x487a 0x487a 0x487b 0x487b
Sep 10 13:11:48 Duh0 kernel: [    1.691097] [drm]   Encoders:
Sep 10 13:11:48 Duh0 kernel: [    1.691406] [drm]     DFP5: INTERNAL_UNIPHY
Sep 10 13:11:48 Duh0 kernel: [    1.692131] amdgpu 0000:00:06.0: fence driver on ring 0 use gpu addr 0x0000000100000008, cpu addr 0xffff8804258a0008
Sep 10 13:11:48 Duh0 kernel: [    1.693434] amdgpu 0000:00:06.0: fence driver on ring 1 use gpu addr 0x0000000100000018, cpu addr 0xffff8804258a0018
Sep 10 13:11:48 Duh0 kernel: [    1.694741] amdgpu 0000:00:06.0: fence driver on ring 2 use gpu addr 0x0000000100000028, cpu addr 0xffff8804258a0028
Sep 10 13:11:48 Duh0 kernel: [    1.696027] amdgpu 0000:00:06.0: fence driver on ring 3 use gpu addr 0x0000000100000038, cpu addr 0xffff8804258a0038
Sep 10 13:11:48 Duh0 kernel: [    1.697334] amdgpu 0000:00:06.0: fence driver on ring 4 use gpu addr 0x0000000100000048, cpu addr 0xffff8804258a0048
Sep 10 13:11:48 Duh0 kernel: [    1.698652] amdgpu 0000:00:06.0: fence driver on ring 5 use gpu addr 0x0000000100000058, cpu addr 0xffff8804258a0058
Sep 10 13:11:48 Duh0 kernel: [    1.699913] amdgpu 0000:00:06.0: fence driver on ring 6 use gpu addr 0x0000000100000068, cpu addr 0xffff8804258a0068
Sep 10 13:11:48 Duh0 kernel: [    1.701219] amdgpu 0000:00:06.0: fence driver on ring 7 use gpu addr 0x0000000100000078, cpu addr 0xffff8804258a0078
Sep 10 13:11:48 Duh0 kernel: [    1.702535] amdgpu 0000:00:06.0: fence driver on ring 8 use gpu addr 0x0000000100000088, cpu addr 0xffff8804258a0088
Sep 10 13:11:48 Duh0 kernel: [    1.703875] amdgpu 0000:00:06.0: fence driver on ring 9 use gpu addr 0x0000000100000098, cpu addr 0xffff8804258a0098
Sep 10 13:11:48 Duh0 kernel: [    2.760342] [drm] ib test on ring 10 succeeded in 0 usecs
Sep 10 13:11:48 Duh0 kernel: [    2.761996] [drm] ib test on ring 11 succeeded
Sep 10 13:11:48 Duh0 kernel: [    2.762738] [drm] ib test on ring 12 succeeded
Sep 10 13:11:48 Duh0 kernel: [    2.763354] [drm] Initialized amdgpu 3.1.0 20150101 for 0000:00:06.0 on minor 0


I'm going to install the ROCm bits and see if they work. From my experiences with amdgpu on FreeBSD, if the ring setup works, the driver mostly works.

@mattmacy

This comment has been minimized.

Copy link

commented Sep 10, 2017

Further updates: If I start the guest again the system locks up hard. If I pass boot_verbose=1 to the loader, bhyve breaks altogether:

root@VogonPoetry:/home/mmacy # chyves UBI console
Critical error detected. Exiting for following reason:
Your CPU lacks the basic feature to run bhyve. For AMD CPUs this means RVI. For Intel CPUs this means EPT.

@EpiJunkie

This comment has been minimized.

Copy link
Member Author

commented Sep 11, 2017

This is still a limitation of the bhyve code. From what I understand, this is being worked on.

See here:
https://wiki.freebsd.org/bhyve > "bhyve Features Under Development" > "VGA Pass-through"

It was also discussed with Rod Grimes on the BSD Now podcast number 172:
http://www.bsdnow.tv/episodes/2016_12_14_a_tale_of_bsd_from_yore

Further updates: If I start the guest again the system locks up hard. If I pass boot_verbose=1 to the loader, bhyve breaks altogether:

Woops, I've been meaning to write a bypass property for the CPU check. Basically the boot_verbose flag in /etc/rc.conf changes the output that chyves checks which CPU is in use.

If you want, you can bypass that check, comment out the if statement near line 287 in /usr/local/sbin/chyves and add if [ 0 = 1 ]; then instead.

EDIT: Clarified that VGA passthrough was discussed with Rod Grimes during the BSD Now podcast.

@mattmacy

This comment has been minimized.

Copy link

commented Sep 11, 2017

It looks like bhyvectl --destroy --vm=chy- may reset the device in a way that chyves stop doesn't. After doing that I was able to restart the VM.

@EpiJunkie

This comment has been minimized.

Copy link
Member Author

commented Sep 11, 2017

It looks like bhyvectl --destroy --vm=chy- may reset the device in a way that chyves stop doesn't. After doing that I was able to restart the VM.

chyves <guest-name> stop

If bhyve for the guest does not exit cleanly, the command above runs the bhyvectl --destroy command to clean up the vmm. You can run chyves <guest-name> stop force to force bhyve to exit if it has not already because it has hung.

Also, you may also want to attempt to reload PCI device's driver with devctl clear driver -f ppt0 and then use the ppt driver again with devctl set driver pci0:2:0:0 ppt.

@mattmacy

This comment has been minimized.

Copy link

commented Sep 11, 2017

Also, you may also want to attempt to reload PCI device's driver with devctl clear driver -f ppt0 and then use the ppt driver again with devctl set driver pci0:2:0:0 ppt.

Is this likely to restore the device to the power on reset state that the guest driver expects? And what if there is no FreeBSD driver for the device / one that doesn't detach cleanly?

@EpiJunkie

This comment has been minimized.

Copy link
Member Author

commented Sep 11, 2017

Is this likely to restore the device to the power on reset state that the guest driver expects? And what if there is no FreeBSD driver for the device / one that doesn't detach cleanly?

I (now) believe that the PCI device is suppose to receive a Function Level Reset on the start of the bhyve guest. Yesterday night/early this morning, I was mistakenly thinking it was when the driver was loaded. See this commit for details: https://svnweb.freebsd.org/base?view=revision&revision=305502.

My understanding, is that code was written with SR-IOV in mind but IIRC I think it is one of the components needed for the eventual VGA passthrough support.

This is the MFC into 10-stable and 11-stable from the above commit. What version of FreeBSD are you running?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.