Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gim error:(set_new_adapter:617) asic does not support SRIOV #3

Closed
falcon78921 opened this issue Apr 25, 2018 · 24 comments
Closed

gim error:(set_new_adapter:617) asic does not support SRIOV #3

falcon78921 opened this issue Apr 25, 2018 · 24 comments

Comments

@falcon78921
Copy link

Hello!

We are trying to use an AMD FirePro S7150x2 in a Ubuntu 16.04.2 system. Here is some information regarding the setup:

OS: Ubuntu 16.04.2
Hypervisor: qemu-kvm 2.5.0 with libvirt 1.3.1
Kernel: 4.4.0-75-generic

I followed the README.md on the gim repo. I was able to load the kernel module:

image

I was able to get one of the patches installed (0001), but not the other (0002). We're trying to utilize the FirePro for VDI virtualization, exactly as stated on the AMD website (https://pro.radeon.com/en/solutions/vdi/).

However, we are having problems fully comprehending the documentation. I guess I'm just seeing if anyone could enlighten me on how I can utilize this hardware for a VDI infrastructure. I know for 100% certainty our hardware supports SRIOV.

I am more than happy to provide information, if requested.

There were some errors that I found in our dmesg:

image

Any help is much appreciated!

@salco23
Copy link

salco23 commented Apr 25, 2018

Hi James,

Would you be able to share the output of:

lspci -vvv -d 1002:6829

Thanks,

Emil

@falcon78921
Copy link
Author

Hi Emil,

Thanks for reaching out!

Here is the output for lspci -vvv -d 1002:6829:

root@vdi:~# lspci -vvv -d 1002:6829
root@vdi:~# 

Nothing appeared to output. Our AMD FirePro is 1002:6929.

Here is the output for lspci -vvv -d 1002:6929:

root@vdi:~# lspci -vvv -d 1002:6929
83:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Tonga XT GL [FirePro S7150] (prog-if 00 [VGA controller])
	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Tonga XT GL [FirePro S7150]
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 11
	Region 0: Memory at 383fe0000000 (64-bit, prefetchable) [disabled] [size=256M]
	Region 2: Memory at 383ff4000000 (64-bit, prefetchable) [disabled] [size=2M]
	Region 4: I/O ports at f000 [disabled] [size=256]
	Region 5: Memory at fbc00000 (32-bit, non-prefetchable) [disabled] [size=256K]
	Expansion ROM at fbc40000 [disabled] [size=128K]
	Capabilities: [48] Vendor Specific Information: Len=08 <?>
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 4096 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
		LnkCap:	Port #8, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 8GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
			 EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
	Capabilities: [a0] MSI: Enable- Count=1/4 Maskable+ 64bit+
		Address: 0000000000000000  Data: 0000
		Masking: 00000000  Pending: 00000000
	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [150 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [200 v1] #15
	Capabilities: [270 v1] #19
	Capabilities: [2b0 v1] Address Translation Service (ATS)
		ATSCap:	Invalidate Queue Depth: 00
		ATSCtl:	Enable-, Smallest Translation Unit: 00
	Capabilities: [2c0 v1] #13
	Capabilities: [2d0 v1] #1b
	Capabilities: [328 v1] Alternative Routing-ID Interpretation (ARI)
		ARICap:	MFVC- ACS-, Next Function: 0
		ARICtl:	MFVC- ACS-, Function Group: 0
	Capabilities: [330 v1] Single Root I/O Virtualization (SR-IOV)
		IOVCap:	Migration-, Interrupt Message Number: 000
		IOVCtl:	Enable- Migration- Interrupt- MSE- ARIHierarchy-
		IOVSta:	Migration-
		Initial VFs: 16, Total VFs: 16, Number of VFs: 0, Function Dependency Link: 00
		VF offset: 256, stride: 1, Device ID: 692f
		Supported Page Size: 00000553, System Page Size: 00000001
		Region 0: Memory at 0000383ee0000000 (64-bit, prefetchable)
		Region 2: Memory at 0000383ff0000000 (64-bit, prefetchable)
		Region 5: Memory at f7c00000 (32-bit, non-prefetchable)
		VF Migration: offset: 00000000, BIR: 0
	Capabilities: [400 v1] Vendor Specific Information: ID=0002 Rev=1 Len=070 <?>
	Kernel modules: amdgpu

85:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Tonga XT GL [FirePro S7150] (prog-if 00 [VGA controller])
	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Tonga XT GL [FirePro S7150]
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 11
	Region 0: Memory at 383ec0000000 (64-bit, prefetchable) [disabled] [size=256M]
	Region 2: Memory at 383ed4000000 (64-bit, prefetchable) [disabled] [size=2M]
	Region 4: I/O ports at e000 [disabled] [size=256]
	Region 5: Memory at f7800000 (32-bit, non-prefetchable) [disabled] [size=256K]
	Expansion ROM at f7840000 [disabled] [size=128K]
	Capabilities: [48] Vendor Specific Information: Len=08 <?>
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 4096 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
		LnkCap:	Port #16, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 8GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
			 EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
	Capabilities: [a0] MSI: Enable- Count=1/4 Maskable+ 64bit+
		Address: 0000000000000000  Data: 0000
		Masking: 00000000  Pending: 00000000
	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [150 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [200 v1] #15
	Capabilities: [270 v1] #19
	Capabilities: [2b0 v1] Address Translation Service (ATS)
		ATSCap:	Invalidate Queue Depth: 00
		ATSCtl:	Enable-, Smallest Translation Unit: 00
	Capabilities: [2c0 v1] #13
	Capabilities: [2d0 v1] #1b
	Capabilities: [328 v1] Alternative Routing-ID Interpretation (ARI)
		ARICap:	MFVC- ACS-, Next Function: 0
		ARICtl:	MFVC- ACS-, Function Group: 0
	Capabilities: [330 v1] Single Root I/O Virtualization (SR-IOV)
		IOVCap:	Migration-, Interrupt Message Number: 000
		IOVCtl:	Enable- Migration- Interrupt- MSE- ARIHierarchy-
		IOVSta:	Migration-
		Initial VFs: 16, Total VFs: 16, Number of VFs: 0, Function Dependency Link: 00
		VF offset: 256, stride: 1, Device ID: 692f
		Supported Page Size: 00000553, System Page Size: 00000001
		Region 0: Memory at 0000383dc0000000 (64-bit, prefetchable)
		Region 2: Memory at 0000383ed0000000 (64-bit, prefetchable)
		Region 5: Memory at f3800000 (32-bit, non-prefetchable)
		VF Migration: offset: 00000000, BIR: 0
	Capabilities: [400 v1] Vendor Specific Information: ID=0002 Rev=1 Len=070 <?>
	Kernel modules: amdgpu

@kzytaruk
Copy link

Hi James, Can you confirm that you have patch 0001 built and installed?

For the patch you can actually completely remove the check for end point type. See the same file in the v4.6 kernel for reference. This fix has been upstreamed as of v4.6

Thanks,
Kelly

@falcon78921
Copy link
Author

Hi Kelly,

I'll verify right now.

Thanks,
James

@falcon78921
Copy link
Author

falcon78921 commented Apr 26, 2018

Hi Kelly,

Everytime I run the apt-get source linux-image-$(uname -r), it disregards my 4.4.0-75-generic and installs a new 4.4.0 kernel.

root@vdi:/opt/firepro# apt-get source linux-image-$(uname -r)
Reading package lists... Done
Picking 'linux' as source package instead of 'linux-image-4.4.0-75-generic'
NOTICE: 'linux' packaging is maintained in the 'Git' version control system at:
git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/xenial
Please use:
git clone git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/xenial
to retrieve the latest (possibly unreleased) updates to the package.
Need to get 147 MB of source archives.
Get:1 http://us.archive.ubuntu.com/ubuntu xenial-proposed/main linux 4.4.0-122.146 (dsc) [9,712 B]
Get:2 http://us.archive.ubuntu.com/ubuntu xenial-proposed/main linux 4.4.0-122.146 (tar) [133 MB]

This is a step in the docs/environment_setup file. Any advice? Thanks!

@kzytaruk
Copy link

Hi James,

I have never used apt-get to get the kernel. I always download the .tar.gz from kernel.org (https://mirrors.edge.kernel.org/pub/linux/kernel/v4.x/).

gunzip the .gz file then 'tar -xf' the results.
You will need to create a .config file before building. If you need instructions there are lots on the internet or I can try to summarize it for you.

How comfortable are you with rebuilding the kernel?

Thanks,
Kelly

@falcon78921
Copy link
Author

falcon78921 commented Apr 27, 2018

Hi Kelly,

I don't have much experience in building custom kernels. However, I followed an article and learned more about it! :)

I'm running make for the new kernel now. I patched the kernel with 0001 and 0002. I'll keep you posted on the progress!

Thanks again for your assistance! :)

Sincerely,
James

@falcon78921
Copy link
Author

Hi Kelly,

I think I got the kernel going. I installed the gim module and added it to /etc/modules. There were no kernel panics in dmesg:

[   20.683154] gim: loading out-of-tree module taints kernel.
[   20.684787] gim info:(gim_init:144) Start AMD open source GIM initialization
[   20.684788] gim info:(gim_init:147) GPU IOV MODULE - version 1.0.0-k
[   20.684789] gim info:(gim_init:149) Copyright (c) 2014-2017 AMD Corporation.
[   20.684824] gim info:(init_config:305) INIT CONFIG

When I do a lspci -vvv -d 1002:6929, I don't see the FirePro anymore. Is this because blacklist amdgpu is defined in /etc/modprobe.d/blacklist.conf?

I guess I'm wondering where I go from here.

Thanks,
James

@falcon78921
Copy link
Author

falcon78921 commented Apr 30, 2018

Hi Kelly,

Good news! I think we're making progress :)

Here is my dmesg output, once we booted with the FirePro, gim, and updated kernel:

[   20.621078] gim: loading out-of-tree module taints kernel.
[   20.621889] gim info:(gim_init:144) Start AMD open source GIM initialization
[   20.621890] gim info:(gim_init:147) GPU IOV MODULE - version 1.0.0-k
[   20.621890] gim info:(gim_init:149) Copyright (c) 2014-2017 AMD Corporation.
[   20.621902] gim info:(init_config:305) INIT CONFIG
[   20.621929] gim info:(set_new_adapter:614) curr allocated at ffffffffc0383b80
[   20.621929] gim info:(set_new_adapter:621) SRIOV is supported
[   20.621932] gim info:(set_new_adapter:629) found PCI bridge device
[   20.621933] gim info:(set_new_adapter:633) found: 03:8.0
[   20.621951] gim info:(set_new_adapter:650) mmio_base = ffffc90063080000
[   20.621961] gim info:(set_new_adapter:652) doorbell = ffffc90063c00000
[   20.621986] gim info:(set_new_adapter:654) pf.fb_va = ffffc900a0000000
[   20.621986] gim info:(sriov_is_ari_enabled:164) PCI_SRIOV_CAP = 0x00000002
[   20.621987] gim info:(sriov_is_ari_enabled:173) PCI_SRIOV_CTRL = 0x00000010
[   20.621988] gim info:(sriov_is_ari_enabled:176) PCI_SRIOV_CTRL_ARI is set --> ARI is supported
[   20.621997] gim info:(program_ari_mode:483) Read bif_strap8 = 0x00200004
[   20.621998] gim info:(program_ari_mode:488) program_ari_mode - Set ARI_Mode = PF_BUS
[   20.621999] gim info:(program_ari_mode:498) Write bif_strap8 = 0x00000004
[   20.622143] gim info:(gim_read_rom_from_reg:222) Reading VBios from ROM
[   20.622143] gim info:(gim_read_vbios:284) VBIOS starts:  0x55, 0xaa
[   20.622143] gim info:(gim_read_vbios:287) VBios size is 0x10000
[   20.622148] gim info:(gim_read_vbios:290) vbios allocated at ffffc9006301a000
[   20.622148] gim info:(gim_read_rom_from_reg:222) Reading VBios from ROM
[   20.765069] gim info:(gim_read_vbios:298) BIOS Version Major 0xF Minor 0x31
[   20.766437] gim info:(gim_read_vbios:311) Valid video BIOS image, 
[   20.766462] gim info:(gim_read_vbios:313) size = 0x10000, check sum is 0x548300
[   20.767712] gim info:(gim_post_vbios:344) Init Parser passed!, continue
[   20.768326] gim info:(atom_chk_asic_status:335) ATOM_CheckAsicStatus - BIOS_SCRATCH_7 = 0x00000000
[   20.768932] gim info:(atom_chk_asic_status:338) Isolate ATOM_S7_ASIC_INIT_COMPLETE_MASK bit(s) = 0x00000000
[   20.769517] gim info:(atom_chk_asic_status:341) RLC_CNTL = 0x00000000
[   20.770109] gim info:(atom_chk_asic_status:343) Isolate RLC_CNTL__RLC_ENABLE_F32_MASK = 0x00000000
[   20.770705] gim info:(atom_chk_asic_status:350) ATOM_ASIC_NEED_POST
[   20.771307] gim info:(gim_post_vbios:347) Asic needs a VBios post
[   20.771894] gim info:(atom_post_vbios:200) ATOM_PostVBIOS: firmware_info passed
[   20.772473] gim info:(atom_post_vbios:253) asic_init before, engine clock = 7530; memory clock =1e848
[   21.115106] gim info:(atom_post_vbios:256) asic_init after
[   21.115106] gim info:(atom_post_vbios:263) atom_init_fan_cntl before
[   21.115107] gim info:(atom_post_vbios:265) atom_init_fan_cntl after
[   21.115108] gim info:(gim_post_vbios:353) Post INIT_ASIC successfully!
[   21.115137] gim info:(firmware_requires_update:512) SMU option ROM version 0x111700
[   21.115137] gim info:(firmware_requires_update:513) versus patch version 0x111a00
[   21.115165] gim info:(firmware_requires_update:523) RLCV option ROM version 113 versus patch version 113
[   21.115166] gim info:(firmware_requires_update:528) TOC found, update it
[   21.115167] gim info:(patch_firmware:579) Update smc_init table
[   21.695553] gim info:(patch_firmware:612) Update smu firmware
[   21.707940] gim info:(patch_firmware:630) Update TOC
[   21.713433] gim info:(gim_post_vbios:359) Asic needs firmware loaded
[   21.714014] gim info:(atom_post_vbios:200) ATOM_PostVBIOS: firmware_info passed
[   21.714589] gim info:(atom_post_vbios:239) just load uCode
[   21.714606] gim info:(atom_post_vbios:253) asic_init before, engine clock = 7530; memory clock =1e848
[   23.547318] gim info:(atom_post_vbios:256) asic_init after
[   23.547813] gim info:(atom_post_vbios:263) atom_init_fan_cntl before
[   23.548259] gim info:(atom_post_vbios:265) atom_init_fan_cntl after
[   23.548688] gim info:(gim_post_vbios:365) Post LOAD_FW successfully!
[   23.549107] gim info:(gim_post_vbios:366) Post VBIOS successfully!
[   23.549519] gim info:(set_new_adapter:701) gim_post_vbios done
[   23.549533] gim info:(set_new_adapter:739) Scheduler Time interval is per-vf from XL
[   23.549942] gim info:(set_new_adapter:740) config file
[   23.550757] gim info:(enable_sriov:295) Enable SRIOV
[   23.551212] gim info:(enable_sriov:296) Enable SRIOV vfs count = 16
[   23.685847] gim info:(enumerate_vfs:123) vf found: 04:2.0
[   23.686458] gim info:(enumerate_vfs:123) vf found: 04:2.1
[   23.687057] gim info:(enumerate_vfs:123) vf found: 04:2.2
[   23.687768] gim info:(enumerate_vfs:123) vf found: 04:2.3
[   23.688348] gim info:(enumerate_vfs:123) vf found: 04:2.4
[   23.688916] gim info:(enumerate_vfs:123) vf found: 04:2.5
[   23.689474] gim info:(enumerate_vfs:123) vf found: 04:2.6
[   23.690020] gim info:(enumerate_vfs:123) vf found: 04:2.7
[   23.690558] gim info:(enumerate_vfs:123) vf found: 04:3.0
[   23.691088] gim info:(enumerate_vfs:123) vf found: 04:3.1
[   23.691703] gim info:(enumerate_vfs:123) vf found: 04:3.2
[   23.692207] gim info:(enumerate_vfs:123) vf found: 04:3.3
[   23.692700] gim info:(enumerate_vfs:123) vf found: 04:3.4
[   23.693183] gim info:(enumerate_vfs:123) vf found: 04:3.5
[   23.693654] gim info:(enumerate_vfs:123) vf found: 04:3.6
[   23.694116] gim info:(enumerate_vfs:123) vf found: 04:3.7
[   23.694590] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:2.0
[   23.695029] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.695590] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.696048] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:2.1
[   23.696476] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.696908] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.697355] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:2.2
[   23.697775] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.698197] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.698636] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:2.3
[   23.699045] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.699576] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.700005] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:2.4
[   23.700405] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.700809] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.701227] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:2.5
[   23.701617] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.702011] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.702420] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:2.6
[   23.702801] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.703231] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.703686] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:2.7
[   23.704057] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.704431] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.704821] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:3.0
[   23.705183] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.705549] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.705928] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:3.1
[   23.706278] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.706632] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.707000] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:3.2
[   23.707459] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.707806] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.708179] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:3.3
[   23.708536] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.708904] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.709301] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:3.4
[   23.709681] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.710073] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.710490] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:3.5
[   23.710893] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.711387] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.711850] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:3.6
[   23.712273] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.712710] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.713172] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 04:3.7
[   23.713621] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   23.714080] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   23.714545] gim info:(pci_gpu_iov_init:87) total_fb_available = 8190
[   23.714992] gim info:(pci_gpu_iov_init:88) AMD GIM pci_gpu_iov_init pos = 400
[   23.715990] gim info:(pci_gpu_iov_init:90) AMD GIM pci_gpu_iov_init total_fb_available = 1ffe
[   23.716444] gim info:(init_frame_buffer_partition:190) PCI defined PF FB size = 256 MB
[   23.716902] gim info:(init_frame_buffer_partition:194) PCI defined VF FB size = 256 MB
[   23.717355] gim info:(init_frame_buffer_partition:200) Total FB Available = 8190 MB, CSA = 8 MB, Max remaining FBsize =8160 MB
[   23.718265] gim info:(init_frame_buffer_partition:201) max_fb_size = 8160
[   23.718732] gim info:(init_frame_buffer_partition:214) PF FB size after checking limits from config file = 256MB
[   23.719262] gim info:(init_frame_buffer_partition:217) PF rounded down to nearest 16MB boundary = 256
[   23.719787] gim info:(init_pf_fb:60) total framebuffer available = 1ffe
[   23.720261] gim info:(init_pf_fb:61) pf framebuffer = 100
[   23.720729] gim info:(init_pf_fb:63) total framebuffer consumed = 1efe
[   23.721200] gim info:(init_frame_buffer_partition:224) CSA starts at offset 256MB
[   23.721675] gim info:(init_context_save_area:42) AMD GIM init_context_save_area: base =100 size=1.
[   23.722152] gim info:(init_frame_buffer_partition:231) VF FB base = 272MB (256 + 16)
[   23.722632] gim info:(init_frame_buffer_partition:235) VF FB Size = 7904MB (8160 - 256)
[   23.723137] gim info:(init_fb_static:118) AMD GIM init_fb_static: num_vf = 16, base= 272, total_size=7904, mini_size=256
[   23.723701] gim info:(init_fb_static:139) AMD GIM init_fb_static: vf_fb_size = 480, base= 272
[   23.724188] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 0 base =272,size= 480
[   23.724674] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 1 base =752,size= 480
[   23.725152] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 2 base =1232,size= 480
[   23.725626] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 3 base =1712,size= 480
[   23.726094] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 4 base =2192,size= 480
[   23.726551] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 5 base =2672,size= 480
[   23.727003] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 6 base =3152,size= 480
[   23.727553] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 7 base =3632,size= 480
[   23.727989] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 8 base =4112,size= 480
[   23.728420] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 9 base =4592,size= 480
[   23.728845] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 10 base =5072,size= 480
[   23.729262] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 11 base =5552,size= 480
[   23.729676] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 12 base =6032,size= 480
[   23.730083] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 13 base =6512,size= 480
[   23.730481] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 14 base =6992,size= 480
[   23.730874] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 15 base =7472,size= 480
[   23.731362] gim info:(set_new_adapter:789) enable MSI
[   23.731471] gim info:(ih_iv_ring_disable:383) disable iv ring successfully
[   23.732223] gim info:(alloc_iv_ring:99) ih->ivr_num_entries = 256
[   23.732607] gim info:(alloc_iv_ring:102) ih->ivr_size_in_bytes = 4096
[   23.732987] gim info:(alloc_iv_ring:108) ih->ivr_alloc_size_in_bytes = 4100
[   23.733367] gim info:(alloc_iv_ring:110) iv ring page_cnt = 2
[   23.733745] gim info:(alloc_iv_ring:141) ih->ivr_va = ffffc900632c9000
[   23.734165] gim info:(alloc_iv_ring:147) ih->ivr_ma.quad_part = 0xfffff000
[   23.734535] gim info:(alloc_iv_ring:151) ih->ivr_wptr_wb = ffffc900632ca000
[   23.734905] gim info:(alloc_iv_ring:158) ih->ivr_wptr_wa.quad_part = 0xffffe000
[   23.735341] gim info:(alloc_iv_ring:163) update rptr via doorbell
[   23.735734] gim info:(ih_iv_ring_init:291) ih->rptr_doorbell = ffffc90063c007a0
[   23.735750] gim info:(ih_iv_ring_init:292) ih->rptr_doorbell_offset = 0x1e8
[   23.736111] gim info:(ih_iv_ring_hw_init:185) the physical address of ring buffer: 0xfffff0
[   23.736477] gim info:(ih_iv_ring_setup_rptr:451) write mmBIF_DOORBELL_APER_EN: 0x1
[   23.737204] gim info:(ih_iv_ring_enable:350) ih->ivr_wptr_reg = 0x0
[   23.737573] gim info:(ih_iv_ring_enable:352) ih->ivr_wptr = 0
[   23.737943] gim info:(ih_iv_ring_enable:354) ih->ivr_rptr_reg = 0x0
[   23.738306] gim info:(ih_iv_ring_enable:356) ih->ivr_rptr = 0
[   23.738652] gim info:(ih_iv_ring_enable:358) *(ih->rptr_doorbell) = 0x0
[   23.738989] gim info:(ih_iv_ring_init:299) init iv ring successfully
[   23.739003] gim info:(set_new_adapter:802) init work
[   23.739438] gim info:(set_new_adapter:808) register interrupt
[   23.739469] gim info:(ih_irq_source_enable:583) IH: read 0x00000000 from mask_reg 0x14d1
[   23.739810] gim info:(ih_irq_source_enable:589) IH: write 0x00000001 to mask_reg 0x14d1
[   23.740152] gim info:(ih_irq_source_enable:592) irq sourceID 0x89 get enabled
[   23.740857] gim info:(ih_irq_source_enable:583) IH: read 0x00000001 from mask_reg 0x14d1
[   23.740874] gim info:(ih_irq_source_enable:589) IH: write 0x00000003 to mask_reg 0x14d1
[   23.741238] gim info:(ih_irq_source_enable:592) irq sourceID 0x88 get enabled
[   23.742238] gim info:(gim_clear_all_errors:357) PCIE cap pos 58
[   23.742612] gim info:(gim_clear_all_errors:362) AER ext cap pos 150
[   23.742974] gim info:(gim_clear_all_errors:369) DevStatus = 0x9
[   23.743434] gim info:(gim_clear_all_errors:387) PCIE unrecover error = 0x2000
[   23.743799] gim info:(resume_scheduler:131) No functions on the runlist.
[   23.743813] gim info:(resume_scheduler:132) Don't need to restart the scheduler
[   23.744533] gim info:(gim_probe:86) AMD GIM probe: pf_count = 1
[   23.744907] gim info:(set_new_adapter:614) curr allocated at ffffffffc038d6e0
[   23.745291] gim info:(set_new_adapter:621) SRIOV is supported
[   23.745684] gim info:(set_new_adapter:629) found PCI bridge device
[   23.746075] gim info:(set_new_adapter:633) found: 03:10.0
[   23.746474] gim info:(set_new_adapter:650) mmio_base = ffffc90063300000
[   23.746492] gim info:(set_new_adapter:652) doorbell = ffffc90063800000
[   23.746904] gim info:(set_new_adapter:654) pf.fb_va = ffffc900c0000000
[   23.747423] gim info:(sriov_is_ari_enabled:164) PCI_SRIOV_CAP = 0x00000002
[   23.748276] gim info:(sriov_is_ari_enabled:173) PCI_SRIOV_CTRL = 0x00000010
[   23.748716] gim info:(sriov_is_ari_enabled:176) PCI_SRIOV_CTRL_ARI is set --> ARI is supported
[   23.749165] gim info:(program_ari_mode:483) Read bif_strap8 = 0x00200004
[   23.749620] gim info:(program_ari_mode:488) program_ari_mode - Set ARI_Mode = PF_BUS
[   23.750082] gim info:(program_ari_mode:498) Write bif_strap8 = 0x00000004
[   23.750536] gim info:(gim_read_rom_from_reg:222) Reading VBios from ROM
[   23.750667] gim info:(gim_read_vbios:284) VBIOS starts:  0x55, 0xaa
[   23.751661] gim info:(gim_read_vbios:287) VBios size is 0x10000
[   23.752123] gim info:(gim_read_vbios:290) vbios allocated at ffffc900632cc000
[   23.752577] gim info:(gim_read_rom_from_reg:222) Reading VBios from ROM
[   23.892025] gim info:(gim_read_vbios:298) BIOS Version Major 0xF Minor 0x31
[   23.892950] gim info:(gim_read_vbios:311) Valid video BIOS image, 
[   23.892964] gim info:(gim_read_vbios:313) size = 0x10000, check sum is 0x548300
[   23.893880] gim info:(gim_post_vbios:344) Init Parser passed!, continue
[   23.894353] gim info:(atom_chk_asic_status:335) ATOM_CheckAsicStatus - BIOS_SCRATCH_7 = 0x00000000
[   23.894827] gim info:(atom_chk_asic_status:338) Isolate ATOM_S7_ASIC_INIT_COMPLETE_MASK bit(s) = 0x00000000
[   23.895394] gim info:(atom_chk_asic_status:341) RLC_CNTL = 0x00000000
[   23.895880] gim info:(atom_chk_asic_status:343) Isolate RLC_CNTL__RLC_ENABLE_F32_MASK = 0x00000000
[   23.896342] gim info:(atom_chk_asic_status:350) ATOM_ASIC_NEED_POST
[   23.896795] gim info:(gim_post_vbios:347) Asic needs a VBios post
[   23.897246] gim info:(atom_post_vbios:200) ATOM_PostVBIOS: firmware_info passed
[   23.897710] gim info:(atom_post_vbios:253) asic_init before, engine clock = 7530; memory clock =1e848
[   24.223080] gim info:(atom_post_vbios:256) asic_init after
[   24.223619] gim info:(atom_post_vbios:263) atom_init_fan_cntl before
[   24.224093] gim info:(atom_post_vbios:265) atom_init_fan_cntl after
[   24.224560] gim info:(gim_post_vbios:353) Post INIT_ASIC successfully!
[   24.225034] gim info:(firmware_requires_update:512) SMU option ROM version 0x111700
[   24.225050] gim info:(firmware_requires_update:513) versus patch version 0x111a00
[   24.225950] gim info:(firmware_requires_update:523) RLCV option ROM version 113 versus patch version 113
[   24.226394] gim info:(firmware_requires_update:528) TOC found, update it
[   24.226836] gim info:(patch_firmware:579) Update smc_init table
[   24.813582] gim info:(patch_firmware:612) Update smu firmware
[   24.825202] gim info:(patch_firmware:630) Update TOC
[   24.830026] gim info:(gim_post_vbios:359) Asic needs firmware loaded
[   24.830446] gim info:(atom_post_vbios:200) ATOM_PostVBIOS: firmware_info passed
[   24.830870] gim info:(atom_post_vbios:239) just load uCode
[   24.830883] gim info:(atom_post_vbios:253) asic_init before, engine clock = 7530; memory clock =1e848
[   26.745318] gim info:(atom_post_vbios:256) asic_init after
[   26.745748] gim info:(atom_post_vbios:263) atom_init_fan_cntl before
[   26.746170] gim info:(atom_post_vbios:265) atom_init_fan_cntl after
[   26.746586] gim info:(gim_post_vbios:365) Post LOAD_FW successfully!
[   26.747002] gim info:(gim_post_vbios:366) Post VBIOS successfully!
[   26.747423] gim info:(set_new_adapter:701) gim_post_vbios done
[   26.747436] gim info:(set_new_adapter:739) Scheduler Time interval is per-vf from XL
[   26.747861] gim info:(set_new_adapter:740) config file
[   26.748706] gim info:(enable_sriov:295) Enable SRIOV
[   26.749126] gim info:(enable_sriov:296) Enable SRIOV vfs count = 16
[   26.879719] gim info:(enumerate_vfs:123) vf found: 06:2.0
[   26.880372] gim info:(enumerate_vfs:123) vf found: 06:2.1
[   26.881048] gim info:(enumerate_vfs:123) vf found: 06:2.2
[   26.881736] gim info:(enumerate_vfs:123) vf found: 06:2.3
[   26.882315] gim info:(enumerate_vfs:123) vf found: 06:2.4
[   26.882883] gim info:(enumerate_vfs:123) vf found: 06:2.5
[   26.883519] gim info:(enumerate_vfs:123) vf found: 06:2.6
[   26.884066] gim info:(enumerate_vfs:123) vf found: 06:2.7
[   26.884602] gim info:(enumerate_vfs:123) vf found: 06:3.0
[   26.885128] gim info:(enumerate_vfs:123) vf found: 06:3.1
[   26.885643] gim info:(enumerate_vfs:123) vf found: 06:3.2
[   26.886146] gim info:(enumerate_vfs:123) vf found: 06:3.3
[   26.886640] gim info:(enumerate_vfs:123) vf found: 06:3.4
[   26.887146] gim info:(enumerate_vfs:123) vf found: 06:3.5
[   26.887671] gim info:(enumerate_vfs:123) vf found: 06:3.6
[   26.888133] gim info:(enumerate_vfs:123) vf found: 06:3.7
[   26.888592] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:2.0
[   26.889031] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.889475] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.889933] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:2.1
[   26.890362] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.890796] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.891279] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:2.2
[   26.891741] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.892164] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.892604] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:2.3
[   26.893014] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.893426] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.893855] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:2.4
[   26.894255] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.894658] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.895077] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:2.5
[   26.895545] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.895940] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.896349] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:2.6
[   26.896730] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.897113] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.897513] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:2.7
[   26.897883] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.898258] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.898647] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:3.0
[   26.899009] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.899434] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.899814] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:3.1
[   26.900166] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.900521] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.900891] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:3.2
[   26.901232] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.901580] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.901953] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:3.3
[   26.902308] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.902677] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.903075] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:3.4
[   26.903517] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.903909] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.904330] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:3.5
[   26.904732] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.905146] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.905587] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:3.6
[   26.906010] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.906447] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.906914] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 06:3.7
[   26.907422] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
[   26.907884] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
[   26.908345] gim info:(pci_gpu_iov_init:87) total_fb_available = 8190
[   26.908792] gim info:(pci_gpu_iov_init:88) AMD GIM pci_gpu_iov_init pos = 400
[   26.909686] gim info:(pci_gpu_iov_init:90) AMD GIM pci_gpu_iov_init total_fb_available = 1ffe
[   26.910140] gim info:(init_frame_buffer_partition:190) PCI defined PF FB size = 256 MB
[   26.910602] gim info:(init_frame_buffer_partition:194) PCI defined VF FB size = 256 MB
[   26.911052] gim info:(init_frame_buffer_partition:200) Total FB Available = 8190 MB, CSA = 8 MB, Max remaining FBsize =8160 MB
[   26.912040] gim info:(init_frame_buffer_partition:201) max_fb_size = 8160
[   26.912513] gim info:(init_frame_buffer_partition:214) PF FB size after checking limits from config file = 256MB
[   26.912987] gim info:(init_frame_buffer_partition:217) PF rounded down to nearest 16MB boundary = 256
[   26.913467] gim info:(init_pf_fb:60) total framebuffer available = 1ffe
[   26.913940] gim info:(init_pf_fb:61) pf framebuffer = 100
[   26.914409] gim info:(init_pf_fb:63) total framebuffer consumed = 1efe
[   26.914883] gim info:(init_frame_buffer_partition:224) CSA starts at offset 256MB
[   26.915416] gim info:(init_context_save_area:42) AMD GIM init_context_save_area: base =100 size=1.
[   26.915895] gim info:(init_frame_buffer_partition:231) VF FB base = 272MB (256 + 16)
[   26.916375] gim info:(init_frame_buffer_partition:235) VF FB Size = 7904MB (8160 - 256)
[   26.916850] gim info:(init_fb_static:118) AMD GIM init_fb_static: num_vf = 16, base= 272, total_size=7904, mini_size=256
[   26.917339] gim info:(init_fb_static:139) AMD GIM init_fb_static: vf_fb_size = 480, base= 272
[   26.917824] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 0 base =272,size= 480
[   26.918308] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 1 base =752,size= 480
[   26.918784] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 2 base =1232,size= 480
[   26.919299] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 3 base =1712,size= 480
[   26.919785] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 4 base =2192,size= 480
[   26.920243] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 5 base =2672,size= 480
[   26.920697] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 6 base =3152,size= 480
[   26.921141] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 7 base =3632,size= 480
[   26.921576] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 8 base =4112,size= 480
[   26.922009] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 9 base =4592,size= 480
[   26.922432] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 10 base =5072,size= 480
[   26.922850] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 11 base =5552,size= 480
[   26.923303] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 12 base =6032,size= 480
[   26.923751] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 13 base =6512,size= 480
[   26.924150] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 14 base =6992,size= 480
[   26.924542] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 15 base =7472,size= 480
[   26.924939] gim info:(set_new_adapter:789) enable MSI
[   26.924982] gim info:(ih_iv_ring_disable:383) disable iv ring successfully
[   26.925735] gim info:(alloc_iv_ring:99) ih->ivr_num_entries = 256
[   26.926117] gim info:(alloc_iv_ring:102) ih->ivr_size_in_bytes = 4096
[   26.926496] gim info:(alloc_iv_ring:108) ih->ivr_alloc_size_in_bytes = 4100
[   26.926874] gim info:(alloc_iv_ring:110) iv ring page_cnt = 2
[   26.927287] gim info:(alloc_iv_ring:141) ih->ivr_va = ffffc90063343000
[   26.927745] gim info:(alloc_iv_ring:147) ih->ivr_ma.quad_part = 0xfffff000
[   26.928116] gim info:(alloc_iv_ring:151) ih->ivr_wptr_wb = ffffc90063344000
[   26.928486] gim info:(alloc_iv_ring:158) ih->ivr_wptr_wa.quad_part = 0xffffe000
[   26.928850] gim info:(alloc_iv_ring:163) update rptr via doorbell
[   26.929211] gim info:(ih_iv_ring_init:291) ih->rptr_doorbell = ffffc900638007a0
[   26.929227] gim info:(ih_iv_ring_init:292) ih->rptr_doorbell_offset = 0x1e8
[   26.929587] gim info:(ih_iv_ring_hw_init:185) the physical address of ring buffer: 0xfffff0
[   26.929954] gim info:(ih_iv_ring_setup_rptr:451) write mmBIF_DOORBELL_APER_EN: 0x1
[   26.930680] gim info:(ih_iv_ring_enable:350) ih->ivr_wptr_reg = 0x0
[   26.931049] gim info:(ih_iv_ring_enable:352) ih->ivr_wptr = 0
[   26.931481] gim info:(ih_iv_ring_enable:354) ih->ivr_rptr_reg = 0x0
[   26.931845] gim info:(ih_iv_ring_enable:356) ih->ivr_rptr = 0
[   26.932190] gim info:(ih_iv_ring_enable:358) *(ih->rptr_doorbell) = 0x0
[   26.932529] gim info:(ih_iv_ring_init:299) init iv ring successfully
[   26.932543] gim info:(set_new_adapter:802) init work
[   26.932875] gim info:(set_new_adapter:808) register interrupt
[   26.932903] gim info:(ih_irq_source_enable:583) IH: read 0x00000000 from mask_reg 0x14d1
[   26.933244] gim info:(ih_irq_source_enable:589) IH: write 0x00000001 to mask_reg 0x14d1
[   26.933585] gim info:(ih_irq_source_enable:592) irq sourceID 0x89 get enabled
[   26.934290] gim info:(ih_irq_source_enable:583) IH: read 0x00000001 from mask_reg 0x14d1
[   26.934307] gim info:(ih_irq_source_enable:589) IH: write 0x00000003 to mask_reg 0x14d1
[   26.934671] gim info:(ih_irq_source_enable:592) irq sourceID 0x88 get enabled
[   26.935753] gim info:(gim_clear_all_errors:357) PCIE cap pos 58
[   26.936129] gim info:(gim_clear_all_errors:362) AER ext cap pos 150
[   26.936492] gim info:(gim_clear_all_errors:369) DevStatus = 0x9
[   26.936850] gim info:(gim_clear_all_errors:387) PCIE unrecover error = 0x2000
[   26.937215] gim info:(resume_scheduler:131) No functions on the runlist.
[   26.937229] gim info:(resume_scheduler:132) Don't need to restart the scheduler
[   26.937950] gim info:(gim_probe:86) AMD GIM probe: pf_count = 2

Now that it's enumerating VF's, how do you define them in a virsh XML file?

Thanks,
James

@flintcq
Copy link

flintcq commented May 2, 2018

hi,falcon78921 ,i also do the work as u do, here is my kernel logs, there are some differences

**

  1. i don't know wether the s7150x2 is something wrong, the first one on 0000:04:00.0 each time will occur error as "Failed to INIT PF for initial register 'init-state'", when wait_cmd_complete is timeout
    **

May 2 09:17:35 vgpu kernel: gim info:(set_new_adapter:802) init work
May 2 09:17:35 vgpu kernel: gim info:(set_new_adapter:808) register interrupt
May 2 09:17:35 vgpu kernel: gim info:(ih_irq_source_enable:583) IH: read 0x00000000 from mask_reg 0x14d1
May 2 09:17:35 vgpu kernel: gim info:(ih_irq_source_enable:589) IH: write 0x00000001 to mask_reg 0x14d1
May 2 09:17:35 vgpu kernel: gim info:(ih_irq_source_enable:592) irq sourceID 0x89 get enabled
May 2 09:17:35 vgpu kernel: gim info:(ih_irq_source_enable:583) IH: read 0x00000001 from mask_reg 0x14d1
May 2 09:17:35 vgpu kernel: gim info:(ih_irq_source_enable:589) IH: write 0x00000003 to mask_reg 0x14d1
May 2 09:17:35 vgpu kernel: gim info:(ih_irq_source_enable:592) irq sourceID 0x88 get enabled
May 2 09:17:35 vgpu kernel: gim error:(wait_cmd_complete:1643) wait_cmd_complete -- time out after 0.100009307 sec
May 2 09:17:35 vgpu kernel: gim error:(wait_cmd_complete:1650) Cmd = 0x17, Status = 0x0
May 2 09:17:35 vgpu kernel: gim error:(dump_gpu_status:1271) **** dump gpu status begin for struct adapter 4:00.00
May 2 09:17:35 vgpu kernel: gim info:(check_base_addrs:1259) CP_MQD_BASE_ADDR = 0x0:00000000
May 2 09:17:35 vgpu kernel: gim error:(dump_gpu_status:1308) mmGRBM_STATUS = 0x3028
May 2 09:17:35 vgpu kernel: gim error:(dump_gpu_status:1311) mmGRBM_STATUS2 = 0x8
May 2 09:17:35 vgpu kernel: gim error:(dump_gpu_status:1314) mmSRBM_STATUS = 0x20000040
May 2 09:17:35 vgpu kernel: gim error:(dump_gpu_status:1317) mmSRBM_STATUS2 = 0x0
May 2 09:17:35 vgpu kernel: gim error:(dump_gpu_status:1320) mmSDMA0_STATUS_REG = 0x46dee557
May 2 09:17:35 vgpu kernel: gim error:(dump_gpu_status:1323) mmSDMA1_STATUS_REG = 0x46dee557
May 2 09:17:35 vgpu kernel: gim info:(check_me_cntl:1237) CP_ME_CNTL = 0x15000000 GPU dump
May 2 09:17:35 vgpu kernel: gim error:(check_me_cntl:1239) ME HALTED!
May 2 09:17:35 vgpu kernel: gim error:(check_me_cntl:1242) PFP HALTED!
May 2 09:17:35 vgpu kernel: gim error:(check_me_cntl:1245) CE HALTED!
May 2 09:17:35 vgpu kernel: gim error:(dump_gpu_status:1443) **** dump gpu status end
May 2 09:17:35 vgpu kernel: gim error:(init_register_init_state:3149) Failed to INIT PF for initial register 'init-state'
May 2 09:17:35 vgpu kernel: gim info:(gim_clear_all_errors:357) PCIE cap pos 58
May 2 09:17:35 vgpu kernel: gim info:(gim_clear_all_errors:362) AER ext cap pos 150
May 2 09:17:35 vgpu kernel: gim info:(gim_clear_all_errors:369) DevStatus = 0xb
May 2 09:17:35 vgpu kernel: gim info:(gim_clear_all_errors:387) PCIE unrecover error = 0x2000

**

  1. add vfs to the runlist

**
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF0 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a8ef8#011index 0
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF1 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a8f18#011index 1
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF2 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a8f38#011index 2
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF3 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a8f58#011index 3
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF4 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a8f78#011index 4
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF5 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a8f98#011index 5
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF6 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a8fb8#011index 6
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF7 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a8fd8#011index 7
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF8 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a8ff8#011index 8
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF9 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a9018#011index 9
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF10 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a9038#011index 10
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF11 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a9058#011index 11
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF12 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a9078#011index 12
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF13 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a9098#011index 13
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF14 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a90b8#011index 14
May 2 09:17:35 vgpu kernel: gim info:(add_func_to_run_list:2216) Add VF15 to the runlist
May 2 09:17:35 vgpu kernel: gim info:(alloc_fn_list_node:3362) New Function List Node allocated at ffffffffa08a90d8#011index 15
May 2 09:17:35 vgpu kernel: gim warning:(resume_scheduler:149) Restart the Scheduler for 7 msec
May 2 09:17:35 vgpu kernel: gim info:(gim_probe:86) AMD GIM probe: pf_count = 1

**

  1. when enable sriov by gim, my case is to add iommu group(here i use pci-stud in kernel command line as 'pci-stub.ids=1002:692f', so it will use pci-stub as the vfs driver)

**
May 2 09:17:38 vgpu kernel: gim info:(enable_sriov:295) Enable SRIOV
May 2 09:17:38 vgpu kernel: gim info:(enable_sriov:296) Enable SRIOV vfs count = 16
May 2 09:17:38 vgpu kernel: pci 0000:06:02.0: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:02.0,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:02.0 to group 70
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:02.0: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:02.1: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:02.1,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:02.1 to group 71
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:02.1: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:02.2: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:02.2,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:02.2 to group 72
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:02.2: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:02.3: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:02.3,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:02.3 to group 73
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:02.3: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:02.4: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:02.4,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:02.4 to group 74
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:02.4: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:02.5: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:02.5,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:02.5 to group 75
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:02.5: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:02.6: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:02.6,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:02.6 to group 76
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:02.6: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:02.7: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:02.7,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:02.7 to group 77
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:02.7: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:03.0: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:03.0,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:03.0 to group 78
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:03.0: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:03.1: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:03.1,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:03.1 to group 79
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:03.1: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:03.2: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:03.2,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:03.2 to group 80
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:03.2: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:03.3: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:03.3,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:03.3 to group 81
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:03.3: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:03.4: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:03.4,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:03.4 to group 82
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:03.4: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:03.5: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:03.5,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:03.5 to group 83
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:03.5: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:03.6: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:03.6,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:03.6 to group 84
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:03.6: claimed by stub
May 2 09:17:38 vgpu kernel: pci 0000:06:03.7: Max Payload Size set to 256 (was 128, max 256)
May 2 09:17:38 vgpu kernel: vgaarb: device added: PCI:0000:06:03.7,decodes=io+mem,owns=none,locks=none
May 2 09:17:38 vgpu kernel: iommu: Adding device 0000:06:03.7 to group 85
May 2 09:17:38 vgpu kernel: pci-stub 0000:06:03.7: claimed by stub

this a more complete log content
init_register_init_state_fail.log

@falcon78921
Copy link
Author

Hi flintcq,

As of this morning, I was able to see AMD MxGPU as a device in Device Manager (Windows guest VM). The AMD Pro guest driver installed successfully (https://support.amd.com/en-us/download/workstation?os=Guest%20OS#pro-driver). I used the Radeon™ Pro Software Adrenalin Edition for Windows® 10 (64-bit) driver. This driver was stated specifically for KVM. Unfortunately, we are having stability issues with it. Every time we pass a VF to the guest, Windows will boot but crashes shortly after getting to the desktop. It might be a misconfiguration on our end, still investigating.

Your dmesg is a bit different than mine, are you running a different Linux distro? I would love to help other people who are trying to use the AMD FirePro GPU for virtualization. Looking at your dmesg, it appears you're at the same place as me. The gim module is loading successfully and enumerating VF's.

I'll keep you posted on my progress.

Thanks!

@flintcq
Copy link

flintcq commented May 3, 2018

hi falcon78921,
thx for your reply, following is my first environment:
CPU: E5-2683 v3
motherboard: Supermicro X10DAi
BIOS American Megatrends Inc. 3.0a(02/02/2018)
video card: S7150x2
Centos 7.4, kernel 4.4.126(build with patches), qemu:2.5.0(build) libvirt:3.2.0 (yum installed)
everything is according to the guide, run vm as following cmmand:
qemu-system-x86_64 -enable-kvm -m 2048 -cpu host,kvm=off -smp 4,sockets=1,cores=4,threads=1 - hda /home/win7.img -net nic,macaddr=00:23:F7:34:69:9D -net tap,script=/etc/qemu-ifup -device vfio-pci,host=06:02.0,rombar=0 -vnc :1

in this environment, all my vms can see vgpu, but all are errored with code 43

then i downgrade my qemu to qemy-kvm-1.5.3(yum installed), vms (win 7/10) can recognize vgpus, but there are still some issues exists:

1. in all vm, when start a new vm, started vms will stuck for seconds, this may caused by gim need to check full access to vf timeout, it will round robin all vfs, and back to its origin one, (i have post a issue as "check full access timeout", following i paste a part of gim log:

[ 1372.644436] gim info:(handle_fullaccess_timeout:1333) begin to check full access timeout for VF7
[ 1372.644443] gim info:(handle_fullaccess_timeout:1348) Need to restart the timer
[ 1372.744451] gim info:(handle_fullaccess_timeout:1333) begin to check full access timeout for VF7
[ 1372.744456] gim info:(handle_fullaccess_timeout:1348) Need to restart the timer
[ 1372.844465] gim info:(handle_fullaccess_timeout:1333) begin to check full access timeout for VF7
[ 1372.844470] gim info:(handle_fullaccess_timeout:1348) Need to restart the timer
[ 1372.944530] gim info:(handle_fullaccess_timeout:1333) begin to check full access timeout for VF7
[ 1372.944536] gim info:(handle_fullaccess_timeout:1348) Need to restart the timer
[ 1373.044550] gim info:(handle_fullaccess_timeout:1333) begin to check full access timeout for VF7
[ 1373.044559] gim info:(handle_fullaccess_timeout:1348) Need to restart the timer
[ 1373.144597] gim info:(handle_fullaccess_timeout:1333) begin to check full access timeout for VF7
[ 1373.144603] gim info:(handle_fullaccess_timeout:1348) Need to restart the timer
[ 1373.244643] gim info:(handle_fullaccess_timeout:1333) begin to check full access timeout for VF7
[ 1373.244650] gim info:(handle_fullaccess_timeout:1348) Need to restart the timer
[ 1373.344691] gim info:(handle_fullaccess_timeout:1333) begin to check full access timeout for VF7
[ 1373.344697] gim info:(handle_fullaccess_timeout:1348) Need to restart the timer
[ 1373.444782] gim info:(handle_fullaccess_timeout:1333) begin to check full access timeout for VF7
[ 1373.444790] gim info:(handle_fullaccess_timeout:1348) Need to restart the timer
[ 1373.544799] gim info:(handle_fullaccess_timeout:1333) begin to check full access timeout for VF7
[ 1373.544805] gim info:(handle_fullaccess_timeout:1348) Need to restart the timer
[ 1373.644813] gim info:(handle_fullaccess_timeout:1333) begin to check full access timeout for VF7
[ 1373.644818] gim info:(handle_fullaccess_timeout:1348) Need to restart the timer
[ 1373.744829] gim info:(handle_fullaccess_timeout:1333) begin to check full access timeout for VF7
[ 1373.744834] gim info:(handle_fullaccess_timeout:1348) Need to restart the timer
[ 1373.844841] gim info:(handle_fullaccess_timeout:1333) begin to check full access timeout for VF7
[ 1373.844846] gim info:(handle_fullaccess_timeout:1348) Need to restart the timer
[ 1373.905899] gim info:(ih_irq_process:712) AMD ISR is being invoked
[ 1373.905904] gim info:(ih_iv_ring_get_pointers:481) ih_iv_ring_get_pointers
[ 1373.905908] gim info:(ih_iv_ring_get_pointers:482) ih->ivr_wptr_wb = 0xffffc90006d86000
[ 1373.905911] gim info:(ih_iv_ring_get_pointers:484) write offset: *(ih->ivr_wptr_wb) = 0x100
[ 1373.905914] gim info:(ih_iv_ring_get_pointers:485) read idx: ih->ivr_rptr = 0xf
[ 1373.905916] gim info:(ih_iv_ring_get_pointers:487) Rx at entry 15 in the ring
[ 1373.905919] gim info:(ih_iv_ring_get_pointers:489) iv_ring_entry.source_id = 137
[ 1373.905921] gim info:(ih_iv_ring_get_pointers:490) iv_ring_entry.source_data = 7
[ 1373.905925] gim info:(ih_irq_process:733) recieved 1 irqs in one ISR
[ 1373.905928] gim info:(ih_irq_process:756) VF_PF_MSGBUF_VALID recieved(Received msg from VF)
[ 1373.905932] gim info:(mailbox_update_index:833) write mmMAILBOX_INDEX: 0x7
[ 1373.905938] gim info:(mailbox_msg_rcv:878) read mmMAILBOX_MSGBUF_RCV_DW0: 0x2
[ 1373.905940] gim info:(mailbox_msg_rcv:879) read mmMAILBOX_MSGBUF_RCV_DW1: 0
[ 1373.905942] gim info:(mailbox_msg_rcv:880) read mmMAILBOX_MSGBUF_RCV_DW2: 0x0
[ 1373.905945] gim info:(mailbox_msg_rcv:881) read mmMAILBOX_MSGBUF_RCV_DW3: 0x0
[ 1373.905949] gim info:(mailbox_ack_reciept:977) write mmMAILBOX_CONTROL: 0x300 to MAILBOX_INDEX 0x7
[ 1373.905952] gim info:(ih_irq_process:777) GPU access flag = 0x87
[ 1373.905955] gim info:(idh_queue:655) new idh: task->event = 2, task->func_id = 7
[ 1373.905959] gim info:(mailbox_update_index:833) write mmMAILBOX_INDEX: 0x7
[ 1373.905965] gim info:(ih_iv_ring_update_rptr:545) update the new rptr: ih->ivr_rptr_reg = 0x100
[ 1373.905968] gim info:(ih_iv_ring_update_rptr:553) update rptr via doorbell: 0x100
[ 1373.905971] gim info:(ih_iv_ring_update_rptr:555) current wptr: 0x100
[ 1373.905974] gim info:(ih_irq_process:820) AMD ISR is complete
[ 1373.905995] gim info:(signal_scheduler:1390) Invoked the task scheduler thread. Process IRQ activity
[ 1373.905999] gim info:(signal_scheduler:1419) Got a release GPU task
[ 1373.906002] gim info:(signal_scheduler:1422) rel_gpu_task --> Event = 2;FunID = 7
[ 1373.906004] gim info:(signal_scheduler:1424) Got a REL_GPU_ACCESS task for VFindex 7
[ 1373.906007] gim info:(signal_scheduler:1447) IDH_REL_GPU_INIT_ACCESS
[ 1373.906011] gim info:(signal_scheduler:1449) GRBM_STATUS = 0x00003028
[ 1373.906015] gim info:(handle_rel_gpu_init_access:1191) restore FLR VF 7 to available
[ 1373.906018] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 8
[ 1373.906139] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1373.913144] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 9
[ 1373.913149] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1373.920152] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 10
[ 1373.920155] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1373.927159] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 11
[ 1373.927162] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1373.934165] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 12
[ 1373.934169] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1373.941171] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 13
[ 1373.941174] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1373.948177] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 14
[ 1373.948180] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1373.955183] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 15
[ 1373.955185] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1373.962189] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 0
[ 1373.962192] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1373.969195] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 1
[ 1373.969198] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1373.976201] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 2
[ 1373.976204] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1373.983207] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 3
[ 1373.983209] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1373.990212] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 4
[ 1373.990215] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1373.997219] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 5
[ 1373.997221] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1374.004225] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 6
[ 1374.004227] gim info:(kcl_thread_sleep:132) wait 7 ms
[ 1374.011230] gim info:(loop_once_for_all_active_VFs:1008) switch to VF 7
[ 1374.011964] gim info:(kcl_thread_sleep:132) wait 6 ms
[ 1374.017967] gim info:(handle_rel_gpu_init_access:1199) VF7 is indicated as the current running vf
[ 1374.017970] gim warning:(resume_scheduler:149) Restart the Scheduler for 6 msec

  1. in win 10, the radeon program can run without errors, and show good

  2. in win 7, its very unstable, a) after start, first connect by rdp, then will be stucked when using vnc to connect to vm; b) sometime the vgpu showed with code 43; c) vfio error with logs:
    vfio_cap_init: 0000:06:02.7 hiding cap 0xff
    [ 1929.812478] vfio_ecap_init: 0000:06:02.7 hiding ecap 0xffff@0xffc
    [ 1930.523380] vfio-pci 0000:06:02.7: timed out waiting for pending transaction; performing function level reset anyway

@norwalk350
Copy link

@flintcq Would you dump an xml of your windows 10 vm and post that here? I'm working with @falcon78921 on this adventure, and we're experiencing instability with windows 10 - kind of like what you've mentioned for windows 7. We haven't even tried windows 7. We were able to boot the Windows 10 VM after installing the Radeon Driver and it seems to run for 5 minutes, then it locks up the server to the point where we have to power it off to recover. Here's what's on the console when this happens:
image

@flintcq
Copy link

flintcq commented May 4, 2018

hi all,
this is the command i used to launch a vm(the environment is same as that mentioned in this project):

  1. add "intel_iommu=on kvm.ignore_msrs=1 kvm-intel.nested=1 kvm_intel.ept=1" to grub
  2. "modprobe vfio-pci" and "modprobe gim" after boot
  3. virsh nodedev-detach pci_0000_06_02_0 (vgpu's pci address is 0000:06:02.0)
  4. run command as: qemu-system-x86_64 -enable-kvm -m 4096 -cpu host -smp 4,sockets=1,cores=4,threads=1 -hda /mnt/win10-3.img -net nic,macaddr=00:23:F7:34:69:9A -net tap,script=/etc/qemu-ifup -device vfio-pci,host=06:02.0 -vnc :2

some tricks we found:
"kvm=off" should not used, it will cause error code 43

some abnormals we also found:
when use vnc, the vnc connection will stucked, but rdp is ok

the gim error you mentioned didn't show up in our environment, but in our environment, one of my s7150x2 card failed at "init_register_init_state" when load gim driver, we think it maybe a hardware problem or a bug mentioned in gim code


[ 588.503468] gim error:(wait_cmd_complete:1648) wait_cmd_complete -- time out after 0.100013123 sec
[ 588.503493] gim error:(wait_cmd_complete:1655) Cmd = 0x17, Status = 0x0
[ 588.503501] gim error:(dump_gpu_status:1271) **** dump gpu status begin for struct adapter 4:00.00
[ 588.503513] gim info:(check_base_addrs:1259) CP_MQD_BASE_ADDR = 0x0:00000000
[ 588.503542] gim error:(dump_gpu_status:1308) mmGRBM_STATUS = 0x3028
[ 588.503550] gim error:(dump_gpu_status:1311) mmGRBM_STATUS2 = 0x8
[ 588.503558] gim error:(dump_gpu_status:1314) mmSRBM_STATUS = 0x20000040
[ 588.503566] gim error:(dump_gpu_status:1317) mmSRBM_STATUS2 = 0x0
[ 588.503574] gim error:(dump_gpu_status:1320) mmSDMA0_STATUS_REG = 0x46dee557
[ 588.503582] gim error:(dump_gpu_status:1323) mmSDMA1_STATUS_REG = 0x46dee557
[ 588.503596] gim info:(check_me_cntl:1237) CP_ME_CNTL = 0x15000000 GPU dump
[ 588.503597] gim error:(check_me_cntl:1239) ME HALTED!
[ 588.503604] gim error:(check_me_cntl:1242) PFP HALTED!
[ 588.503609] gim error:(check_me_cntl:1245) CE HALTED!
[ 588.503616] gim error:(dump_gpu_status:1443) **** dump gpu status end
[ 588.503623] gim error:(init_register_init_state:3154) Failed to INIT PF for initial register 'init-state'

DMAR fault we see very offen, especially when something wrong occurs, we think it maybe a incompatible of BIOS settings, following is our DMAR fault:

May 3 11:20:43 vgpu kernel: [ 5771.299572] DMAR: DRHD: handling fault status reg 702
May 3 11:20:43 vgpu kernel: [ 5771.299572] DMAR: DMAR:[DMA Read] Request device [06:00.0] fault addr 0
May 3 11:20:43 vgpu kernel: [ 5771.299572] DMAR:[fault reason 06] PTE Read access is not set

@flintcq
Copy link

flintcq commented May 4, 2018

additional, each time we start or reboot a new vm, gim would do "check full access" to its vgpu, and each time do a loop switch, which cause all started vms halt for a while

@Tungsten-lawrenceb
Copy link

Tungsten-lawrenceb commented May 6, 2018

Hi @flintcq @falcon78921, I have the same virtual desktop environment using a S7150 with qemu/kvm.
Linux vdi 4.13.0-38-generic #43~16.04.1-Ubuntu.

I'm no expert, but I'd love to share my experience and hopefully learn something along the way.
In my implementation I have a very stable Win10/Win2K16 VM with no uptime issues using the open source driver.

image

image

This is the VIRSH XML output for the Win10 VM.
vm-1.zip

@flintcq
Copy link

flintcq commented May 7, 2018

@Tungsten-lawrenceb can u paste the kernel log about gim module, because the errors showed in logs are not always affect the use of vm.
we are interested in founding out what caused these errors, and how to fixed if we could.
we don't know how you connect to your vms, in out way, we tried vnc and spice, but both of them stucked with black, then use rdp, but the features of vgpus cannot be enabled, finally we use vnc inside the vm, we can achieve a good performance of vgpus, while the vgpus still has limtations of supporting directx 10 and other stuff.
so to us, can correctly see and install vgpu in vm, is only a first step to move.
anyway, thanks for sharing your experiences

@Tungsten-lawrenceb
Copy link

Tungsten-lawrenceb commented May 7, 2018

@flintcq kernel logs are as follows:

INITIALIZATION

[ 6061.613321] [drm] amdgpu kernel modesetting enabled.
[ 6061.619673] AMD IOMMUv2 driver by Joerg Roedel jroedel@suse.de
[ 6061.619679] AMD IOMMUv2 functionality not available on this system
[ 6061.627849] CRAT table not found
[ 6061.627854] Finished initializing topology ret=0
[ 6061.628053] kfd kfd: Initialized module
[ 6061.628625] amdgpu 0000:03:00.0: enabling device (0000 -> 0003)
[ 6061.629078] [drm] initializing kernel modesetting (TONGA 0x1002:0x6929 0x1002:0x030C 0x00).
[ 6061.629110] [drm] register mmio base: 0xDE400000
[ 6061.629113] [drm] register mmio size: 262144
[ 6061.629139] [drm] doorbell mmio base: 0xF4000000
[ 6061.629141] [drm] doorbell mmio size: 2097152
[ 6061.629150] [drm] probing gen 2 caps for device 8086:3c04 = 7a7903/e
[ 6061.629154] [drm] probing mlw for device 8086:3c04 = 7a7903
[ 6061.851897] ATOM BIOS: C76720
[ 6061.851913] [drm] GPU not posted. posting now...
[ 6061.939702] [drm] Changing default dispclk from 600Mhz to 625Mhz
[ 6062.401188] amdgpu 0000:03:00.0: VRAM: 8190M 0x0000000000000000 - 0x00000001FFDFFFFF (8190M used)
[ 6062.401198] amdgpu 0000:03:00.0: GTT: 8190M 0x00000001FFE00000 - 0x00000003FFBFFFFF
[ 6062.401201] [drm] Detected VRAM RAM=8190M, BAR=256M
[ 6062.401204] [drm] RAM width 256bits DDR
[ 6062.401235] [drm] amdgpu: 8190M of VRAM memory ready
[ 6062.401238] [drm] amdgpu: 8190M of GTT memory ready.
[ 6062.401336] [drm] GART: num cpu pages 2096640, num gpu pages 2096640
[ 6062.419498] [drm] PCIE GART of 8190M enabled (table at 0x0000000000040000).
[ 6062.419539] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[ 6062.419542] [drm] Driver supports precise vblank timestamp query.
[ 6062.419675] amdgpu 0000:03:00.0: amdgpu: using MSI.
[ 6062.419715] [drm] amdgpu: irq initialized.
[ 6062.419734] Can't find requested voltage id in vdd_dep_on_sclk table!
[ 6062.436638] amdgpu: powerplay initialized
[ 6062.437912] [drm] AMDGPU Display Connectors
[ 6062.443332] amdgpu 0000:03:00.0: fence driver on ring 0 use gpu addr 0x00000001ffe00008, cpu addr 0xffff88102a311008
[ 6062.443861] amdgpu 0000:03:00.0: fence driver on ring 1 use gpu addr 0x00000001ffe00018, cpu addr 0xffff88102a311018
[ 6062.444348] amdgpu 0000:03:00.0: fence driver on ring 2 use gpu addr 0x00000001ffe00028, cpu addr 0xffff88102a311028
[ 6062.444802] amdgpu 0000:03:00.0: fence driver on ring 3 use gpu addr 0x00000001ffe00038, cpu addr 0xffff88102a311038
[ 6062.445270] amdgpu 0000:03:00.0: fence driver on ring 4 use gpu addr 0x00000001ffe00048, cpu addr 0xffff88102a311048
[ 6062.445706] amdgpu 0000:03:00.0: fence driver on ring 5 use gpu addr 0x00000001ffe00058, cpu addr 0xffff88102a311058
[ 6062.446154] amdgpu 0000:03:00.0: fence driver on ring 6 use gpu addr 0x00000001ffe00068, cpu addr 0xffff88102a311068
[ 6062.446583] amdgpu 0000:03:00.0: fence driver on ring 7 use gpu addr 0x00000001ffe00078, cpu addr 0xffff88102a311078
[ 6062.446967] amdgpu 0000:03:00.0: fence driver on ring 8 use gpu addr 0x00000001ffe00088, cpu addr 0xffff88102a311088
[ 6062.448004] amdgpu 0000:03:00.0: fence driver on ring 9 use gpu addr 0x00000001ffe00098, cpu addr 0xffff88102a311098
[ 6062.448229] amdgpu 0000:03:00.0: fence driver on ring 10 use gpu addr 0x00000001ffe000a8, cpu addr 0xffff88102a3110a8
[ 6062.449552] [drm] Found UVD firmware Version: 1.65 Family ID: 10
[ 6062.450512] amdgpu 0000:03:00.0: fence driver on ring 11 use gpu addr 0x000000000108ef40, cpu addr 0xffffc9000d04ef40
[ 6062.451255] [drm] Found VCE firmware Version: 52.4 Binary ID: 3
[ 6062.451528] amdgpu 0000:03:00.0: fence driver on ring 12 use gpu addr 0x00000001ffe000c8, cpu addr 0xffff88102a3110c8
[ 6062.451728] amdgpu 0000:03:00.0: fence driver on ring 13 use gpu addr 0x00000001ffe000d8, cpu addr 0xffff88102a3110d8
[ 6062.539347] [drm] ring test on 0 succeeded in 12 usecs
[ 6062.539582] [drm] ring test on 1 succeeded in 28 usecs
[ 6062.539614] [drm] ring test on 2 succeeded in 17 usecs
[ 6062.539622] [drm] ring test on 3 succeeded in 3 usecs
[ 6062.539631] [drm] ring test on 4 succeeded in 3 usecs
[ 6062.539639] [drm] ring test on 5 succeeded in 3 usecs
[ 6062.539647] [drm] ring test on 6 succeeded in 3 usecs
[ 6062.539655] [drm] ring test on 7 succeeded in 3 usecs
[ 6062.539663] [drm] ring test on 8 succeeded in 3 usecs
[ 6062.539702] [drm] ring test on 9 succeeded in 6 usecs
[ 6062.539722] [drm] ring test on 10 succeeded in 5 usecs
[ 6062.585417] [drm] ring test on 11 succeeded in 2 usecs
[ 6062.605284] [drm] UVD initialized successfully.
[ 6062.823681] [drm] ring test on 12 succeeded in 21 usecs
[ 6062.823691] [drm] ring test on 13 succeeded in 3 usecs
[ 6062.823693] [drm] VCE initialized successfully.
[ 6062.824026] [drm] ib test on ring 0 succeeded in 0 usecs
[ 6062.824323] [drm] ib test on ring 1 succeeded in 0 usecs
[ 6062.824500] [drm] ib test on ring 2 succeeded in 0 usecs
[ 6062.824630] [drm] ib test on ring 3 succeeded in 0 usecs
[ 6062.824760] [drm] ib test on ring 4 succeeded in 0 usecs
[ 6062.824940] [drm] ib test on ring 5 succeeded in 0 usecs
[ 6062.825106] [drm] ib test on ring 6 succeeded in 0 usecs
[ 6062.825288] [drm] ib test on ring 7 succeeded in 0 usecs
[ 6062.825461] [drm] ib test on ring 8 succeeded in 0 usecs
[ 6062.825624] [drm] ib test on ring 9 succeeded in 0 usecs
[ 6062.825751] [drm] ib test on ring 10 succeeded in 0 usecs
[ 6062.846505] [drm] ib test on ring 11 succeeded
[ 6062.866619] [drm] ib test on ring 12 succeeded
[ 6062.866728] [drm] Initialized amdgpu 3.1.0 20150101 for 0000:03:00.0 on minor 1

BAR SETUP

Mar 26 12:24:14 vdi kernel: pci 0000:03:00.0: [1002:6929] type 00 class 0x030000
Mar 26 12:24:14 vdi kernel: pci 0000:03:00.0: reg 0x10: [mem 0x380fe0000000-0x380fefffffff 64bit pref]
Mar 26 12:24:14 vdi kernel: pci 0000:03:00.0: reg 0x18: [mem 0x380ff4000000-0x380ff41fffff 64bit pref]
Mar 26 12:24:14 vdi kernel: pci 0000:03:00.0: reg 0x20: [io 0xa000-0xa0ff]
Mar 26 12:24:14 vdi kernel: pci 0000:03:00.0: reg 0x24: [mem 0xde400000-0xde43ffff]
Mar 26 12:24:14 vdi kernel: pci 0000:03:00.0: reg 0x30: [mem 0xde440000-0xde45ffff pref]
Mar 26 12:24:14 vdi kernel: pci 0000:03:00.0: supports D1 D2
Mar 26 12:24:14 vdi kernel: pci 0000:03:00.0: PME# supported from D1 D2 D3hot D3cold
Mar 26 12:24:14 vdi kernel: pci 0000:03:00.0: reg 0x354: [mem 0x380ee0000000-0x380eefffffff 64bit pref]
Mar 26 12:24:14 vdi kernel: pci 0000:03:00.0: VF(n) BAR0 space: [mem 0x380ee0000000-0x380fdfffffff 64bit pref] (contains BAR0 for 16 VFs)
Mar 26 12:24:14 vdi kernel: pci 0000:03:00.0: reg 0x35c: [mem 0x380ff0000000-0x380ff03fffff 64bit pref]
Mar 26 12:24:14 vdi kernel: pci 0000:03:00.0: VF(n) BAR2 space: [mem 0x380ff0000000-0x380ff3ffffff 64bit pref] (contains BAR2 for 16 VFs)
Mar 26 12:24:14 vdi kernel: pci 0000:03:00.0: reg 0x368: [mem 0xda400000-0xda7fffff]
Mar 26 12:24:14 vdi kernel: pci 0000:03:00.0: VF(n) BAR5 space: [mem 0xda400000-0xde3fffff] (contains BAR5 for 16 VFs)

Apr 19 21:16:28 vdi kernel: [ 2.878415] gim: loading out-of-tree module taints kernel.
Apr 19 21:16:28 vdi kernel: [ 2.879218] gim: module verification failed: signature and/or required key missing - tainting ker nel
Apr 19 21:16:28 vdi kernel: [ 2.880937] gim info:(gim_init:144) Start AMD open source GIM initialization
Apr 19 21:16:28 vdi kernel: [ 2.881519] gim info:(gim_init:147) GPU IOV MODULE - version 1.0.0-k
Apr 19 21:16:28 vdi kernel: [ 2.882096] gim info:(gim_init:149) Copyright (c) 2014-2017 AMD Corporation.
Apr 19 21:16:28 vdi kernel: [ 2.884490] Modules linked in: gim(OE+) crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc as t i2c_algo_bit ttm aesni_intel drm_kms_helper isci aes_x86_64 syscopyarea e1000e crypto_simd sysfillrect glue_helper sysimgblt f b_sys_fops cryptd libsas ahci ptp drm libahci pps_core scsi_transport_sas wmi
Apr 19 21:16:28 vdi kernel: [ 2.906459] ? gim_probe+0x3e/0x3e [gim]
Apr 19 21:16:28 vdi kernel: [ 2.907123] init_config+0x64/0x318 [gim]
Apr 19 21:16:28 vdi kernel: [ 2.907780] ? gim_probe+0x3e/0x3e [gim]
Apr 19 21:16:28 vdi kernel: [ 2.908446] gim_init+0xa3/0x158 [gim]
Apr 19 21:16:28 vdi kernel: [ 2.909076] ? gim_probe+0x3e/0x3e [gim]
Apr 19 21:16:28 vdi kernel: [ 2.919866] gim info:(init_config:305) INIT CONFIG
Apr 19 21:16:28 vdi kernel: [ 2.920967] gim info:(set_new_adapter:614) curr allocated at ffffffffc076db40
Apr 19 21:16:28 vdi kernel: [ 2.921473] gim info:(set_new_adapter:621) SRIOV is supported
Apr 19 21:16:28 vdi kernel: [ 2.921967] gim info:(set_new_adapter:629) found PCI bridge device
Apr 19 21:16:28 vdi kernel: [ 2.922455] gim info:(set_new_adapter:633) found: 00:2.0
Apr 19 21:16:28 vdi kernel: [ 2.922955] gim info:(set_new_adapter:650) mmio_base = ffff9bb7c81c0000
Apr 19 21:16:28 vdi kernel: [ 2.922960] gim info:(set_new_adapter:652) doorbell = ffff9bb7c8600000
Apr 19 21:16:28 vdi kernel: [ 2.923454] gim info:(set_new_adapter:654) pf.fb_va = ffff9bb7f0000000
Apr 19 21:16:28 vdi kernel: [ 2.923938] gim info:(sriov_is_ari_enabled:164) PCI_SRIOV_CAP = 0x00000002
Apr 19 21:16:28 vdi kernel: [ 2.924913] gim info:(sriov_is_ari_enabled:173) PCI_SRIOV_CTRL = 0x00000010
Apr 19 21:16:28 vdi kernel: [ 2.925378] gim info:(sriov_is_ari_enabled:176) PCI_SRIOV_CTRL_ARI is set --> ARI is supported
Apr 19 21:16:28 vdi kernel: [ 2.925853] gim info:(program_ari_mode:483) Read bif_strap8 = 0x00200004
Apr 19 21:16:28 vdi kernel: [ 2.926329] gim info:(program_ari_mode:488) program_ari_mode - Set ARI_Mode = PF_BUS
Apr 19 21:16:28 vdi kernel: [ 2.926813] gim info:(program_ari_mode:498) Write bif_strap8 = 0x00000004
Apr 19 21:16:28 vdi kernel: [ 2.927295] gim info:(gim_read_rom_from_reg:222) Reading VBios from ROM
Apr 19 21:16:28 vdi kernel: [ 2.927410] gim info:(gim_read_vbios:284) VBIOS starts: 0x55, 0xaa
Apr 19 21:16:28 vdi kernel: [ 2.928381] gim info:(gim_read_vbios:287) VBios size is 0x10000
Apr 19 21:16:28 vdi kernel: [ 2.928849] gim info:(gim_read_vbios:290) vbios allocated at ffff9bb7c7dea000
Apr 19 21:16:28 vdi kernel: [ 2.929310] gim info:(gim_read_rom_from_reg:222) Reading VBios from ROM
Apr 19 21:16:28 vdi kernel: [ 3.063327] gim info:(gim_read_vbios:298) BIOS Version Major 0xF Minor 0x31
Apr 19 21:16:28 vdi kernel: [ 3.064326] gim info:(gim_read_vbios:311) Valid video BIOS image,
Apr 19 21:16:28 vdi kernel: [ 3.064327] gim info:(gim_read_vbios:313) size = 0x10000, check sum is 0x53b100
Apr 19 21:16:28 vdi kernel: [ 3.065266] gim info:(gim_post_vbios:344) Init Parser passed!, continue
Apr 19 21:16:28 vdi kernel: [ 3.065765] gim info:(atom_chk_asic_status:335) ATOM_CheckAsicStatus - BIOS_SCRATCH_7 = 0x0000000 0
Apr 19 21:16:28 vdi kernel: [ 3.066254] gim info:(atom_chk_asic_status:338) Isolate ATOM_S7_ASIC_INIT_COMPLETE_MASK bit(s) = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 3.066753] gim info:(atom_chk_asic_status:341) RLC_CNTL = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 3.067251] gim info:(atom_chk_asic_status:343) Isolate RLC_CNTL__RLC_ENABLE_F32_MASK = 0x0000000 0
Apr 19 21:16:28 vdi kernel: [ 3.067759] gim info:(atom_chk_asic_status:350) ATOM_ASIC_NEED_POST
Apr 19 21:16:28 vdi kernel: [ 3.068298] gim info:(gim_post_vbios:347) Asic needs a VBios post
Apr 19 21:16:28 vdi kernel: [ 3.068810] gim info:(atom_post_vbios:200) ATOM_PostVBIOS: firmware_info passed
Apr 19 21:16:28 vdi kernel: [ 3.069328] gim info:(atom_post_vbios:253) asic_init before, engine clock = 7530; memory clock =1 e848
Apr 19 21:16:28 vdi kernel: [ 3.393000] gim info:(atom_post_vbios:256) asic_init after
Apr 19 21:16:28 vdi kernel: [ 3.393775] gim info:(atom_post_vbios:263) atom_init_fan_cntl before
Apr 19 21:16:28 vdi kernel: [ 3.394481] gim info:(atom_post_vbios:265) atom_init_fan_cntl after
Apr 19 21:16:28 vdi kernel: [ 3.395173] gim info:(gim_post_vbios:353) Post INIT_ASIC successfully!
Apr 19 21:16:28 vdi kernel: [ 3.395859] gim info:(firmware_requires_update:512) SMU option ROM version 0x111700
Apr 19 21:16:28 vdi kernel: [ 3.395860] gim info:(firmware_requires_update:513) versus patch version 0x111a00
Apr 19 21:16:28 vdi kernel: [ 3.397293] gim info:(firmware_requires_update:523) RLCV option ROM version 113 versus patch vers ion 113
Apr 19 21:16:28 vdi kernel: [ 3.397967] gim info:(firmware_requires_update:528) TOC found, update it
Apr 19 21:16:28 vdi kernel: [ 3.398618] gim info:(patch_firmware:579) Update smc_init table
Apr 19 21:16:28 vdi kernel: [ 3.950698] gim info:(patch_firmware:612) Update smu firmware
Apr 19 21:16:28 vdi kernel: [ 3.954023] gim info:(patch_firmware:630) Update TOC
Apr 19 21:16:28 vdi kernel: [ 3.955809] gim info:(gim_post_vbios:359) Asic needs firmware loaded
Apr 19 21:16:28 vdi kernel: [ 3.956556] gim info:(atom_post_vbios:200) ATOM_PostVBIOS: firmware_info passed
Apr 19 21:16:28 vdi kernel: [ 3.957245] gim info:(atom_post_vbios:239) just load uCode
Apr 19 21:16:28 vdi kernel: [ 3.957246] gim info:(atom_post_vbios:253) asic_init before, engine clock = 7530; memory clock =1 e848
Apr 19 21:16:28 vdi kernel: [ 5.651871] gim info:(atom_post_vbios:256) asic_init after
Apr 19 21:16:28 vdi kernel: [ 5.652554] gim info:(atom_post_vbios:263) atom_init_fan_cntl before
Apr 19 21:16:28 vdi kernel: [ 5.653200] gim info:(atom_post_vbios:265) atom_init_fan_cntl after
Apr 19 21:16:28 vdi kernel: [ 5.653839] gim info:(gim_post_vbios:365) Post LOAD_FW successfully!
Apr 19 21:16:28 vdi kernel: [ 5.654477] gim info:(gim_post_vbios:366) Post VBIOS successfully!
Apr 19 21:16:28 vdi kernel: [ 5.655110] gim info:(set_new_adapter:701) gim_post_vbios done
Apr 19 21:16:28 vdi kernel: [ 5.655111] gim info:(set_new_adapter:739) Scheduler Time interval is per-vf from XL
Apr 19 21:16:28 vdi kernel: [ 5.655741] gim info:(set_new_adapter:740) config file
Apr 19 21:16:28 vdi kernel: [ 5.656993] gim info:(enable_sriov:295) Enable SRIOV
Apr 19 21:16:28 vdi kernel: [ 5.657574] gim info:(enable_sriov:296) Enable SRIOV vfs count = 16
Apr 19 21:16:28 vdi kernel: [ 5.796414] gim info:(enumerate_vfs:123) vf found: 03:2.0
Apr 19 21:16:28 vdi kernel: [ 5.797151] gim info:(enumerate_vfs:123) vf found: 03:2.1
Apr 19 21:16:28 vdi kernel: [ 5.797871] gim info:(enumerate_vfs:123) vf found: 03:2.2
Apr 19 21:16:28 vdi kernel: [ 5.798577] gim info:(enumerate_vfs:123) vf found: 03:2.3
Apr 19 21:16:28 vdi kernel: [ 5.799272] gim info:(enumerate_vfs:123) vf found: 03:2.4
Apr 19 21:16:28 vdi kernel: [ 5.799950] gim info:(enumerate_vfs:123) vf found: 03:2.5
Apr 19 21:16:28 vdi kernel: [ 5.800649] gim info:(enumerate_vfs:123) vf found: 03:2.6
Apr 19 21:16:28 vdi kernel: [ 5.801305] gim info:(enumerate_vfs:123) vf found: 03:2.7
Apr 19 21:16:28 vdi kernel: [ 5.801945] gim info:(enumerate_vfs:123) vf found: 03:3.0
Apr 19 21:16:28 vdi kernel: [ 5.802573] gim info:(enumerate_vfs:123) vf found: 03:3.1
Apr 19 21:16:28 vdi kernel: [ 5.803188] gim info:(enumerate_vfs:123) vf found: 03:3.2
Apr 19 21:16:28 vdi kernel: [ 5.803788] gim info:(enumerate_vfs:123) vf found: 03:3.3
Apr 19 21:16:28 vdi kernel: [ 5.804405] gim info:(enumerate_vfs:123) vf found: 03:3.4
Apr 19 21:16:28 vdi kernel: [ 5.804980] gim info:(enumerate_vfs:123) vf found: 03:3.5
Apr 19 21:16:28 vdi kernel: [ 5.805540] gim info:(enumerate_vfs:123) vf found: 03:3.6
Apr 19 21:16:28 vdi kernel: [ 5.806086] gim info:(enumerate_vfs:123) vf found: 03:3.7
Apr 19 21:16:28 vdi kernel: [ 5.806643] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:2.0
Apr 19 21:16:28 vdi kernel: [ 5.807176] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.807709] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.808284] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:2.1
Apr 19 21:16:28 vdi kernel: [ 5.808806] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.809329] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.809860] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:2.2
Apr 19 21:16:28 vdi kernel: [ 5.810369] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.810878] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.811397] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:2.3
Apr 19 21:16:28 vdi kernel: [ 5.811894] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.812424] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.812932] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:2.4
Apr 19 21:16:28 vdi kernel: [ 5.813418] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.813905] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.814399] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:2.5
Apr 19 21:16:28 vdi kernel: [ 5.814873] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.815348] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.815832] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:2.6
Apr 19 21:16:28 vdi kernel: [ 5.816325] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.816789] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.817261] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:2.7
Apr 19 21:16:28 vdi kernel: [ 5.817711] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.818163] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.818622] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:3.0
Apr 19 21:16:28 vdi kernel: [ 5.819060] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.819499] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.819946] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:3.1
Apr 19 21:16:28 vdi kernel: [ 5.820405] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.820834] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.821284] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:3.2
Apr 19 21:16:28 vdi kernel: [ 5.821700] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.822118] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.822557] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:3.3
Apr 19 21:16:28 vdi kernel: [ 5.822991] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.823439] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.823907] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:3.4
Apr 19 21:16:28 vdi kernel: [ 5.824416] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.824893] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.825390] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:3.5
Apr 19 21:16:28 vdi kernel: [ 5.825880] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.826382] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.826907] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:3.6
Apr 19 21:16:28 vdi kernel: [ 5.827424] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.827957] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.828540] gim info:(pci_disable_error_reporting:764) Disable error reporting for device: 03:3.7
Apr 19 21:16:28 vdi kernel: [ 5.829087] gim info:(pci_disable_error_reporting:769) Mask before -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.829648] gim info:(pci_disable_error_reporting:780) Mask after -> corr = 0x00000000, uncorr = 0x00000000
Apr 19 21:16:28 vdi kernel: [ 5.830206] gim info:(pci_gpu_iov_init:87) total_fb_available = 8190
Apr 19 21:16:28 vdi kernel: [ 5.830207] gim info:(pci_gpu_iov_init:88) AMD GIM pci_gpu_iov_init pos = 400
Apr 19 21:16:28 vdi kernel: [ 5.831854] gim info:(pci_gpu_iov_init:90) AMD GIM pci_gpu_iov_init total_fb_available = 1ffe
Apr 19 21:16:28 vdi kernel: [ 5.832439] gim info:(init_frame_buffer_partition:190) PCI defined PF FB size = 256 MB
Apr 19 21:16:28 vdi kernel: [ 5.833001] gim info:(init_frame_buffer_partition:194) PCI defined VF FB size = 256 MB
Apr 19 21:16:28 vdi kernel: [ 5.833554] gim info:(init_frame_buffer_partition:200) Total FB Available = 8190 MB, CSA = 8 MB, Max remaining FBsize =8160 MB
Apr 19 21:16:28 vdi kernel: [ 5.834666] gim info:(init_frame_buffer_partition:201) max_fb_size = 8160
Apr 19 21:16:28 vdi kernel: [ 5.835238] gim info:(init_frame_buffer_partition:214) PF FB size after checking limits from conf ig file = 256MB
Apr 19 21:16:28 vdi kernel: [ 5.835818] gim info:(init_frame_buffer_partition:217) PF rounded down to nearest 16MB boundary = 256
Apr 19 21:16:28 vdi kernel: [ 5.836430] gim info:(init_pf_fb:60) total framebuffer available = 1ffe
Apr 19 21:16:28 vdi kernel: [ 5.837007] gim info:(init_pf_fb:61) pf framebuffer = 100
Apr 19 21:16:28 vdi kernel: [ 5.837578] gim info:(init_pf_fb:63) total framebuffer consumed = 1efe
Apr 19 21:16:28 vdi kernel: [ 5.838153] gim info:(init_frame_buffer_partition:224) CSA starts at offset 256MB
Apr 19 21:16:28 vdi kernel: [ 5.838729] gim info:(init_context_save_area:42) AMD GIM init_context_save_area: base =100 size=1 .
Apr 19 21:16:28 vdi kernel: [ 5.839310] gim info:(init_frame_buffer_partition:231) VF FB base = 272MB (256 + 16)
Apr 19 21:16:28 vdi kernel: [ 5.839896] gim info:(init_frame_buffer_partition:235) VF FB Size = 7904MB (8160 - 256)
Apr 19 21:16:28 vdi kernel: [ 5.840509] gim info:(init_fb_static:118) AMD GIM init_fb_static: num_vf = 16, base= 272, total_s ize=7904, mini_size=256
Apr 19 21:16:28 vdi kernel: [ 5.841104] gim info:(init_fb_static:139) AMD GIM init_fb_static: vf_fb_size = 480, base= 272
Apr 19 21:16:28 vdi kernel: [ 5.841696] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 0 base =272,size= 480
Apr 19 21:16:28 vdi kernel: [ 5.842285] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 1 base =752,size= 480
Apr 19 21:16:28 vdi kernel: [ 5.842865] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 2 base =1232,size= 48 0
Apr 19 21:16:28 vdi kernel: [ 5.843439] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 3 base =1712,size= 48 0
Apr 19 21:16:28 vdi kernel: [ 5.844020] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 4 base =2192,size= 48 0
Apr 19 21:16:28 vdi kernel: [ 5.844592] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 5 base =2672,size= 48 0
Apr 19 21:16:28 vdi kernel: [ 5.845141] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 6 base =3152,size= 48 0
Apr 19 21:16:28 vdi kernel: [ 5.845680] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 7 base =3632,size= 48 0
Apr 19 21:16:28 vdi kernel: [ 5.846218] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 8 base =4112,size= 48 0
Apr 19 21:16:28 vdi kernel: [ 5.846740] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 9 base =4592,size= 48 0
Apr 19 21:16:28 vdi kernel: [ 5.847252] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 10 base =5072,size= 4 80
Apr 19 21:16:28 vdi kernel: [ 5.847759] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 11 base =5552,size= 4 80
Apr 19 21:16:28 vdi kernel: [ 5.848290] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 12 base =6032,size= 4 80
Apr 19 21:16:28 vdi kernel: [ 5.848781] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 13 base =6512,size= 4 80
Apr 19 21:16:28 vdi kernel: [ 5.849261] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 14 base =6992,size= 4 80
Apr 19 21:16:28 vdi kernel: [ 5.849734] gim info:(init_fb_static:150) AMD GIM init_fb_static: partition 15 base =7472,size= 4 80
Apr 19 21:16:28 vdi kernel: [ 5.850209] gim info:(set_new_adapter:789) enable MSI
Apr 19 21:16:28 vdi kernel: [ 5.850261] gim info:(ih_iv_ring_disable:383) disable iv ring successfully
Apr 19 21:16:28 vdi kernel: [ 5.851185] gim info:(alloc_iv_ring:99) ih->ivr_num_entries = 256
Apr 19 21:16:28 vdi kernel: [ 5.851648] gim info:(alloc_iv_ring:102) ih->ivr_size_in_bytes = 4096
Apr 19 21:16:28 vdi kernel: [ 5.852140] gim info:(alloc_iv_ring:108) ih->ivr_alloc_size_in_bytes = 4100
Apr 19 21:16:28 vdi kernel: [ 5.852600] gim info:(alloc_iv_ring:110) iv ring page_cnt = 2
Apr 19 21:16:28 vdi kernel: [ 5.853056] gim info:(alloc_iv_ring:141) ih->ivr_va = ffff9bb7c00e1000
Apr 19 21:16:28 vdi kernel: [ 5.854205] gim info:(alloc_iv_ring:147) ih->ivr_ma.quad_part = 0xfffff000
Apr 19 21:16:28 vdi kernel: [ 5.854664] gim info:(alloc_iv_ring:151) ih->ivr_wptr_wb = ffff9bb7c00e2000
Apr 19 21:16:28 vdi kernel: [ 5.855117] gim info:(alloc_iv_ring:158) ih->ivr_wptr_wa.quad_part = 0xffffe000
Apr 19 21:16:28 vdi kernel: [ 5.855567] gim info:(alloc_iv_ring:163) update rptr via doorbell
Apr 19 21:16:28 vdi kernel: [ 5.856021] gim info:(ih_iv_ring_init:291) ih->rptr_doorbell = ffff9bb7c86007a0
Apr 19 21:16:28 vdi kernel: [ 5.856022] gim info:(ih_iv_ring_init:292) ih->rptr_doorbell_offset = 0x1e8
Apr 19 21:16:28 vdi kernel: [ 5.856481] gim info:(ih_iv_ring_hw_init:185) the physical address of ring buffer: 0xfffff0
Apr 19 21:16:28 vdi kernel: [ 5.856919] gim info:(ih_iv_ring_setup_rptr:451) write mmBIF_DOORBELL_APER_EN: 0x1
Apr 19 21:16:28 vdi kernel: [ 5.857797] gim info:(ih_iv_ring_enable:350) ih->ivr_wptr_reg = 0x0
Apr 19 21:16:28 vdi kernel: [ 5.858240] gim info:(ih_iv_ring_enable:352) ih->ivr_wptr = 0
Apr 19 21:16:28 vdi kernel: [ 5.858670] gim info:(ih_iv_ring_enable:354) ih->ivr_rptr_reg = 0x0
Apr 19 21:16:28 vdi kernel: [ 5.859081] gim info:(ih_iv_ring_enable:356) ih->ivr_rptr = 0
Apr 19 21:16:28 vdi kernel: [ 5.859479] gim info:(ih_iv_ring_enable:358) *(ih->rptr_doorbell) = 0x0
Apr 19 21:16:28 vdi kernel: [ 5.859872] gim info:(ih_iv_ring_init:299) init iv ring successfully
Apr 19 21:16:28 vdi kernel: [ 5.859873] gim info:(set_new_adapter:802) init work
Apr 19 21:16:28 vdi kernel: [ 5.860294] gim info:(set_new_adapter:808) register interrupt
Apr 19 21:16:28 vdi kernel: [ 5.860688] gim info:(ih_irq_source_enable:583) IH: read 0x00000000 from mask_reg 0x14d1
Apr 19 21:16:28 vdi kernel: [ 5.861067] gim info:(ih_irq_source_enable:589) IH: write 0x00000001 to mask_reg 0x14d1
Apr 19 21:16:28 vdi kernel: [ 5.861457] gim info:(ih_irq_source_enable:592) irq sourceID 0x89 get enabled
Apr 19 21:16:28 vdi kernel: [ 5.862218] gim info:(ih_irq_source_enable:583) IH: read 0x00000001 from mask_reg 0x14d1
Apr 19 21:16:28 vdi kernel: [ 5.862219] gim info:(ih_irq_source_enable:589) IH: write 0x00000003 to mask_reg 0x14d1
Apr 19 21:16:28 vdi kernel: [ 5.862593] gim info:(ih_irq_source_enable:592) irq sourceID 0x88 get enabled
Apr 19 21:16:28 vdi kernel: [ 5.863600] gim info:(gim_clear_all_errors:357) PCIE cap pos 58
Apr 19 21:16:28 vdi kernel: [ 5.863960] gim info:(gim_clear_all_errors:362) AER ext cap pos 150
Apr 19 21:16:28 vdi kernel: [ 5.864348] gim info:(gim_clear_all_errors:369) DevStatus = 0x9
Apr 19 21:16:28 vdi kernel: [ 5.864710] gim info:(gim_clear_all_errors:387) PCIE unrecover error = 0x2000
Apr 19 21:16:28 vdi kernel: [ 5.865081] gim info:(resume_scheduler:131) No functions on the runlist.
Apr 19 21:16:28 vdi kernel: [ 5.865082] gim info:(resume_scheduler:132) Don't need to restart the scheduler
Apr 19 21:16:28 vdi kernel: [ 5.865839] gim info:(gim_probe:86) AMD GIM probe: pf_count = 1

I hope you glean something useful from this data dump!

@Tungsten-lawrenceb
Copy link

Tungsten-lawrenceb commented May 7, 2018

@flintcq I have used RDP/VNC/SPICE to connect to the VM without any issues.

03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Tonga XT GL [FirePro S7150] (prog-if 00 [VGA controller])
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Tonga XT GL [FirePro S7150]
Flags: fast devsel, IRQ 11
Memory at 380fe0000000 (64-bit, prefetchable) [disabled] [size=256M]
Memory at 380ff4000000 (64-bit, prefetchable) [disabled] [size=2M]
I/O ports at a000 [disabled] [size=256]
Memory at de400000 (32-bit, non-prefetchable) [disabled] [size=256K]
Expansion ROM at de440000 [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 Capabilities: [50] Power Management version 3 Capabilities: [58] Express Legacy Endpoint, MSI 00 Capabilities: [a0] MSI: Enable- Count=1/4 Maskable+ 64bit+ Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010
Capabilities: [150] Advanced Error Reporting
Capabilities: [200] #15
Capabilities: [270] #19
Capabilities: [2b0] Address Translation Service (ATS)
Capabilities: [2c0] #13
Capabilities: [2d0] #1b
Capabilities: [328] Alternative Routing-ID Interpretation (ARI)
Capabilities: [330] Single Root I/O Virtualization (SR-IOV)
Capabilities: [400] Vendor Specific Information: ID=0002 Rev=1 Len=070 <?>
Kernel modules: amdgpu

modinfo gim
filename: /lib/modules/4.4.0-116-generic/GIM/gim.ko
license: GPL and additional rights
description: GPU IOV MODULE
author: Advanced Micro Devices, Inc.
version: 1.0.0-k
srcversion: 2DEB65B681134BB6A4208B0
depends:
vermagic: 4.4.0-116-generic SMP mod_unload modversions retpoline
parm: fb_option:Frame Buffer Partition.0:static partition; 1: dynamic partition (uint)
parm: sched_option:GPU scheduler. 0: round robin; 1: predictable perf (uint)
parm: vf_num:number of enabled virtual functions. 0: default number of VF in pci sriov config space; N: enable N VFs (uint)
parm: vf_fb:uint
parm: pf_fb:Frame Buffer Size in MegaBytes for PF (uint)
parm: sched_interval:Scheduling time quanta in milliseconds. 0: default quanta(6ms) (uint)
parm: fb_clear:Clear the VRAM for VF. 0:Skipping; 1: clear FB of VF when VF is free or down (uint)

@flintcq
Copy link

flintcq commented May 7, 2018

@Tungsten-lawrenceb, thanks for your logs and information, we have checked , all of them almost are the same, but the logs you provided is only the initializing phase of gim, not gim output while booting/running/rebooting/shutdowning the vms, most issues we met are occurred during these periods

and we wonder why you can use vnc/spice, while black for us, and when using vnc(server)/spice/rdp, the featues of directx will not be opened for vgpu, how you work around, and we use gpu-z for checking vgpu, here is the result, how about yours?
image

@falcon78921
Copy link
Author

Thanks for the reference file @Tungsten-lawrenceb!

I've been busy with other projects. I'm hoping to continue soon.

@fanghuafan
Copy link

Hi @falcon78921
I have this problem, too.
This is my dmesg:

[   44.258616] kvm: zapping shadow pages for mmio generation wraparound
[   44.265952] kvm: zapping shadow pages for mmio generation wraparound
[   53.488250] br0: port 2(vnet0) entered forwarding state
[   57.274859] br0: port 3(vnet1) entered forwarding state
[ 2341.093681] perf interrupt took too long (2519 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[ 3646.628145] SGI XFS with ACLs, security attributes, realtime, no debug enabled
[ 3646.639813] JFS: nTxBlock = 8192, nTxLock = 65536
[ 3646.671487] ntfs: driver 2.1.32 [Flags: R/O MODULE].
[ 3646.716823] QNX4 filesystem 0.2.3 registered.
[ 3867.023132] gim: loading out-of-tree module taints kernel.
[ 3867.024883] gim info:(gim_init:144) Start AMD open source GIM initialization
[ 3867.024887] gim info:(gim_init:147) GPU IOV MODULE - version 1.0.0-k
[ 3867.024889] gim info:(gim_init:149) Copyright (c) 2014-2017 AMD Corporation.
[ 3867.031170] gim info:(parse_config_file:210) AMD GIM fb_option = 0
[ 3867.031175] gim info:(parse_config_file:210) AMD GIM sched_option = 0
[ 3867.031177] gim info:(parse_config_file:210) AMD GIM vf_num = 0
[ 3867.031180] gim info:(parse_config_file:210) AMD GIM pf_fb = 0
[ 3867.031183] gim info:(parse_config_file:210) AMD GIM vf_fb = 0
[ 3867.031186] gim info:(parse_config_file:210) AMD GIM sched_interval = 0
[ 3867.031189] gim info:(parse_config_file:210) AMD GIM fb_clear = 0
[ 3867.031191] gim info:(init_config:305) INIT CONFIG
[ 3867.051095] gim info:(set_new_adapter:614) curr allocated at ffffffffc073bb80
[ 3867.051099] gim error:(set_new_adapter:617) asic does not support SRIOV
[ 3867.051311] gim info:(gim_probe:86) AMD GIM probe: pf_count = 0

Did you solve it?

@falcon78921
Copy link
Author

Hi @fanghuafan,

What kernel are you using? Could you output uname -r ?

I had to rebuild a Linux kernel with SRIOV support. The patches supplied in the git repo allow you to do this. According to @kzytaruk, the patches have been upstream'd as of kernel 4.6.

Here is a good reference article, if you need assistance building kernels:
https://medium.freecodecamp.org/building-and-installing-the-latest-linux-kernel-from-source-6d8df5345980

If you're using Ubuntu, you can download kernels from http://kernel.ubuntu.com/~kernel-ppa/mainline/.

You can also download the latest kernels from kernel.org.

@fanghuafan
Copy link

Hi @falcon78921
I run the uname -r:

root@kpa:/lib/modules/4.4.117my-very-own-kernel/GIM# uname -r
4.4.117my-very-own-kernel

I'll try your method.
Thank you very much!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants