-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GVT-d][KBL-NUC]System print call trace"drm_mode_config_cleanup" and "kernel NULL pointer" after run "echo 0000:00:02.0 > /sys/bus/pci/devices/0000:00:02.0/driver/unbind" #502
Comments
It's a native i915 driver issue. The driver is failed to unbind. |
Is this reported upstream? |
No. @TerrenceXu tested upstream kernel and didn't reproduce this issue. |
Are you following https://github.com/intel/gvt-linux/wiki/GVTg_Setup_Guide ? |
GVT-d guide is at https://github.com/intel/gvt-linux/wiki/GVTd_Setup_Guide |
@miguelinux No, it's found in GVT-d setup that before pass through GPU device to virtual machine, first unbind i915 driver from the device. |
Hi All, any progress? |
FWIW, I can also easily reproduce that with the latest version of Clear Linux (i.e.: 28190), tried both the @TerrenceXu , did you use the exact same kernel configuration when you tested with the upstream kernel? |
@gvancuts , I just use the same kernel configuration with build upstream 4.20.13 kernel, this issue was not happening. |
Earlier today I built an upstream 4.20.14 (same version of the latest Clear Linux native kernel), using the exact same kernel config than the one we use in Clear Linux. But I am also seeing the same call trace with the upstream kernel. It's getting late over here but I'll see if I can try with the upstream 4.20.13 kernel. |
Same with upstream 4.20.13 using the Clear Linux configuration. Can you double-check you are indeed using the exact same kernel configuration when comparing the upstream and Clear Linux native kernels? |
My last test of the day was to make the i915 a module (instead of built-in) but I still see the same call trace in |
@gvancuts, looks this issue is related to Clear Linux OS or Clear Linux OS patches. |
This points at at least some responsibility in user-space then, I don't really know where to go from here so I'll leave it to the experts to jump in and help you out. I'll just add a couple of notes:
|
@miguelinux Do you have the hardware available to try and reproduce this issue? |
@bryteise I think we dont have KBL-NUC at GDC |
@nesiusra Do you know who might have a KBL handy to work on this? |
@TerrenceXu I've been reminding that this could be do to a bug in the i915 driver due to how we load firmware. Could you disable CONFIG_EXTRA_FIRMWARE in the config and retry (that line loads the DMC firmware)? |
@bryteise<https://github.com/bryteise> I think we dont have KBL-NUC at GDC
If you have Skull Canyon, you can reproduce it on that NUC too.
|
How does Clear load dmc/guc differently? |
run modinfo i915 | grep -i guc |
@seanvk i915 is built-in on Clear I think for boot speed reasons. |
@bryteise dmc will load on native only, guc/huc will not be loaded |
@chivakker Isn't this an issue for native? |
@bryteise , we still can reproduce this issue after disable CONFIG_EXTRA_FIRMWARE in the kernel config.:( |
Well that's unpleasantly surprising, hrm. Thoughts @fenrus75 ? |
You might need CONFIG_VFIO_IOMMU_TYPE1 and CONFIG_VFIO_PCI_IGD |
Those 2 are enabled in |
Okay. One last thought. Maybe you also need echo -n auto > /sys/bus/pci/devices/0000:00:02.0/power/control |
works for me on Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz |
Now I did it when I had a graphics target up and of course that got messy, but aside from pulling the rug out from under the target mode, it worked fine. |
Just to double check for errors. I went back and did the following: First, reboot with iommu enabled: Next I disabled graphical target: Then I loaded the modules: Then I unbound the i915: dmesg was clean and my remote shell to the laptop is functional, no lockups or kernel oops. [ 236.133164] calling vfio_virqfd_init+0x0/0x1000 [vfio_virqfd] @ 1247 Sean |
No issue seen on current Clear release either: root@clr-bend-svkelley /home/seanvk # uname -a This is a KBL-R/Coffeelake based system |
@ahkok @iphutch I suggest you close this issue as unable to reproduce unless @TerrenceXu can respond on the steps I've suggested and reproduces the error. |
I am still seeing the same issue after adding |
Same with the latest. The only thing that makes the kernel crash go away is to not start the desktop environment. |
welllllll starting the desktop environment and then yanking away the gpu is
a bit... harsh innit?
…On Fri, Mar 15, 2019 at 3:11 PM Geoffroy Van Cutsem < ***@***.***> wrote:
Same with the latest. The only thing that makes the kernel crash go away
is to not start the desktop environment.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#502 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABPeFZSfSciNLadEIZsyhi6yKFVxvjqoks5vXBqDgaJpZM4bb7Se>
.
|
Yes, especially to the user staring at the monitor :-) @seanvk mentionned in an earlier comment (#502 (comment)) that he did not see the crash even when he had a graphics target up and running. |
@seanvk @gvancuts I still can reproduce it with 28320 (5.0.2-717.native).
|
|
Please omit irrelevant data and trim the bug report template to only those parts that make sense!
Describe the bug
System print call trace"drm_mode_config_cleanup" and "kernel NULL pointer" after try to unbind Intel Graphic card by "echo 0000:00:02.0 > /sys/bus/pci/devices/0000:00:02.0/driver/unbind".
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Unbind "00:02.0" well, GVT-d can work.
Screenshots
N/A
Environment (please complete the following information):
Additional context
The text was updated successfully, but these errors were encountered: