-
Notifications
You must be signed in to change notification settings - Fork 342
Conversation
Is there a good way of debugging why a kernel module is still in use? Unfortunately with this patch after manually sending a remove udev event, and
which is weird because it doesn't look like this patch is doing much different compared to my last PR which allowed |
In my testing, |
FWIW, I figured I'd see if I can help test this and I hit a segfault when triggering remove. BT:
I ran udevadm trigger in root shell inside alacritty. I have set alacritty so that it updates title depending on command being run. Guessing some racecondition happened. This is with current git repo HEAD of both wlroots & sway. This is with i915 + nouveau (libglvnd patch from https://gitlab.freedesktop.org/glvnd/libglvnd/-/merge_requests/235 included to make sway start). If needed I can add debuginfos for the libffi etc or other information that would help EDIT: NVM, PEBKAC - I triggered with MINOR=0 instead of MINOR=1 to remove the secondary GPU. That actually works .. So kudos from my end anyway. |
This sounds unrelated. Please open a separate issue. |
so I've implemented the other half of this here: https://github.com/neon64/wlroots/tree/drm-hotplug . Should I make a separate PR to wlroots which includes/depends on the changes in this one? |
Nice! Sure, feel free to open a new PR. Have you run into any bug with this PR? |
I still haven't fully diagnosed why nouveau is still in-use unfortunately.
After some more testing, I've worked out that if, before trying to destroy the card1 backend, I pull out the HDMI cable, then the refcount drops from 5 to 1, and then after running If I don't disconnect the physical cable before using udevadm, then the refcount stays at 5 as long as sway is open. So my theory is that there's something missing in the cleanup code for outputs / drm-connectors, if you don't disconnect the output before the backend destroy. May or may not be relevant from sway.log:
|
Hmm. Yeah, if the commit fails, we don't destroy the FBs in |
Yep that appears to work - refcount drops to zero and |
The main thing I'm worried about is if this happens prior to shutdown and we try to re-use the CRTC for another connector later on. The kernel would return an error in this case (EINVAL for atomic). We could only ignore errors when we're shutting down the backend, but not sure it's worth it. In all cases, something will be printed to the logs. In any case, feel free to submit a pull request. |
Any use of the DRM FD after the remove event results in a "Permission denied" error.
On GPU unplug, disabling a CRTC can fail with EPERM. References: swaywm#2575 (comment)
Pushed the fix for EPERM. |
This is half of the GPU hotplug work, see #2423.
To test: