Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvidia-driver (535.146.02) in parallel with drm-61-kmod #21

Open
dasTor opened this issue Feb 19, 2024 · 11 comments
Open

nvidia-driver (535.146.02) in parallel with drm-61-kmod #21

dasTor opened this issue Feb 19, 2024 · 11 comments

Comments

@dasTor
Copy link

dasTor commented Feb 19, 2024

Hi,

i have to run 15-current, to get my integrated Graphics (Alder Lake) to work.
So i installed drm-61-kmod, but it fails to run with this repo.

I do have hybrid graphics and have already tried building freebsd/drm-kmod/ 6.1-lts branch from source,
which runs fine. But when building this repo against it, i can load nvidia-drm, but any wayland compositor
refuses to work. I can attach the Errors i get, if you want them later.

Is there anything i am missing, or is a nvidia-drm-61-kmod port planned or in the making?

Regards,
Daniel

@amshafer
Copy link
Owner

Yes a nvidia-drm-61-kmod port is planned, shouldn't be too long and should happen whenever the nvidia-driver gets bumped to 550.

In the meantime you can follow the build instructions here to build it yourself. You'll have to also check out the matching drm-kmod tree at the 6.1-lts branch to match your installed port. You'll also have to manually apply these patches to get it to build. Sorry for the inconvenience, this will be taken care of in the future port improvements.

@amshafer
Copy link
Owner

Actually I just went ahead and pushed a version for nvidia-drm-61-kmod, so you can give that a try: https://reviews.freebsd.org/D43987

@dasTor
Copy link
Author

dasTor commented Feb 20, 2024

thanks for your effort, unfortunately the port produces the same result for me:
nvidia-modeset and nvidia are loading fine
nvidia-drm is loading, but instantly freezing the system when for example used by the (sway)wm

in case you are interested, i have attached some hopefully useful logs -
or should i write to freebsd-current?

dmesg.txt
sway-i915kms.txt
sway-nvidia.txt

@amshafer
Copy link
Owner

According to dmesg nvidia-drm.ko loads just fine.

Any other details you can provide? Is freezing on kernel module load, sway startup, etc? Is this on the laptop screen or an external monitor? I'm assuming you have the laptop in hybrid graphics mode and not NVIDIA-only mode?

Note there are still some rough edges for sway with NVIDIA. If you are using the default GL backend then there can be some ugly tearing, you'll have to disable hardware cursors, the Vulkan backend doesn't work unless you use the 550 NVIDIA driver (not yet in ports). Suspend/resume of any wayland apps on the NVIDIA GPU won't work either.

@dasTor
Copy link
Author

dasTor commented Feb 20, 2024

i'm dualbooting with arch, so i know the nvidia hassle.
these logs are all i have, as the complete system freezes, i watched /var/log/messages and /var/log/sway via ssh
Bios is in Dual-Mode and sway + nvidia is doing fine with arch
is there any other good way to get more usefull informations / debug logs?

@amshafer
Copy link
Owner

So freezes on sway startup? Are you checking that you can fully load into the system running without a desktop, then start sway manually and see it crash?

I would configure automatic kernel panicking, hopefully that will catch whatever is going wrong:

# auto crashdump
kern.coredump=1
debug.minidump=1
debug.debugger_on_panic=0
debug.kdb.break_to_debugger=0

Then ensure you have a swap partition you can dump to with dumpon -l and you should be set. I'm guessing it's a panic coming from drm-kmod or nvidia-drm, but hard to say without data.

@dasTor
Copy link
Author

dasTor commented Feb 21, 2024

The order is:
Booting (with i915kms enabled) - works
Starting sway after boot - works
Booting and then kldloading nvidia-drm - works
Starting sway when no output is attached to the nvidia card - works since yesterday
Starting sway when output is attached to nvidia - or running sway and then attaching output - panic

It took me a while to get the crashdumps working - where can i upload that, attach here? it's 950mb tar.gzipped
I attached you some info and other files, maybe they already tell you what's wrong

sway-start.txt
kldload-nvidia.txt
dmesg-hdmi-plugged-in.txt
info.txt
core.txt - is empty
core.1 - is too big to attach

@amshafer
Copy link
Owner

Thanks for all the details. The core file itself isn't going to be that useful for me, since I don't have your debug symbols. You can open it with kgdb and give me the backtrace though, that would be the most helpful. You can just post it in a comment here as an inline code block.

@dasTor
Copy link
Author

dasTor commented Feb 21, 2024

Thanks for having a look at it

the inline code looks unformated, i also attached a txt
kgdb.txt

(No debugging symbols found in /boot/modules/nvidia-drm.ko)
Reading symbols from /boot/modules/nvidia.ko...
(No debugging symbols found in /boot/modules/nvidia.ko)
Reading symbols from /boot/modules/nvidia-modeset.ko...
(No debugging symbols found in /boot/modules/nvidia-modeset.ko)
__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
57		__asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) backtrace
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
#1  doadump (textdump=textdump@entry=1) at /usr/src/sys/kern/kern_shutdown.c:403
#2  0xffffffff80b53e50 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:521
#3  0xffffffff80b54352 in vpanic (fmt=0xffffffff811da3dc "%s", ap=ap@entry=0xfffffe0245e408c0)
    at /usr/src/sys/kern/kern_shutdown.c:973
#4  0xffffffff80b541a3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:889
#5  0xffffffff81059aaf in trap_fatal (frame=0xfffffe0245e409c0, eva=32)
    at /usr/src/sys/amd64/amd64/trap.c:950
#6  0xffffffff81059b5e in trap_pfault (frame=0xfffffe0245e409c0, usermode=false, signo=<optimized out>,
    ucode=<optimized out>) at /usr/src/sys/amd64/amd64/trap.c:758
#7  <signal handler called>
#8  0xffffffff85549dd1 in nv_drm_gem_prime_import () from /boot/modules/nvidia-drm.ko
#9  0xffffffff85355744 in drm_gem_prime_fd_to_handle () from /boot/modules/drm.ko
#10 0xffffffff8534988d in drm_ioctl_kernel () from /boot/modules/drm.ko
#11 0xffffffff85349bf3 in drm_ioctl () from /boot/modules/drm.ko
#12 0xffffffff8554451b in nv_drm_ioctl () from /boot/modules/nvidia-drm.ko
#13 0xffffffff80de3756 in linux_file_ioctl_sub (fp=0x20, filp=0xfffff8005db80000, cmd=<optimized out>,
    data=<optimized out>, fop=<optimized out>, td=<optimized out>)
    at /usr/src/sys/compat/linuxkpi/common/src/linux_compat.c:946
#14 linux_file_ioctl (fp=0x20, cmd=<optimized out>, data=<optimized out>, cred=<optimized out>,
    td=<optimized out>) at /usr/src/sys/compat/linuxkpi/common/src/linux_compat.c:1570
#15 0xffffffff80bceec6 in fo_ioctl (fp=0xfffff80046bb86e0, com=3222037550, data=0xffffffff853851bb,
    active_cred=0xffffffff85551740 <nv_drm_fops>, td=0xfffff80a0136d000) at /usr/src/sys/sys/file.h:368
#16 kern_ioctl (td=td@entry=0xfffff80a0136d000, fd=81, com=com@entry=3222037550,
    data=0xffffffff853851bb "/usr/ports/graphics/drm-61-kmod/work/drm-kmod-drm_v6.1.69/drivers/gpu/drm/drm_prime.c", data@entry=0xfffffe0245e40d50 "") at /usr/src/sys/kern/sys_generic.c:804
#17 0xffffffff80bcebd3 in sys_ioctl (td=0xfffff80a0136d000, uap=0xfffff80a0136d400)
    at /usr/src/sys/kern/sys_generic.c:712
#18 0xffffffff8105a473 in syscallenter (td=0xfffff80a0136d000)
    at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:186
#19 amd64_syscall (td=0xfffff80a0136d000, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1192
#20 <signal handler called>
#21 0x0000000843acfcfa in ?? ()
Backtrace stopped: Cannot access memory at address 0x82032b168
(kgdb)

@amshafer
Copy link
Owner

I'm able to reproduce this, will take a look

@amshafer
Copy link
Owner

amshafer commented Mar 6, 2024

I have a fix for this panic, but the external monitor display does not work so I'm still looking into that. Afaict this isn't something I have tested yet, so this isn't a regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants