Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GNOME fails to start after installing NVIDIA proprietary driver #1274

Open
GraciousGazelles opened this issue Sep 26, 2019 · 24 comments
Open
Labels
bug desktop Affects Desktop experience high priority
Projects

Comments

@GraciousGazelles
Copy link

GraciousGazelles commented Sep 26, 2019

The last two days I've attempted multiple (5+) times to install the latest NVIDIA proprietary driver and after installation GNOME fails to start - I'm left at a black screen with a flashing cursor. I'm able to access the system via SSH and terminal with CTRL+ALT+F2.

I've followed the tutorial here: https://docs.01.org/clearlinux/latest/tutorials/nvidia.html

My system information:

H/W path          Device    Class          Description
======================================================
                            system         To Be Filled By O.E.M. (To Be Filled By O.E.M.)
/0                          bus            Z390 Taichi
/0/0                        memory         64KiB BIOS
/0/10                       memory         32GiB System Memory
/0/10/0                     memory         8GiB DIMM DDR4 Synchronous 3200 MHz (0.3 ns)
/0/10/1                     memory         8GiB DIMM DDR4 Synchronous 3200 MHz (0.3 ns)
/0/10/2                     memory         8GiB DIMM DDR4 Synchronous 3200 MHz (0.3 ns)
/0/10/3                     memory         8GiB DIMM DDR4 Synchronous 3200 MHz (0.3 ns)
/0/1f                       memory         512KiB L1 cache
/0/20                       memory         2MiB L2 cache
/0/21                       memory         16MiB L3 cache
/0/22                       processor      Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz
/0/100                      bridge         8th Gen Core 8-core Desktop Processor Host Bridge/DRAM Registers [Coffee Lake S]
/0/100/1                    bridge         Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16)
/0/100/1/0                  display        GP104 [GeForce GTX 1070]
/0/100/1/0.1                multimedia     GP104 High Definition Audio Controller
/0/100/1.1                  bridge         Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8)
/0/100/1.1/0                network        BCM4360 802.11ac Wireless Network Adapter
/0/100/1.2                  bridge         Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x4)
/0/100/1.2/0                bus            ASM1042A USB 3.0 Host Controller
/0/100/1.2/0/0    usb3      bus            xHCI Host Controller
/0/100/1.2/0/0/1            multimedia     EVGA NU Audio
/0/100/1.2/0/1    usb4      bus            xHCI Host Controller
/0/100/12                   generic        Cannon Lake PCH Thermal Controller
/0/100/14                   bus            Cannon Lake PCH USB 3.1 xHCI Host Controller
/0/100/14/0       usb1      bus            xHCI Host Controller
/0/100/14/0/3               bus            ASM107x
/0/100/14/0/4               input          Gaming Mouse G502
/0/100/14/0/d               bus            USB2.0 Hub
/0/100/14/1       usb2      bus            xHCI Host Controller
/0/100/14/1/7               bus            ASM107x
/0/100/14.2                 memory         RAM memory
/0/100/16                   communication  Cannon Lake PCH HECI Controller
/0/100/17                   storage        Cannon Lake PCH SATA AHCI Controller
/0/100/1c                   bridge         Cannon Lake PCH PCI Express Root Port #7
/0/100/1c/0       /dev/fb0  bridge         ASM1184e PCIe Switch Port
/0/100/1c/0/1               bridge         ASM1184e PCIe Switch Port
/0/100/1c/0/1/0   wlp6s0    network        Dual Band Wireless-AC 3168NGW [Stone Peak]
/0/100/1c/0/3               bridge         ASM1184e PCIe Switch Port
/0/100/1c/0/3/0   enp7s0    network        I211 Gigabit Network Connection
/0/100/1c/0/5               bridge         ASM1184e PCIe Switch Port
/0/100/1c/0/7               bridge         ASM1184e PCIe Switch Port
/0/100/1c/0/7/0             storage        ASM1062 Serial ATA Controller
/0/100/1f                   bridge         Z390 Chipset LPC/eSPI Controller
/0/100/1f.4                 bus            Cannon Lake PCH SMBus Controller
/0/100/1f.5                 bus            Cannon Lake PCH SPI Controller
/0/100/1f.6       eno1      network        Ethernet Connection (7) I219-V
@puneetse
Copy link
Contributor

puneetse commented Sep 26, 2019

What version of Clear Linux and what version of the NVIDIA driver?

I noticed 430.x on the Clear Linux 5.3 kernel is not working. I haven't had a chance to see if this is a Clear Linux specific issue or an upstream issue. But a workaround is Interrupt the bootloader by holding SPACE key as the system boots and select an older 5.2.x kernel to boot.

@mrkz mrkz added desktop Affects Desktop experience high priority labels Sep 27, 2019
@mrkz mrkz self-assigned this Sep 27, 2019
@mrkz mrkz removed the new label Sep 27, 2019
@mrkz mrkz added this to To Do in Desktop via automation Sep 27, 2019
@GraciousGazelles
Copy link
Author

Sorry, I should have specified the versions.

This was with the 5.3 kernel and NVIDIA Driver 430.5.

I just tried the 435.21 drivers today with 5.3 and had no issues installing or booting in to GNOME after first install. The system was stable until the first reboot. After reboot I was left with a flashing cursor and was no longer able to access terminal, system was unresponsive.

Reverting to a 5.2.x kernel in the meantime.

@mrkz
Copy link

mrkz commented Oct 3, 2019

@gmatler did the combo kernel 5.2.x + NVIDIA 430.5 get you a successful boot?

@puneetse
Copy link
Contributor

puneetse commented Oct 7, 2019

@mrkz rolling back to 31080 with a 5.2 kernel works for me. Anything newer does not.

The driver appears to build and install ok and load ok:

$ lsmod | grep ^nvidia
nvidia_drm             45056  0
nvidia_modeset       1122304  1 nvidia_drm
nvidia              19513344  1 nvidia_modeset

The X.org log is pretty generic:


[   145.811] (II) NVIDIA GLX Module  435.21  Sun Aug 25 08:14:27 CDT 2019
[   145.811] (II) NVIDIA: The X server does not support PRIME Render Offload.
[   146.187] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA GPU at PCI:1:0:0.  Please
[   146.187] (EE) NVIDIA(GPU-0):     check your system's kernel log for additional error
[   146.187] (EE) NVIDIA(GPU-0):     messages and refer to Chapter 8: Common Problems in the
[   146.187] (EE) NVIDIA(GPU-0):     README for additional information.
[   146.187] (EE) NVIDIA(GPU-0): Failed to initialize the NVIDIA graphics device!
[   146.187] (EE) NVIDIA(0): Failing initialization of X screen

The only thing that sticks out to me from journalctl are these lines. On a successful start these do not appear:

2:44 kernel: nvidia 0000:01:00.0: DMAR: 32bit DMA uses non-identity mapping
2:44 kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x24:0x59:1184)
2:44 kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0

@puneetse
Copy link
Contributor

puneetse commented Oct 7, 2019

Another data point: I rebuilt the 5.2.17-836 kernel and booted it on Clear Linux version 31230 and the nvidia driver started working again.

This indicates to me the issue is indeed somewhere in the kernel, and not another gcc/gnome change.

@puneetse
Copy link
Contributor

puneetse commented Oct 7, 2019

Another finding that can hopefully help root cause: changing kernel parameter intel_iommu=igfx_off to intel_iommu=off also resolves the issue on the 5.3 kernel.

EDIT: intel_iommu=on also works

@SPAstef
Copy link

SPAstef commented Oct 8, 2019

So you say it's a kernel bug? I mean, it's a coincidence GNOME was updated at the same time?

@SPAstef
Copy link

SPAstef commented Oct 14, 2019

@puneetse any updates on this? I literally can't use Clear Linux...

@puneetse
Copy link
Contributor

@SPAstef Sorry I don't have much of an update, but try the work around I posted above. It should at least unblock you. You can hold SPACE to interrupt the bootloader and hit e to edit the kernel command-line.

One more thing I was able to test: the "mainline" kernel package also has the issue, which tells me it's probably not one of the Clear Linux kernel patches causing this. Maybe it's a conflicting config or upstream bug (but I'd expect more noise if it was)

@SPAstef
Copy link

SPAstef commented Oct 14, 2019

Ok, didnt know that space interrupt thing. I have it in dual boot with Windows, hope it will work anyway. What would be the problem into shipping the kernel with that parameter you told me already set?

@bryteise
Copy link
Member

Likely an issue between nvidia driver and the specific kernel version. Changing the kernel command line in this case is a negative impact for other uses though so we wouldn't be likely to make it default.

@SPAstef
Copy link

SPAstef commented Oct 15, 2019

Thanks. Another issue that I have only since latest version of GNOME: Xorg session doesent recognize external monitor (attached to GPU via Displayport), while Wayland does. (This happens with Nouveau drivers). Should I open a new issue for this?

EDIT: anyway it did kinda work... While I was writing this message saying it didn't, it actually did. Took it 5 minutes to show GDM but in the end it did it...

@SPAstef
Copy link

SPAstef commented Oct 23, 2019

With the latest version it seems that even setting the intel_iommu option doesent work anymore

@puneetse
Copy link
Contributor

puneetse commented Oct 24, 2019

With the latest version it seems that even setting the intel_iommu option doesent work anymore

@SPAstef It's still working for me on 31380 . Check cat /proc/cmdline to make sure it really booted withou the intel_iommu option.

Likely an issue between nvidia driver and the specific kernel version.

@bryteise while that is a recurring cat-and-mouse game, I don't think this particular issue is a generic NVIDIA vs Linux kernel problem.

I would expect more noise from other distro users if that was the case and I tested a Fedora 30/31 system with NVIDIA drivers (5.3.6 kernel, GNOME 3.32/3.34, and with and without intel_iommu=igfx_off) and it works fine. So I suspect this is more focused to Clear Linux somehow.

@SPAstef
Copy link

SPAstef commented Nov 6, 2019

@puneetse now it works again, but the internal display isn't recognized (this might be related to the kernel option, since the internal display is connected to the Intel on-board GPU). Still hoping this will get fixed soon 😄

@robbierobs
Copy link

I have the same issue and it has been resolved at this time by removing intel_iommu=iglx_off and adding intel_iommu=on as @SPAstef suggested.

I've had this issue since I installed Clear Linux last week but I was using the LTS kernel to get around the issue. This is the first time I'm seeing this so I catnt say if any changed or had an effect.

@SPAstef
Copy link

SPAstef commented Nov 24, 2019

For completely different reasons, I happened to disable Intel VT-d in the BIOS, and now everything seems to be working normally. I don't know if it is related to this... Or did you fix it silently?

@JeffBolle
Copy link

I wanted to just confirm that disabling Intel VT-d in BIOS resolved this issue for me. I had no graphical boot with the same generic failure message in my Xorg log. After disabling VT-d I booted into GDM / Gnome just fine. I did not change the intel_iommu setting, it is currently set to intel_iommu=igfx_off and I have the internal graphics card on the system disabled in the bios as well.

@skabber
Copy link

skabber commented Mar 3, 2020

I have this problem as well. Have followed all the steps here “Oh no! Something has gone wrong” error screen.
My BIOS does not allow me to disable the integrated graphics.
This is on a Razer Blade 13"

@SPAstef
Copy link

SPAstef commented Mar 4, 2020

I have this problem as well. Have followed all the steps here “Oh no! Something has gone wrong” error screen.
My BIOS does not allow me to disable the integrated graphics.
This is on a Razer Blade 13"

My laptop doesn't either, you shouldn't need to disable it. Just follow the additional steps for Optimus laptops

@skabber
Copy link

skabber commented Mar 4, 2020

Just follow the additional steps for Optimus laptops

Am I missing what these additional steps are? The only reference I see about Optimus in the docs is to turn of the iGPU via firmware.
Screen Shot 2020-03-04 at 11 37 56 AM
I'm unsure how to do that since my BIOS doesn't have that option.

@SPAstef
Copy link

SPAstef commented Mar 4, 2020

Just follow the additional steps for Optimus laptops

Am I missing what these additional steps are? The only reference I see about Optimus in the docs is to turn of the iGPU via firmware.
Screen Shot 2020-03-04 at 11 37 56 AM
I'm unsure how to do that since my BIOS doesn't have that option.

They've just been lazy writing that tutorial. Follow this:
https://community.clearlinux.org/t/bash-scripts-to-automate-installation-of-nvidia-proprietary-driver/368

@herreradimas
Copy link

I found a solution. I installed sudo install lightdm and reconfigure sudo dpkg-reconfigure lightdm
I don't understand why gdm3 fail for start gnome with nvidia.
My laptop debian 10 Linux 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux

@mrkz mrkz removed their assignment Jul 11, 2022
@ricardobranco777
Copy link

I found a solution. I installed sudo install lightdm and reconfigure sudo dpkg-reconfigure lightdm I don't understand why gdm3 fail for start gnome with nvidia. My laptop debian 10 Linux 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux

Didn't work for me for Wayland.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug desktop Affects Desktop experience high priority
Projects
Desktop
  
To Do
Development

No branches or pull requests

10 participants