New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux kernel 4.18 breaks display on 14,3 #73

Open
jboyens opened this Issue Aug 19, 2018 · 33 comments

Comments

@jboyens

jboyens commented Aug 19, 2018

Just upgraded to 4.18 and it breaks the display in some way. I haven't figured it out yet, but it locks up to a back light black screen.

I'm not using a Login Manager or anything like that. Disabling kernel mode setting got me back to a usable prompt and let me downgrade the kernel.

@jboyens

This comment has been minimized.

jboyens commented Aug 19, 2018

I've tried with 4.18 -> 4.18.3 to no avail. There were a ton of AMDGPU "fixes" that went into 4.18 so maybe something there broke it.

@jboyens

This comment has been minimized.

jboyens commented Aug 19, 2018

Definitely the amdgpu driver. If loaded early in boot, it black screens faster.

@jboyens jboyens changed the title from Linux kernel 4.18 breaks display to Linux kernel 4.18 breaks display on 14,3 Aug 19, 2018

@roadrunner2

This comment has been minimized.

Contributor

roadrunner2 commented Aug 26, 2018

I can confirm that booting using the dGPU gives a black screen under 4.18 on a MBP13,3 too; but booting with the iGPU works fine (see #6 for how to switch to using the iGPU).

@turenar

This comment has been minimized.

turenar commented Aug 29, 2018

I tried to boot Linux 4.18.5 with MBP 14,3 (dGPU), and found my external display (connected by HDMI) working and treated as the only display connected with my MBP.
The built-in display is recognized first, but seems to be missed after a few seconds...

@andersensam

This comment has been minimized.

andersensam commented Sep 5, 2018

Downgrading to 4.15.18 on Ubuntu fixed the issue.

I noticed when building 4.18.x that I was getting errors about missing firmware for amdgpu, perhaps forcing a revert to radeon...

@jboyens

This comment has been minimized.

jboyens commented Sep 8, 2018

Looking into this a bit more, I see this in sway:

  Current mode: 0x0 @ 0.000000 Hz
  Position: 5120,0
  Scale factor: 1x
  Transform: normal

That indicates to me that the display is being seen, but it seems like the backlight is off.

@arno01

This comment has been minimized.

arno01 commented Sep 10, 2018

Same with 4.19-rc3 :/

@wsy2220

This comment has been minimized.

wsy2220 commented Sep 11, 2018

4.18 also breaks 15,1

@jboyens

This comment has been minimized.

jboyens commented Sep 11, 2018

Booting with the iGPU does not work for me on 4.18. I get some serious graphic glitches. On <=4.17 the external displays won't work with the iGPU only on dGPU. External displays only are working on 4.18 and the eDP-1 is detected, just not enabled (no backlight). Using drm debug flags I can see that the 2880x1440 modeline is being called invalid. Wonder if I can provide a KMS modeline as a kernel arg.

I dug through a bunch of amdgpu bugs looking for hints or fixes, but I didn't find anything terribly useful.

@roadrunner2

This comment has been minimized.

Contributor

roadrunner2 commented Sep 11, 2018

@wsy2220 and @jboyens: have you tried booting with i915.fastboot=1 (on the kernel command line)?

@christophgysin

This comment has been minimized.

Contributor

christophgysin commented Sep 11, 2018

I believe I have the same issue on 13,1 starting from 4.17. Currently running 4.16.8.

I have no AMD GPU. It seems that the display goes black immediately after the bootloader, even without i915 in the initrd (i915.fastboot=1 does not make a difference).

@melentye

This comment has been minimized.

melentye commented Sep 11, 2018

Not sure if that's the same issue because for 14,3 it started with kernel 4.18, not with 4.17.

@jboyens

This comment has been minimized.

jboyens commented Sep 11, 2018

@wsy2220 and @jboyens: have you tried booting with i915.fastboot=1 (on the kernel command line)?

I worked on this a bit this morning. On 4.17, I needed to remove amdgpu from the initrd and blacklist the module. I also added i915 fastboot and the iGPU seems to be working fine, but only for the embedded display and not any externals.

@jboyens

This comment has been minimized.

jboyens commented Sep 11, 2018

@roadrunner2 Using your various tutorials and bug reports I was able to get both GPUs enabled at the same time and switch off the dGPU.

It required adding: amdgpu.dc=1 i915.fastboot=1 to the kernel commandline and NOT adding any modules to the initrd.

@wsy2220

This comment has been minimized.

wsy2220 commented Sep 12, 2018

@wsy2220 and @jboyens: have you tried booting with i915.fastboot=1 (on the kernel command line)?

Tried, without any effect.

@roadrunner2

This comment has been minimized.

Contributor

roadrunner2 commented Sep 13, 2018

I just played around with this a bit, and here are my notes. I tried things out on both 4.18.7 and 4.19.0-rc3 kernels, and the behaviour was the same.

Setup:

  • MBP13,3 with Radeon Pro 450
  • Apple USB-C to HDMI adapter
  • external HDMI display
  • boot using the iGPU and i915.fastboot=1 and the dGPU enabled (but not active/used).
  • both i915 and amdgpu modules are present in initrd

Observations:

  • the dGPU has to be powered on for the external display to be seen/used (looks like the gmux is for the eDP only, and the external DP is wired up to the dGPU only); however, it does not need to be "active" (i.e. rendering and eDP driving is still being done on/by the iGPU)
  • powering down the dGPU (echo OFF | sudo tee /sys/kernel/debug/vgaswitcheroo/switch) and powering it back up (echo ON | sudo tee /sys/kernel/debug/vgaswitcheroo/switch) causes the external display still to be seen, but nothing is actually shown on it (white/gray screen) - the amdgpu driver is very much not happy after powering back on (see numerous errors in dmesg)
  • amdgpu.dc=1 is not necessary, as this is default (dmesg | grep 'Display Core'); forcing it off (amdgpu.dc=0) only slightly changed the behaviour in as much as after powering the dGPU down and up again the external display wouldn't even be recognized anymore. (Update: prior to 4.18 amdgpu.dc=1 was necessary unless the kernel was compiled with DRM_AMD_DC_PRE_VEGA=y.)

I can also confirm that booting 4.19.0-rc3 with the dGPU active results in a black screen too.

@arno01

This comment has been minimized.

arno01 commented Sep 17, 2018

I have just tried booting 4.18.8 with amdgpu.dc=0 (without playing with switching iGPU/dGPU nor external screens but rather sticking to the default dGPU) and it is working. I'll probably just stick to this option then so I can run the new kernels on my MBP 14,3. :-)

$ cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-4.18.8-041808-generic root=UUID=[redacted] ro modprobe.blacklist=brcmfmac amdgpu.dc=0

$ cat /proc/version 
Linux version 4.18.8-041808-generic (kernel@gloin) (gcc version 8.2.0 (Ubuntu 8.2.0-6ubuntu1)) #201809150431 SMP Sat Sep 15 08:33:36 UTC 2018

$ glxinfo
...
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: X.Org (0x1002)
    Device: AMD Radeon (TM) RX Graphics (POLARIS11 / DRM 3.26.0 / 4.18.8-041808-generic, LLVM 6.0.0) (0x67ef)
    Version: 18.0.5
    Accelerated: yes
    Video memory: 4049MB
...
    Currently available dedicated video memory: 4049 MB
OpenGL vendor string: X.Org
OpenGL renderer string: AMD Radeon (TM) RX Graphics (POLARIS11 / DRM 3.26.0 / 4.18.8-041808-generic, LLVM 6.0.0)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 18.0.5
OpenGL core profile shading language version string: 4.50
...

Upd

I have also noticed that my 4.18 (and probably 4.19 too) do not have the CONFIG_DRM_AMD_DC_PRE_VEGA=y.

@melentye

This comment has been minimized.

melentye commented Sep 17, 2018

Kernel parameter amdgpu.dc=0 suggested by @arno01 works for me on macbook 14,3 with kernel 4.18.7 too.

Edit: https://bugzilla.kernel.org/show_bug.cgi?id=200695 looks relevant, doesn't it?

@arno01

This comment has been minimized.

arno01 commented Sep 17, 2018

@melentye yes, it does look like it is relevant!
I am inclined to believe now that the issue is happening due to a combination of two following things rather than an assumption that this only has to do with the kernels of versions like v4.18 and upwards:

  1. lack of CONFIG_DRM_AMD_DC_PRE_VEGA=y in the kernel config;
  2. amdgpu.dc=1 which is enabled by default as @roadrunner2 observed above;

So when these two options are true, users see the black screen.

The CONFIG_DRM_AMD_DC_PRE_VEGA option appeared in 4.15 (according to a quick search https://cateee.net/lkddb/web-lkddb/DRM_AMD_DC_PRE_VEGA.html )

Update

It looks like CONFIG_DRM_AMD_DC_PRE_VEGA is gone from 4.18 https://github.com/torvalds/linux/blob/v4.17/drivers/gpu/drm/amd/display/Kconfig#L12

And according to this torvalds/linux@2fa4173#diff-531835135ce1f53a5430016399b993e4L2097
it does not look like relevant.

I think we should see/bisect what happened between these two:

https://github.com/torvalds/linux/commits/v4.17/drivers/gpu/drm/amd
https://github.com/torvalds/linux/commits/v4.18-rc1/drivers/gpu/drm/amd

@arno01

This comment has been minimized.

arno01 commented Sep 18, 2018

I think I've found the culprit. I was bisecting the AMD GPU related commits which came to v4.18-rc1.

So if you revert the following commit torvalds/linux@e03fd3f, you will not have the "black screen" issue. At least in my case, tried with v4.18 from https://github.com/torvalds/linux/tree/v4.18. Please try reverting it, rebuilding your kernel and let us know.

commit e03fd3f300f6184c1264186a4c815e93bf658abb
Author: Mikita Lipski <mikita.lipski@amd.com>
Date:   Wed May 16 16:46:18 2018 -0400

    drm/amd/display: Do not limit color depth to 8bpc
    
    Delete if statement that would force any display's color depth higher
    than 8 bpc to 8
    
    Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
    Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 1ce10bc2d37b..52e57b52cdbb 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2095,12 +2095,6 @@ convert_color_depth_from_display_info(const struct drm_connector *connector)
 {
        uint32_t bpc = connector->display_info.bpc;
 
-       /* Limited color depth to 8bit
-        * TODO: Still need to handle deep color
-        */
-       if (bpc > 8)
-               bpc = 8;
-
        switch (bpc) {
        case 0:
                /* Temporary Work around, DRM don't parse color depth for

Probably the driver cannot get the display_info.bpc from the screen so it defaults to COLOR_DEPTH_UNDEFINED causing the black screen? not sure.. I will email Mikita Lipski about this and see what he will say.

I do not explicitly set amdgpu.dc=1, but apparently it gets automatically enabled:

$ dmesg -T |grep Display
[Tue Sep 18 15:35:39 2018] [drm] Display Core initialized with v3.1.44!

Upd

Mikita's email does not exist, so email returned.
I have forwarded it to Harry Wentland & Alex Deucher at AMD.

Upd2

4.19-rc4 from kernel.org is working with the beforementioned commit reverted:

--- linux-4.19-rc4.orig/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c	2018-09-16 20:52:37.000000000 +0200
+++ linux-4.19-rc4/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c	2018-09-18 16:13:13.520606848 +0200
@@ -2086,6 +2086,12 @@
 {
 	uint32_t bpc = connector->display_info.bpc;
 
+	/* Limited color depth to 8bit
+	 * TODO: Still need to handle deep color
+	 */
+	if (bpc > 8)
+		bpc = 8;
+
 	switch (bpc) {
 	case 0:
 		/* Temporary Work around, DRM don't parse color depth for
$ cat /proc/version 
Linux version 4.19.0-rc4-ubuntu-wo-e03fd3f (aarapov@ea1a257a26ec) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #1 SMP Tue Sep 18 14:14:48 UTC 2018
$ dmesg -T |grep Displa
[Tue Sep 18 16:41:56 2018] [drm] Display Core initialized with v3.1.59!
@roadrunner2

This comment has been minimized.

Contributor

roadrunner2 commented Sep 19, 2018

I just tried booting again with the dGPU. On my MBP13,3 I see the following behaviour:

  • amdgpu.dc=0 everything works fine (usable display on boot and on login)
  • dc=1 and the above patch: display during boot is fixed, but get black screen (backlight is on, though) after login (both Wayland and X)

Then I had a hunch the problem in the second case might be the resolution after login (I was using 1920x1200) - changing the resolution to 2880x1800 or 1440x900 fixed this. However, when I changed back to 1920x1200 and tried one more then it suddenly was fine. Go figure. In short, there appears to still be some resolution related flakiness.

P.S. Gnome appears to remember display resolutions based on GPU, so setting the resolution while on iGPU does not affect the resolution when later booting with dGPU, and visa versa. So if anybody plays with resolutions, make sure you change them while running the same GPU that you're trying to test under.

@roadrunner2

This comment has been minimized.

Contributor

roadrunner2 commented Sep 20, 2018

Btw., forgot to mention, but excellent work @arno01 on bisecting the issue - that's a tedious process!

@Dunedan

This comment has been minimized.

Owner

Dunedan commented Sep 20, 2018

Great work @arno01. Is my understanding correct that this is just a bug which got introduced with 4.18 and will be fixed with 4.19 again? If yes, anything else to discuss here or can we just close this issue?

@arno01

This comment has been minimized.

arno01 commented Sep 21, 2018

@Dunedan I'd say yes, this could be closed, but I'd love OP to confirm it first.
@jboyens are you able to try the kernel without that particular commit? If you have difficulty with compiling the kernel, I can send you mine 4.19-rc4 (for Ubuntu)

@ClashTheBunny

This comment has been minimized.

Contributor

ClashTheBunny commented Sep 25, 2018

I also had an interesting experience on 4.18 related to display. Did the backlight change with the new kernel? It seems like backlightctl now controls amdgpu_bl1 and not gmux_backlight. It only turns off or on the display. The brightness is still controlled with gmux_backlight

@roadrunner2

This comment has been minimized.

Contributor

roadrunner2 commented Oct 20, 2018

@arno01 or @jboyens: can I suggest one of you open a ticket upstream (on https://bugs.freedesktop.org/, DRM/AMDgpu component) so this gets looked at there?

(edit: corrected component)

@whereswaldon

This comment has been minimized.

whereswaldon commented Oct 31, 2018

Applying the patch (reverted commit) to 4.19.0 worked for me. I can use the built-in display again!

@arno01

This comment has been minimized.

arno01 commented Nov 8, 2018

@arno01

This comment has been minimized.

arno01 commented Nov 20, 2018

@roadrunner2

This comment has been minimized.

Contributor

roadrunner2 commented Nov 27, 2018

@arno01 Thanks for shepherding this. I can confirm the patches have made it into 4.20-rc4 and fix the issue 👍 .

@frazer-jamieson

This comment has been minimized.

frazer-jamieson commented Dec 11, 2018

Is there an iso file available with this kernel please?

@arno01

This comment has been minimized.

arno01 commented Dec 13, 2018

@frazer-jamieson please try this one:

ISO with Ubuntu 18.04 for MacBook (Linux 4.15.0-22), touchbar and keyboard are working:

Download 2.4G: https://files.nixaid.com/ubuntu4mac.iso
GPG signed checksum: https://files.nixaid.com/ubuntu4mac.iso.gpg
@frazer-jamieson

This comment has been minimized.

frazer-jamieson commented Dec 14, 2018

@arno01 thank you sincerely for your help :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment