Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Video) KMS no longer works with mesa #7119

Closed
orbea opened this issue Aug 24, 2018 · 30 comments
Closed

(Video) KMS no longer works with mesa #7119

orbea opened this issue Aug 24, 2018 · 30 comments

Comments

@orbea
Copy link
Contributor

orbea commented Aug 24, 2018

Description

With newer mesa (Git master currently) RetroArch KMS no longer works at least with some systems.

I have tested this with nouveau, but unfortunately I can't test with the llvmpipe or swrast because they have been broken with KMS for much longer, maybe they have never worked? If anyone can test intel or AMD with a new enough mesa it would be appreciated to see if this is a more general issue or not.

I'm not sure if this is a RetroArch or mesa bug, but I'm posting it here for more visibility.

Expected behavior

KMS should work.

Actual behavior

RetroArch immediately errors when trying to start KMS.

[INFO] RetroArch 1.7.3 (Git b095511)
[INFO] === Build =======================================
Capabilities: MMX MMXEXT SSE1 SSE2 SSE3 SSSE3 SSE4 SSE4.2 AVX AES 
Built: Aug 23 2018
[INFO] Version: 1.7.3
[INFO] Git: b095511
[INFO] =================================================
[INFO] Environ SET_PIXEL_FORMAT: RGB565.
[INFO] Redirecting save file to "/media/data/home/games/roms/.saves/retroarch/.srm".
[INFO] Redirecting savestate to "/media/data/home/games/roms/.saves/retroarch/.sstates/.state".
[INFO] Version of libretro API: 1
[INFO] Compiled against API: 1
[INFO] [Audio]: Set audio input rate to: 29970.03 Hz.
[INFO] [Video]: Video @ fullscreen
[INFO] [DRM]: Found 4 connectors.
[INFO] [DRM]: Connector 0 connected: yes
[INFO] [DRM]: Connector 0 has 11 modes.
[INFO] [DRM]: Connector 0 assigned to monitor index: #1.
[INFO] [DRM]: Connector 1 connected: no
[INFO] [DRM]: Connector 1 has 0 modes.
[INFO] [DRM]: Connector 2 connected: no
[INFO] [DRM]: Connector 2 has 0 modes.
[INFO] [DRM]: Connector 3 connected: no
[INFO] [DRM]: Connector 3 has 0 modes.
[INFO] [DRM]: Mode 0: (1680x1050) 1680 x 1050, 60 Hz
[INFO] [DRM]: Mode 1: (1280x1024) 1280 x 1024, 75 Hz
[INFO] [DRM]: Mode 2: (1280x1024) 1280 x 1024, 60 Hz
[INFO] [DRM]: Mode 3: (1152x864) 1152 x 864, 75 Hz
[INFO] [DRM]: Mode 4: (1024x768) 1024 x 768, 75 Hz
[INFO] [DRM]: Mode 5: (1024x768) 1024 x 768, 60 Hz
[INFO] [DRM]: Mode 6: (800x600) 800 x 600, 75 Hz
[INFO] [DRM]: Mode 7: (800x600) 800 x 600, 60 Hz
[INFO] [DRM]: Mode 8: (640x480) 640 x 480, 75 Hz
[INFO] [DRM]: Mode 9: (640x480) 640 x 480, 60 Hz
[INFO] [DRM]: Mode 10: (720x400) 720 x 400, 70 Hz
[INFO] [GL]: Found GL context: kms
[INFO] [GL]: Detecting screen resolution 1680x1050.
[INFO] [EGL] Found EGL_EXT_platform_base, trying eglGetPlatformDisplayEXT
[INFO] [EGL]: EGL version: 1.4
[ERROR] [Video]: Cannot open video driver ... Exiting ...
[ERROR] Fatal error received in: "init_video()"

Steps to reproduce the bug

  1. Build the mesa git master.
  2. Start RetroArch on Linux with KMS.
  3. See if any menu driver still works.

Bisect Results

I bisected this to this mesa commit.

753f603b52db5eb38e27e1842fa43299a348998b is the first bad commit
commit 753f603b52db5eb38e27e1842fa43299a348998b
Author: Daniel Stone <daniels@collabora.com>
Date:   Wed Jun 13 06:04:12 2018 +0200

    gbm: Add support for 10bpp BGR formats
    
    Add support for XBGR2101010 and ABGR2101010 formats.
    
    Signed-off-by: Daniel Stone <daniels@collabora.com>
    Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
    Tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
    Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
    Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>

:040000 040000 9dac3f382cf27d5ca3224a3d2d5624e451dd6bc6 bf416c5c8023cad55893f313aa096261042ed5e5 M      src

https://github.com/mesa3d/mesa/commit/753f603b52db5eb38e27e1842fa43299a348998b

Version/Commit

You can find this information under Information/System Information

Environment information

  • OS: Slackware64-current
  • Compiler: gcc-8.2.0
@orbea
Copy link
Contributor Author

orbea commented Aug 24, 2018

Its failing here.

https://github.com/libretro/RetroArch/blob/master/gfx/video_driver.c#L1076

So I tried turning thread video on and I got a segfault instead.

Thread 3 "retroarch" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff1b6f700 (LWP 15580)]
0x0000000000644125 in gl_viewport_info (data=0x0, vp=0xa7f780) at gfx/drivers/gl.c:2220
2220	   *vp             = gl->vp;
(gdb) bt
#0  0x0000000000644125 in gl_viewport_info (data=0x0, vp=0xa7f780) at gfx/drivers/gl.c:2220
#1  0x000000000062b434 in video_thread_handle_packet (thr=0xa7f600, incoming=0x7ffff1b6e920) at gfx/video_thread_wrapper.c:344
#2  0x000000000062c01e in video_thread_loop (data=0xa7f600) at gfx/video_thread_wrapper.c:597
#3  0x000000000062a71a in thread_wrap (data_=0xad9260) at libretro-common/rthreads/rthreads.c:145
#4  0x00007ffff6b73637 in start_thread () from /lib64/libpthread.so.0
#5  0x00007ffff4b68e8f in clone () from /lib64/libc.so.6
(gdb) bt full
#0  0x0000000000644125 in gl_viewport_info (data=0x0, vp=0xa7f780) at gfx/drivers/gl.c:2220
        width = 0
        height = 0
        top_y = 32767
        top_dist = 4160516112
        gl = 0x0
#1  0x000000000062b434 in video_thread_handle_packet (thr=0xa7f600, incoming=0x7ffff1b6e920) at gfx/video_thread_wrapper.c:344
        i = 0
        pkt = {type = CMD_INIT, data = {b = false, i = 0, f = 0, str = 0x0, v = 0x0, set_shader = {type = RARCH_SHADER_NONE, path = 0x0}, set_viewport = {width = 0, height = 0, force_full = false, 
              allow_rotate = false}, rect = {index = 0, x = 0, y = 0, w = 0, h = 0}, image = {data = 0x0, num = 0}, output = {width = 0, height = 0}, new_mode = {width = 0, height = 0, fullscreen = false}, 
            filtering = {index = 0, smooth = false}, osd_message = {msg = '\000' <repeats 127 times>, params = {x = 0, y = 0, scale = 0, drop_mod = 0, drop_x = 0, drop_y = 0, drop_alpha = 0, color = 0, 
                full_screen = false, text_align = TEXT_ALIGN_LEFT}}, custom_command = {method = 0x0, data = 0x0, return_value = 0}, font_init = {method = 0x0, font_driver = 0x0, font_handle = 0x0, 
              video_data = 0x0, font_path = 0x0, font_size = 0, return_value = false, is_threaded = false, api = FONT_DRIVER_RENDER_DONT_CARE}}}
        ret = false
#2  0x000000000062c01e in video_thread_loop (data=0xa7f600) at gfx/video_thread_wrapper.c:597
        pkt = {type = CMD_INIT, data = {b = false, i = 0, f = 0, str = 0x0, v = 0x0, set_shader = {type = RARCH_SHADER_NONE, path = 0x0}, set_viewport = {width = 0, height = 0, force_full = false, 
              allow_rotate = false}, rect = {index = 0, x = 0, y = 0, w = 0, h = 0}, image = {data = 0x0, num = 0}, output = {width = 0, height = 0}, new_mode = {width = 0, height = 0, fullscreen = false}, 
            filtering = {index = 0, smooth = false}, osd_message = {msg = '\000' <repeats 127 times>, params = {x = 0, y = 0, scale = 0, drop_mod = 0, drop_x = 0, drop_y = 0, drop_alpha = 0, color = 0, 
                full_screen = false, text_align = TEXT_ALIGN_LEFT}}, custom_command = {method = 0x0, data = 0x0, return_value = 0}, font_init = {method = 0x0, font_driver = 0x0, font_handle = 0x0, 
              video_data = 0x0, font_path = 0x0, font_size = 0, return_value = false, is_threaded = false, api = FONT_DRIVER_RENDER_DONT_CARE}}}
        updated = false
        thr = 0xa7f600
#3  0x000000000062a71a in thread_wrap (data_=0xad9260) at libretro-common/rthreads/rthreads.c:145
        data = 0xad9260
#4  0x00007ffff6b73637 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#5  0x00007ffff4b68e8f in clone () from /lib64/libc.so.6
No symbol table info available.
(gdb) t a a f

Thread 3 (Thread 0x7ffff1b6f700 (LWP 15580)):
#3  0x000000000062a71a in thread_wrap (data_=0xad9260) at libretro-common/rthreads/rthreads.c:145
145	   data->func(data->userdata);

Thread 2 (Thread 0x7ffff2370700 (LWP 15579)):
#0  0x00007ffff6b79783 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0

Thread 1 (Thread 0x7ffff7e25780 (LWP 15575)):
#0  0x00007ffff6b79783 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0

@orbea
Copy link
Contributor Author

orbea commented Aug 25, 2018

This appears to be a RetroArch bug, I asked in #nouveau and was kindly explained where RetroArch is going wrong.

19:12 <imirkin> orbea: yeah, egl_init_context needs enhancement.
19:12 <imirkin> it may actually return multiple configs, in this case rgb10 and rgb8, and it 
needs to pick the one it likes (i.e. which matches its pixel format)
19:15 <imirkin> https://github.com/libretro/RetroArch/blob/master/gfx/common/egl_common.c#L312
19:15 <imirkin> have a look at the change i linked to earlier, and observe the similarity
19:15 <imirkin> https://cgit.freedesktop.org/mesa/kmscube/commit/?id=56c3917ffd1f05942246e2532ca4a5707554a2fc
19:15 <imirkin> that was fixing the same issue in kmscube
19:16 <imirkin> basically you can't just pick any ol' config for any format. you have to pick
the right one.
19:16 <imirkin> here they want GBM_FORMAT_XRGB8888
19:16 <imirkin> so you have to pick the right one
19:16 <imirkin> and for reasons which aren't apparent to me, you can't actually pass that in 
as one of the attributes
19:16 <imirkin> to eglChooseConfigs

Edit: Fixed markdown.

@RobLoach RobLoach changed the title KMS no longer works with the mesa git master. (Video) KMS no longer works with the mesa Aug 29, 2018
@orbea
Copy link
Contributor Author

orbea commented Aug 30, 2018

@RobLoach This is not a vulkan bug, I don't even have vulkan drivers. :)

I think its a general EGL issue.

@imirkin
Copy link

imirkin commented Aug 30, 2018

It's an issue when using the GBM egl platform. It returns multiple configs, one for each (matching) pixel format exposed by the driver. There's no way to stop eglChooseConfigs from doing that. And then you have to iterate over them to find which one has the EGL_NATIVE_VISUAL_ID that matches the GBM_FORMAT_* that you want to draw with. (And note that DRM_FORMAT_* == GBM_FORMAT_*.) Otherwise you'll get either failures or misrendering depending on exactly which way the fail goes, since the "front buffer" of the config will be in a different format than what the scanout engine is configured for when you feed those surfaces in as drm framebuffers.

@orbea
Copy link
Contributor Author

orbea commented Sep 1, 2018

@twinaphex Do you know who could help solve this? I think the above comment explains why this is broken and what should be done to fix it.

@gouchi
Copy link
Member

gouchi commented Sep 8, 2018

I don't reproduce the issue using Mesa 18.1.7 (August 24, 2018) and Intel HD (5500).

@imirkin
Copy link

imirkin commented Sep 8, 2018

Because the first-returned config happens to be the one you're looking for?

@gouchi
Copy link
Member

gouchi commented Sep 12, 2018

I tried to reproduce it with Nvidia card (GK107) and Mesa 18.1.7 but it is working also using RA with commit 7428fef4bc

[INFO] [GL]: Found GL context: kms
[INFO] [GL]: Detecting screen resolution 1920x1080.
[INFO] [EGL] Found EGL client version >= 1.5, trying eglGetPlatformDisplay
[INFO] [EGL]: EGL version: 1.4
[INFO] [EGL]: Current context: 0x5566cf561170.
[INFO] [KMS]: New FB: 1920x1080 (stride: 7680).
[INFO] [GL]: Vendor: nouveau, Renderer: NVE7.
[INFO] [GL]: Version: 3.1 Mesa 18.1.7.
[INFO] [GL]: Using resolution 1920x1080

@imirkin
Copy link

imirkin commented Sep 12, 2018

Now update mesa, and potentially the kernel, so that you get 30bpp properly exposed. Then I think you might get the 30bpp config first.

@orbea
Copy link
Contributor Author

orbea commented Sep 13, 2018

For reference I can still reproduce it with mesa commit https://github.com/mesa3d/mesa/commit/14fe9fa11baa4e4e782d192e0b8c150ee9e8754f from 11 days ago and linux 4.17.14.

@gouchi Also thanks for the interest!

@gouchi
Copy link
Member

gouchi commented Sep 13, 2018

@imirkin Indeed I reproduce the issue using latest mesa with Nvidia card. I will have to test with the Intel card.

@orbea No problem, you are welcome.

@gouchi
Copy link
Member

gouchi commented Sep 15, 2018

I can't reproduce the issue using latest Mesa with Intel HD card.

@imirkin
Copy link

imirkin commented Sep 15, 2018

I'm not sure what you're trying to demonstrate... I've pointed out where the bug is, as well as code for how to handle it.

You need a driver which returns EGL configs in an order where the first one returned doesn't happen to match the DRM format that you've chosen. Nouveau is an instance of this happening, but it can happen anywhere. Try e.g. requesting a 10bpp format with intel and it will fail in the way described. Or RGB instead of BGR (or vice-versa).

@orbea orbea added bug: kms and removed bug: vulkan labels Oct 8, 2018
@orbea
Copy link
Contributor Author

orbea commented Nov 19, 2018

@gouchi Any progress on this?

@gouchi
Copy link
Member

gouchi commented Nov 19, 2018

@orbea I am afraid not. Somebody more skilled on this subject should help ;-)

@orbea
Copy link
Contributor Author

orbea commented Nov 19, 2018

I know what you mean...thanks for the reply.

@orbea
Copy link
Contributor Author

orbea commented Nov 25, 2018

This issue is also reproducible with amdgpu (mesa).

@inactive123
Copy link
Contributor

We don't know how to fix this ourselves. Pull requests very much welcome.

@orbea
Copy link
Contributor Author

orbea commented Nov 26, 2018

There is a good example of how to fix this in comment #7119 (comment).

See this link where the same issue affected kmscube.

https://cgit.freedesktop.org/mesa/kmscube/commit/?id=56c3917ffd1f05942246e2532ca4a5707554a2fc

I would make a PR, but all my attempts at fixing this have been unsuccessful. My understanding of the code base and GL are too little.

@inactive123
Copy link
Contributor

inactive123 commented Nov 26, 2018

Maybe @imirkin can make the PR then. There is not enough inhouse knowledge right now on DRM/KMS where we feel reasonably confident to make such changes, and it would still have to be tested.

@orbea
Copy link
Contributor Author

orbea commented Nov 26, 2018

On a side note, this issue only affects opengl, with amdgpu and vulkan KMS still works.

However this won't work with GL cores, there are vulkan issues with parallel-n64 and parallel-psx which I can finally investigate and nouveau of course still doesn't have vulkan.

Edit: Parallel-psx seems like an easy fix. libretro/beetle-psx-libretro#449

@orbea orbea changed the title (Video) KMS no longer works with the mesa (Video) KMS no longer works with mesa Nov 27, 2018
@dtsarr
Copy link

dtsarr commented Dec 5, 2018

I threw together a quick fix following what had been done to kmscube to address the same issue.
Please, let me know if it works for you too.

https://drive.google.com/file/d/1mT_lb9Of49M3eJszhGAqVzySGnG0BTIu/view?usp=sharing

@orbea
Copy link
Contributor Author

orbea commented Dec 5, 2018

@dtsarr Yes, that does work! Thanks a lot!

However now when starting a core such as Genesis GX Plus with KMS and OpenGL RetroArch will print the following. I suspect its related to this change as the same will happen with the patch applied to v1.7.5 as it will with master.

mesa: for the -simplifycfg-sink-common option: may only occur zero or one times!
mesa: for the -global-isel-abort option: may only occur zero or one times!

And for the record here is the current patch.

diff --git a/gfx/common/egl_common.c b/gfx/common/egl_common.c
index f3c579af21..217e370a23 100644
--- a/gfx/common/egl_common.c
+++ b/gfx/common/egl_common.c
@@ -329,8 +329,40 @@ bool egl_init_context(egl_ctx_data_t *egl,
 
    RARCH_LOG("[EGL]: EGL version: %d.%d\n", *major, *minor);
 
-   if (!eglChooseConfig(egl->dpy, attrib_ptr, &egl->config, 1, n) || *n != 1)
+   EGLint count = 0;
+   EGLint matched = 0;
+   EGLConfig *configs;
+   int config_index = -1;
+
+   if (!eglGetConfigs(egl->dpy, NULL, 0, &count) || count < 1)
+   {
+      RARCH_ERR("[EGL]: No cofigs to choose from.\n");
+      return false;
+   }
+
+   configs = malloc(count * sizeof *configs);
+   if (!configs) return false;
+
+   if (!eglChooseConfig(egl->dpy, attrib_ptr, configs, count, &matched) || !matched)
+   {
+      RARCH_ERR("[EGL]: No EGL configs with appropriate attributes.\n");
       return false;
+   }
+
+   int i;
+
+   for (i = 0; i < count; ++i)
+   {
+      EGLint id;
+
+      if (!eglGetConfigAttrib(egl->dpy, configs[i], EGL_NATIVE_VISUAL_ID, &id))
+         continue;
+
+      if (id == GBM_FORMAT_XRGB8888) break;
+   }
+
+   config_index = i;
+   if (config_index != -1) egl->config = configs[config_index];
 
    egl->major = g_egl_major;
    egl->minor = g_egl_minor;

@orbea
Copy link
Contributor Author

orbea commented Dec 5, 2018

On second thought that might be unrelated and was masked by how I was using nouveau before and am now using radeonsi.

@orbea
Copy link
Contributor Author

orbea commented Dec 5, 2018

@orbea
Copy link
Contributor Author

orbea commented Dec 5, 2018

Removing the workaround in mesa silences the warnings and I suspect the rest is a mesa issue. @dtsarr What do you think? It seems like your change could be made into a PR?

diff --git a/src/amd/common/ac_llvm_util.c b/src/amd/common/ac_llvm_util.c
index dc9b684e9d..1501afe95e 100644
--- a/src/amd/common/ac_llvm_util.c
+++ b/src/amd/common/ac_llvm_util.c
@@ -50,19 +50,6 @@ static void ac_init_llvm_target()
 
 	/* For inline assembly. */
 	LLVMInitializeAMDGPUAsmParser();
-
-	/* Workaround for bug in llvm 4.0 that causes image intrinsics
-	 * to disappear.
-	 * https://reviews.llvm.org/D26348
-	 *
-	 * "mesa" is the prefix for error messages.
-	 *
-	 * -global-isel-abort=2 is a no-op unless global isel has been enabled.
-	 * This option tells the backend to fall-back to SelectionDAG and print
-	 * a diagnostic message if global isel fails.
-	 */
-	const char *argv[3] = { "mesa", "-simplifycfg-sink-common=false", "-global-isel-abort=2" };
-	LLVMParseCommandLineOptions(3, argv, NULL);
 }
 
 static once_flag ac_init_llvm_target_once_flag = ONCE_FLAG_INIT;

@dtsarr
Copy link

dtsarr commented Dec 6, 2018

@orbea Can you make the PR? I'm still very new to Git and I don't want to mess things up.

Please note that I modified the code slightly on my drive.

@orbea
Copy link
Contributor Author

orbea commented Dec 6, 2018

I made a PR with the updated code, again thanks a lot!

@orbea
Copy link
Contributor Author

orbea commented Dec 10, 2018

@dtsarr Hi, unfortunately this patch has had unfortunate consequences for wayland users, please see issue #7721. Do you happen to know how to fix this? :)

@orbea
Copy link
Contributor Author

orbea commented Dec 11, 2018

It should hopefully be fixed after PR #7729.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants