Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wayland Support for the OpenGL Backend #3545

Merged
merged 3 commits into from
Dec 28, 2020
Merged

Conversation

nemosupremo
Copy link
Contributor

@nemosupremo nemosupremo commented Dec 20, 2020

This PR isn't meant to be merged in the current state, but to illustrate what changes were need to get it working.

I've been trying to get the OpenGL backend working with wayland. The current device I'm targeting doesn't currently support Vulkan so I'm trying to get the wayland renderer working. During my tests I'm hitting some blockers with the hal::Instance API and I'm wondering if I'm doing something wrong, or if the API just wasn't developed with this in mind.

I'm hitting the following roadblocks:

  1. I need a pointer to this display to pass to get_platform_display. Both creating a new connection and using egl::DEFAULT_DISPLAY results in the window not working or rendering in the quad example. To get around this I'm passing a pointer to the create function via the version parameter.
  2. To create wl_egl_window_create I need to pass in a window size. I don't think this is a show stopper as long as there was some way get resize events so I could call wl_egl_window_resize. I think this can be done by calling wl_egl_window_resize in PresentationSurface::configure_swapchain

Another thing I had to remove the GL_COLORSPACE. This attribute is only supported in EGL 1.5, but currently are only testing 1.4. Seeing how the default value is GL_COLORSPACE_SRGB, then I just removed this attribute entirely.

Is there a proper way to overcome these hurdles without drastically modifying gfx's api?

@nemosupremo nemosupremo marked this pull request as draft December 20, 2020 22:21
Copy link
Member

@kvark kvark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for taking a stab at this!
I think I know why it doesn't work. See comments.

log::info!("Using Wayland platform");
const EGL_PLATFORM_WAYLAND_KHR: u32 = 0x31D8;
let display_attributes = [egl::ATTRIB_NONE];
egl::get_platform_display(EGL_PLATFORM_WAYLAND_KHR, ptr as _, &display_attributes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this part. The docs on EGL_PLATFORM_WAYLAND_KHR say that a wl_display is needed here for the second parameter. And we have that right here, but instead of using it the code closes the connection and passes in the pointer that was given to Instance::new.

We shouldn't be modifying the signature of Instance::new.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is the display that is passed to get_platform_display is created by winit. If I create my own, then quad does not render at all. I understand modifying Instance::new here is undesired - I just did it prove it works. What I need is a handle to wl_display here that was created by winit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be a different wl_display, necessarily, than what wayland::display_connect() returns?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what is happening, but if I pass in wl_display that is returned from wayland::display_connect() then no window is created on screen. Same for passing egl::DEFAULT_DISPLAY.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, how come no window is created? The window is created by winit, it has to be there.
Do you mean that no rendering is done on it? And no errors occurring here?

src/backend/gl/src/window/egl.rs Show resolved Hide resolved
@@ -195,20 +219,26 @@ impl hal::Instance<crate::Backend> for Instance {
Rwh::Xcb(handle) => handle.window as _,
#[cfg(target_os = "android")]
Rwh::Android(handle) => handle.a_native_window,
#[cfg(not(target_os = "android"))]
Rwh::Wayland(handle) => {
let window = wayland::egl_window_create(handle.surface, 1024, 768);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks quite fishy to me. We aren't supposed to create a window here. We are supposed to get a window and initialize the GL context on it. Rwh::Wayland has the WL surface, and that's actually what we need here, not WL window being created. See:

To obtain an on-screen rendering surface from a Wayland window, call
eglCreatePlatformWindowSurface with a that belongs to Wayland and
a <native_window> that points to a struct wl_egl_surface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that documentation as well and I think it's outdated. This is not creating a window. The documentation here is pretty sparse but you need to create a wayland egl window (https://jan.newmarch.name/Wayland/EGL/)

An EGL window needs to be created from a Wayland surface, by the Wayland call wl_egl_window_create. This is then turned into an EGL drawing surface by the EGL call eglCreateWindowSurface. Drawing into this surface is generally done by an API such as OpenGL, and the choice of this has been set in the egl_context. This is set for the drawing surfaces by eglMakeCurrent.

Copy link
Contributor Author

@nemosupremo nemosupremo Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a more explanatory article I found (https://ppaalanen.blogspot.com/2012/03/what-does-egl-do-in-wayland-stack.html):

The client passes the wl_display object to eglGetDisplay() and receives an EGLDisplay to be used with EGL calls. Then comes the trick that is denoted by the double-arrowed blue line from Wayland client to Mesa EGL in the diagram. The client calls the wayland-egl API (implemented in Mesa) function wl_egl_window_create() to get the native window handle. Normally you would just use the "real" native window object wl_surface (or an X11 Window if you were using X). The native window handle is used to create the EGLSurface EGL handle. Wayland has this extra step and the wayland-egl API because a wl_surface carries no information of its size. When the EGL library allocates buffers, it needs to know the size, and wayland-egl API is the only way to tell that.

My theory here is that this surface is owned by the display that was created by winit, however the display we have is the once we created in Instance::create

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link says "This project hasn't been updated since mid-2017."

An EGL window needs to be created from a Wayland surface, by the Wayland call wl_egl_window_create

My expectation is that Winit calls that, or does something equivalent to this. gfx-rs shouldn't be creating any windows.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That other link is even older, from 2012. It does sound peculiar indeed, but between the link and the actual spec for the specification we are using, I'd rather trust the spec. Note that the spec expects wl_egl_surface, but that function wl_egl_window_create returns wl_egl_window, so it's a clear mismatch.

Copy link
Contributor Author

@nemosupremo nemosupremo Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into this then - I saw the spec and when I tried that I got a segfault. And despite what the spec says, Weston, Wayland reference compositor uses wl_egl_window_create as well. I'll see if there are any other projects on github that pass in the surface directly.

https://github.com/wayland-project/weston/blob/master/clients/simple-egl.c#L365

Edit: It's possible that this behavior was changed in 1.5 (which is what the spec notes). I'm trying to target the Pi which seems to only support 1.4.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble finding wl_egl_surface at all. Perhaps, the spec is indeed outdated, and this is now called wl_egl_window? :/

Copy link
Contributor Author

@nemosupremo nemosupremo Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually missed that - my brain read wl_egl_surface and just wl_surface - and likewise I've never seen wl_egl_surface in wayland-project/wayland. wl_egl_surface surface doesn't seem to exist outside the spec and some nvidia documentation

What I can say for sure is most other projects I've used as a reference use egl_window_create on a surface. I think the name is misleading however, like the ppaalanen's blogpost article states a wl_egl_window is just a surface with size information (you can see here that the function to create a wl_egl_window just wraps a surface with some width/height metadata).

I think, conceptually this code is correct. The bigger problem seems to be that the raw_window_handle that I have, that was created by winit has a wl_surface and a wl_display. The wl_display I get from the raw_window_handle is created by winit, not by Instance::create, and when I try to use this surface in create_platform_surface_window, wayland doesn't like this (logs from weston 8.0.0):

[21:01:54.017] libwayland: invalid object (6), type (zwp_relative_pointer_manager_v1), message attach(?oii)
[21:01:54.025] libwayland: error in client communication (pid 10906)

AFAICT, my options are:

  1. Reinitialize the glcontext with this new display/window handle. This repeats the create function, and also could require an API change (&mut self).
  2. Pass in the window handle during create. This is what I'm doing, via the version parameter, now. Again this breaks the API
  3. Somehow figure this out without needing interaction with winit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[21:01:54.017] libwayland: invalid object (6), type (zwp_relative_pointer_manager_v1), message attach(?oii)

This sounds like the object is bad. Is there a way to check if it's a good wl_display? I wonder if it's just a bug on winit side.

@@ -37,6 +37,7 @@ hal = { path = "../src/hal", version = "0.6", package = "gfx-hal" }
auxil = { path = "../src/auxil/auxil", version = "0.5", package = "gfx-auxil" }
gfx-backend-empty = { path = "../src/backend/empty", version = "0.6" }
winit = { version = "0.23", features = ["web-sys"] }
raw-window-handle = "0.3"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once the Instance API is back, we'll need to remove it

@nemosupremo
Copy link
Contributor Author

So, quickly, the code works (as in it actually renders quad.rs), but as you can imagine it hacks the API.

@kvark
Copy link
Member

kvark commented Dec 21, 2020

Alright, so we have 2 major issues so far:

  1. The wl_display from winit is not compatible with wl_display that we could create in Instance::new. We need to figure out why exactly this is the case. If it just happens to be that your platform has the default WL display different from winit window, we should just make it so the surface (created from this window) can't be rendered to by the physical adapter from our Instance, so we'll never need to stitch the two wl_display things together.
  2. EGL needs the window size in order to create a surface. The main question to resolve (in order to make progress) here is to find out why X11 path doesn't require that. I suspect one possible answer is that X11 window handle already has that information associated with it, but WL surface doesn't. In this case, I suggest this to be a problem with RawWindowHandle, and we should file an issue accordingly and discuss this.

Edit: filed rust-windowing/raw-window-handle#61

For (1) if we fail to use the default wl_display, we could also write the Wayland window integration in a way similar to the Web:

  • that Instance by itself doesn't initialize anything
  • it's only really initialized upon create_surface, which needs to be called before the enumerate_adapters, and needs to be only done once

@nemosupremo
Copy link
Contributor Author

wrt (2), I don't think this is an issue. You can call egl_window_create with some dummy resolution (I just call it with VGA 640×480), then when you configure the swap chain, you call egl_window_resize.

(1) seems to be the tricky one - I suspect wayland's design doesn't support intermixing of resources by two different wl_displays, so when we get our wl_surface from winit and try to use it libwayland hits an error. I don't think this has anything to do with the physical display (after all I only have one display), but something inherent with wayland's "secure by design" architecture. I'm going to try and take a deep dive into what the Vulkan instance does.

@kvark
Copy link
Member

kvark commented Dec 21, 2020

Digging up a bit... here is where winit establishes a Wayland display - https://docs.rs/wayland-client/0.28.2/wayland_client/struct.Display.html#method.connect_to_env

@nemosupremo
Copy link
Contributor Author

For (1) if we fail to use the default wl_display, we could also write the Wayland window integration in a way similar to the Web:

This seems like the most successful path forward, but because we don't have a mutable reference to self, it would require wrapping everything in a mutex to get interior mutability. My plan was to leave eveerything as is (so enumerate_adapters still works) but then recreate the context once create_surface was called with a display other than our own. The downside of this would be calls to create_surface, enumerate_adapters and destroy_surface would be mutex protected even in X11 - would that cause an undue performance hit?

@kvark
Copy link
Member

kvark commented Dec 21, 2020

There is no performance concern about mutexing this :) I was wondering if the logic can all be tight in src/window/wayland.rs without affecting the main window/egl.rs module, but in the end you see this more clearly since you are writing the code. So I defer to your judgement on how to best roll this out.

@fooishbar
Copy link

The wl_display from winit is not compatible with wl_display that we could create in Instance::new. We need to figure out why exactly this is the case. If it just happens to be that your platform has the default WL display different from winit window, we should just make it so the surface (created from this window) can't be rendered to by the physical adapter from our Instance, so we'll never need to stitch the two wl_display things together.

You must use the same wl_display; objects such as wl_surface are not shareable between different instances of wl_display. The ID namespace is completely private to each client connection.

@nemosupremo nemosupremo reopened this Dec 21, 2020
@nemosupremo nemosupremo marked this pull request as ready for review December 21, 2020 22:01
@nemosupremo nemosupremo force-pushed the gl-wayland branch 2 times, most recently from 6f7b7c9 to c35dfc4 Compare December 21, 2020 22:09
@nemosupremo
Copy link
Contributor Author

Ok, taking into the account the design of wayland with the current API, I've gone the route of interior mutability. What I've done is move the create function largely into an Inner struct - that way it can easily be dropped/created in create_surface once we detect a new display. Under wayland when are asked to create a surface, we drop the previous context and recreate it. The X11 implementation is largely untouched from a functionality standpoint.

@nemosupremo
Copy link
Contributor Author

The OpenGL renderer seems to work on wgpu using the gl renderer on a pi under wayland

wayland-screenshot-2020-12-22_00-21-01

Copy link
Member

@kvark kvark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking very close to a mergeable state now! Just a couple of things I noticed


#[derive(Debug)]
pub struct Inner {
egl: Starc<egl::DynamicInstance>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so both Instance and Inner now contain the same egl member?

if let Ok(library) = libloading::Library::new("libwayland-egl.so") {
Some(((), library))
} else {
None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a lot of nesting here. Perhaps, moving it into a function would help, so that you can do early-out with ??

src/backend/gl/src/window/egl.rs Outdated Show resolved Hide resolved
None
};

let display = match (wayland_display, x11_display) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the if/else if/else pattern would be cleaner here

}
attributes.push(egl::ATTRIB_NONE);

let native_window = match has_handle.raw_window_handle() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is super confusing, since now we have 2 different variables named native_window of 2 different but close types. Let's not re-use the variable name here, please.

@@ -3,7 +3,7 @@
use crate::{conv, native, GlContainer, PhysicalDevice, Starc};
use glow::HasContext;
use hal::{image, window as w};
use std::{os::raw, ptr};
use std::{os::raw, ptr, sync::Mutex};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, one last thing: let's use parking_lot::Mutex just to be consistent

Copy link
Member

@kvark kvark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!
bors r+

@kvark
Copy link
Member

kvark commented Dec 28, 2020

bors retry

@bors
Copy link
Contributor

bors bot commented Dec 28, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants