Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design of the drm backend #18

Closed
Drakulix opened this issue Jun 2, 2017 · 14 comments
Closed

Design of the drm backend #18

Drakulix opened this issue Jun 2, 2017 · 14 comments

Comments

@Drakulix
Copy link
Member

Drakulix commented Jun 2, 2017

So this will be a small write-up about thoughts I had about writing a drm backend for smithay
mostly for @nshp and for further discussion with everyone.

A drm backend will be the first to yield more then one graphics backends, because there may be multiple outputs for one device and those may also be attached and detached at run time.
Which makes it difficult to do this correctly, as an output is theoretically not owned by anyone and as no specified lifetime, I have no idea, what the best way to handle this in rust is, but I bet there are many solutions (winit suffers from the same problem, as a window might get closed at any time).

Opening the devices will require some session code in the future (systemd, ConsoleKit, etc) to allow launching the compositor as a user process. Right now I would like to handle this case, like the libinput backend does right now, just don't open the device yourself. The drm backend could just assume that the device is already opened. The simple example could just open the device using libc::open and thus require root privileges for now until we provide a nice way to deal with sessions.

This also leaves the problem of choosing the card out of scope for now. In the future we maybe also want to have a handler that uses udev to select the most appropriate graphics card available.

Both of this suggestions flow nicely with smithay's idea of exposing all low-level functionality directly. We can always plug more logic on top of it.

My initial idea was to do something like this:

#[derive(Debug, PartialEq, Eq, Hash)]
pub struct DrmOutput {}

impl EglGraphicsBackend for DrmOutput { .. }

pub trait DrmHandler {
    pub fn on_output_connected(&mut self, output: DrmOutput);
    pub fn on_output_disconnected(&mut self, output: &DrmOutput);
}

pub struct DrmBackend<'a, H: DrmHandler> {
    gbm: Device,
    handler: H,
}

impl<'a, H: Handler> DrmBackend<'a, H> {
    fn new_from_fd(fd: RawFd, handler: H) -> IoResult<DrmBackend<'a, H>> {
        let gbm = Device::new(fd)?;
        ...
        DrmBackend {
            gbm: Device,
            handler: handler,
        }
    }
}

Some inspiration for the implementation can be gathered from:

Please note that wlroots is very new and largely untested in a productive environment
and that wlc still suffers up to this day from issues when attaching/detaching monitors at runtime, possibly related to some race conditions, so it is certainly not bug-free.

Still other implementations can provide some insight and weston's implementation is certainly the most difficult to read (probably for a reason).

Providing a stable solution is our goal here, which will require reading through a lot of docs and probably some suffering during debugging and which also brings us to the next topic: drm bindings.

The are many different libdrm bindings out there:

Last time I checked non of the rusty ones were complete, which is most likely the reason, why they are so many. While we already have some unsafe c-bindings in rust (egl code) I would like to see a safe wrapper been used.

rusty-desktop will be trying to solve this problem, so it looks like we should try using drm-rs at first and contribute to that repository, when functionality is lacking behind what we need.

@cyndis
Copy link

cyndis commented Jun 3, 2017

Something I would like to see supported is having the GBM device be different than the DRM device. This is required on Tegra systems, since the display controller is separate from the GPU, with separate drivers (dc is tegradrm, gpu is nouveau). This has been implemented for weston in the top commits of https://github.com/Gnurou/weston/commits/fb_modifiers, or kmscube in https://github.com/Gnurou/kmscube/commits/fb_modifiers.

@Drakulix
Copy link
Member Author

Drakulix commented Jun 4, 2017

@cyndis I was not aware, that this is a thing, but sure, it seems to be easy enough to support. (But most likely nobody here will be able to test that.)

@cyndis
Copy link

cyndis commented Jun 5, 2017

Sure, I could test it eventually. I think this is the same mechanism that is used for DRM PRIME (optimus systems) so it would be useful for those systems too.

@fooishbar
Copy link

Something I would like to see supported is having the GBM device be different than the DRM device. This is required on Tegra systems, since the display controller is separate from the GPU, with separate drivers (dc is tegradrm, gpu is nouveau). This has been implemented for weston in the top commits of https://github.com/Gnurou/weston/commits/fb_modifiers, or kmscube in https://github.com/Gnurou/kmscube/commits/fb_modifiers.

To be honest, I think this is an abuse of the GBM API. Other drivers with mainline kernel & Mesa support (unlike Tegra) such as Etnaviv already handle this transparently. I would expect Tegra to behave the same.

As far as PRIME goes, having separate GBM & DRM device nodes actually makes it impossible for that to work correctly. The correct mechanism is to pass the DRM device node to GBM (i.e. as Weston does today), and then to use the proposed EGL_EXT_explict_device extension to select a GPU to render with.

@Drakulix
Copy link
Member Author

@fooishbar Can you explain where you are coming from and if there is something wrong with supporting that use-case other then it was not intended to be used like this?

It does not surprise me at all, if again nvidia is doing some quirks here with their tegra devices, but this time this seems to be easy to support from our side of view (other then EGLStreams). I don't like to limit our possibilities by trying to support something unusual, but if possible, I don't like ignoring such requests either, if they are not difficult to implement.

PRIME should not be a big deal on tegra based chipsets anyway and I do not see, why we should not be able to support PRIME on "normal" systems using the same device for the DRM and GBM APIs.

Also do you think the tegra driver will in the near future actually move to a more standard-conform implementation?

@cyndis
Copy link

cyndis commented Aug 28, 2017

To be honest, I'm not fully familiar with the situation - it might be that the correct solution is to have something in Mesa do this automatically - but I think there is something uniquely complicated about it since otherwise it would already have been implemented. I'll try to find out and comment here.

Regarding NVIDIA quirks - this situation is specific to the "upstream graphics stack" running on upstream kernel, with tegradrm, nouveau and mesa, and driven by upstream developers - so this is unrelated to the NVIDIA proprietary graphics stack, and development thereof.

@Drakulix
Copy link
Member Author

Regarding NVIDIA quirks - this situation is specific to the "upstream graphics stack" running on upstream kernel, with tegradrm, nouveau and mesa, and driven by upstream developers - so this is unrelated to the NVIDIA proprietary graphics stack, and development thereof.

I get that, I am also very happy nvidia made that decision with that platform. Still the platform seems to be unusual in the implementation and although there is likely reason for that, it does not surprise me to see this when the initial development was done by that same company.

At least that means the situation can be fixed and I am willing to contribute whats necessary to provide a decent experience for smithay, but we need to be fully aware of what is the situation right now to make the right decision if this is feasible to support.

@fooishbar
Copy link

@Drakulix I don't think it's quirky or anything, just not particularly well thought through. And certainly not matching etnaviv/imx, vc4/pl111, Mali/anything, or any of the other separate display controller + GPU combinations which already have proper support.

The problem is that GBM needs to allocate buffers suitable for both the GPU and display controller. Currently GBM takes a FD for the display controller as the device, and then rendering happens by EGL, which by definition has the GPU device as well. Combining these two, you know how to allocate buffers suitable for both.

The Tegra patches do not pass the display controller device to GBM, only the GPU (which is already known by EGL). So they lose information, and make handling PRIME basically impossible. I assume they would be fixed with the 'renderonly' framework in newer Mesa (used by etnaviv/vc4/etc), but the driver sadly seems to have been abandoned for a couple of years.

@cyndis
Copy link

cyndis commented Aug 28, 2017

There are some renderonly patches - https://github.com/Gnurou/mesa/commits/renderonly - maybe these will do the trick if they can be merged. I'll discuss the situation with @thierryreding, or maybe he'll comment himself.

@thierryreding
Copy link

I think people misunderstand the evolution of all of this. The code that @Gnurou has in his tree is based on prototype code that I had initially written for kmscube and that evolved to use the newer FB modifiers rather than the device specific IOCTLs we have on Tegra. This goes way back to a time when nobody was doing anything like this, so this was a proof of concept of how we could deal with the split display controller and GPU setups that we were seeing on all SoCs starting to see support in the kernel.

Once that prototype was working I set out to implement a Tegra driver for Mesa that would essentially do the same thing but behind the scenes. The idea was that you'd pass the Tegra DRM file descriptor to GBM and the driver would internally figure out the GPU file descriptor to use and create a Nouveau pipe_context from that:

https://lists.freedesktop.org/archives/mesa-dev/2014-November/071521.html

People didn't like that very much because it was a complete Gallium driver just to wrap around essentially the Nouveau driver. It also had a couple of shortcomings that would've taken some effort to fix, and I did go back to this occasionally and tried various things over the past few years. Unfortunately this is not something I was ever able to work on full time, so I haven't made too much progress. If anyone's interested, I've got a bit more recent code here:

https://cgit.freedesktop.org/~tagr/mesa/log/?h=staging/work

That's fairly stale by now, and I might have some more up-to-date branches locally somewhere.

Later on, this work was picked up by @austriancoder for etnaviv/imx and drastically simplified to have the absolute minimum necessary to share buffers between a GPU and a KMS-only driver. This is what is known today as renderonly. Essentially what this does is make the GPU driver (etnaviv, Nouveau, ...) aware of the fact that it might be asked to share certain buffers with a KMS-only driver, then creates a tiny wrapper Gallium driver that passes some hooks to the GPU driver to tell it what to do when sharing buffers.

In my opinion the solution is suboptimal, for two reasons. The first reason is generic because it requires all GPU drivers that want to share buffers with a KMS driver to learn about this infrastructure. I think that's bad because it violates encapsulation. From an architectural point of view I think the GPU drivers should be agnostic of what they're used with. Instead, it should fall to the SoC driver to glue together the GPU backend with the KMS frontend. That's why the initial Tegra driver was written the way it was.

The second shortcoming of the renderonly infrastructure is somewhat Tegra specific. Tegra is "special" in this regard in that the original kernel driver provided both a modesetting part and served as the GPU driver. Back at the time, the GPU was technically still split from the display controller, but they were still tightly integrated, with no other driver available for the GPU device, so that a single DRM device could be used for display and rendering. There is a reverse-engineering effort that mostly drove the kernel driver development and in recent months has made really good progress:

https://github.com/grate-driver

Contrary to etnaviv/imx and vc4/pl111, this means that a Gallium driver for Tegra actually needs to be some sort of hybrid. If it only works with the Nouveau/Tegra combination, what are the grate guys going to do when they want to add a driver? The bottom line is that for Tegra a simple renderonly driver isn't going to cut it, at least not in the long run.

Now, all of that said, ports of Tegra have been done using the renderonly infrastructure. And even though I have my reservations, I can't deny that it is fairly elegant at its core and cuts down on the LoC required to support split display/render configurations. The only reason I haven't done any work on the Tegra renderonly port myself is because I don't want to step on @kusma's and @digetx' toes with the work they're doing. I'm sure we could find ways to merge pre-Tegra124 support into Mesa later on and go with a render-only solution for now, especially since others have already taken that shortcut.

@cyndis
Copy link

cyndis commented Aug 29, 2017

Thanks Thierry, that was very enlightening.

@Drakulix Drakulix mentioned this issue Sep 14, 2017
@Drakulix
Copy link
Member Author

Drakulix commented Nov 21, 2018

@cyndis While working on the drm backend again (#116), I noticed that this issue is still open, as I still have not addressed your concerns regarding the tegra driver and other split gbm/kms configurations.

Can you give me an update on the status of these efforts and what would currently be required to support this configurations? (possibly with a minimal working example, like your older kmscube link?)

I would like to tackle this issue, because I think our rendering stack is mature enough now to be able to work support into it.

@thierryreding
Copy link

Tegra support is now available in recent versions of Mesa, which means that you should be able to treat Tegra just like any other device.

@Drakulix
Copy link
Member Author

Alright. Closing this monster-issue then. If there is another configuration we are not supporting and that is not already on our roadmap, feel free to open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants