Daniel-Vetter/…
Commits on Jun 22, 2021
-
RFC: drm/amdgpu: Implement a proper implicit fencing uapi
WARNING: Absolutely untested beyond "gcc isn't dying in agony". Implicit fencing done properly needs to treat the implicit fencing slots like a funny kind of IPC mailbox. In other words it needs to be explicitly. This is the only way it will mesh well with explicit fencing userspace like vk, and it's also the bare minimum required to be able to manage anything else that wants to use the same buffer on multiple engines in parallel, and still be able to share it through implicit sync. amdgpu completely lacks such an uapi. Fix this. Luckily the concept of ignoring implicit fences exists already, and takes care of all the complexities of making sure that non-optional fences (like bo moves) are not ignored. This support was added in commit 177ae09 Author: Andres Rodriguez <andresx7@gmail.com> Date: Fri Sep 15 20:44:06 2017 -0400 drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2 Unfortuantely it's the wrong semantics, because it's a bo flag and disables implicit sync on an allocated buffer completely. We _do_ want implicit sync, but control it explicitly. For this we need a flag on the drm_file, so that a given userspace (like vulkan) can manage the implicit sync slots explicitly. The other side of the pipeline (compositor, other process or just different stage in a media pipeline in the same process) can then either do the same, or fully participate in the implicit sync as implemented by the kernel by default. By building on the existing flag for buffers we avoid any issues with opening up additional security concerns - anything this new flag here allows is already. All drivers which supports this concept of a userspace-specific opt-out of implicit sync have a flag in their CS ioctl, but in reality that turned out to be a bit too inflexible. See the discussion below, let's try to do a bit better for amdgpu. This alone only allows us to completely avoid any stalls due to implicit sync, it does not yet allow us to use implicit sync as a strange form of IPC for sync_file. For that we need two more pieces: - a way to get the current implicit sync fences out of a buffer. Could be done in a driver ioctl, but everyone needs this, and generally a dma-buf is involved anyway to establish the sharing. So an ioctl on the dma-buf makes a ton more sense: https://lore.kernel.org/dri-devel/20210520190007.534046-4-jason@jlekstrand.net/ Current drivers in upstream solves this by having the opt-out flag on their CS ioctl. This has the downside that very often the CS which must actually stall for the implicit fence is run a while after the implicit fence point was logically sampled per the api spec (vk passes an explicit syncobj around for that afaiui), and so results in oversync. Converting the implicit sync fences into a snap-shot sync_file is actually accurate. - Simillar we need to be able to set the exclusive implicit fence. Current drivers again do this with a CS ioctl flag, with again the same problems that the time the CS happens additional dependencies have been added. An explicit ioctl to only insert a sync_file (while respecting the rules for how exclusive and shared fence slots must be update in struct dma_resv) is much better. This is proposed here: https://lore.kernel.org/dri-devel/20210520190007.534046-5-jason@jlekstrand.net/ These three pieces together allow userspace to fully control implicit fencing and remove all unecessary stall points due to them. Well, as much as the implicit fencing model fundamentally allows: There is only one set of fences, you can only choose to sync against only writers (exclusive slot), or everyone. Hence suballocating multiple buffers or anything else like this is fundamentally not possible, and can only be fixed by a proper explicit fencing model. Aside from that caveat this model gets implicit fencing as closely to explicit fencing semantics as possible: On the actual implementation I opted for a simple setparam ioctl, no locking (just atomic reads/writes) for simplicity. There is a nice flag parameter in the VM ioctl which we could use, except: - it's not checked, so userspace likely passes garbage - there's already a comment that userspace _does_ pass garbage in the priority field So yeah unfortunately this flag parameter for setting vm flags is useless, and we need to hack up a new one. v2: Explain why a new SETPARAM (Jason) v3: Bas noticed I forgot to hook up the dependency-side shortcut. We need both, or this doesn't do much. v4: Rebase over the amdgpu patch to always set the implicit sync fences. Cc: mesa-dev@lists.freedesktop.org Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: Dave Airlie <airlied@gmail.com> Cc: Rob Clark <robdclark@chromium.org> Cc: Kristian H. Kristensen <hoegsberg@google.com> Cc: Michel Dänzer <michel@daenzer.net> Cc: Daniel Stone <daniels@collabora.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Deepak R Varma <mh12gx2825@gmail.com> Cc: Chen Li <chenli@uniontech.com> Cc: Kevin Wang <kevin1.wang@amd.com> Cc: Dennis Li <Dennis.Li@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
-
drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
Spotted while trying to convert panfrost to these. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch>
-
drm/tiny: drm_gem_simple_display_pipe_prepare_fb is the default
Goes through all the drivers and deletes the default hook since it's the default now. Acked-by: David Lechner <david@lechnology.com> Acked-by: Noralf Trønnes <noralf@tronnes.org> Acked-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Joel Stanley <joel@jms.id.au> Cc: Andrew Jeffery <andrew@aj.id.au> Cc: "Noralf Trønnes" <noralf@tronnes.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Emma Anholt <emma@anholt.net> Cc: David Lechner <david@lechnology.com> Cc: Kamlesh Gurudasani <kamlesh.gurudasani@gmail.com> Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: linux-aspeed@lists.ozlabs.org Cc: linux-arm-kernel@lists.infradead.org Cc: xen-devel@lists.xenproject.org
-
drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
It's tedious to review this all the time, and my audit showed that arcpgu actually forgot to set this. Make this the default and stop worrying. Again I sprinkled WARN_ON_ONCE on top to make sure we don't have strange combinations of hooks: cleanup_fb without prepare_fb doesn't make sense, and since simpler drivers are all new they better be GEM based drivers. v2: Warn and bail when it's _not_ a GEM driver (Noralf) Cc: Noralf Trønnes <noralf@tronnes.org> Acked-by: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch>
-
drm/omap: Follow implicit fencing in prepare_fb
I guess no one ever tried running omap together with lima or panfrost, not even sure that's possible. Anyway for consistency, fix this. Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Tomi Valkeinen <tomba@kernel.org>
-
drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
Like we have for the shadow helpers too, and roll it out to drivers. Acked-by: Tian Tao <tiantao6@hisilicon.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Tian Tao <tiantao6@hisilicon.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
-
drm/armada: Remove prepare/cleanup_fb hooks
All they do is refcount the fb, which the atomic helpers already do. This is was necessary with the legacy helpers and I guess just carry over in the conversion. drm_plane_state always has a full reference for its ->fb pointer during its entire lifetime, see __drm_atomic_helper_plane_destroy_state() Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Russell King <linux@armlinux.org.uk>
-
drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
No need to set it explicitly. Acked-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Paul Cercueil <paul@crapouillou.net> Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Acked-by: Philippe Cornu <philippe.cornu@foss.st.com> Acked-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Shawn Guo <shawnguo@kernel.org> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Pengutronix Kernel Team <kernel@pengutronix.de> Cc: Fabio Estevam <festevam@gmail.com> Cc: NXP Linux Team <linux-imx@nxp.com> Cc: Philipp Zabel <p.zabel@pengutronix.de> Cc: Paul Cercueil <paul@crapouillou.net> Cc: Chun-Kuang Hu <chunkuang.hu@kernel.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Neil Armstrong <narmstrong@baylibre.com> Cc: Kevin Hilman <khilman@baylibre.com> Cc: Jerome Brunet <jbrunet@baylibre.com> Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Cc: Marek Vasut <marex@denx.de> Cc: Stefan Agner <stefan@agner.ch> Cc: Sandy Huang <hjc@rock-chips.com> Cc: "Heiko Stübner" <heiko@sntech.de> Cc: Yannick Fertre <yannick.fertre@foss.st.com> Cc: Philippe Cornu <philippe.cornu@foss.st.com> Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Chen-Yu Tsai <wens@csie.org> Cc: Jernej Skrabec <jernej.skrabec@gmail.com> Cc: Jyri Sarha <jyri.sarha@iki.fi> Cc: Tomi Valkeinen <tomba@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mips@vger.kernel.org Cc: linux-mediatek@lists.infradead.org Cc: linux-amlogic@lists.infradead.org Cc: linux-rockchip@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Cc: linux-sunxi@lists.linux.dev
-
drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
There's a bunch of atomic drivers who don't do this quite correctly, luckily most of them aren't in wide use or people would have noticed the tearing. By making this the default we avoid the constant audit pain and can additionally remove a ton of lines from vfuncs for a bit more clarity in smaller drivers. While at it complain if there's a cleanup_fb hook but no prepare_fb hook, because that makes no sense. I haven't found any driver which violates this, but better safe than sorry. Subsequent patches will reap the benefits. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch>
-
drm/panfrost: Fix implicit sync
Currently this has no practial relevance I think because there's not many who can pull off a setup with panfrost and another gpu in the same system. But the rules are that if you're setting an exclusive fence, indicating a gpu write access in the implicit fencing system, then you need to wait for all fences, not just the previous exclusive fence. panfrost against itself has no problem, because it always sets the exclusive fence (but that's probably something that will need to be fixed for vulkan and/or multi-engine gpus, or you'll suffer badly). Also no problem with that against display. With the prep work done to switch over to the dependency helpers this is now a oneliner. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org
-
drm/panfrost: Use xarray and helpers for depedency tracking
More consistency and prep work for the next patch. Aside: I wonder whether we shouldn't just move this entire xarray business into the scheduler so that not everyone has to reinvent the same wheels. Cc'ing some scheduler people for this too. v2: Correctly handle sched_lock since Lucas pointed out it's needed. v3: Rebase, dma_resv_get_excl_unlocked got renamed v4: Don't leak job references on failure (Steven). Cc: Lucas Stach <l.stach@pengutronix.de> Cc: "Christian König" <christian.koenig@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Lee Jones <lee.jones@linaro.org> Cc: Steven Price <steven.price@arm.com> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
-
drm/panfrost: Shrink sched_lock
drm/scheduler requires a lock between _init and _push_job, but the reservation lock dance doesn't. So shrink the critical section a notch. v2: Lucas pointed out how this should really work, I got it all wrong in v1. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
-
dma-buf: Document dma-buf implicit fencing/resv fencing rules
Docs for struct dma_resv are fairly clear: "A reservation object can have attached one exclusive fence (normally associated with write operations) or N shared fences (read operations)." https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects Furthermore a review across all of upstream. First of render drivers and how they set implicit fences: - nouveau follows this contract, see in validate_fini_no_ticket() nouveau_bo_fence(nvbo, fence, !!b->write_domains); and that last boolean controls whether the exclusive or shared fence slot is used. - radeon follows this contract by setting p->relocs[i].tv.num_shared = !r->write_domain; in radeon_cs_parser_relocs(), which ensures that the call to ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the right thing. - vmwgfx seems to follow this contract with the shotgun approach of always setting ttm_val_buf->num_shared = 0, which means ttm_eu_fence_buffer_objects() will only use the exclusive slot. - etnaviv follows this contract, as can be trivially seen by looking at submit_attach_object_fences() - i915 is a bit a convoluted maze with multiple paths leading to i915_vma_move_to_active(). Which sets the exclusive flag if EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for softpin mode, or through the write_domain when using relocations. It follows this contract. - lima follows this contract, see lima_gem_submit() which sets the exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that bo - msm follows this contract, see msm_gpu_submit() which sets the exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer - panfrost follows this contract with the shotgun approach of just always setting the exclusive fence, see panfrost_attach_object_fences(). Benefits of a single engine I guess - v3d follows this contract with the same shotgun approach in v3d_attach_fences_and_unlock_reservation(), but it has at least an XXX comment that maybe this should be improved - v4c uses the same shotgun approach of always setting an exclusive fence, see vc4_update_bo_seqnos() - vgem also follows this contract, see vgem_fence_attach_ioctl() and the VGEM_FENCE_WRITE. This is used in some igts to validate prime sharing with i915.ko without the need of a 2nd gpu - vritio follows this contract again with the shotgun approach of always setting an exclusive fence, see virtio_gpu_array_add_fence() This covers the setting of the exclusive fences when writing. Synchronizing against the exclusive fence is a lot more tricky, and I only spot checked a few: - i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all implicit dependencies (which is used by vulkan) - etnaviv does this. Implicit dependencies are collected in submit_fence_sync(), again with an opt-out flag ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in etnaviv_sched_dependency which is the drm_sched_backend_ops->dependency callback. - v4c seems to not do much here, maybe gets away with it by not having a scheduler and only a single engine. Since all newer broadcom chips than the OG vc4 use v3d for rendering, which follows this contract, the impact of this issue is fairly small. - v3d does this using the drm_gem_fence_array_add_implicit() helper, which then it's drm_sched_backend_ops->dependency callback v3d_job_dependency() picks up. - panfrost is nice here and tracks the implicit fences in panfrost_job->implicit_fences, which again the drm_sched_backend_ops->dependency callback panfrost_job_dependency() picks up. It is mildly questionable though since it only picks up exclusive fences in panfrost_acquire_object_fences(), but not buggy in practice because it also always sets the exclusive fence. It should pick up both sets of fences, just in case there's ever going to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a pcie port and a real gpu, which might actually happen eventually. A bug, but easy to fix. Should probably use the drm_gem_fence_array_add_implicit() helper. - lima is nice an easy, uses drm_gem_fence_array_add_implicit() and the same schema as v3d. - msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT, but because it doesn't use the drm/scheduler it handles fences from the wrong context with a synchronous dma_fence_wait. See submit_fence_sync() leading to msm_gem_sync_object(). Investing into a scheduler might be a good idea. - all the remaining drivers are ttm based, where I hope they do appropriately obey implicit fences already. I didn't do the full audit there because a) not follow the contract would confuse ttm quite well and b) reading non-standard scheduler and submit code which isn't based on drm/scheduler is a pain. Onwards to the display side. - Any driver using the drm_gem_plane_helper_prepare_fb() helper will correctly. Overwhelmingly most drivers get this right, except a few totally dont. I'll follow up with a patch to make this the default and avoid a bunch of bugs. - I didn't audit the ttm drivers, but given that dma_resv started there I hope they get this right. In conclusion this IS the contract, both as documented and overwhelmingly implemented, specically as implemented by all render drivers except amdgpu. Amdgpu tried to fix this already in commit 049aca4 Author: Christian König <christian.koenig@amd.com> Date: Wed Sep 19 16:54:35 2018 +0200 drm/amdgpu: fix using shared fence for exported BOs v2 but this fix falls short on a number of areas: - It's racy, by the time the buffer is shared it might be too late. To make sure there's definitely never a problem we need to set the fences correctly for any buffer that's potentially exportable. - It's breaking uapi, dma-buf fds support poll() and differentitiate between, which was introduced in commit 9b495a5 Author: Maarten Lankhorst <maarten.lankhorst@canonical.com> Date: Tue Jul 1 12:57:43 2014 +0200 dma-buf: add poll support, v3 - Christian König wants to nack new uapi building further on this dma_resv contract because it breaks amdgpu, quoting "Yeah, and that is exactly the reason why I will NAK this uAPI change. "This doesn't works for amdgpu at all for the reasons outlined above." https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/ Rejecting new development because your own driver is broken and violates established cross driver contracts and uapi is really not how upstream works. Now this patch will have a severe performance impact on anything that runs on multiple engines. So we can't just merge it outright, but need a bit a plan: - amdgpu needs a proper uapi for handling implicit fencing. The funny thing is that to do it correctly, implicit fencing must be treated as a very strange IPC mechanism for transporting fences, where both setting the fence and dependency intercepts must be handled explicitly. Current best practices is a per-bo flag to indicate writes, and a per-bo flag to to skip implicit fencing in the CS ioctl as a new chunk. - Since amdgpu has been shipping with broken behaviour we need an opt-out flag from the butchered implicit fencing model to enable the proper explicit implicit fencing model. - for kernel memory fences due to bo moves at least the i915 idea is to use ttm_bo->moving. amdgpu probably needs the same. - since the current p2p dma-buf interface assumes the kernel memory fence is in the exclusive dma_resv fence slot we need to add a new fence slot for kernel fences, which must never be ignored. Since currently only amdgpu supports this there's no real problem here yet, until amdgpu gains a NO_IMPLICIT CS flag. - New userspace needs to ship in enough desktop distros so that users wont notice the perf impact. I think we can ignore LTS distros who upgrade their kernels but not their mesa3d snapshot. - Then when this is all in place we can merge this patch here. What is not a solution to this problem here is trying to make the dma_resv rules in the kernel more clever. The fundamental issue here is that the amdgpu CS uapi is the least expressive one across all drivers (only equalled by panfrost, which has an actual excuse) by not allowing any userspace control over how implicit sync is conducted. Until this is fixed it's completely pointless to make the kernel more clever to improve amdgpu, because all we're doing is papering over this uapi design issue. amdgpu needs to attain the status quo established by other drivers first, once that's achieved we can tackle the remaining issues in a consistent way across drivers. v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I entirely missed. This is great because it means the amdgpu specific piece for proper implicit fence handling exists already, and that since a while. The only thing that's now missing is - fishing the implicit fences out of a shared object at the right time - setting the exclusive implicit fence slot at the right time. Jason has a patch series to fill that gap with a bunch of generic ioctl on the dma-buf fd: https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/ v3: Since Christian has fixed amdgpu now in commit 8c505bd (drm-misc/drm-misc-next) Author: Christian König <christian.koenig@amd.com> Date: Wed Jun 9 13:51:36 2021 +0200 drm/amdgpu: rework dma_resv handling v3 Use the audit covered in this commit message as the excuse to update the dma-buf docs around dma_buf.resv usage across drivers. Since dynamic importers have different rules also hammer these in again while we're at it. Cc: mesa-dev@lists.freedesktop.org Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: Dave Airlie <airlied@gmail.com> Cc: Rob Clark <robdclark@chromium.org> Cc: Kristian H. Kristensen <hoegsberg@google.com> Cc: Michel Dänzer <michel@daenzer.net> Cc: Daniel Stone <daniels@collabora.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Deepak R Varma <mh12gx2825@gmail.com> Cc: Chen Li <chenli@uniontech.com> Cc: Kevin Wang <kevin1.wang@amd.com> Cc: Dennis Li <Dennis.Li@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
-
dma-buf: Switch to inline kerneldoc
Also review & update everything while we're at it. This is prep work to smash a ton of stuff into the kerneldoc for @resv. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Nirmoy Das <nirmoy.das@amd.com> Cc: Deepak R Varma <mh12gx2825@gmail.com> Cc: Chen Li <chenli@uniontech.com> Cc: Kevin Wang <kevin1.wang@amd.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org
-
Oversight from commit 6edbd6a Author: Christian König <christian.koenig@amd.com> Date: Mon May 10 16:14:09 2021 +0200 dma-buf: rename and cleanup dma_resv_get_excl v3 Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org
-
-
-
Merge remote-tracking branch 'drm/drm-next' into drm-tip
# Conflicts: # drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c # drivers/gpu/drm/amd/amdgpu/amdgpu_object.c # drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c # drivers/gpu/drm/vc4/vc4_hdmi.c
-
drm/amdgpu: wait for moving fence after pinning
We actually need to wait for the moving fence after pinning the BO to make sure that the pin is completed. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> References: https://lore.kernel.org/dri-devel/20210621151758.2347474-1-daniel.vetter@ffwll.ch/ CC: stable@kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20210622114506.106349-3-christian.koenig@amd.com
-
drm/radeon: wait for moving fence after pinning
We actually need to wait for the moving fence after pinning the BO to make sure that the pin is completed. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> References: https://lore.kernel.org/dri-devel/20210621151758.2347474-1-daniel.vetter@ffwll.ch/ CC: stable@kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20210622114506.106349-2-christian.koenig@amd.com
-
drm/nouveau: wait for moving fence after pinning v2
We actually need to wait for the moving fence after pinning the BO to make sure that the pin is completed. v2: grab the lock while waiting Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> References: https://lore.kernel.org/dri-devel/20210621151758.2347474-1-daniel.vetter@ffwll.ch/ CC: stable@kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20210622114506.106349-1-christian.koenig@amd.com
-
drm/i915/dsc: abstract helpers to get bigjoiner primary/secondary crtc
Add a single point of truth for figuring out the primary/secondary crtc for bigjoiner instead of duplicating the magic pipe +/- 1 in multiple places. Also fix the pipe validity checks to properly take non-contiguous pipes into account. The current checks may theoretically overflow i915->pipe_to_crtc_mapping[pipe], albeit with a warning, due to fused off pipes, as INTEL_NUM_PIPES() returns the actual number of pipes on the platform, and the check is for INTEL_NUM_PIPES() == pipe + 1. Prefer primary/secondary terminology going forward. v2: - Improved abstractions for pipe validity etc. Fixes: 8a029c1 ("drm/i915/dp: Modify VDSC helpers to configure DSC for Bigjoiner slave") Fixes: d961eb2 ("drm/i915/bigjoiner: atomic commit changes for uncompressed joiner") Cc: Animesh Manna <animesh.manna@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Vandita Kulkarni <vandita.kulkarni@intel.com> Reviewed-by: Manasi Navare <manasi.dl.navare@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210610090528.20511-1-jani.nikula@intel.com
-
drm/amdgpu: rework dma_resv handling v3
Drop the workaround and instead implement a better solution. Basically we are now chaining all submissions using a dma_fence_chain container and adding them as exclusive fence to the dma_resv object. This way other drivers can still sync to the single exclusive fence while amdgpu only sync to fences from different processes. v3: add the shared fence first before the exclusive one Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210614174536.5188-2-christian.koenig@amd.com
-
drm/amdgpu: unwrap fence chains in the explicit sync fence
Unwrap the explicit fence if it is a dma_fence_chain and sync to the first fence not matching the owner rules. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210614174536.5188-1-christian.koenig@amd.com
-
Revert "drm: add a locked version of drm_is_current_master"
This reverts commit 1815d9c. Unfortunately this inverts the locking hierarchy, so back to the drawing board. Full lockdep splat below: ====================================================== WARNING: possible circular locking dependency detected 5.13.0-rc7-CI-CI_DRM_10254+ #1 Not tainted ------------------------------------------------------ kms_frontbuffer/1087 is trying to acquire lock: ffff88810dcd01a8 (&dev->master_mutex){+.+.}-{3:3}, at: drm_is_current_master+0x1b/0x40 but task is already holding lock: ffff88810dcd0488 (&dev->mode_config.mutex){+.+.}-{3:3}, at: drm_mode_getconnector+0x1c6/0x4a0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&dev->mode_config.mutex){+.+.}-{3:3}: __mutex_lock+0xab/0x970 drm_client_modeset_probe+0x22e/0xca0 __drm_fb_helper_initial_config_and_unlock+0x42/0x540 intel_fbdev_initial_config+0xf/0x20 [i915] async_run_entry_fn+0x28/0x130 process_one_work+0x26d/0x5c0 worker_thread+0x37/0x380 kthread+0x144/0x170 ret_from_fork+0x1f/0x30 -> #1 (&client->modeset_mutex){+.+.}-{3:3}: __mutex_lock+0xab/0x970 drm_client_modeset_commit_locked+0x1c/0x180 drm_client_modeset_commit+0x1c/0x40 __drm_fb_helper_restore_fbdev_mode_unlocked+0x88/0xb0 drm_fb_helper_set_par+0x34/0x40 intel_fbdev_set_par+0x11/0x40 [i915] fbcon_init+0x270/0x4f0 visual_init+0xc6/0x130 do_bind_con_driver+0x1e5/0x2d0 do_take_over_console+0x10e/0x180 do_fbcon_takeover+0x53/0xb0 register_framebuffer+0x22d/0x310 __drm_fb_helper_initial_config_and_unlock+0x36c/0x540 intel_fbdev_initial_config+0xf/0x20 [i915] async_run_entry_fn+0x28/0x130 process_one_work+0x26d/0x5c0 worker_thread+0x37/0x380 kthread+0x144/0x170 ret_from_fork+0x1f/0x30 -> #0 (&dev->master_mutex){+.+.}-{3:3}: __lock_acquire+0x151e/0x2590 lock_acquire+0xd1/0x3d0 __mutex_lock+0xab/0x970 drm_is_current_master+0x1b/0x40 drm_mode_getconnector+0x37e/0x4a0 drm_ioctl_kernel+0xa8/0xf0 drm_ioctl+0x1e8/0x390 __x64_sys_ioctl+0x6a/0xa0 do_syscall_64+0x39/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: &dev->master_mutex --> &client->modeset_mutex --> &dev->mode_config.mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&dev->mode_config.mutex); lock(&client->modeset_mutex); lock(&dev->mode_config.mutex); lock(&dev->master_mutex); *** DEADLOCK *** 1 lock held by kms_frontbuffer/1087: #0: ffff88810dcd0488 (&dev->mode_config.mutex){+.+.}-{3:3}, at: drm_mode_getconnector+0x1c6/0x4a0 stack backtrace: CPU: 7 PID: 1087 Comm: kms_frontbuffer Not tainted 5.13.0-rc7-CI-CI_DRM_10254+ #1 Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019 Call Trace: dump_stack+0x7f/0xad check_noncircular+0x12e/0x150 __lock_acquire+0x151e/0x2590 lock_acquire+0xd1/0x3d0 __mutex_lock+0xab/0x970 drm_is_current_master+0x1b/0x40 drm_mode_getconnector+0x37e/0x4a0 drm_ioctl_kernel+0xa8/0xf0 drm_ioctl+0x1e8/0x390 __x64_sys_ioctl+0x6a/0xa0 do_syscall_64+0x39/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae Note that this broke the intel-gfx CI pretty much across the board because it has to reboot machines after it hits a lockdep splat. Testcase: igt/debugfs_test/read_all_entries Acked-by: Petri Latvala <petri.latvala@intel.com> Fixes: 1815d9c ("drm: add a locked version of drm_is_current_master") Cc: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210622075409.2673805-1-daniel.vetter@ffwll.ch
-
drm: bridge: ti-sn65dsi83: Retrieve the display mode from the state
Instead of storing a copy of the display mode in the sn65dsi83 structure, retrieve it from the atomic state in sn65dsi83_atomic_enable(). This allows the removal of the .mode_set() operation, and completes the transition to the atomic API. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Robert Foss <robert.foss@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20210621125518.13715-6-laurent.pinchart@ideasonboard.com
-
drm: bridge: ti-sn65dsi83: Retrieve output format from bridge state
The driver currently iterates over all connectors to get the bus format, used to configure the LVDS output format. This causes several issues: - If other connectors than the LVDS output are present, the format used by the driver may end up belonging to an entirely different output. - The code can crash if some connectors are not connected, as bus_format may then be NULL. - There's no guarantee that the bus format on the connector at the output of the pipeline matches the output of the sn65dsi83, as there may be other bridges in the pipeline. Solve this by retrieving the format from the bridge state instead, which provides the format corresponding to the output of the bridge. The struct sn65dsi83 lvds_format_24bpp and lvds_format_jeida fields are moved to local variables in sn65dsi83_atomic_enable() as they're now used in that function only. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Robert Foss <robert.foss@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20210621125518.13715-5-laurent.pinchart@ideasonboard.com
-
drm: bridge: ti-sn65dsi83: Switch to atomic operations
Use the atomic version of the enable/disable operations to continue the transition to the atomic API, started with the introduction of .atomic_get_input_bus_fmts(). This will be needed to access the mode from the atomic state. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Robert Foss <robert.foss@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20210621125518.13715-4-laurent.pinchart@ideasonboard.com
-
drm: bridge: ti-sn65dsi83: Pass mode explicitly to helper functions
Pass the display mode explicitly to the sn65dsi83_get_lvds_range() and sn65dsi83_get_dsi_range() functions to prepare for its removal from the sn65dsi83 structure. This is not meant to bring any functional change. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Robert Foss <robert.foss@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20210621125518.13715-3-laurent.pinchart@ideasonboard.com