Matthew-Brost/…
Commits on Aug 16, 2021
-
drm/i915/guc: Add GuC kernel doc
Add GuC kernel doc for all structures added thus far for GuC submission and update the main GuC submission section with the new interface details. Signed-off-by: Matthew Brost <matthew.brost@intel.com>
-
drm/i915/guc: Move GuC priority fields in context under guc_active
Move GuC management fields in context under guc_active struct as this is where the lock that protects theses fields lives. Also only set guc_prio field once during context init. Fixes: ee242ca ("drm/i915/guc: Implement GuC priority management") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org>
-
drm/i915/guc: Drop pin count check trick between sched_disable and re…
…-pin Drop pin count check trick between a sched_disable and re-pin, now rely on the lock and counter of the number of committed requests to determine if scheduling should be disabled on the context. Signed-off-by: Matthew Brost <matthew.brost@intel.com>
-
drm/i915/guc: Proper xarray usage for contexts_lookup
Lock the xarray and take ref to the context if needed. v2: (Checkpatch) - Add new line after declaration Signed-off-by: Matthew Brost <matthew.brost@intel.com>
-
drm/i915/guc: Rework and simplify locking
Rework and simplify the locking with GuC subission. Drop sched_state_no_lock and move all fields under the guc_state.sched_state and protect all these fields with guc_state.lock . This requires changing the locking hierarchy from guc_state.lock -> sched_engine.lock to sched_engine.lock -> guc_state.lock. Signed-off-by: Matthew Brost <matthew.brost@intel.com>
-
drm/i915/guc: Move guc_blocked fence to struct guc_state
Move guc_blocked fence to struct guc_state as the lock which protects the fence lives there. s/ce->guc_blocked/ce->guc_state.blocked_fence/g Signed-off-by: Matthew Brost <matthew.brost@intel.com>
-
drm/i915/guc: Release submit fence from an IRQ
A subsequent patch will flip the locking hierarchy from ce->guc_state.lock -> sched_engine->lock to sched_engine->lock -> ce->guc_state.lock. As such we need to release the submit fence for a request from an IRQ to break a lock inversion - i.e. the fence must be release went holding ce->guc_state.lock and the releasing of the can acquire sched_engine->lock. Signed-off-by: Matthew Brost <matthew.brost@intel.com>
-
drm/i915/guc: Flush G2H work queue during reset
It isn't safe to scrub for missing G2H or continue with the reset until all G2H processing is complete. Flush the G2H work queue during reset to ensure it is done running. Fixes: eb5e7da ("drm/i915/guc: Reset implementation for new GuC interface") Signed-off-by: Matthew Brost <matthew.brost@intel.com>
-
drm/i915: Allocate error capture in atomic context
Error captures can now be done in a work queue processing G2H messages. These messages need to be completely done being processed in the reset path, to avoid races in the missing G2H cleanup, which create a dependency on memory allocations and dma fences (i915_requests). Requests depend on resets, thus now we have a circular dependency. To work around this, allocate the error capture in an atomic context. Fixes: dc0dad3 ("Fix for error capture after full GPU reset with GuC") Fixes: 573ba12 ("Capture error state on context reset") Signed-off-by: Matthew Brost <matthew.brost@intel.com>
-
drm/i915/guc: Reset LRC descriptor if register returns -ENODEV
Reset LRC descriptor if a context register returns -ENODEV as this means we are mid-reset. Fixes: eb5e7da ("drm/i915/guc: Reset implementation for new GuC interface") Signed-off-by: Matthew Brost <matthew.brost@intel.com>
-
drm/i915/guc: Don't touch guc_state.sched_state without a lock
Before we did some clever tricks to not use the a lock when touching guc_state.sched_state in certain cases. Don't do that, enforce the use of the lock. Part of this is removing a dead code path from guc_lrc_desc_pin where a context could be deregistered when the aforementioned function was called from the submission path. Remove this dead code and add a GEM_BUG_ON if this path is ever attempted to be used. Signed-off-by: Matthew Brost <matthew.brost@intel.com>
-
drm/i915/guc: Take context ref when cancelling request
A context can get destroyed after cancelling a request so take a reference to context when cancelling a request. Fixes: 62eaf0a ("drm/i915/guc: Support request cancellation") Signed-off-by: Matthew Brost <matthew.brost@intel.com>
-
drm/i915/selftests: Add initial GuC selftest for scrubbing lost G2H
While debugging an issue with full GT resets I went down a rabbit hole thinking the scrubbing of lost G2H wasn't working correctly. This proved to be incorrect as this was working just fine but this chase inspired me to write a selftest to prove that this works. This simple selftest injects errors dropping various G2H and then issues a full GT reset proving that the scrubbing of these G2H doesn't blow up. v2: (Daniel Vetter) - Use ifdef instead of macros for selftests v3: (Checkpatch) - A space after 'switch' statement Signed-off-by: Matthew Brost <matthew.brost@intel.com>
-
drm/i915/selftests: Fix memory corruption in live_lrc_isolation
GuC submission has exposed an existing memory corruption in live_lrc_isolation. We believe that some writes to the watchdog offsets in the LRC (0x178 & 0x17c) can result in trashing of portions of the address space. With GuC submission there are additional objects which can move the context redzone into the space that is trashed. To workaround this avoid poisoning the watchdog. v2: (Daniel Vetter) - Add VLK ref in code to workaround Signed-off-by: Matthew Brost <matthew.brost@intel.com>
-
drm/i915/guc: Don't enable scheduling on a banned context, guc_id inv…
…alid, not registered When unblocking a context, do not enable scheduling if the context is banned, guc_id invalid, or not registered. Fixes: 62eaf0a ("drm/i915/guc: Support request cancellation") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org>
-
drm/i915/selftests: Add a cancel request selftest that triggers a reset
Add a cancel request selftest that results in an engine reset to cancel the request as it is non-preemptable. Also insert a NOP request after the cancelled request and confirm that it completely successfully. Signed-off-by: Matthew Brost <matthew.brost@intel.com>
-
drm/i915/execlists: Do not propagate errors to dependent fences
Progagating errors to dependent fences is wrong, don't do it. Selftest in following patch exposes this bug. Fixes: 8e9f84c ("drm/i915/gt: Propagate change in error status to children on unhold") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org>
-
drm/i915/guc: Workaround reset G2H is received after schedule done G2H
If the context is reset as a result of the request cancelation the context reset G2H is received after schedule disable done G2H which is likely the wrong order. The schedule disable done G2H release the waiting request cancelation code which resubmits the context. This races with the context reset G2H which also wants to resubmit the context but in this case it really should be a NOP as request cancelation code owns the resubmit. Use some clever tricks of checking the context state to seal this race until if / when the GuC firmware is fixed. v2: (Checkpatch) - Fix typos Fixes: 62eaf0a ("drm/i915/guc: Support request cancellation") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org>
-
drm/i915/guc: Don't drop ce->guc_active.lock when unwinding context
Don't drop ce->guc_active.lock when unwinding a context after reset. At one point we had to drop this because of a lock inversion but that is no longer the case. It is much safer to hold the lock so let's do that. Fixes: eb5e7da ("drm/i915/guc: Reset implementation for new GuC interface") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org>
-
drm/i915/guc: Unwind context requests in reverse order
When unwinding requests on a reset context, if other requests in the context are in the priority list the requests could be resubmitted out of seqno order. Traverse the list of active requests in reverse and append to the head of the priority list to fix this. Fixes: eb5e7da ("drm/i915/guc: Reset implementation for new GuC interface") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org>
-
drm/i915/guc: Fix outstanding G2H accounting
A small race that could result in incorrect accounting of the number of outstanding G2H. Basically prior to this patch we did not increment the number of outstanding G2H if we encoutered a GT reset while sending a H2G. This was incorrect as the context state had already been updated to anticipate a G2H response thus the counter should be incremented. Fixes: f4eb1f3 ("drm/i915/guc: Ensure G2H response has space in buffer") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org>
-
drm/i915/guc: Fix blocked context accounting
Prior to this patch the blocked context counter was cleared on init_sched_state (used during registering a context & resets) which is incorrect. This state needs to be persistent or the counter can read the incorrect value resulting in scheduling never getting enabled again. Fixes: 62eaf0a ("drm/i915/guc: Support request cancellation") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: <stable@vger.kernel.org>
-
Merge drm/drm-next into drm-intel-next
Catch up with drm core changes. Signed-off-by: Jani Nikula <jani.nikula@intel.com>
-
Merge tag 'drm-misc-next-2021-08-12' of git://anongit.freedesktop.org…
…/drm/drm-misc into drm-next drm-misc-next for v5.15: UAPI Changes: Cross-subsystem Changes: - Add lockdep_assert(once) helpers. Core Changes: - Add lockdep assert to drm_is_current_master_locked. - Fix typos in dma-buf documentation. - Mark drm irq midlayer as legacy only. - Fix GPF in udmabuf_create. - Rename member to correct value in drm_edid.h Driver Changes: - Build fix to make nouveau build with NOUVEAU_BACKLIGHT. - Add MI101AIT-ICP1, LTTD800480070-L6WWH-RT panels. - Assorted fixes to bridge/it66121, anx7625. - Add custom crtc_state to simple helpers, and use it to convert pll handling in mgag200 to atomic. - Convert drivers to use offset-adjusted framebuffer bo mappings. - Assorted small fixes and fix for a use-after-free in vmwgfx. - Convert remaining callers of non-legacy drivers to use linux irqs directly. - Small cleanup in ingenic. - Small fixes to virtio and ti-sn65dsi86. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1cf2d7fc-402d-1852-574a-21cbbd2eaebf@linux.intel.com
Commits on Aug 13, 2021
-
drm/i915/dg2: add SNPS PHY translations for UHBR link rates
UHBR link rates use different tx equalization settings. Using this will require changes in the link training code too. Bspec: 53920 Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210813115151.19290-3-jani.nikula@intel.com
-
drm/i915/dg2: use existing mechanisms for SNPS PHY translations
We use encoder->get_buf_trans() in many places, for example intel_ddi_dp_voltage_max(), and the hook was set to some old platform's function for DG2 SNPS PHY. Convert SNPS PHY to use the same translation mechanisms as everything else. Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210813115151.19290-2-jani.nikula@intel.com
-
drm/i915/dp: pass crtc_state to intel_ddi_dp_level()
Needed in the future. Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210813115151.19290-1-jani.nikula@intel.com
-
drm/i915/mst: use intel_de_rmw() to simplify VC payload alloc set/clear
Less is more, fewer lines to wonder about. Cc: Manasi Navare <manasi.d.navare@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210813115610.20010-1-jani.nikula@intel.com
-
drm/i915/edp: fix eDP MSO pipe sanity checks for ADL-P
ADL-P supports stream splitter on pipe B in addition to pipe A. Update the sanity check in intel_ddi_mso_get_config() to reflect this, and remove the check in intel_ddi_mso_configure() as redundant with encoder->pipe_mask. Abstract the splitter pipe mask to a single point of truth while at it to avoid similar mistakes in the future. Fixes: 7bc188c ("drm/i915/adl_p: enable MSO on pipe B") Cc: Uma Shankar <uma.shankar@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Swati Sharma <swati2.sharma@intel.com> Reviewed-by: Swati Sharma <swati2.sharma@intel.com> Tested-by: Swati Sharma <swati2.sharma@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210812132354.10885-1-jani.nikula@intel.com
-
fbdev/efifb: Release PCI device's runtime PM ref during FB destroy
Atm the EFI FB platform driver gets a runtime PM reference for the associated GFX PCI device during probing the EFI FB platform device and releases it only when the platform device gets unbound. When fbcon switches to the FB provided by the PCI device's driver (for instance i915/drmfb), the EFI FB will get only unregistered without the EFI FB platform device getting unbound, keeping the runtime PM reference acquired during the platform device probing. This reference will prevent the PCI driver from runtime suspending the device. Fix this by releasing the RPM reference from the EFI FB's destroy hook, called when the FB gets unregistered. While at it assert that pm_runtime_get_sync() didn't fail. v2: - Move pm_runtime_get_sync() before register_framebuffer() to avoid its race wrt. efifb_destroy()->pm_runtime_put(). (Daniel) - Assert that pm_runtime_get_sync() didn't fail. - Clarify commit message wrt. platform/PCI device/driver and driver removal vs. device unbinding. Fixes: a6c0fd3 ("efifb: Ensure graphics device for efifb stays at PCI D0") Cc: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1) Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210809133146.2478382-1-imre.deak@intel.com
Commits on Aug 12, 2021
-
drm/bridge: ti-sn65dsi86: Avoid creating multiple connectors
If we created our own connector because the driver does not support the NO_CONNECTOR flag, we don't want the downstream bridge to *also* create a connector. And if this driver did pass the NO_CONNECTOR flag (and we supported that mode) this would change nothing. Fixes: 4e5763f ("drm/bridge: ti-sn65dsi86: Wrap panel with panel-bridge") Reported-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Tested-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20210811235253.924867-2-robdclark@gmail.com
-
Byte 26 in a edid struct is supposed to be "Blue and white least-significant 2 bits", not "black and white". Rename the field accordingly. This field is not used anywhere, so just renaming it here for correctness. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Simon Ser <contact@emersion.fr> Signed-off-by: Simon Ser <contact@emersion.fr> Link: https://patchwork.freedesktop.org/patch/msgid/20210811205818.156100-1-lucas.demarchi@intel.com
-
drm/virtio: set non-cross device blob uuid_state
Blob resources without the cross device flag don't have a uuid to share with other virtio devices. When exporting such blobs, set uuid_state to STATE_ERR so that virtgpu_virtio_get_uuid doesn't hang. Signed-off-by: David Stevens <stevensd@chromium.org> Link: http://patchwork.freedesktop.org/patch/msgid/20210811040401.1264234-1-stevensd@chromium.org Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
-
drm/i915: Tweaked Wa_14010685332 for all PCHs
dispcnlunit1_cp_xosc_clkreq clock observed to be active on TGL-H platform despite Wa_14010685332 original sequence, thus blocks entry to deeper s0ix state. The Tweaked Wa_14010685332 sequence fixes this issue, therefore use tweaked Wa_14010685332 sequence for every PCH since PCH_CNP. v2: - removed RKL from comment and simplified condition. [Rodrigo] Fixes: b896898 ("drm/i915: Tweaked Wa_14010685332 for PCHs used on gen11 platforms") Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210810113112.31739-2-anshuman.gupta@intel.com
-
udmabuf: fix general protection fault in udmabuf_create
Syzbot reported general protection fault in udmabuf_create. The problem was in wrong error handling. In commit 16c243e ("udmabuf: Add support for mapping hugepages (v4)") shmem_read_mapping_page() call was replaced with find_get_page_flags(), but find_get_page_flags() returns NULL on failure instead PTR_ERR(). Wrong error checking was causing GPF in get_page(), since passed page was equal to NULL. Fix it by changing if (IS_ER(!hpage)) to if (!hpage) Reported-by: syzbot+e9cd3122a37c5d6c51e8@syzkaller.appspotmail.com Fixes: 16c243e ("udmabuf: Add support for mapping hugepages (v4)") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20210811175052.21254-1-paskripkin@gmail.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>