Skip to content

Commit be0376a

Browse files
fenrus75gregkh
authored andcommitted
drm/amdgpu: fix zero-size GDS range init on RDNA4
commit 095a8b0 upstream. RDNA4 (GFX 12) hardware removes the GDS, GWS, and OA on-chip memory resources. The gfx_v12_0 initialisation code correctly leaves adev->gds.gds_size, adev->gds.gws_size, and adev->gds.oa_size at zero to reflect this. amdgpu_ttm_init() unconditionally calls amdgpu_ttm_init_on_chip() for each of these resources regardless of size. When the size is zero, amdgpu_ttm_init_on_chip() forwards the call to ttm_range_man_init(), which calls drm_mm_init(mm, 0, 0). drm_mm_init() immediately fires DRM_MM_BUG_ON(start + size <= start) -- trivially true when size is zero -- crashing the kernel during modprobe of amdgpu on an RX 9070 XT. Guard against this by returning 0 early from amdgpu_ttm_init_on_chip() when size_in_page is zero. This skips TTM resource manager registration for hardware resources that are absent, without affecting any other GPU type. DRM_MM_BUG_ON() only asserts if CONFIG_DRM_DEBUG_MM is enabled in the kernel config. This is apparently rarely enabled as these chips have been in the market for over a year and this issue was only reported now. Link: https://lore.kernel.org/all/bug-221376-2300@https.bugzilla.kernel.org%2F/ Link: https://bugzilla.kernel.org/show_bug.cgi?id=221376 Oops-Analysis: http://oops.fenrus.org/reports/bugzilla.korg/221376/report.html Assisted-by: GitHub Copilot:Claude Sonnet 4.6 linux-kernel-oops-x86. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5719ce5) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1 parent 8e8be63 commit be0376a

1 file changed

Lines changed: 3 additions & 0 deletions

File tree

drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,9 @@ static int amdgpu_ttm_init_on_chip(struct amdgpu_device *adev,
7575
unsigned int type,
7676
uint64_t size_in_page)
7777
{
78+
if (!size_in_page)
79+
return 0;
80+
7881
return ttm_range_man_init(&adev->mman.bdev, type,
7982
false, size_in_page);
8083
}

0 commit comments

Comments
 (0)