Skip to content

Commit

Permalink
drm/amdgpu: Init zone device and drm client after mode-1 reset on reload
Browse files Browse the repository at this point in the history
[ Upstream commit f679fd6 ]

In passthrough environment, when amdgpu is reloaded after unload, mode-1
is triggered after initializing the necessary IPs, That init does not
include KFD, and KFD init waits until the reset is completed. KFD init
is called in the reset handler, but in this case, the zone device and
drm client is not initialized, causing app to create kernel panic.

v2: Removing the init KFD condition from amdgpu_amdkfd_drm_client_create.
As the previous version has the potential of creating DRM client twice.

v3: v2 patch results in SDMA engine hung as DRM open causes VM clear to SDMA
before SDMA init. Adding the condition to in drm client creation, on top of v1,
to guard against drm client creation call multiple times.

Signed-off-by: Ahmad Rehman <Ahmad.Rehman@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
  • Loading branch information
ahrehman authored and gregkh committed Apr 13, 2024
1 parent 8cae460 commit 4f8154f
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 2 deletions.
2 changes: 1 addition & 1 deletion drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
Expand Up @@ -146,7 +146,7 @@ int amdgpu_amdkfd_drm_client_create(struct amdgpu_device *adev)
{
int ret;

if (!adev->kfd.init_complete)
if (!adev->kfd.init_complete || adev->kfd.client.dev)
return 0;

ret = drm_client_init(&adev->ddev, &adev->kfd.client, "kfd",
Expand Down
5 changes: 4 additions & 1 deletion drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
Expand Up @@ -2451,8 +2451,11 @@ static void amdgpu_drv_delayed_reset_work_handler(struct work_struct *work)
}
for (i = 0; i < mgpu_info.num_dgpu; i++) {
adev = mgpu_info.gpu_ins[i].adev;
if (!adev->kfd.init_complete)
if (!adev->kfd.init_complete) {
kgd2kfd_init_zone_device(adev);
amdgpu_amdkfd_device_init(adev);
amdgpu_amdkfd_drm_client_create(adev);
}
amdgpu_ttm_set_buffer_funcs_status(adev, true);
}
}
Expand Down

0 comments on commit 4f8154f

Please sign in to comment.