Skip to content

Commit 20a0778

Browse files
tomaszlimwajdecz
authored andcommitted
drm/xe/vf: Fail migration recovery if fixups needed but platform not supported
The post-migration recovery needs to be fully implemented for a specific platform in order to make continuation of workloads possible. New platforms introduce changes which affect the recovery procedure, and without a clear verification of support this leads to errors with no straight forward error message explaining the cause. This patch fixes that issue - it introduces a message to be logged when the current driver is known to not support the current platform. Wedging the driver immediately also decreases the amount of additional errors which would come afterwards if the driver continued operation. v2: Show the message during probe as well as during recovery; do not perform any recovery steps if the recovery is bound to fail v3: Use SRIOV-specific logging, fix typos v4: XE_DEBUG_SRIOV to XE_DEBUG check switch, to make testing more straightforward Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Acked-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250519230035.3143966-1-tomasz.lis@intel.com
1 parent 49c6dc7 commit 20a0778

File tree

1 file changed

+17
-0
lines changed

1 file changed

+17
-0
lines changed

drivers/gpu/drm/xe/xe_sriov_vf.c

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,15 @@
123123
* | | |
124124
*/
125125

126+
static bool vf_migration_supported(struct xe_device *xe)
127+
{
128+
/*
129+
* TODO: Add conditions to allow specific platforms, when they're
130+
* supported at production quality.
131+
*/
132+
return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
133+
}
134+
126135
static void migration_worker_func(struct work_struct *w);
127136

128137
/**
@@ -132,6 +141,9 @@ static void migration_worker_func(struct work_struct *w);
132141
void xe_sriov_vf_init_early(struct xe_device *xe)
133142
{
134143
INIT_WORK(&xe->sriov.vf.migration.worker, migration_worker_func);
144+
145+
if (!vf_migration_supported(xe))
146+
xe_sriov_info(xe, "migration not supported by this module version\n");
135147
}
136148

137149
/**
@@ -236,6 +248,11 @@ static void vf_post_migration_recovery(struct xe_device *xe)
236248
goto defer;
237249
if (unlikely(err))
238250
goto fail;
251+
if (!vf_migration_supported(xe)) {
252+
xe_sriov_err(xe, "migration not supported by this module version\n");
253+
err = -ENOTRECOVERABLE;
254+
goto fail;
255+
}
239256

240257
need_fixups = vf_post_migration_fixup_ggtt_nodes(xe);
241258
/* FIXME: add the recovery steps */

0 commit comments

Comments
 (0)