Skip to content

Commit 77052ab

Browse files
rtauro1895lucasdemarchi
authored andcommitted
drm/xe: Add documentation for survivability mode
Add survivability mode document to pcode document as it is enabled when pcode detects a failure. v2: fix kernel-doc (Lucas) Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250407051414.1651616-3-riana.tauro@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
1 parent 16280de commit 77052ab

File tree

2 files changed

+30
-11
lines changed

2 files changed

+30
-11
lines changed

Documentation/gpu/xe/xe_pcode.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,10 @@ Internal API
1212

1313
.. kernel-doc:: drivers/gpu/drm/xe/xe_pcode.c
1414
:internal:
15+
16+
==================
17+
Boot Survivability
18+
==================
19+
20+
.. kernel-doc:: drivers/gpu/drm/xe/xe_survivability_mode.c
21+
:doc: Xe Boot Survivability

drivers/gpu/drm/xe/xe_survivability_mode.c

Lines changed: 23 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -28,20 +28,32 @@
2828
* This is implemented by loading the driver with bare minimum (no drm card) to allow the firmware
2929
* to be flashed through mei and collect telemetry. The driver's probe flow is modified
3030
* such that it enters survivability mode when pcode initialization is incomplete and boot status
31-
* denotes a failure. The driver then populates the survivability_mode PCI sysfs indicating
32-
* survivability mode and provides additional information required for debug
31+
* denotes a failure.
3332
*
34-
* KMD exposes below admin-only readable sysfs in survivability mode
33+
* Survivability mode can also be entered manually using the survivability mode attribute available
34+
* through configfs which is beneficial in several usecases. It can be used to address scenarios
35+
* where pcode does not detect failure or for validation purposes. It can also be used in
36+
* In-Field-Repair (IFR) to repair a single card without impacting the other cards in a node.
3537
*
36-
* device/survivability_mode: The presence of this file indicates that the card is in survivability
37-
* mode. Also, provides additional information on why the driver entered
38-
* survivability mode.
38+
* Use below command enable survivability mode manually::
3939
*
40-
* Capability Information - Provides boot status
41-
* Postcode Information - Provides information about the failure
42-
* Overflow Information - Provides history of previous failures
43-
* Auxiliary Information - Certain failures may have information in
44-
* addition to postcode information
40+
* # echo 1 > /sys/kernel/config/xe/0000:03:00.0/survivability_mode
41+
*
42+
* Refer :ref:`xe_configfs` for more details on how to use configfs
43+
*
44+
* Survivability mode is indicated by the below admin-only readable sysfs which provides additional
45+
* debug information::
46+
*
47+
* /sys/bus/pci/devices/<device>/surivability_mode
48+
*
49+
* Capability Information:
50+
* Provides boot status
51+
* Postcode Information:
52+
* Provides information about the failure
53+
* Overflow Information
54+
* Provides history of previous failures
55+
* Auxiliary Information
56+
* Certain failures may have information in addition to postcode information
4557
*/
4658

4759
static u32 aux_history_offset(u32 reg_value)

0 commit comments

Comments
 (0)