Skip to content

Commit

Permalink
From patchwork series 376667
Browse files Browse the repository at this point in the history
  • Loading branch information
Fox Snowpatch committed Oct 9, 2023
1 parent eddc90e commit 4d99dc0
Show file tree
Hide file tree
Showing 6 changed files with 308 additions and 233 deletions.
12 changes: 12 additions & 0 deletions Documentation/ABI/testing/sysfs-kernel-fadump
Expand Up @@ -38,3 +38,15 @@ Contact: linuxppc-dev@lists.ozlabs.org
Description: read only
Provide information about the amount of memory reserved by
FADump to save the crash dump in bytes.
What: /sys/kernel/fadump/hotplug_ready
Date: Sep 2023
Contact: linuxppc-dev@lists.ozlabs.org
Description: read only
The Kdump scripts utilize udev rules to monitor memory add/remove
events, ensuring that FADUMP is automatically re-registered when
system memory changes occur. This re-registration was necessary
to update the elfcorehdr, which describes the system memory to the
second kernel. Now If this sysfs node holds a value of 1, it
indicates to userspace that FADUMP does not require re-registration
since the elfcorehdr is now generated in the second kernel.
User: kexec-tools
91 changes: 42 additions & 49 deletions Documentation/powerpc/firmware-assisted-dump.rst
Expand Up @@ -134,12 +134,12 @@ that are run. If there is dump data, then the
memory is held.

If there is no waiting dump data, then only the memory required to
hold CPU state, HPTE region, boot memory dump, FADump header and
elfcore header, is usually reserved at an offset greater than boot
memory size (see Fig. 1). This area is *not* released: this region
will be kept permanently reserved, so that it can act as a receptacle
for a copy of the boot memory content in addition to CPU state and
HPTE region, in the case a crash does occur.
hold CPU state, HPTE region, boot memory dump, and FADump header is
usually reserved at an offset greater than boot memory size (see Fig. 1).
This area is *not* released: this region will be kept permanently
reserved, so that it can act as a receptacle for a copy of the boot
memory content in addition to CPU state and HPTE region, in the case
a crash does occur.

Since this reserved memory area is used only after the system crash,
there is no point in blocking this significant chunk of memory from
Expand All @@ -153,22 +153,22 @@ that were present in CMA region::

o Memory Reservation during first kernel

Low memory Top of memory
0 boot memory size |<--- Reserved dump area --->| |
| | | Permanent Reservation | |
V V | | V
+-----------+-----/ /---+---+----+-------+-----+-----+----+--+
| | |///|////| DUMP | HDR | ELF |////| |
+-----------+-----/ /---+---+----+-------+-----+-----+----+--+
| ^ ^ ^ ^ ^
| | | | | |
\ CPU HPTE / | |
------------------------------ | |
Boot memory content gets transferred | |
to reserved area by firmware at the | |
time of crash. | |
FADump Header |
(meta area) |
Low memory Top of memory
0 boot memory size |<------ Reserved dump area ----->| |
| | | Permanent Reservation | |
V V | | V
+-----------+-----/ /---+---+----+-----------+-------+----+-----+
| | |///|////| DUMP | HDR |////| |
+-----------+-----/ /---+---+----+-----------+-------+----+-----+
| ^ ^ ^ ^ ^
| | | | | |
\ CPU HPTE / | |
-------------------------------- | |
Boot memory content gets transferred | |
to reserved area by firmware at the | |
time of crash. | |
FADump Header |
(meta area) |
|
|
Metadata: This area holds a metadata structure whose
Expand All @@ -186,20 +186,33 @@ that were present in CMA region::
0 boot memory size |
| |<------------ Crash preserved area ------------>|
V V |<--- Reserved dump area --->| |
+-----------+-----/ /---+---+----+-------+-----+-----+----+--+
| | |///|////| DUMP | HDR | ELF |////| |
+-----------+-----/ /---+---+----+-------+-----+-----+----+--+
| |
V V
Used by second /proc/vmcore
kernel to boot
+----+---+--+-----/ /---+---+----+-------+-----+-----+-------+
| |ELF| | |///|////| DUMP | HDR |/////| |
+----+---+--+-----/ /---+---+----+-------+-----+-----+-------+
| | | | | |
----- ------------------------------ ---------------
\ | |
\ | |
\ | |
\ | ----------------------------
\ | /
\ | /
\ | /
/proc/vmcore


+---+
|///| -> Regions (CPU, HPTE & Metadata) marked like this in the above
+---+ figures are not always present. For example, OPAL platform
does not have CPU & HPTE regions while Metadata region is
not supported on pSeries currently.

+---+
|ELF| -> elfcorehdr, it is created in second kernel after crash.
+---+

Note: Memory from 0 to the boot memory size is used by second kernel

Fig. 2


Expand Down Expand Up @@ -353,26 +366,6 @@ TODO:
- Need to come up with the better approach to find out more
accurate boot memory size that is required for a kernel to
boot successfully when booted with restricted memory.
- The FADump implementation introduces a FADump crash info structure
in the scratch area before the ELF core header. The idea of introducing
this structure is to pass some important crash info data to the second
kernel which will help second kernel to populate ELF core header with
correct data before it gets exported through /proc/vmcore. The current
design implementation does not address a possibility of introducing
additional fields (in future) to this structure without affecting
compatibility. Need to come up with the better approach to address this.

The possible approaches are:

1. Introduce version field for version tracking, bump up the version
whenever a new field is added to the structure in future. The version
field can be used to find out what fields are valid for the current
version of the structure.
2. Reserve the area of predefined size (say PAGE_SIZE) for this
structure and have unused area as reserved (initialized to zero)
for future field additions.

The advantage of approach 1 over 2 is we don't need to reserve extra space.

Author: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Expand Down
24 changes: 23 additions & 1 deletion arch/powerpc/include/asm/fadump-internal.h
Expand Up @@ -42,7 +42,25 @@ static inline u64 fadump_str_to_u64(const char *str)

#define FADUMP_CPU_UNKNOWN (~((u32)0))

#define FADUMP_CRASH_INFO_MAGIC fadump_str_to_u64("FADMPINF")
/*
* The introduction of new fields in the fadump crash info header has
* led to a change in the magic key, from `FADMPINF` to `FADMPSIG`.
* This alteration ensures backward compatibility, enabling the kernel
* with the updated fadump crash info to handle kernel dumps from older
* kernels.
*
* To prevent the need for further changes to the magic number in the
* event of future modifications to the fadump header, a version field
* has been introduced to track the fadump crash info header version.
*
* Historically, there was no connection between the magic number and
* the fadump crash info header version. However, moving forward, the
* `FADMPINF` magic number in header will be treated as version 0, while
* the `FADMPSIG` magic number in header will include a version field to
* determine its version.
*/
#define FADUMP_CRASH_INFO_MAGIC fadump_str_to_u64("FADMPSIG")
#define FADUMP_VERSION 1

/* fadump crash info structure */
struct fadump_crash_info_header {
Expand All @@ -51,6 +69,10 @@ struct fadump_crash_info_header {
u32 crashing_cpu;
struct pt_regs regs;
struct cpumask cpu_mask;
u32 version;
u64 elfcorehdr_size;
u64 vmcoreinfo_raddr;
u64 vmcoreinfo_size;
};

struct fadump_memory_range {
Expand Down

0 comments on commit 4d99dc0

Please sign in to comment.