Permalink
Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
2672 lines (2278 sloc) 112 KB

The History and Future of Core Dumps in FreeBSD

\subsection*{Abstract}

Crash dumps, also known as core dumps, have been a part of BSD since its beginnings in Research UNIX. Though 38 years have passed since doadump() came about in UNIX/32V, core dumps are still needed and utilized in much the same way they were then. However, as underlying assumptions about the ratio of swap to RAM have proven inappropriate for modern systems, several extensions have been made by those who needed core dumps on very large servers, or very small embedded systems.

We begins with a background on what core dumps are and why operators might need them. Then several timelines are used to characterize the changing nature of the core dump procedure in FreeBSD, with the side effect of providing a history of archtecture support from UNIX v6 through to FreeBSD 12. Following that the current state of the core dump facility and some of the more common extensions in use are examined.

We conclude with my experience porting Illumos’ ability to dump to swap on a ZFS zvol to FreeBSD and what that provides the operator.

In addition a complete history of core dumps in UNIX and BSD was produced as research for this paper and can be found in the appendix.

1 Introduction

The BSD core dump facility performs a simple yet vital service to the operator: it preserves a copy of the contents of system memory at the time of a fatal error for later debugging.

This copy or “dump” can be a machine readable form of the complete contents of system memory, or just the set of kernel pages that are active at the time of the crash. There is also support for dumping a less complete but human readable debugger scripting output.

Throughout the history of UNIX operating systems, different methods have been used to produce a core dump. In the earliest UNIXes magnetic tape was the only supported dump device but when hard disk support matured, swap space was used, obviating the need for changing out tapes before a dump2. Modern and embedded systems continue to introduce new constraints that have motivated the need for newer methods of ex-filtrating a core dump from a faltering kernel.

The FreeBSD variant of the BSD operating system has introduced gradual extensions to the core dumping facility. FreeBSD 6.2 introduced “minidumps”, a subset of a full dump that only consists of active kernel memory. FreeBSD 7.1’s textdumps(4) consist of the result of debugger commands input interactivly in DDB or via script11. FreeBSD 12-CURRENT introduced support for public-key cryptographic encryption of core dumps.

Though not in the main source tree, compressed dumps and the ability to dump to a remote network device exist and function. While promising, these extensions have been inconsistent in their integration and interoperability.

Another BSD derived OS, Mac OS X has also introduced similar compression and network dumping features into their kernel albeit with a distinct pedigree from FreeBSD1012.

The following paper will provide a historical survey of the dump facility itself, from its introduction in UNIX to its current form in modern BSDs and BSD derived operating systems. We will also explore these core dump extensions, associated tools, and describe an active effort to fully modularize them, allowing the operator to enable one or more of them simultaneously.

2 Motivation

In UNIX and early BSDs core dumps were originally made to magnetic tape which was superseded by dumping to a swap partition on a hard disk since at least 3BSD. For decades since, increases in physical system memory and swap partition size have loosely tracked increases in available persistent memory, allowing for the continued use of this paradigm.

However, recent advances in commodity system hardware have upended the traditional memory to disk space ratio with systems now routinely utilizing 1TB or more physical memory whilst running on less than 256GB of solid state disk. Given that the kernel memory footprint has grown in size, the assumption that disk space would always allow for a swap partition large enough for a core dump has proved to be inaccurate. This change has spurred development of several extensions to the core dumping facility, including compressed dumping to swap and dumping over the network to a server with disk space for modern core dumps. Because dumps contain all the contents of memory, any sensitive information in flight at the time of a crash appears in the dump. For this reason encrypted dumps have been recently added to FreeBSD13.

While dealing with the above problems the author and his colleagues became closely familiar with the state of the core dump code and its associated documentation. As users of the core dump code they felt a need for more flexibility and extensibity in the core dump routines of FreeBSD. The author intends to provide a basis for the argument that the core dump code should be modularized for the flexibility that provides to operators.

In addition it is hoped that the information herein is of use to inform further work on core dumps, failing that we hope it is interesting.

3 The Present

3.1 Core Dumps in FreeBSD

3.1.1 Quick Background

While reading this paper you may wish to take a crash dump on your system and play around with the features discussed. The following is a quick “crash” course.

First configure the system dump device in /etc/rc.conf or using dumpon(8) 29. The easiest way is to use \verb|dumpdev=’AUTO’|, which will set the dumpdev to the first configured swap device, make sure your swap partition is large enough for a core dump, if using dumpon(8) use swapinfo to find a suitable partition. Next, if you are using the default dumpdir, make sure it exists and set permissions accordingly. Now, in order to generate a kernel dump you will need to panic your kernel, there are several ways to do this, including writing a program that calls panic(9), using dtrace to call panic or the simplest using the sysctl, sysctl debug.kdb.panic=1. Note this will crash and reboot your system.

For those who prefer shell to English:

# mkdir /var/crash
# chmod 700 /var/crash
# swapinfo
# dumpon -v /dev/da0p2
# sysctl debug.kdb.panic=1

If your dumpdir is configured correctly, savecore(8) will run automatically upon reboot. If not, run savecore(8) manually.

3.1.2 Full Core Dump Procedure

When a UNIX-like system such as FreeBSD encounters an unrecoverable and unexpected error the kernel will “panic”. Though the word panic has connotations of irrationality, the function panic(9) maintains composure while it shuts down the running system and attempts to save a core dump to a configured dump device.

What follows is a thorough description of the FreeBSD core dump routine (as of FreeBSD 11-RELEASE) starting with doadump() in sys/kern/kern_shutdown.c.

doadump() is called by kern_reboot(), which shuts down “the system cleanly to prepare for reboot, halt, or power off.” 4 kern_reboot() calls doadump() if the RB_DUMP flag is set and the system is not “cold” or already creating a core dump. doadump() takes a boolean informing it to whether or not to take a “text dump”, a form of dump carried out if the online kernel debugger, DDB, is built into the running kernel. doadump() returns an error code if the system is currently creating a dump, the dumper is NULL and returns error codes on behalf of dumpsys().

doadump(boolean_t textdump) starts the core dump procedure by saving the current context with a call to savectx(). At this point if they are configured, a “text dump” can be carried out. Otherwise a core dump is invoked using dumpsys(), passing it a struct dumper. dumpsys() is defined on a per-architecture basis. This allows different architectures to setup their dump structure differently. dumpsys() calls dumpsys_generic() passing along the struct dumperinfo it was called with. dumpsys_generic() is defined in sys/kern/kern_dump.c and is the foundation of the core dump procedure.

There are several main steps to the dumpsys_generic() procedure. The main steps are as follows. At any point if there is an error condition, goto failure cleanup at the end of the procedure.

  1. Fill in the ELF header.
  2. Calculate the dump size.
  3. Determine if the dump device is large enough.
  4. Fill in kernel dump header
  5. Begin Dump
    1. Leader
    2. ELF Header
    3. Program Headers
    4. Memory Chunks
    5. Trailer
  6. End Dump

After this is done the kernel gives a zero length block to dump_write() to “Signal completion, signoff and exit stage left.” And our core dump is complete.

3.1.3 Full Core Dump Contents

The canonical form of core dump is the “full dump”. Full dumps are created via the doadump() code path which starts in sys/kern/kern_shutdown.c. The resulting dump is an ELF formatted binary written to a configured swap partition. The following is based on amd64 code and is the result of dumpsys_generic(). This will be similar in format but different values for different architectures.

FieldDescription
/><
LeaderSee Table tab:kdhheader
ELF HeaderSee Table tab:elfheader
Program Headers
Memory Chunks
TrailerSee Table tab:kdhheader
FieldValue
/><
magic“FreeBSD Kernel Dump”
architecture“amd64”
version1 (kdh format version)
architectureversion2
dumplengthvaries, excludes headers
dumptimecurrent time
blocksizeblock size
hostnamehostname
versionstringversion of OS
panicstringpanic(9) message
parityparity bits
FieldValue
/><
e_ident[EI_MAG0]0x7f
e_ident[EI_MAG1]`E’
e_ident[EI_MAG2]`L’
e_ident[EI_MAG3]`F’
e_ident[EI_CLASS]2 (64-bit)
e_ident[EI_DATA]1 (little endian)
e_ident[EI_VERSION]1 (ELF version 1)
e_ident[EI_OSABI]255
e_type4 (core)
e_machine62 (x86-64)
e_phoffsize of this header
e_flags0
e_ehsizesize of this header
e_phentsizesize of program header
e_shentsizesize of section header

3.1.3.1 Notes

3.1.4 Minidump Procedure and Contents

FreeBSD 6.2 introduced a new form of core dump termed, “minidumps”. Instead of dumping all of phsyical memory to guarantee all relevent information is archived, minidumps dump “only memory pages in use by the kernel.”14

Minidumps use a custom format in lieu of ELF. The format of a modern minidump (version 2) can be found in table tab:minidumpformat.

FieldDescription
/><
LeaderSee Table tab:kdhheader
Minidump HeaderSee Table tab:minidumpheader
Message Buffermessage buffer contents
Bitmapmap of kernel pages
Kernel Page Directory
Memory Chunks
TrailerSee Table tab:kdhheader
FieldValue
/><
magic“minidump FreeBSD/amd64”
version2
msgbufsizesize of message buffer
bitmapsizesize of bitmap
pmapsizesize of physical memory map
kernbaseptr to start of kernel mem
dmapbaseptr to start of direct map
dmapendptr to end of direct map

The minidump procedure in general is similiar to that of the full dump but with the added step of creating a bitmap that indicates which pages are to become part of the dump. The minidump procedure detailed here is based on the AMD64 code as found in =sys/amd64/amd64/minidump_machdep.c=15, but it nearly identical for other architectures.

  1. Create bitmap describing pages to be dumped.
  2. Calculate the dump size.
  3. Determine if the dump device is large enough.
  4. Fill in minidump header
  5. Fill in kernel dump header
  6. Begin Dump
    1. Leader
    2. Minidump Header
    3. Message Buffer
    4. Bitmap
    5. Kernel Page Directory
    6. Memory Chunks
    7. Trailer
  7. End Dump

    The minidump will fail for any of the reasons a full dump will and also if the dump map grows while creating it. This will cause the routine to retry up to dump_retry_count times, the default is 5 times but can be set with the sysctl machdep.dump_retry_count.

3.1.4.1 Notes

r157908 | peter | 2006-04-20 23:24:50 -0500 (Thu, 20 Apr 2006) | 39 lines

Introduce minidumps. Full physical memory crash dumps are still available via the debug.minidump sysctl and tunable.

Traditional dumps store all physical memory. This was once a good thing when machines had a maximum of 64M of ram and 1GB of kvm. These days, machines often have many gigabytes of ram and a smaller amount of kvm. libkvm+kgdb don’t have a way to access physical ram that is not mapped into kvm at the time of the crash dump, so the extra ram being dumped is mostly wasted.

Minidumps invert the process. Instead of dumping physical memory in in order to guarantee that all of kvm’s backing is dumped, minidumps instead dump only memory that is actively mapped into kvm.

amd64 has a direct map region that things like UMA use. Obviously we cannot dump all of the direct map region because that is effectively an old style all-physical-memory dump. Instead, introduce a bitmap and two helper routines (dump_add_page(pa) and dump_drop_page(pa)) that allow certain critical direct map pages to be included in the dump. uma_machdep.c’s allocator is the intended consumer.

Dumps are a custom format. At the very beginning of the file is a header, then a copy of the message buffer, then the bitmap of pages present in the dump, then the final level of the kvm page table trees (2MB mappings are expanded into a 4K page mappings), then the sparse physical pages according to the bitmap. libkvm can now conveniently access the kvm page table entries.

Booting my test 8GB machine, forcing it into ddb and forcing a dump leads to a 48MB minidump. While this is a best case, I expect minidumps to be in the 100MB-500MB range. Obviously, never larger than physical memory of course.

minidumps are on by default. It would want be necessary to turn them off if it was necessary to debug corrupt kernel page table management as that would mess up minidumps as well.

Both minidumps and regular dumps are supported on the same machine.

3.1.5 Textdump Procedure and Contents

FreeBSD added a new type of dump, the textdump(4). “The textdump facility allows the capture of kernel debugging information to disk in a human-readable rather than the machine-readable form normally used with kernel memory dumps and minidumps.”18 If doadump() in kern_shutdown.c is given a boolean value of ‘true’ then a minidump or full dump is cancelled and instead textdump_dumpsys() is invoked in sys/ddb/db_textdump.c.

Since textdumps are not binary data, textdumps are written out in the ustar tar file format. This tar contains several files listed in tab:textdumpformat19. There exist several sysctls to select which files an operator wishes to include. These are listed in textdump(4).

FileDescription
/><
LeaderSee Table tab:kdhheader
version.txtKernel version string
panic.txtKernel panic message
msgbuf.txtKernel message buffer
config.txtKernel configuration
ddb.txtCaptured DDB output
TrailerSee Table tab:kdhheader

The textdump(4) procedure is similar in its setup to the other types of dumps but has several differences in particular because the dump is in ustar format containing several text files instead of a binary format containing kernel pages.

  1. Check if minimum amount of space is available on dump device
  2. Set start of dump at the end of the swap partition minus the size of the dump header
  3. Fill in kernel dump header
  4. Begin Dump
    1. Trailer
    2. ddb.txt
    3. config.txt
    4. msgbuf.txt
    5. panic.txt
    6. version.txt
    7. Header
    8. Re-write Trailer with correct size
  5. End Dump

If an error occurs during this procedure, report said error. If not, tell dump_write() to write a zero-length block to signifiy the end of the dump and report that the dump suceeded and return to executing the rest of the machine independent dump code.

3.1.5.1 Notes

  • https://lists.freebsd.org/pipermail/freebsd-current/2007-December/081626.html
  • texdump email

    Dear all,

    I’ve received a few textdump-related questions that I thought I’d share my answers to.

    (1) What information is in a textdump?

    The textdump is stored as a tarfile with several subfiles in it:

    config.txt - Kernel configuration, if compiled into kernel ddb.txt - Captured DDB output, if present msgbuf.txt - Kernel message buffer panic.txt - Kernel panic message, if there was a panic version.txt - Kernel version string

    It is easy to add new files to textdumps, so if there’s some easily extractable kernel state that you feel should go in there, drop me an e-mail and/or send a patch.

    (2) Is there any information in a textdump that can’t be acquired using kgdb and other available dump analysis tools?

    In principle no, as normal dumps include all kernel memory, and textdumps operate by inspecting kernel memory using DDB, capturing only small but presumably relevant parts. However, there are some important differences in approach that mean that textdumps can be used in ways that regular dumps can’t easily be:

    • DDB textdumps are very small. Including a full debugging session, kernel

    message buffer, and kernel configuration, my textdumps are frequently around 100k uncompressed. This makes it possible to use them on very small machines, store them for an extended period, e-mail them around, etc, in a way that you can’t currently do with kernel memory dumps. This improved usability will (hopefully) improve our bug and crash management.

    • DDB is a specialized debugging tool with intimate knowledge of the kernel,

    and there are types of data trivially extracted with DDB that are awkward or quite difficult to extract using kgdb or other currently available dump analysis tools. Locking, waiting, and process information are examples of where automatic extraction is currently only possible with DDB, and one of the reasons many developers prefer to begin any diagnosis with an interactive DDB session.

    • DDB textdumps can be used without the exact source tree, kernel

    configuration, built kernel, and debug symbols, as they interpret rather than save the pages of memory. They’re even an architecture-independent file format so you don’t need a cross-debugger. Having that additional context is useful (ability to map symbol+offset to line of code), but you can actually go a remarkable way without it, especially looking at the results in a PR potentially years later.

    (3) What do I lose by using textdumps?

    To be clear, there are also some important things that textdumps can’t do – principally, a textdump doesn’t contain all kernel memory, so your textdump output is all you have. If you need to extract detailed structure information for something DDB doesn’t understand, or that you don’t think of in advance or during a DDB session, then there’s nothing to fall back on except configuring a textdump or regular dump and waiting for the panic to happen again.

    (4) When should I use textdumps?

    Minidumps remain the default in 7.x and 8.x, and full dumps remain the default in 6.x and earlier. Textdumps must be specifically enabled by the administrator to be used.

    DDB is an excellent live debugging tool whose use has been limited to situations where there is an accessible video console, or more ideally serial or firewire console to a second box, and generally requiring an experienced developer to be available to drive debugging. There are many problems that can be pretty much instantly understood with a couple of DDB commands, so these limitations impacted debugging effectiveness.

    The goal of adding DDB capture output, scripting, and textdumps was to broaden the range of situations in which DDB could be used: now it is usable more easily for post-mortem analysis, no console or second machine is required, and a developer can install, or even e-mail, a script of DDB commands to run automatically. Developers can simply define a few scripts to handle various DDB cases, such as panic, and get a nice debugging bundle to look at later.

    When I’m debugging network stack problems, I typically want a fairly small set of DDB commands to be run by the user, and the output sent back, and now it will go from “Read the chapter on kernel debugging, set up a serial console, run the following commands, copy and paste from your serial console – oh, you don’t have a serial console, perhaps hand-copy these fields or use a digital camera” to “run the following ddb(8) command and when the box reboots, send me the tarball in /var/crash”.

    I anticipate that textdumps will see use when developers are exchanging e-mail with users reporting problems and trying to gather concise summaries of information about a crash with minimum downtime and maximum portability, in embedded environments where dumping kernel memory to flash is tricky, or in order to save a transcript of an interactive DDB session when testing new features locally.

    Another interesting advantage of textdumps is that it’s easy to inspect them for confidential/identifying information and mask or purge it. When someone sends out a kernel memory dump, it potentially contains a lot of sensitive information, and most people (including me) would have difficulty making sure all sensitive information was purged safely.

    (5) I want to collect DDB output, but still need memory dumps – can I do both?

    Yes and no.

    Yes, you can use the DDB output capture buffer and scripting without using a textdump, as the capture buffer is stored in kernel memory. You can print it using kgdb, and we should probably add that capability to ddb(8) also. End your script with “call doadump; reset” but don’t “textdump set”. For example:

    ddb script kdb.enter.panic=”capture on;show pcpu;trace;ps;show locks;alltrace;show alllocks;show lockedvnods;call doadump;reset”

    No, because you must pick one of the three dump layouts (dump, minidump, textdump) to write to the swap partition – you can’t write out all three and then decide which to extract later. In principle this could be changed so that we actually write out a textdump section and a full/minidump, but that’s not implemented.

    (6) I have a serial console so don’t need textudmps, can I still use DDB scripting to manage a crash?

    Yes. You can set up scripts in exactly the same way as with textdumps, only omit the textdump bits and end with a “reset” to reboot the system when done. That way you can extract the results from the serial console log. I.e.,

    ddb script kdb.enter.panic=”show pcpu;trace;show locks;ps;alltrace;show alllocks;show lockedvnods;reset”

    (7) I’m in DDB and I suddenly realize I want to save the output, and I haven’t configured textdumps. What do I do?

    As with normal dumps, you must previously have configured support for a dump partition. These days, that is done automatically whenever you have swap configured on the box, so unless you’re in single-user mode or don’t have swap configured, you should be able to do the following:

    Schedule a textdump using the “textdump set” command.

    Turn on DDB output capture using “capture on”, run your commands of interest, and turn it off using “capture off”.

    Type “call doadump” to dump memory, and “reset” to reboot.

    (8) The buffer is small, can I pick and choose what DDB output is captured?

    The capture buffer does have a size limit, so you might find you want to explore interactively at first to figure out what information to save. Then you can turn it on and off around output to capture with “capture on” and “capture off”. Each time you turn capture back on, new output is appended after any existing output.

    If you decide you want to clear the buffer, you can use “capture reset” to do that, and you can check the status of the buffer using “capture status”.

    You can also increase the buffer size by setting the debug.ddb.capture.bufsize sysctl to a larger size. The sysctl will automatically round up to the next textdump blocksize.

    (9) Can I continue the kernel after doing a textdump?

    No. As with kernel memory dumps, textdumps invoke the storage controller dumper routine, which may hose up state in the device driver preventing its use after the dump is generated.

    However, if you do plan to continue from DDB, just use DDB output capture without a textdump. You can then extract the contents of the DDB buffer using the debug.ddb.capture.data sysctl.

3.2 Core Dumps in Mac OS X

Mac OS X is capable of creating compressed core dumps and dumping them locally, or over the network using a modified tftpd(8) from FreeBSD called =kdumpd(8)=16. Network dumping “has been present since Mac OS X 10.3 for PowerPC-based Macintosh systems, and since Mac OS X 10.4.7 for Intel-based Macintosh systems.”10 In addition dumps over FireWire are supported for situations where the kernel panic is caused by the Ethernet driver or network code.

In xnu/osfmk/kdp/kdp_core.c Mac OS X gzips its core dump before writing it out to disk, and is otherwise much like the FreeBSD “full dump” procedure with one major difference besides its features12. Notably, Mac OS X uses a different executable image-format called Mach-O, as opposed to ELF, because OS X runs a hybrid Mach and BSD kernel called XNU7.

  1. Initialize gzip
  2. Determine where to write dump
    1. If local, determine offset to place file header, panic and core log
    2. If remote, setup buffer for compressed core and packet size
  3. Traverse the pmap for dumpable pages
  4. Fill in Mach-O header
  5. Begin Dump Write/Transmission
    1. Mach-O Header
    2. Information about panicked thread’s state
    3. Information about dump output location
    4. Pad with zeroes to page align
    5. Kernel Pages
    6. Signal Completion with zero length write
    7. Print out Information about Dump
    8. If Local, write out debug log and gzip file header
  6. End Dump Write/Transmission

If an error is detected at any point, return and report the given error message.

3.2.1 Notes

static int
do_kern_dump(kern_dump_output_proc outproc, bool local)
{
    struct kern_dump_preflight_context kdc_preflight;
    struct kern_dump_send_context      kdc_sendseg;
    struct kern_dump_send_context      kdc_send;
    struct kdp_core_out_vars           outvars;
    struct mach_core_fileheader         hdr;
    kernel_mach_header_t mh;
    uint32_t	         segment_count, tstate_count;
    size_t		 command_size = 0, header_size = 0, tstate_size = 0;
    uint64_t	         hoffset, foffset;
    int                  ret;
    char *               log_start;
    uint64_t             log_length;
    uint64_t             new_logs;
    boolean_t            opened;

    opened     = false;
    log_start  = debug_buf_ptr;
    log_length = 0;
    if (log_start >= debug_buf_addr)
    {
	log_length = log_start - debug_buf_addr;
	if (log_length <= debug_buf_size) log_length = debug_buf_size - log_length;
	else log_length = 0;
    }

    if (local)
    {
	if ((ret = (*outproc)(KDP_WRQ, NULL, 0, &hoffset)) != kIOReturnSuccess) {
	    DEBG("KDP_WRQ(0x%x)\n", ret);
	    goto out;
	}
    }
    opened = true;

    // init gzip
    bzero(&outvars, sizeof(outvars));
    bzero(&hdr, sizeof(hdr));
    outvars.outproc = outproc;
    kdp_core_zs.avail_in  = 0;
    kdp_core_zs.next_in   = NULL;
    kdp_core_zs.avail_out = 0;
    kdp_core_zs.next_out  = NULL;
    kdp_core_zs.opaque    = &outvars;
    kdc_sendseg.outvars   = &outvars;
    kdc_send.outvars      = &outvars;

    if (local)
    {
	outvars.outbuf      = NULL;
        outvars.outlen      = 0;
        outvars.outremain   = 0;
	outvars.zoutput     = kdp_core_zoutput;
    	// space for file header & log
    	foffset = (4096 + log_length + 4095) & ~4095ULL;
	hdr.log_offset = 4096;
	hdr.gzip_offset = foffset;
	if ((ret = (*outproc)(KDP_SEEK, NULL, sizeof(foffset), &foffset)) != kIOReturnSuccess) { 
		DEBG("KDP_SEEK(0x%x)\n", ret);
		goto out;
	} 
    }
    else
    {
	outvars.outbuf    = (Bytef *) (kdp_core_zmem + kdp_core_zoffset);
	assert((kdp_core_zoffset + kdp_crashdump_pkt_size) <= kdp_core_zsize);
        outvars.outlen    = kdp_crashdump_pkt_size;
        outvars.outremain = outvars.outlen;
	outvars.zoutput  = kdp_core_zoutputbuf;
    }

    deflateResetWithIO(&kdp_core_zs, kdp_core_zinput, outvars.zoutput);


    kdc_preflight.region_count = 0;
    kdc_preflight.dumpable_bytes = 0;

    ret = pmap_traverse_present_mappings(kernel_pmap,
					 VM_MIN_KERNEL_AND_KEXT_ADDRESS,
					 VM_MAX_KERNEL_ADDRESS,
					 kern_dump_pmap_traverse_preflight_callback,
					 &kdc_preflight);
    if (ret)
    {
	DEBG("pmap traversal failed: %d\n", ret);
	return (ret);
    }

    outvars.totalbytes = kdc_preflight.dumpable_bytes;
    assert(outvars.totalbytes);
    segment_count = kdc_preflight.region_count;

    kern_collectth_state_size(&tstate_count, &tstate_size);

    command_size = segment_count * sizeof(kernel_segment_command_t) + tstate_count * tstate_size;

    header_size = command_size + sizeof(kernel_mach_header_t);

    /*
     *	Set up Mach-O header for currently executing kernel.
     */

    mh.magic = _mh_execute_header.magic;
    mh.cputype = _mh_execute_header.cputype;;
    mh.cpusubtype = _mh_execute_header.cpusubtype;
    mh.filetype = MH_CORE;
    mh.ncmds = segment_count + tstate_count;
    mh.sizeofcmds = (uint32_t)command_size;
    mh.flags = 0;
#if defined(__LP64__)
    mh.reserved = 0;
#endif

    hoffset = 0;	                                /* offset into header */
    foffset = (uint64_t) round_page(header_size);	/* offset into file */

    /* Transmit the Mach-O MH_CORE header, and segment and thread commands 
     */
    if ((ret = kdp_core_stream_output(&outvars, sizeof(kernel_mach_header_t), (caddr_t) &mh) != kIOReturnSuccess))
    {
	DEBG("KDP_DATA(0x%x)\n", ret);
	goto out;
    }

    hoffset += sizeof(kernel_mach_header_t);

    DEBG("%s", local ? "Writing local kernel core..." :
    	    	       "Transmitting kernel state, please wait:\n");

    kdc_sendseg.region_count   = 0;
    kdc_sendseg.dumpable_bytes = 0;
    kdc_sendseg.hoffset = hoffset;
    kdc_sendseg.foffset = foffset;
    kdc_sendseg.header_size = header_size;

    if ((ret = pmap_traverse_present_mappings(kernel_pmap,
					 VM_MIN_KERNEL_AND_KEXT_ADDRESS,
					 VM_MAX_KERNEL_ADDRESS,
					 kern_dump_pmap_traverse_send_seg_callback,
					 &kdc_sendseg)) != kIOReturnSuccess)
    {
	DEBG("pmap_traverse_present_mappings(0x%x)\n", ret);
	goto out;
    }

    hoffset = kdc_sendseg.hoffset;
    /*
     * Now send out the LC_THREAD load command, with the thread information
     * for the current activation.
     */

    if (tstate_size > 0)
    {
	void * iter;
	char tstate[tstate_size];
	iter = NULL;
	do {
	    /*
	     * Now send out the LC_THREAD load command, with the thread information
	     */
	    kern_collectth_state (current_thread(), tstate, tstate_size, &iter);

	    if ((ret = kdp_core_stream_output(&outvars, tstate_size, tstate)) != kIOReturnSuccess) {
		    DEBG("kdp_core_stream_output(0x%x)\n", ret);
		    goto out;
	    }
	}
	while (iter);
    }

    kdc_send.region_count   = 0;
    kdc_send.dumpable_bytes = 0;
    foffset = (uint64_t) round_page(header_size);	/* offset into file */
    kdc_send.foffset = foffset;
    kdc_send.hoffset = 0;
    foffset = round_page_64(header_size) - header_size;
    if (foffset)
    {
	// zero fill to page align
	if ((ret = kdp_core_stream_output(&outvars, foffset, NULL)) != kIOReturnSuccess) {
		DEBG("kdp_core_stream_output(0x%x)\n", ret);
		goto out;
	}
    }

    ret = pmap_traverse_present_mappings(kernel_pmap,
					 VM_MIN_KERNEL_AND_KEXT_ADDRESS,
					 VM_MAX_KERNEL_ADDRESS,
					 kern_dump_pmap_traverse_send_segdata_callback,
					 &kdc_send);
    if (ret) {
	DEBG("pmap_traverse_present_mappings(0x%x)\n", ret);
	goto out;
    }

    if ((ret = kdp_core_stream_output(&outvars, 0, NULL) != kIOReturnSuccess)) {
	DEBG("kdp_core_stream_output(0x%x)\n", ret);
	goto out;
    }

out:
    if (kIOReturnSuccess == ret) DEBG("success\n");
    else                         outvars.zipped = 0;

    DEBG("Mach-o header: %lu\n", header_size);
    DEBG("Region counts: [%u, %u, %u]\n", kdc_preflight.region_count,
					  kdc_sendseg.region_count, 
					  kdc_send.region_count);
    DEBG("Byte counts  : [%llu, %llu, %llu, %lu, %llu]\n", kdc_preflight.dumpable_bytes, 
							   kdc_sendseg.dumpable_bytes, 
							   kdc_send.dumpable_bytes, 
							   outvars.zipped, log_length);
    if (local && opened)
    {
    	// write debug log
    	foffset = 4096;
	if ((ret = (*outproc)(KDP_SEEK, NULL, sizeof(foffset), &foffset)) != kIOReturnSuccess) { 
	    DEBG("KDP_SEEK(0x%x)\n", ret);
	    goto exit;
	} 

	new_logs = debug_buf_ptr - log_start;
	if (new_logs > log_length) new_logs = log_length;
    	
	if ((ret = (*outproc)(KDP_DATA, NULL, new_logs, log_start)) != kIOReturnSuccess)
	{ 
	    DEBG("KDP_DATA(0x%x)\n", ret);
	    goto exit;
	} 

    	// write header

    	foffset = 0;
	if ((ret = (*outproc)(KDP_SEEK, NULL, sizeof(foffset), &foffset)) != kIOReturnSuccess) { 
	    DEBG("KDP_SEEK(0x%x)\n", ret);
	    goto exit;
	} 

	hdr.signature  = MACH_CORE_FILEHEADER_SIGNATURE;
	hdr.log_length = new_logs;
        hdr.gzip_length = outvars.zipped;

	if ((ret = (*outproc)(KDP_DATA, NULL, sizeof(hdr), &hdr)) != kIOReturnSuccess)
	{ 
	    DEBG("KDP_DATA(0x%x)\n", ret);
	    goto exit;
	}
    }

exit:
    /* close / last packet */
    if ((ret = (*outproc)(KDP_EOF, NULL, 0, ((void *) 0))) != kIOReturnSuccess)
    {
	DEBG("KDP_EOF(0x%x)\n", ret);
    }	


    return (ret);
}

int
kern_dump(boolean_t local)
{
    static boolean_t dumped_local;
    if (local) {
	if (dumped_local) return (0);
	dumped_local = TRUE;
	return (do_kern_dump(&kern_dump_disk_proc, true));
    }
#if CONFIG_KDP_INTERACTIVE_DEBUGGING
    return (do_kern_dump(&kdp_send_crashdump_data, false));
#else
    return (-1);
#endif
}

3.2.1.1 backtrace.io email

From one of our engineers after reading your paper as FYI:

"macOS has some nifty features for kernel debugging that aren't available
on other platforms, which are not mentioned in that paper.

you can not only debug the kernel over the network (only possible via
firewire or serial on FreeBSD but not ethernet), but all the special
commands available in the console debugger (and then some) are available in
macOS's gdb-based toolkit

i've never seen a network-based debugger for linux either, but perhaps
there is one"

-Eddie
Hi Sam,

Sorry, but the OS X / MachO core dump format or code to deal with same was =
never in my area so I would have no idea whether what you=E2=80=99ve writte=
n is technically correct or not. :)

My company rolodex is also almost 4 years out of date and pretty much all o=
f my contacts have moved on, so I wouldn't even know where to point you for=
 what is, admittedly, a somewhat esoteric topic!

Sorry I couldn=E2=80=99t help.

- Jordan

3.3 Core Dumps in Solaris (Not in Scope)

Solaris has several features that others don’t. But Solaris is arguably not within the scope of this paper. Detailing Illmos’ abilities instead.

  • savecore(1M) has the ability to “live dump”, creating a dump of a running system. savecore(1M) does note that this dump will not be entirely self consistent because the machine is not halted while dumping.
  • dumpadm(1M) allows save compression and dumping to swap on zvol(!!!)
  • dumpadm(1M) as of Solaris 11.2 has a dump size estimation feature that will attempt to estimate the size of a dump given your current configuration.
    • Illumos has this. Just going to do an illumos section instead

3.3.1 Notes

3.4 Core Dumps in Illumos

“illumos is a free and open-source Unix operating system. It derives from OpenSolaris, which in turn derives from SVR4 UNIX and Berkeley Software Distribution (BSD).”20 Illumos has several attractive features in its core dump routine including “live dumping”, compression and support for swap on zvol as a dump device.

The Illumos dump routine, dumpsys() can be found in usr/sys/uts/common/os/dumpsubr.c. In contrast to the other dump routines explained previously, the Illumos dump routine is very complex but with that complexity comes the several features mentioned above that are not available elsewhere.

Illumos’ savecore(1M) has the ability to “live dump”, creating a dump of a running system22. savecore(1M) does note that this dump will not be entirely self consistent because the machine is not suspended while dumping.

In addition to a version of savecore(1M), Illumos has a tool analogous to FreeBSD’s dumpon(8) called dumpadm(1M) which primarily is used to set the current dump device. Importantly this dump device can be a swap partition in a ZFS zvol. dumpadm(1M) is also used to configure save compression and is able to estimate the size of a dump on a running system21.

3.4.1 Notes

3.5 Backtrace.io

“Backtrace is a company that is aiming [to improve] the post-mortem debugging process.” 23 Unlike the rest of this paper, Backtrace is not an operating system’s dump process or its features, but a tool for analyzing cores once they are generated.

Backtrace supports several languages for userspace core dumps, including C, C++, Go, Python. Most importantly, Backtrace supports FreeBSD kernel core dumps. This section will focus on FreeBSD kernel core dump support.

Backtrace does not replace the FreeBSD core dump procedure, but is a service that collects core dumps and helps the operator traige and fix the bugs that cause those cores to be dumped.

Backtrace is a system made up of several parts: coresnapd, a snapshot generator; a set of analysis modules for automated debugging; coroner, an object store; a web interface and hydra its terminal counterpart 28.

After a sucessful savecore(8), coresnapd and a set of companion scripts create a “snapshot” of any cores generated and send it back to coroner 26. A snapshot contains a stack-trace across all threads, active regions of memory, requested global variables, environment information like virtual memory and CPU statistics, custom metadata such as datacenter, and annotations created by the analysis modules such as automated checking for a double free() of a pointer 28. This results in a self-contained package that is smaller than a minidump and can be analyzed on a machine with an environment differing from the machine that created the original core 24. Once collected, Backtrace’s web interface can be used to categorize and triage different faults by any metadata or by panic string, for example. After triage, the web interface or hydra can be used to analyze snapshots 27.

Backtrace has also sponsored work on FreeBSD itself, by improving kvm(3)’s libkvm physical address lookup time from a linear time lookup to a constant time lookup. This provides gains in runtime complexity and space complexity of dealing with cores via crashinfo(8) or kgdb(1) especially for those systems with large amounts of RAM. 25

3.5.1 Notes

  • email will@freebsd.org to look through this.

4 The Future

There are several extensions to the FreeBSD core dump code that exist as sets of patches on mailing lists and wikis but are not found in upstream FreeBSD.

First, we provide some background on several extensions and tools including dumping over the network, compressed dumps and a tool for estimating the size of a minidump. Then we will explore the benefits of modularized core dump code.

4.1 netdump - Network Dump

Crash dumping over the network can be especially useful in embedded systems that do not have adequately sized swap partitions.

The original netdump code was written by Darrell Anderson at Duke around 2000 in the FreeBSD 4.x era as a kernel module. This code was later ported to modern FreeBSD in 2010 at Sandvine with the intention of being part of FreeBSD 9.0, which did not succeed.

Currently there exists working netdump code from Isilon that can be applied with some difficulty to versions of FreeBSD after 11.0. Network dumps have not yet made it into upstream FreeBSD.

4.1.1 Notes

4.2 Compressed Dump

Modern systems often have several hundred gigabytes of RAM and will soon often have terabytes. This means full crash dumps, even minidumps, can be much larger than most sensible amounts of swap.

Though savecore(8) has the ability to compress core dumps with the =`-z’= option, this only compresses a core once it is copied into the main filesystem. The core dump that was written to the swap partition remains uncompressed.

Compressed dumps see a 6:1 to 14:1 compression ratio for core dumps with a slight penalty in the time required to write the dump initially8. However the following savecore(8) on the next boot is faster, resulting in a faster dump and reboot sequence.

Compressed dumps have not yet made it into upstream FreeBSD.

4.2.1 Notes

4.3 minidumpsz - Minidump Size Estimation

minidumpsz is a kernel module that can do an online estimation of the size of a minidump if it were to occur at the time sysctl debug.mini_dump_size is called.

minidumpsz performs an inactive version of the minidump routine, minidumpsys(), to estimate the size of a dump if it were to take place at the time of the sysctl’s calling.

Illumos is also capable of performing an online dump size estimation using dumpadm(1M)’s -e option which estimates the size of the dump taking in account options like compression 21.

minidumpsz was created by Rodney W. Grimes for the author’s work at Groupon and applies to FreeBSD 10.1 and FreeBSD 11. minidumpsz has not yet made it into upstream FreeBSD.

4.3.1 Notes

4.3.1.1 Solaris dumpadm -e

  • Solaris 11.2 has a similiar capability but is not limited to minidump, it estimates based on your current config.

4.4 Modularizing Dump Code

Currently if one would like to implement features or fixes in the core dump code one would need to recompile their kernel and reboot. This is highly undesireable when an operator wants to upgrade or fix their production systems. Refactoring the dump code into loadable kernel modules (LKM) would yield two major benefits for operators: easier development of fixes and features and a smaller kernel for embedded systems.

There is a proof of concept modularization of the dump code working on FreeBSD 11.0p117. This code has not yet made it into upstream FreeBSD.

4.4.1 Notes

  • Backporting features and fixes added to dump code becomes trivial
  • Development becomes easier because LKMs are easier to work with
  • Embedded systems benefit from a smaller kernel

4.4.1.1 Email from Rod Grimes

From: "Rodney W. Grimes" <freebsd@pdx.rh.cn85.dnsmgr.net>
Subject: Re: Modular Dump
To: Sam Gwydir <sam@samgwydir.com>
Date: Thu, 12 Jan 2017 22:06:55 -0800 (PST)

> On Thu, Jan 12, 2017 at 11:03 PM, Rodney W. Grimes
> <freebsd@pdx.rh.cn85.dnsmgr.net> wrote:
> >> Hey Rod,
> >>
> >> Finishing up my paper on core dumps and wanted to talk about your idea for
> >> modularization of the dump code.
> >
> > Is there a copy of it some place to read?  (Please don't email it, as that
> > tends to clutter my mail folder.)
> 
> Here you go: https://github.com/gwydirsam/bsd-coredump-history
> 
> The pdf is compiled from the org file. The org file contains notes but
> may be hard to read without emacs and org-mode.

No emacs for me, so I'll be reading the pdf.

> The history is now an appendix because it is just a huge list. I'm not
> 100% on some of the architecture support claims, in particular I'm not
> familiar enough with VAX to nail down that period. In addition there
> are some important features and bug fixes I'm sure I missed in the
> FreeBSD history because I didn't go through all minor versions.
> 
> If there's anything you have to comment on let me know. Thanks for
> taking a look.

I'll make time to at least give it one fast pass.

> >> I want to talk about why FreeBSD should go
> >> in this direction and what are the pros and cons of a modular dump code?
> >
> > There are 2 major reasons I want to go in this direction, and think that
> > these reasons are benificial to the FreeBSD project and its users.
> >
> > 1)  By moving all the dump code to Loadable Kernel Modules (LKM) it
> >     makes this code easier to work on and enhance with new features.
> >     I actually did this for the netdump code so that I didnt have
> >     to go through reboot cycles while I debugged it.  I could simply
> >     load the module, test it, unload it, edit, compile, repeat.
> >
> > 2)  I am active in the embeded world of small computers, and dump
> >     code is a debug tool in that world that needs ripped out after
> >     your done with developement.  Your embeded system isnt going
> >     to do a core dump that anyone would ever see.  This shaves
> >     a tiny amount of the size of the kernel, another important
> >     thing in the embeded world.
> 
> Sounds good to me. Do you think it would take a large effort to
> modularize all the dump code?

No, I already have a working model, and have glanced at the crypted
dump that just went in the tree, and do not see any thing taking very
much effort at all.

> Would each architecture need its own
> module for the machine dependent parts?

The machine dependent part is tiny, most of it living in the pmap
code.  At present there is a large amount of duplicated code accross
the different architectures in the MD part, telling me since the
code is duplicated almost to the last character that a refactor
would move 95% of that code to MI, so 5% of what is already tiny
would be left behind. 

Realize that each architecture has to have its own module for the MI
part since your aarch64 arm isnt going to run the amd64 code!

...

-- 
Rod Grimes                                                 rgrimes@freebsd.org

4.5 Dump to swap on zvol

Many users of FreeBSD use ZFS extensively. Though FreeBSD supports most ZFS features it currently is not recommended to use swap on a zvol as a dump device. However Illumos distributions support this out of the box and it is often the default 21.

This would be incredibly useful for users of ZFS in enterprise settings because ZFS datasets and zvols can be created, destroyed, and modified online, while modifying standard swap partitions is not possible without taking a machine offline and may not be trivial without re-imaging a machine.

4.5.1 Notes

4.5.1.1 Test dump on swap on zvol

This is the default – it works :)

4.5.1.2 omnios swap info

It is important to note that Illumos does require a “dedicated” dump device separate from its swap partition.

vagrant@omnios-vagrant:/export/home/vagrant$ swap -l swapfile dev swaplo blocks free /dev/zvol/dsk/rpool/swap 266,2 8 2097144 2097032 vagrant@omnios-vagrant:/export/home/vagrant$ sudo dumpadm Dump content: kernel pages Dump device: /dev/zvol/dsk/rpool/dump (dedicated) Savecore directory: /var/crash/unknown Savecore enabled: yes Save compressed: on

4.6 Live Dump

The ability to take a core dump on an online system can be useful when a machine is otherwise hung and a the crash or panic would be difficult if not impossible to reproduce. Illumos can force a crash dump on an online system by issuing the savecore -L command.

This feature is not a replacement for normal crash dumps because the system is not halted during the dump which leads to an inconsistent state stored in the core dump. However, this adds another tool for enterprise FreeBSD users that must avoid taking machines offline as much as possible.

4.6.1 Notes

4.6.1.1 Live Dump Example

vagrant@omnios-vagrant:/export/home/vagrant$ sudo savecore -L dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel dumping: 0:00 100% done 100% done: 66940 pages dumped, dump succeeded savecore: System dump time: Mon Jan 30 15:08:50 2017

savecore: Saving compressed system crash dump in /var/crash/unknown/vmdump.0 savecore: Decompress the crash dump with ‘savecore -vf /var/crash/unknown/vmdump.0’

4.7 Recommendations (incomplete)

  • textdumps by defaults, but with better defaults?
  • documentation should include recommendations on swap size for different amounts of ram
    • include amounts for fulldump, minidump and textdump at certain RAM sizes

5 Acknowledgments

The author would like to thank Michael Dexter, for his initial prompting to write this paper and his help debugging the original issues that led to our current combined knowledge of core dumps, Rodney W. Grimes, for historical knowledge and help reading code from PDP-11 assembly to modern C, and Allan Jude, Daniel Nowacki and Chris Findeisen for finding and correct the many, many spelling, grammar and syntax issues in earlier versions of this paper.

The author thanks Deb Goodkin of the FreeBSD Foundation for her help bringing me into the FreeBSD community and lastly thanks the FreeBSD community in general for making this day and paper possible.

6 Appendix

6.1 The Past: A Complete History of Core Dumps

The following sections list when different features of the core dump code were introduced starting with the core dump code itself. First the dump facility will be followed through the later versions of Research UNIX and then BSD through to present versions of FreeBSD.

6.2 Core Dumps in UNIX

Core dumping was initially a manual process. As documented in Version 6 AT&T UNIX’s crash(8), an operator could take a core dump “if [they felt] up to debugging”. Though 6th Edition is not the first appearance of dump code in UNIX, it is the first complete repository of code the public has access to.

6.2.1 5th Edition UNIX

5th Edition UNIX’s dump code can be found in usr/sys/conf/mch.s.

6.2.1.1 Notes

/usr/sys/conf/mch.s

.globl	dump
dump:
	mov	$4,r0	/ overwrites trap vectors
	mov	r1,(r0)+
	mov	r2,(r0)+
	mov	r3,(r0)+
	mov	r4,(r0)+
	mov	r5,(r0)+
	mov	sp,(r0)+
	mov	$KISA0,r1
	mov	$8.,r2
1:
	mov	(r1)+,(r0)+
	sob	r2,1b
	mov	$MTC,r0
	mov	$60004,(r0)+
	clr	2(r0)
1:
	mov	$-512.,(r0)
	inc	-(r0)
2:
	tstb	(r0)
	bge	2b
	tst	(r0)+
	bge	1b
	5
	mov	$60007,-(r0)
	br	.
  

6.2.2 6th Edition UNIX

In 6th Edition UNIX crash(8) shows how to manually take a core dump:

If the reason for the crash is not evident (see below for guidance on `evident’) you may want to try to dump the system if you feel up to debugging. At the moment a dump can be taken only on magtape. With a tape mounted and ready, stop the machine, load address 44, and start. This should write a copy of all of core on the tape with an EOF mark.

6th Edition UNIX’s core dump procedure is defined in m40.s and m45.s give UNIX support for the PDP-11/40 and PDP-11/45.

6.2.2.1 Notes

/usr/sys/conf/m40.s

.globl	dump
dump:
	bit	$1,SSR0
	bne	dump

/ save regs r0,r1,r2,r3,r4,r5,r6,KIA6
/ starting at abs location 4

	mov	r0,4
	mov	$6,r0
	mov	r1,(r0)+
	mov	r2,(r0)+
	mov	r3,(r0)+
	mov	r4,(r0)+
	mov	r5,(r0)+
	mov	sp,(r0)+
	mov	KISA6,(r0)+

/ dump all of core (ie to first mt error)
/ onto mag tape. (9 track or 7 track 'binary')

	mov	$MTC,r0
	mov	$60004,(r0)+
	clr	2(r0)
1:
	mov	$-512.,(r0)
	inc	-(r0)
2:
	tstb	(r0)
	bge	2b
	tst	(r0)+
	bge	1b
	reset

/ end of file and loop

	mov	$60007,-(r0)
	br	.
6.2.2.1.1 /usr/sys/conf/m45.s

/usr/sys/conf/m45.s

/ Mag tape dump
/ save registers in low core and
/ write all core onto mag tape.
/ entry is thru 44 abs

.data
.globl	dump
dump:
	 bit	$1,SSR0
	 bne	dump

/ save regs r0,r1,r2,r3,r4,r5,r6,KIA6
/ starting at abs location 4

	 mov	r0,4
	 mov	$6,r0
	 mov	r1,(r0)+
	 mov	r2,(r0)+
	 mov	r3,(r0)+
	 mov	r4,(r0)+
	 mov	r5,(r0)+
	 mov	sp,(r0)+
	 mov	KDSA6,(r0)+

/ dump all of core (ie to first mt error)
/ onto mag tape. (9 track or 7 track 'binary')

	 mov	$MTC,r0
	 mov	$60004,(r0)+
	 clr	2(r0)
1:
	 mov	$-512.,(r0)
	 inc	-(r0)
2:
	 tstb	(r0)
	 bge	2b
	 tst	(r0)+
	 bge	1b
	 reset

/ end of file and loop

	 mov	$60007,-(r0)
	 br	.

6.2.3 7th Edition UNIX

7th Edition UNIX adds support for the PDP-11/70.

6.2.3.1 Notes

/usr/sys/conf/mch.s

/usr/sys/conf/mch.s

/ Mag tape dump
/ save registers in low core and
/ write all core onto mag tape.
/ entry is thru 44 abs

.data
.globl	dump
dump:

/ save regs r0,r1,r2,r3,r4,r5,r6,KIA6
/ starting at abs location 4

	mov	r0,4
	mov	$6,r0
	mov	r1,(r0)+
	mov	r2,(r0)+
	mov	r3,(r0)+
	mov	r4,(r0)+
	mov	r5,(r0)+
	mov	sp,(r0)+
	mov	KDSA6,(r0)+

/ dump all of core (ie to first mt error)
/ onto mag tape. (9 track or 7 track 'binary')

.if HTDUMP
	mov	$HTCS1,r0
	mov	$40,*$HTCS2
	mov	$2300,*$HTTC
	clr	*$HTBA
	mov	$1,(r0)
1:
	mov	$-512.,*$HTFC
	mov	$-256.,*$HTWC
	movb	$61,(r0)
2:
	tstb	(r0)
	bge	2b
	bit	$1,(r0)
	bne	2b
	bit	$40000,(r0)
	beq	1b
	mov	$27,(r0)
.endif
HT	= 0172440
HTCS1	= HT+0
HTWC	= HT+2
HTBA	= HT+4
HTFC	= HT+6
HTCS2	= HT+10
HTTC	= HT+32

MTC = 172522
.if TUDUMP
	mov	$MTC,r0
	mov	$60004,(r0)+
	clr	2(r0)
1:
	mov	$-512.,(r0)
	inc	-(r0)
2:
	tstb	(r0)
	bge	2b
	tst	(r0)+
	bge	1b
	reset

/ end of file and loop

	mov	$60007,-(r0)
.endif
	br	.

6.2.4 UNIX/32V

UNIX/32V was an early port of UNIX to the DEC VAX architecture making use of the C programming language to decouple the code from the PDP-11. /usr/src/sys/sys/locore.s contains the first appearance of doadump(), the same function name used today, written in VAX assembly.

6.2.4.1 Notes

/usr/src/sys/sys/locore.s

/usr/src/sys/sys/locore.s

#  0x200
# Produce a core image dump on mag tape
	.globl	doadump
doadump:
	movl	sp,dumpstack	# save stack pointer
	movab	dumpstack,sp	# reinit stack
	mfpr	$PCBB,-(sp)	# save u-area pointer
	mfpr	$MAPEN,-(sp)	# save value
	mfpr	$IPL,-(sp)	# ...
	mtpr	$0,$MAPEN		# turn off memory mapping
	mtpr	$HIGH,$IPL		# disable interrupts
	pushr	$0x3fff			# save regs 0 - 13
	calls	$0,_dump	# produce dump
	halt

	.data
	.align	2
	.globl	dumpstack
	.space	58*4		# seperate stack for tape dumps
dumpstack: 
	.space	4
	.text

6.3 Core Dumps in BSD

6.3.1 1BSD & 2BSD

1BSD and 2BSD inherited their dump code directly from 6th Edition UNIX so it therefore supports the PDP-11/40 and PDP-11/45.

6.3.2 3BSD

3BSD imports its dump code from UNIX/32V maintaining the name doadump(). Because of this pedigree, doadump() is written in VAX assembly.

A “todo” list found in usr/src/sys/sys/TODO notes that “large core dumps are awful and even uninterruptible!”.

6.3.2.1 Notes

/usr/src/sys/sys/locore.s doadump

# =====================================
# Produce a core image dump on mag tape
# =====================================
	.globl	doadump
doadump:
	movl	sp,dumpstack		# save stack pointer
	movab	dumpstack,sp		# reinit stack
	mfpr	$PCBB,-(sp)		# save u-area pointer
	mfpr	$MAPEN,-(sp)		# save value
	mfpr	$IPL,-(sp)		# ...
	mtpr	$0,$MAPEN		# turn off memory mapping
	mtpr	$HIGH,$IPL		# disable interrupts
	pushr	$0x3fff			# save regs 0 - 13
	calls	$0,_dump		# produce dump
	halt

	.data
	.align	2
	.globl	dumpstack
	.space	58*4			# separate stack for tape dumps
dumpstack: 
	.space	4
	.text

6.3.3 4BSD

4BSD introduces a new feature to doadump, printing tracing information with dumptrc.

In addition, usr/src/sys/sys/TODO is the first mention of writing dumps to swap: “Support automatic dumps to paging area”.

6.3.3.1 Notes

6.3.4 4.1BSD

Beginning in 4.1BSD doadump() is relegated to setting up the machine for dumpsys() which is written in C and found in sys/vax/vax/machdep.c.

As of 4.1c2BSD doadump() now fulfills the “todo” listed in 4BSD and dumps to the “paging area”, or swap. savecore(8) is introduced to extract the core from the swap partition and place it in the filesystem.

  • Support for VAX750, VAX780, VAX7ZZ (VAX730)
  • In 4.1c2BSD changes VAX7ZZ references to VAX730

6.3.4.1 Notes

6.3.5 4.2BSD

  • no changes.

6.3.5.1 Notes

  • check this again

6.3.6 4.3BSD

6.3.6.1 4.3 BSD-Tahoe

  • Initial support is added for the “tahoe” processor and and doadump is ported to the tahoe.

6.3.6.2 4.3 BSD Net/1

  • Same as 4.3-Tahoe
6.3.6.2.1 Notes

6.3.6.3 4.3 BSD-Reno

  • hp300 and i386 core dump support is added in usr/src/sys/hp300/locore.s and usr/src/sys/i386/locore.s, respectively.
6.3.6.3.1 Notes

6.3.6.4 4.3 BSD Net/2

  • Same as Reno

6.3.7 4.4BSD

  • luna68k support added
  • news3400 support added
  • pmax support added
  • sparc support added

6.3.7.1 4.4-BSD Lite1 & 4.4-BSD Lite2

  • Same as 4.4BSD – changes made due to AT&T UNIX System Laboratories (USL) lawsuit.

6.3.7.2 4.4-BSD Lite1

Same as 4.4 – changes made due to AT&T UNIX System Laboratories (USL) lawsuit.

6.3.7.3 4.4-BSD Lite2

Same as 4.4 – changes made due to USL lawsuit.

6.3.8 386BSD

6.3.8.1 386BSD 0.0

  • Reduce support to i386 and hp300 support

6.3.8.2 386BSD 0.1

  • hp300 code removed

6.3.8.3 386BSD 0.1-patchkit

  • Same as 386BSD 0.1

6.4 Core Dumps in FreeBSD

6.4.1 FreeBSD 1.0

  • i386 support from 386BSD-0.1-patchkit

6.4.1.1 FreeBSD 1.1

6.4.1.2 FreeBSD 1.1.5

6.4.1.2.1 Notes
> On Thu, Jan 12, 2017 at 11:03 PM, Rodney W. Grimes
> <freebsd@pdx.rh.cn85.dnsmgr.net> wrote:
> >> Hey Rod,
> >>
> >> Finishing up my paper on core dumps and wanted to talk about your idea for
> >> modularization of the dump code.
> >
> > Is there a copy of it some place to read?  (Please don't email it, as that
> > tends to clutter my mail folder.)
>
> Here you go: https://github.com/gwydirsam/bsd-coredump-history

1:
"code at Isilon that applies cleanly to versions of
FreeBSD after 11 but before"

The patch does not apply cleanly, it took me many hours of hand
editing in applying the Isilon diff.

2:
"8.4.1 FreeBSD 1.0

i386 support, hp300 support from 386BSD-0.1-patchkit"

I do not think any version of FreeBSD ever had support for hp300.


Wow, 2 nits in all that writting, good job!

6.4.2 FreeBSD 2.0.0

6.4.2.1 FreeBSD 2.0.0

  • doadump() no longer exists, though is mentioned in comments.

6.4.2.2 FreeBSD 2.0.5

6.4.2.3 FreeBSD 2.1.0

6.4.2.4 FreeBSD 2.1.5

6.4.2.5 FreeBSD 2.1.6

6.4.2.6 FreeBSD 2.1.6.1

6.4.2.7 FreeBSD 2.1.7

6.4.2.8 FreeBSD 2.2.0

  • dumpsys() is placed inside boot() and dumpsys() in kern_shutdown.c because code was not seen as machine dependent.

6.4.2.9 FreeBSD 2.2.1

6.4.2.10 FreeBSD 2.2.2

6.4.2.11 FreeBSD 2.2.5

6.4.2.12 FreeBSD 2.2.6

6.4.2.13 FreeBSD 2.2.7

6.4.2.14 FreeBSD 2.2.8

6.4.2.15 Notes

/ssh:freebsd-current:/root/src/unix-history-repo/:
find . \( -type f -exec grep -q -e dumpsys \{\} \; \) -ls
1945687      144 -rw-r--r--    1 root                             wheel                               72708 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/hp300/hp300/locore.s
1945688       80 -rw-r--r--    1 root                             wheel                               40785 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/hp300/hp300/machdep.c
1785250     1728 -rw-r--r--    1 root                             wheel                              836045 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/hp300/tags
973678       64 -rw-r--r--    1 root                             wheel                               30221 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/i386/i386/machdep.c
973686     1536 -rw-r--r--    1 root                             wheel                              746387 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/i386/tags
2506530      128 -rw-r--r--    1 root                             wheel                               63107 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/luna68k/luna68k/locore.s
2506531       64 -rw-r--r--    1 root                             wheel                               30470 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/luna68k/luna68k/machdep.c
2506880       48 -rw-r--r--    1 root                             wheel                               23756 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/news3400/news3400/machdep.c
2506902     1856 -rw-r--r--    1 root                             wheel                              891210 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/news3400/tags
2271373       56 -rw-r--r--    1 root                             wheel                               28486 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/pmax/dev/rz.c
2506949      112 -rw-r--r--    1 root                             wheel                               53461 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/pmax/pmax/machdep.c
2188752     1792 -rw-r--r--    1 root                             wheel                              862201 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/pmax/tags
2507078       48 -rw-r--r--    1 root                             wheel                               22534 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/sparc/sparc/machdep.c
1130530     1664 -rw-r--r--    1 root                             wheel                              804425 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/sparc/tags
1860931     1600 -rw-r--r--    1 root                             wheel                              773455 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/tahoe/tags
1945710     2240 -rw-r--r--    1 root                             wheel                             1096432 Dec 23 02:00 .ref-BSD-4_4_Lite1/usr/src/sys/vax/tags
2574311       80 -rw-r--r--    1 root                             wheel                               38769 Dec 23 02:00 .ref-FreeBSD-release/1.1.5/sys/i386/i386/machdep.c
1703024       88 -rw-r--r--    1 root                             wheel                               44085 Dec 23 02:00 sys/i386/i386/machdep.c

6.4.3 FreeBSD 3.0.0

  • SMP support
  • alpha support

6.4.3.1 FreeBSD 3.1.0

6.4.3.2 FreeBSD 3.2.0

6.4.3.3 FreeBSD 3.3.0

6.4.3.4 FreeBSD 3.4.0

6.4.3.5 FreeBSD 3.5.0

6.4.3.6 Notes

/*
 *  Go through the rigmarole of shutting down..
 * this used to be in machdep.c but I'll be dammned if I could see
 * anything machine dependant in it.
 */
static void
boot(howto)
	int howto;
{
	sle_p ep;

#ifdef SMP
	if (smp_active) {
		printf("boot() called on cpu#%d\n", cpuid);
	}
#endif
	/*
	 * Do any callouts that should be done BEFORE syncing the filesystems.
	 */
	LIST_FOREACH(ep, &shutdown_lists[SHUTDOWN_PRE_SYNC], links)
		(*ep->function)(howto, ep->arg);

	/* 
	 * Now sync filesystems
	 */
	if (!cold && (howto & RB_NOSYNC) == 0 && waittime < 0) {
		register struct buf *bp;
		int iter, nbusy;

		waittime = 0;
		printf("\nsyncing disks... ");

		sync(&proc0, NULL);

		/*
		 * With soft updates, some buffers that are
		 * written will be remarked as dirty until other
		 * buffers are written.
		 */
		for (iter = 0; iter < 20; iter++) {
			nbusy = 0;
			for (bp = &buf[nbuf]; --bp >= buf; ) {
				if ((bp->b_flags & (B_BUSY | B_INVAL))
						== B_BUSY) {
					nbusy++;
				} else if ((bp->b_flags & (B_DELWRI | B_INVAL))
						== B_DELWRI) {
					/* bawrite(bp);*/
					nbusy++;
				}
			}
			if (nbusy == 0)
				break;
			printf("%d ", nbusy);
			sync(&proc0, NULL);
			DELAY(50000 * iter);
		}
		if (nbusy) {
			/*
			 * Failed to sync all blocks. Indicate this and don't
			 * unmount filesystems (thus forcing an fsck on reboot).
			 */
			printf("giving up\n");
#ifdef SHOW_BUSYBUFS
			nbusy = 0;
			for (bp = &buf[nbuf]; --bp >= buf; ) {
				if ((bp->b_flags & (B_BUSY | B_INVAL))
						== B_BUSY) {
					nbusy++;
					printf(
			"%d: dev:%08lx, flags:%08lx, blkno:%ld, lblkno:%ld\n",
					    nbusy, (u_long)bp->b_dev,
					    bp->b_flags, (long)bp->b_blkno,
					    (long)bp->b_lblkno);
				}
			}
			DELAY(5000000);	/* 5 seconds */
#endif
		} else {
			printf("done\n");
			/*
			 * Unmount filesystems
			 */
			if (panicstr == 0)
				vfs_unmountall();
		}
		DELAY(100000);		/* wait for console output to finish */
	}

	/*
	 * Ok, now do things that assume all filesystem activity has
	 * been completed.
	 */
	LIST_FOREACH(ep, &shutdown_lists[SHUTDOWN_POST_SYNC], links)
		(*ep->function)(howto, ep->arg);
	splhigh();
	if ((howto & (RB_HALT|RB_DUMP)) == RB_DUMP && !cold) {
		savectx(&dumppcb);
#ifdef __i386__
		dumppcb.pcb_cr3 = rcr3();
#endif
		dumpsys();
	}

	/* Now that we're going to really halt the system... */
	LIST_FOREACH(ep, &shutdown_lists[SHUTDOWN_FINAL], links)
		(*ep->function)(howto, ep->arg);

	if (howto & RB_HALT) {
		cpu_power_down();
		printf("\n");
		printf("The operating system has halted.\n");
		printf("Please press any key to reboot.\n\n");
		switch (cngetc()) {
		case -1:		/* No console, just die */
			cpu_halt();
			/* NOTREACHED */
		default:
			howto &= ~RB_HALT;
			break;
		}
	} else if (howto & RB_DUMP) {
		/* System Paniced */

		if (PANIC_REBOOT_WAIT_TIME != 0) {
			if (PANIC_REBOOT_WAIT_TIME != -1) {
				int loop;
				printf("Automatic reboot in %d seconds - "
				       "press a key on the console to abort\n",
					PANIC_REBOOT_WAIT_TIME);
				for (loop = PANIC_REBOOT_WAIT_TIME * 10;
				     loop > 0; --loop) {
					DELAY(1000 * 100); /* 1/10th second */
					/* Did user type a key? */
					if (cncheckc() != -1)
						break;
				}
				if (!loop)
					goto die;
			}
		} else { /* zero time specified - reboot NOW */
			goto die;
		}
		printf("--> Press a key on the console to reboot <--\n");
		cngetc();
	}
die:
	printf("Rebooting...\n");
	DELAY(1000000);	/* wait 1 sec for printf's to complete and be read */
	/* cpu_boot(howto); */ /* doesn't do anything at the moment */
	cpu_reset();
	for(;;) ;
	/* NOTREACHED */
}

6.4.4 FreeBSD 4.0.0

  • Added print uptime before rebooting.
  • Better error message when dumps are not supported

6.4.4.1 FreeBSD 4.1.0

6.4.4.2 FreeBSD 4.1.1

6.4.4.3 FreeBSD 4.2.0

6.4.4.4 FreeBSD 4.3.0

6.4.4.5 FreeBSD 4.4.0

6.4.4.6 FreeBSD 4.5.0

6.4.4.7 FreeBSD 4.6.0

6.4.4.8 FreeBSD 4.6.1

6.4.4.9 FreeBSD 4.6.2

6.4.4.10 FreeBSD 4.7.0

6.4.4.11 FreeBSD 4.8.0

6.4.4.12 FreeBSD 4.9.0

6.4.4.13 FreeBSD 4.10.0

6.4.4.14 FreeBSD 4.11.0

6.4.4.15 Notes

6.4.4.15.1 Check print uptime earliest version.

6.4.5 FreeBSD 5.0.0

  • Added IA64, sparc64, and pc98 support.
  • New kernel dump infrastructure. Broken out to individual architectures again. doadump() is back!
  • Crash dumps can now be obtained in the late stages of kernel initialisation before single user mode

6.4.5.1 FreeBSD 5.1.0

6.4.5.2 FreeBSD 5.2.0

  • AMD64 a Tier1 supported architecture

6.4.5.3 FreeBSD 5.2.1

6.4.5.4 FreeBSD 5.3.0

6.4.5.5 FreeBSD 5.4.0

6.4.5.6 FreeBSD 5.5.0

6.4.5.7 Notes

  • 5.0 16 Jan 2003
6.4.5.7.1 Check NEW support
6.4.5.7.2 savecore and dumpon changes
6.4.5.7.3 BIG CHANGES – More attention here
6.4.5.7.4 2002 Poul-Henning Kamp

Here follows the new kernel dumping infrastructure.

Caveats:

The new savecore program is not complete in the sense that it emulates enough of the old savecores features to do the job, but implements none of the options yet.

I would appreciate if a userland hacker could help me out getting savecore to do what we want it to do from a users point of view, compression, email-notification, space reservation etc etc. (send me email if you are interested).

Currently, savecore will scan all devices marked as “swap” or “dump” in /etc/fstab or any devices specified on the command-line.

All architectures but i386 lack an implementation of dumpsys(), but looking at the i386 version it should be trivial for anybody familiar with the platform(s) to provide this function.

Documentation is quite sparse at this time, more to come.

Details:

ATA and SCSI drivers should work as the dump formatting code has been removed. The IDA, TWE and AAC have not yet been converted.

Dumpon now opens the device and uses ioctl(DIOCGKERNELDUMP) to set the device as dumpdev. To implement the “off” argument, /dev/null is used as the device.

Savecore will fail if handed any options since they are not (yet) implemented. All devices marked “dump” or “swap” in /etc/fstab will be scanned and dumps found will be saved to diskfiles named from the MD5 hash of the header record. The header record is dumped in readable format in the .info file. The kernel is not saved. Only complete dumps will be saved.

All maintainer rights for this code are disclaimed: feel free to improve and extend.

Sponsored by: DARPA, NAI Labs

6.4.6 FreeBSD 6.0.0

6.4.6.1 FreeBSD 6.0.0

  • AMD64 and arm support added.
  • AMD64 and i386 switch to ELF as their crash dump format.
  • AMD64 and i386 bump their dump format to version 2.

6.4.6.2 FreeBSD 6.1.0

6.4.6.3 FreeBSD 6.2.0

  • minidump code added.

6.4.6.4 FreeBSD 6.3.0

6.4.6.5 FreeBSD 6.4.0

6.4.6.6 Notes

  • 9 October 2005
  • 6.0
    • amd64 support added
    • dump format bumped to 2
  • 6.2.0
    • As of 6.2.0 minidump - peter wemm
  • THIS IS WHAT STARTED THIS PROJECT 47c4404f96c6 * Don’t dump core into a partition that is too small for it. If we do, we usually wrote backwareds into the proceeding partititon which is usually the root partition.

6.4.7 FreeBSD 7.0.0

6.4.7.1 FreeBSD 7.0.0

  • sun4v support added
  • minidumps are now default
  • alpha support is removed

6.4.7.2 FreeBSD 7.1.0

  • textdump code is added

6.4.7.3 FreeBSD 7.2.0

6.4.7.4 FreeBSD 7.3.0

6.4.7.5 FreeBSD 7.4.0

6.4.7.6 Notes

  • Architectures
    • arm
    • i386
    • sun4v
    • amd64
    • ia64
    • sparc64

6.4.8 FreeBSD 8.0.0

  • PowerPC support added.
  • mips support added.

6.4.8.1 FreeBSD 8.1.0

6.4.8.2 FreeBSD 8.2.0

6.4.8.3 FreeBSD 8.3.0

6.4.8.4 FreeBSD 8.4.0

6.4.8.5 Notes

  • 22 November 2009
  • 8.0
    • Architectures
      • arm
      • i386
      • sun4v
      • amd64
      • ia64
      • sparc64
      • mips (NEW)
      • powerpc (NEW)

6.4.9 FreeBSD 9.0.0

  • Merge common amd64/i386 dump code under sys/x86 subtree.
  • Only dump at first panic in the event of a double panic
  • Add dump command for DDB
  • Minidump v2
6.4.9.0.1 Notes
  • Explain new stuff in Minidump v2
    • 3 January 2012

6.4.9.1 FreeBSD 9.1.0

6.4.9.2 FreeBSD 9.2.0

6.4.9.3 Notes

6.4.9.3.1 commit 3ac86ffe8c96b93a27e1e5bd872497446f543899

Author: Attilio Rao <attilio@FreeBSD.org> Date: Wed Jun 8 19:28:59 2011 +0000

In the current code, a double panic condition may lead to dumps interleaving. Signal dumping to happen only for the first panic which should be the most important.

Sponsored by: Sandvine Incorporated Submitted by: Nima Misaghian (nmisaghian AT sandvine DOT com) MFC after: 2 weeks

6.4.9.3.2 commit 7ffd4bc3ff05520393dc1061410d1b82d37af823

Author: Marcel Moolenaar <marcel@FreeBSD.org> Date: Tue Jun 7 01:28:12 2011 +0000

Fix making kernel dumps from the debugger by creating a command for it. Do not not expect a developer to call doadump(). Calling doadump does not necessarily work when it’s declared static. Nor does it necessarily do what was intended in the context of text dumps. The dump command always creates a core dump.

Move printing of error messages from doadump to the dump command, now that we don’t have to worry about being called from DDB.

6.4.10 FreeBSD 10.0.0

  • On systems with SMP, CPUs other than the one processing the panic are stopped. This behavior is tunable with the sysctl kern.stop_scheduler_on_panic

6.4.10.1 FreeBSD 10.1.0

6.4.10.2 FreeBSD 10.2.0

6.4.10.3 FreeBSD 10.3.0

6.4.10.4 Notes

  • 15 January 2014
  • list who has minidumps
  • Found great note in sys/ddb/db_textdump.c

6.4.11 FreeBSD 11.0.0

  • RISC-V support added.
  • arm64 support added.
  • Factored out duplicated code from dumpsys() on each each architecture into sys/kern/kern_dump.c
  • A `show panic’ command was added to DDB
  • “4Kn” kernel dump support. Dumps are now written out in the native block size. savecore(1) updated accordingly.
  • “4Kn” minidump support for AMD64 only
  • strlcpy(3) is used to properly null-terminate strings in kernel dump header

6.4.11.1 FreeBSD 11.0.1

6.4.11.2 Notes

  • 28 September 2016
  • archs
    • add riscV
    • arm64
  • factored out duplicated code from dumpsys() on each each architecture into sys/kern/kern_dump.c
6.4.11.2.1 7b143fb29f444afdeb1556558771ab521096edef

Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: Wed Jan 7 01:01:39 2015 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: Wed Jan 7 01:01:39 2015 +0000

Parent: a2c98547f907 Use the new process reaper functionality Containing: FreeBSD-release/11.0.0 Follows: Research-V2 (404857)

Factor out duplicated code from dumpsys() on each architecture into generic code in sys/kern/kern_dump.c. Most dumpsys() implementations are nearly identical and simply redefine a number of constants and helper subroutines; a generic implementation will make it easier to implement features around kernel core dumps. This change does not alter any minidump code and should have no functional impact.

PR: 193873 Differential Revision: https://reviews.freebsd.org/D904 Submitted by: Conrad Meyer <conrad.meyer@isilon.com> Reviewed by: jhibbits (earlier version) Sponsored by: EMC / Isilon Storage Division

6.4.12 FreeBSD 12-CURRENT

  • Support for encrypted kernel crash dumps added. dumpon(8) and savecore(8) updated accordingly. New tool for decrypting cores added, decryptcore(8). Tested on amd64, i386, mipsel and sparc64. Untested on arm and arm64. Encrypted textdump is not yet implemented.

6.4.12.1 Notes

  • r309818

commit f63c437216e0309e4a319c2c95a2f8ca061c0bca Author: def <def@FreeBSD.org> Date: Sat Dec 10 16:20:39 2016 +0000

Add support for encrypted kernel crash dumps.

File Function Line 0 sparc64/include/dump.h <global> 38 int dumpsys(struct dumperinfo *); 1 arm/include/dump.h dumpsys 64 dumpsys(struct dumperinfo *di) 2 arm64/include/dump.h dumpsys 68 dumpsys(struct dumperinfo *di) 3 sys/kern/kern_shutdown.c doadump 329 error = dumpsys(&dumper); 4 mips/include/dump.h dumpsys 70 dumpsys(struct dumperinfo *di) 5 powerpc/include/dump.h dumpsys 63 dumpsys(struct dumperinfo *di) 6 riscv/include/dump.h dumpsys 76 dumpsys(struct dumperinfo *di) 7 sparc64/sparc64/dump_machdep.c dumpsys 77 dumpsys(struct dumperinfo *di) 8 x86/include/dump.h dumpsys 81 dumpsys(struct dumperinfo *di)

Working file: /ssh:freebsd-current:/root/src/freebsd-head-svn/sys/amd64/amd64/minidump_machdep.c


r309818 | def | 2016-12-10 10:20:39 -0600 (Sat, 10 Dec 2016) | 67 lines

Add support for encrypted kernel crash dumps.

Changes include modifications in kernel crash dump routines, dumpon(8) and savecore(8). A new tool called decryptcore(8) was added.

A new DIOCSKERNELDUMP I/O control was added to send a kernel crash dump configuration in the diocskerneldump_arg structure to the kernel. The old DIOCSKERNELDUMP I/O control was renamed to DIOCSKERNELDUMP_FREEBSD11 for backward ABI compatibility.

dumpon(8) generates an one-time random symmetric key and encrypts it using an RSA public key in capability mode. Currently only AES-256-CBC is supported but EKCD was designed to implement support for other algorithms in the future. The public key is chosen using the -k flag. The dumpon rc(8) script can do this automatically during startup using the dumppubkey rc.conf(5) variable. Once the keys are calculated dumpon sends them to the kernel via DIOCSKERNELDUMP I/O control.

When the kernel receives the DIOCSKERNELDUMP I/O control it generates a random IV and sets up the key schedule for the specified algorithm. Each time the kernel tries to write a crash dump to the dump device, the IV is replaced by a SHA-256 hash of the previous value. This is intended to make a possible differential cryptanalysis harder since it is possible to write multiple crash dumps without reboot by repeating the following commands:

db> call doadump(0) db> continue

A kernel dump key consists of an algorithm identifier, an IV and an encrypted symmetric key. The kernel dump key size is included in a kernel dump header. The size is an unsigned 32-bit integer and it is aligned to a block size. The header structure has 512 bytes to match the block size so it was required to make a panic string 4 bytes shorter to add a new field to the header structure. If the kernel dump key size in the header is nonzero it is assumed that the kernel dump key is placed after the first header on the dump device and the core dump is encrypted.

Separate functions were implemented to write the kernel dump header and the kernel dump key as they need to be unencrypted. The dump_write function encrypts data if the kernel was compiled with the EKCD option. Encrypted kernel textdumps are not supported due to the way they are constructed which makes it impossible to use the CBC mode for encryption. It should be also noted that textdumps don’t contain sensitive data by design as a user decides what information should be dumped.

savecore(8) writes the kernel dump key to a key.# file if its size in the header is nonzero. # is the number of the current core dump.

decryptcore(8) decrypts the core dump using a private RSA key and the kernel dump key. This is performed by a child process in capability mode. If the decryption was not successful the parent process removes a partially decrypted core dump.

Description on how to encrypt crash dumps was added to the decryptcore(8), dumpon(8), rc.conf(5) and savecore(8) manual pages.

EKCD was tested on amd64 using bhyve and i386, mipsel and sparc64 using QEMU. The feature still has to be tested on arm and arm64 as it wasn’t possible to run FreeBSD due to the problems with QEMU emulation and lack of hardware.

Designed by: def, pjd Reviewed by: cem, oshogbo, pjd Partial review: delphij, emaste, jhb, kib Approved by: pjd (mentor) Differential Revision: https://reviews.freebsd.org/D4712


r307540 | stevek | 2016-10-17 17:57:41 -0500 (Mon, 17 Oct 2016) | 9 lines

Add sysctl to make amd64 minidump retry count tunable at runtime.

PR: 213462 Submitted by: RaviPrakash Darbha <rdarbha@juniper.net> Reviewed by: cemi, markj Approved by: sjg (mentor) Obtained from: Juniper Networks Differential Revision: https://reviews.freebsd.org/D8254


r306020 | kib | 2016-09-20 04:38:07 -0500 (Tue, 20 Sep 2016) | 6 lines

Move pmap_p*e_index() inline functions from pmap.c to pmap.h. They are already used in minidump code.

Sponsored by: The FreeBSD Foundation MFC after: 1 week


r298076 | cem | 2016-04-15 12:45:12 -0500 (Fri, 15 Apr 2016) | 26 lines

Add 4Kn kernel dump support

(And 4Kn minidump support, but only for amd64.)

Make sure all I/O to the dump device is of the native sector size. To that end, we keep a native sector sized buffer associated with dump devices (di->blockbuf) and use it to pad smaller objects as needed (e.g. kerneldumpheader).

Add dump_write_pad() as a convenience API to dump smaller objects with zero padding. (Rather than pull in NPM leftpad, we wrote our own.)

Savecore(1) has been updated to deal with these dumps. The format for 512-byte sector dumps should remain backwards compatible.

Minidumps for other architectures are left as an exercise for the reader.

PR: 194279 Submitted by: ambrisko@ Reviewed by: cem (earlier version), rpokala Tested by: rpokala (4Kn/512 except 512 fulldump), cem (512 fulldump) Relnotes: yes Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D5848

  • sys/kern/kern\_dump.c
  • sys/kern/kern\_shutdown.c
  • sys/amd64/amd64/machdep\_minidump.c
  • and rarely bits might be in sys/amd64/amd64/pmap.c

7 Footnotes

29 https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html

28 https://documentation.backtrace.io/overview/

27 https://documentation.backtrace.io/hydra/

26 https://documentation.backtrace.io/coresnap_integration/

25 https://svnweb.freebsd.org/base?view=revision&revision=302976

24 https://backtrace.io/blog/whats-a-coredump/

23 https://backtrace.io/blog/supporting-freebsd-backtrace-and-bsd-now/

22 https://illumos.org/man/1m/savecore

21 https://illumos.org/man/1m/dumpadm

20 https://en.wikipedia.org/wiki/Illumos

19 https://lists.freebsd.org/pipermail/freebsd-current/2007-December/081626.html

18 https://www.freebsd.org/cgi/man.cgi?query=textdump&apropos=0&sektion=0&manpath=FreeBSD+11.0-RELEASE+and+Ports&arch=default&format=html

17 https://people.freebsd.org/~rgrimes/

16 https://opensource.apple.com/source/network_cmds/network_cmds-396.6/kdumpd.tproj/kdumpd.8.auto.html

15 https://svnweb.freebsd.org/base/head/sys/amd64/amd64/minidump_machdep.c?revision=157908&view=markup

14 https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html

13 https://svnweb.freebsd.org/base?view=revision&revision=309818

1 The Design and Implementation of the FreeBSD operating system by McKusick, Neville-Neil, and Watson

2 crash(8) - 3BSD

3 man 9 panic - https://www.freebsd.org/cgi/man.cgi?query=panic&apropos=0&sektion=0&manpath=FreeBSD+10.3-RELEASE+and+Ports&arch=default&format=html

4 kern_shutdown.c - https://svnweb.freebsd.org/base/head/sys/kern/kern_shutdown.c?view=markup#l336

5 Unix History Repository - https://github.com/dspinellis/unix-history-repo

6 A Repository with 44 Years of Unix Evolution - http://www.dmst.aueb.gr/dds/pubs/conf/2015-MSR-Unix-History/html/Spi15c.html

7 https://en.wikipedia.org/wiki/Core_dump

8 https://lists.freebsd.org/pipermail/freebsd-arch/2014-November/016231.html

9 https://en.wikipedia.org/wiki/Core_dump

10 https://developer.apple.com/library/content/technotes/tn2004/tn2118.html

11 https://lists.freebsd.org/pipermail/freebsd-current/2007-December/081626.html

12 https://opensource.apple.com/source/xnu/xnu-3789.31.2/osfmk/kdp/kdp_core.c.auto.html