Skip to content
Permalink
Browse files
panic: disable optimistic spin after halting CPUs
A CPU executing with console lock spinning enabled might be halted
during a panic. Before console_flush_on_panic(), it's possible for
console_trylock() to attempt optimistic spinning, deadlocking the panic
CPU:

CPU 0 (panic CPU)             CPU 1
-----------------             ------
                              printk() {
                                vprintk_func() {
                                  vprintk_default() {
                                    vprintk_emit() {
                                      console_unlock() {
                                        console_lock_spinning_enable();
                                        ... printing to console ...
panic() {
  crash_smp_send_stop() {
    NMI  -------------------> HALT
  }
  atomic_notifier_call_chain() {
    printk() {
      ...
      console_trylock_spinnning() {
        // optimistic spin infinitely

This hang during panic can be induced when a kdump kernel is loaded, and
crash_kexec_post_notifiers=1 is present on the kernel command line. The
following script which concurrently writes to /dev/kmsg, and triggers a
panic, can result in this hang:

    #!/bin/bash
    date
    # 991 chars (based on log buffer size):
    chars="$(printf 'a%.0s' {1..991})"
    while :; do
        echo $chars > /dev/kmsg
    done &
    echo c > /proc/sysrq-trigger &
    date
    exit

Below are the hang rates for the above test case, based on v5.16-rc8
before and after this patch:

before:  197 hangs / 340 trials - 57.9%
after :    0 hangs / 245 trials -  0.0%

Fixes: dbdda84 ("printk: Add console owner and waiter logic to load balance console writes")

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
  • Loading branch information
brenns10 authored and intel-lab-lkp committed Jan 13, 2022
1 parent 362f533 commit 3c321d1696e15907efda52875980b82c4bbf64bc
Show file tree
Hide file tree
Showing 3 changed files with 25 additions and 0 deletions.
@@ -179,6 +179,7 @@ extern void console_unlock(void);
extern void console_conditional_schedule(void);
extern void console_unblank(void);
extern void console_flush_on_panic(enum con_flush_mode mode);
extern void console_lock_spinning_disable_on_panic(void);
extern struct tty_driver *console_device(int *);
extern void console_stop(struct console *);
extern void console_start(struct console *);
@@ -265,6 +265,13 @@ void panic(const char *fmt, ...)
crash_smp_send_stop();
}

/*
* Now that we've halted other CPUs, disable optimistic spinning in
* printk(). This avoids deadlocking in console_trylock(), until we call
* console_flush_on_panic().
*/
console_lock_spinning_disable_on_panic();

/*
* Run any panic handlers, including those that might need to
* add information to the kmsg dump output.
@@ -1822,6 +1822,23 @@ static int console_lock_spinning_disable_and_check(void)
return 1;
}

/**
* console_lock_spinning_disable_on_panic - disable spinning so that
* a panic CPU does not enter an infinite loop
*
* This is called once all CPUs are halted. A CPU halted during a section which
* allowed spinning, could trigger an infinite loop in console_trylock. To avoid
* this, mark console_owner as NULL.
*/
void console_lock_spinning_disable_on_panic(void)
{
WRITE_ONCE(console_owner, NULL);
if (raw_spin_is_locked(&console_owner_lock)) {
debug_locks_off();
raw_spin_lock_init(&console_owner_lock);
}
}

/**
* console_trylock_spinning - try to get console_lock by busy waiting
*

0 comments on commit 3c321d1

Please sign in to comment.