-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
include sys/sysmacros.h for major/minor/makedev #4
Conversation
These funcs are defined in the sys/sysmacros.h header, not sys/types.h. Linux C libraries are updating to drop the implicit include, so we need to include it explicitly.
----- Original Message -----
OK thanks Mike, I'll add the change. BTW, I don't use the github pull request facility. Patches should Thanks again, |
Without the patch, the backtrace displays the "cannot resolve stack trace" warning, dumps the backtrace, and then the text symbols: crash> bt PID: 0 TASK: f0962180 CPU: 6 COMMAND: "swapper/6" bt: cannot resolve stack trace: #0 [f095ff1c] __schedule at c0b6ef8d #1 [f095ff58] schedule at c0b6f4a9 #2 [f095ff64] schedule_preempt_disabled at c0b6f728 #3 [f095ff6c] cpu_startup_entry at c04b0310 #4 [f095ff94] start_secondary at c04468c0 bt: text symbols on stack: [f095ff1c] __schedule at c0b6ef8d [f095ff58] schedule at c0b6f4ae [f095ff64] schedule_preempt_disabled at c0b6f72d [f095ff6c] cpu_startup_entry at c04b0315 [f095ff94] start_secondary at c04468c5 crash> The backtrace shown is actually correct. (anderson@redhat.com)
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Fix for 'bt' command and options on Linux 5.8-rc1 or later kernels that contain merge commit 076f14be7fc942e112c94c841baec44124275cd0. The merged patches changed the name of exception functions that have been used by the crash utility to check the exception frame. Without the patch, the command and options cannot display it. Before: crash> bt PID: 8752 TASK: ffff8f80cb244380 CPU: 2 COMMAND: "insmod" #0 [ffffa3e40187f9f8] machine_kexec at ffffffffab25d267 crash-utility#1 [ffffa3e40187fa48] __crash_kexec at ffffffffab38e2ed crash-utility#2 [ffffa3e40187fb10] crash_kexec at ffffffffab38f1dd crash-utility#3 [ffffa3e40187fb28] oops_end at ffffffffab222cbd crash-utility#4 [ffffa3e40187fb48] do_trap at ffffffffab21fea1 crash-utility#5 [ffffa3e40187fb90] do_error_trap at ffffffffab21ff75 crash-utility#6 [ffffa3e40187fbd0] exc_invalid_op at ffffffffabb76a2c crash-utility#7 [ffffa3e40187fbf0] asm_exc_invalid_op at ffffffffabc00a72 crash-utility#8 [ffffa3e40187fc78] init_module at ffffffffc042b018 [invalid] crash-utility#9 [ffffa3e40187fca0] init_module at ffffffffc042b018 [invalid] crash-utility#10 [ffffa3e40187fca8] do_one_initcall at ffffffffab202806 crash-utility#11 [ffffa3e40187fd18] do_init_module at ffffffffab3888ba crash-utility#12 [ffffa3e40187fd38] load_module at ffffffffab38afde After: crash> bt PID: 8752 TASK: ffff8f80cb244380 CPU: 2 COMMAND: "insmod" #0 [ffffa3e40187f9f8] machine_kexec at ffffffffab25d267 crash-utility#1 [ffffa3e40187fa48] __crash_kexec at ffffffffab38e2ed crash-utility#2 [ffffa3e40187fb10] crash_kexec at ffffffffab38f1dd crash-utility#3 [ffffa3e40187fb28] oops_end at ffffffffab222cbd crash-utility#4 [ffffa3e40187fb48] do_trap at ffffffffab21fea1 crash-utility#5 [ffffa3e40187fb90] do_error_trap at ffffffffab21ff75 crash-utility#6 [ffffa3e40187fbd0] exc_invalid_op at ffffffffabb76a2c crash-utility#7 [ffffa3e40187fbf0] asm_exc_invalid_op at ffffffffabc00a72 [exception RIP: init_module+24] RIP: ffffffffc042b018 RSP: ffffa3e40187fca8 RFLAGS: 00010246 RAX: 000000000000001c RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff8f80fbd18000 RDI: ffff8f80fbd18000 RBP: ffffffffc042b000 R8: 000000000000029d R9: 000000000000002c R10: 0000000000000000 R11: ffffa3e40187fb58 R12: ffffffffc042d018 R13: ffffa3e40187fdf0 R14: ffffffffc042d000 R15: ffffa3e40187fe90 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 crash-utility#8 [ffffa3e40187fca0] init_module at ffffffffc042b018 [invalid] crash-utility#9 [ffffa3e40187fca8] do_one_initcall at ffffffffab202806 crash-utility#10 [ffffa3e40187fd18] do_init_module at ffffffffab3888ba crash-utility#11 [ffffa3e40187fd38] load_module at ffffffffab38afde Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Fix for 'bt' command and options on Linux 5.8-rc1 and later kernels that contain merge commit 076f14be7fc942e112c94c841baec44124275cd0. The merged patches changed the name of exception functions that have been used by the crash utility to check the exception frame. Without the patch, the command and options cannot display it. Before: crash> bt PID: 8752 TASK: ffff8f80cb244380 CPU: 2 COMMAND: "insmod" #0 [ffffa3e40187f9f8] machine_kexec at ffffffffab25d267 #1 [ffffa3e40187fa48] __crash_kexec at ffffffffab38e2ed #2 [ffffa3e40187fb10] crash_kexec at ffffffffab38f1dd #3 [ffffa3e40187fb28] oops_end at ffffffffab222cbd #4 [ffffa3e40187fb48] do_trap at ffffffffab21fea1 #5 [ffffa3e40187fb90] do_error_trap at ffffffffab21ff75 #6 [ffffa3e40187fbd0] exc_invalid_op at ffffffffabb76a2c #7 [ffffa3e40187fbf0] asm_exc_invalid_op at ffffffffabc00a72 #8 [ffffa3e40187fc78] init_module at ffffffffc042b018 [invalid] #9 [ffffa3e40187fca0] init_module at ffffffffc042b018 [invalid] #10 [ffffa3e40187fca8] do_one_initcall at ffffffffab202806 #11 [ffffa3e40187fd18] do_init_module at ffffffffab3888ba #12 [ffffa3e40187fd38] load_module at ffffffffab38afde After: crash> bt PID: 8752 TASK: ffff8f80cb244380 CPU: 2 COMMAND: "insmod" #0 [ffffa3e40187f9f8] machine_kexec at ffffffffab25d267 #1 [ffffa3e40187fa48] __crash_kexec at ffffffffab38e2ed #2 [ffffa3e40187fb10] crash_kexec at ffffffffab38f1dd #3 [ffffa3e40187fb28] oops_end at ffffffffab222cbd #4 [ffffa3e40187fb48] do_trap at ffffffffab21fea1 #5 [ffffa3e40187fb90] do_error_trap at ffffffffab21ff75 #6 [ffffa3e40187fbd0] exc_invalid_op at ffffffffabb76a2c #7 [ffffa3e40187fbf0] asm_exc_invalid_op at ffffffffabc00a72 [exception RIP: init_module+24] RIP: ffffffffc042b018 RSP: ffffa3e40187fca8 RFLAGS: 00010246 RAX: 000000000000001c RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff8f80fbd18000 RDI: ffff8f80fbd18000 RBP: ffffffffc042b000 R8: 000000000000029d R9: 000000000000002c R10: 0000000000000000 R11: ffffa3e40187fb58 R12: ffffffffc042d018 R13: ffffa3e40187fdf0 R14: ffffffffc042d000 R15: ffffa3e40187fe90 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffa3e40187fca0] init_module at ffffffffc042b018 [invalid] #9 [ffffa3e40187fca8] do_one_initcall at ffffffffab202806 #10 [ffffa3e40187fd18] do_init_module at ffffffffab3888ba #11 [ffffa3e40187fd38] load_module at ffffffffab38afde Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
…'bt' command dumpfiles: (1) If the kernel's crash_notes are not available, read them from ELF notes. (2) If an online CPUs did not save its ELF notes, then adjust the mapping of each ELF note to its CPU accordingly. E.g. With this patch: crash> bt PID: 4768 TASK: 9800000243bcf200 CPU: 3 COMMAND: "bash" #0 [980000024291f930] __crash_kexec at ffffffff802fff84 crash-utility#1 [980000024291faa0] panic at ffffffff80248cac crash-utility#2 [980000024291fb40] die at ffffffff8021b338 crash-utility#3 [980000024291fb70] do_page_fault at ffffffff802315e0 crash-utility#4 [980000024291fbd0] tlb_do_page_fault_1 at ffffffff80239388 crash-utility#5 [980000024291fd00] sysrq_handle_crash at ffffffff8085d308 crash-utility#6 [980000024291fd10] __handle_sysrq at ffffffff8085d9e0 crash-utility#7 [980000024291fd60] write_sysrq_trigger at ffffffff8085e020 crash-utility#8 [980000024291fd80] proc_reg_write at ffffffff804762f0 crash-utility#9 [980000024291fda0] __vfs_write at ffffffff803f3138 Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn>
…'bt' command dumpfiles: (1) If the kernel's crash_notes are not available, read them from ELF notes. (2) If an online CPUs did not save its ELF notes, then adjust the mapping of each ELF note to its CPU accordingly. E.g. With this patch: crash> bt PID: 4768 TASK: 9800000243bcf200 CPU: 3 COMMAND: "bash" #0 [980000024291f930] __crash_kexec at ffffffff802fff84 #1 [980000024291faa0] panic at ffffffff80248cac #2 [980000024291fb40] die at ffffffff8021b338 #3 [980000024291fb70] do_page_fault at ffffffff802315e0 #4 [980000024291fbd0] tlb_do_page_fault_1 at ffffffff80239388 #5 [980000024291fd00] sysrq_handle_crash at ffffffff8085d308 #6 [980000024291fd10] __handle_sysrq at ffffffff8085d9e0 #7 [980000024291fd60] write_sysrq_trigger at ffffffff8085e020 #8 [980000024291fd80] proc_reg_write at ffffffff804762f0 #9 [980000024291fda0] __vfs_write at ffffffff803f3138 Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Overflow stack supported since kernel 4.14 in commit 872d8327ce8, without this patch, bt command trigger a SIGSEGV fault due the SP pointed to the overflow stack which not yet loaded by crash. Before: KERNEL: ../vmlinux DUMPFILE: la_guestdump.gcore CPUS: 8 DATE: Tue Jul 13 19:59:44 CST 2021 UPTIME: 00:00:42 LOAD AVERAGE: 3.99, 1.13, 0.39 TASKS: 1925 NODENAME: localhost RELEASE: 4.14.156+ VERSION: crash-utility#1 SMP PREEMPT Tue Jul 13 10:37:23 UTC 2021 MACHINE: aarch64 (unknown Mhz) MEMORY: 8.7 GB PANIC: "Kernel panic - not syncing: kernel stack overflow" PID: 1969 COMMAND: "irq/139-0-0024" TASK: ffffffcc1a230000 [THREAD_INFO: ffffffcc1a230000] CPU: 0 STATE: TASK_RUNNING (PANIC) crash-7.3.0> bt PID: 1969 TASK: ffffffcc1a230000 CPU: 0 COMMAND: "irq/139-0-0024" Segmentation fault (core dumped) After: crash> bt PID: 1969 TASK: ffffffcc1a230000 CPU: 0 COMMAND: "irq/139-0-0024" #0 [ffffffcc7fd5cf50] __delay at ffffff8008c80774 crash-utility#1 [ffffffcc7fd5cf60] __const_udelay at ffffff8008c80864 crash-utility#2 [ffffffcc7fd5cf80] msm_trigger_wdog_bite at ffffff80084e9430 crash-utility#3 [ffffffcc7fd5cfa0] do_vm_restart at ffffff80087bc974 crash-utility#4 [ffffffcc7fd5cfc0] machine_restart at ffffff80080856fc crash-utility#5 [ffffffcc7fd5cfd0] emergency_restart at ffffff80080d49bc crash-utility#6 [ffffffcc7fd5d140] panic at ffffff80080af4c0 crash-utility#7 [ffffffcc7fd5d150] nmi_panic at ffffff80080af150 crash-utility#8 [ffffffcc7fd5d190] handle_bad_stack at ffffff800808b0b8 crash-utility#9 [ffffffcc7fd5d2d0] __bad_stack at ffffff800808285c --- <IRQ stack> --- crash-utility#10 [ffffff801187bc60] el1_error_invalid at ffffff8008082e7c crash-utility#11 [ffffff801187bcc0] cyttsp6_mt_attention at ffffff8000e8498c [cyttsp6] crash-utility#12 [ffffff801187bd20] call_atten_cb at ffffff8000e82030 [cyttsp6] crash-utility#13 [ffffff801187bdc0] cyttsp6_irq at ffffff8000e81e34 [cyttsp6] crash-utility#14 [ffffff801187bdf0] irq_thread_fn at ffffff8008128dd8 crash-utility#15 [ffffff801187be50] irq_thread at ffffff8008128ca4 crash-utility#16 [ffffff801187beb0] kthread at ffffff80080d2fc4 crash> Signed-off-by: Hong YANG <hong.yang3@nio.com>
When we use crash to troubleshoot softlockup and other problems, we often use the 'bt -a' command to print the stacks of running processes on all CPUs. But now some servers have hundreds of CPUs (such as AMD machines), which causes the 'bt -a' command to output a lot of process stacks. And many of these stacks are the stacks of the idle process, which are not needed by us. Therefore, in order to reduce this part of the interference information, this patch adds the -n option to the bt command. When we specify '-n idle' (meaning no idle), the stack of the idle process will be filtered out, thus speeding up our troubleshooting. And the option works only for crash dumps captured by kdump. The command output is as follows: crash> bt -a -n idle [...] PID: 0 TASK: ffff889ff8c34380 CPU: 8 COMMAND: "swapper/8" PID: 0 TASK: ffff889ff8c32d00 CPU: 9 COMMAND: "swapper/9" PID: 0 TASK: ffff889ff8c31680 CPU: 10 COMMAND: "swapper/10" PID: 0 TASK: ffff889ff8c35a00 CPU: 11 COMMAND: "swapper/11" PID: 0 TASK: ffff889ff8c3c380 CPU: 12 COMMAND: "swapper/12" PID: 150773 TASK: ffff889fe85a1680 CPU: 13 COMMAND: "bash" #0 [ffffc9000d35bcd0] machine_kexec at ffffffff8105a407 #1 [ffffc9000d35bd28] __crash_kexec at ffffffff8113033d #2 [ffffc9000d35bdf0] panic at ffffffff81081930 #3 [ffffc9000d35be70] sysrq_handle_crash at ffffffff814e38d1 #4 [ffffc9000d35be78] __handle_sysrq.cold.12 at ffffffff814e4175 #5 [ffffc9000d35bea8] write_sysrq_trigger at ffffffff814e404b #6 [ffffc9000d35beb8] proc_reg_write at ffffffff81330d86 #7 [ffffc9000d35bed0] vfs_write at ffffffff812a72d5 #8 [ffffc9000d35bf00] ksys_write at ffffffff812a7579 #9 [ffffc9000d35bf38] do_syscall_64 at ffffffff81004259 RIP: 00007fa7abcdc274 RSP: 00007fffa731f678 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fa7abcdc274 RDX: 0000000000000002 RSI: 0000563ca51ee6d0 RDI: 0000000000000001 RBP: 0000563ca51ee6d0 R8: 000000000000000a R9: 00007fa7abd6be80 R10: 000000000000000a R11: 0000000000000246 R12: 00007fa7abdad760 R13: 0000000000000002 R14: 00007fa7abda8760 R15: 0000000000000002 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b [...] Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Kazuhito Hagio <k-hagio-ab@nec.com> Acked-by: Lianbo Jiang <lijiang@redhat.com>
The previous implementation to locate the call instruction is to strstr "call", then check whether the previous char is ' ' or '\t'. The implementation is problematic. For example it cannot resolve the following disassembly string: "0xffffffffc0995378 <nfs41_callback_svc+344>:\tcall 0xffffffff8ecfa4c0 <schedule>\n" strstr will locate the "_call" and char check fails, as a result, extract_hex fails to get the calling address. NOTE: the issue is more likely to be reproduced when patch[1] applied. Because without patch[1], the disassembly string will be as follows, so the issue is no longer reproducible. "0xffffffffc0995378:\tcall 0xffffffff8ecfa4c0 <schedule>\n" Before the patch: crash> bt 1472 PID: 1472 TASK: ffff8c121fa72f70 CPU: 18 COMMAND: "nfsv4.1-svc" #0 [ffff8c16231a3db8] __schedule at ffffffff8ecf9ef3 #1 [ffff8c16231a3e40] schedule at ffffffff8ecfa4e9 After the patch: crash> bt 1472 PID: 1472 TASK: ffff8c121fa72f70 CPU: 18 COMMAND: "nfsv4.1-svc" #0 [ffff8c16231a3db8] __schedule at ffffffff8ecf9ef3 #1 [ffff8c16231a3e40] schedule at ffffffff8ecfa4e9 #2 [ffff8c16231a3e50] nfs41_callback_svc at ffffffffc099537d [nfsv4] #3 [ffff8c16231a3ec8] kthread at ffffffff8e6b966f #4 [ffff8c16231a3f50] ret_from_fork at ffffffff8ed07898 This patch fix the issue by strstr "\tcall" and " call", to locate the correct call instruction. [1]: https://listman.redhat.com/archives/crash-utility/2022-August/010085.html Signed-off-by: Tao Liu <ltao@redhat.com>
1, Add the implementation to get stack frame from active & inactive task's stack. 2, Add 'bt -l' command support get a line number associated with a current pc address. 3, Add 'bt -f' command support to display all stack data contained in a frame With the patch, we can get the backtrace, crash> bt PID: 113 TASK: ff6000000226c200 CPU: 0 COMMAND: "sh" #0 [ff20000010333b90] riscv_crash_save_regs at ffffffff800078f8 crash-utility#1 [ff20000010333cf0] panic at ffffffff806578c6 crash-utility#2 [ff20000010333d50] sysrq_reset_seq_param_set at ffffffff8038c03c crash-utility#3 [ff20000010333da0] __handle_sysrq at ffffffff8038c604 crash-utility#4 [ff20000010333e00] write_sysrq_trigger at ffffffff8038cae4 crash-utility#5 [ff20000010333e20] proc_reg_write at ffffffff801b7ee8 crash-utility#6 [ff20000010333e40] vfs_write at ffffffff80152bb2 crash-utility#7 [ff20000010333e80] ksys_write at ffffffff80152eda crash-utility#8 [ff20000010333ed0] sys_write at ffffffff80152f52 crash> bt -l PID: 113 TASK: ff6000000226c200 CPU: 0 COMMAND: "sh" #0 [ff20000010333b90] riscv_crash_save_regs at ffffffff800078f8 /buildroot/qemu_riscv64_virt_defconfig/build/linux-custom/arch/riscv/kernel/crash_save_regs.S: 47 crash-utility#1 [ff20000010333cf0] panic at ffffffff806578c6 /buildroot/qemu_riscv64_virt_defconfig/build/linux-custom/kernel/panic.c: 276 ... ... crash> bt -f PID: 113 TASK: ff6000000226c200 CPU: 0 COMMAND: "sh" #0 [ff20000010333b90] riscv_crash_save_regs at ffffffff800078f8 [PC: ffffffff800078f8 RA: ffffffff806578c6 SP: ff20000010333b90 SIZE: 352] ff20000010333b90: ff20000010333bb0 ffffffff800078f8 ff20000010333ba0: ffffffff8008862c ff20000010333b90 ff20000010333bb0: ffffffff810dde38 ff6000000226c200 ff20000010333bc0: ffffffff8032be68 0720072007200720 ... ... Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
We met "bt" command on KASAN kernel vmcore display truncated backtraces like this: crash> bt PID: 4131 TASK: ffff8001521df000 CPU: 3 COMMAND: "bash" #0 [ffff2000224b0cb0] machine_kexec_prepare at ffff2000200bff4c After digging the root cause, it turns out that arm64_in_kdump_text() found wrong bt->bptr at "machine_kexec" branch. Disassemble machine_kexec() of KASAN vmlinux (gcc 7.3.0): crash> dis -x machine_kexec 0xffff2000200bff50 <machine_kexec>: stp x29, x30, [sp,#-208]! 0xffff2000200bff54 <machine_kexec+0x4>: mov x29, sp 0xffff2000200bff58 <machine_kexec+0x8>: stp x19, x20, [sp,#16] 0xffff2000200bff5c <machine_kexec+0xc>: str x24, [sp,#56] 0xffff2000200bff60 <machine_kexec+0x10>: str x26, [sp,#72] 0xffff2000200bff64 <machine_kexec+0x14>: mov x2, #0x8ab3 0xffff2000200bff68 <machine_kexec+0x18>: add x1, x29, #0x70 0xffff2000200bff6c <machine_kexec+0x1c>: lsr x1, x1, #3 0xffff2000200bff70 <machine_kexec+0x20>: movk x2, #0x41b5, lsl #16 0xffff2000200bff74 <machine_kexec+0x24>: mov x19, #0x200000000000 0xffff2000200bff78 <machine_kexec+0x28>: adrp x3, 0xffff2000224b0000 0xffff2000200bff7c <machine_kexec+0x2c>: movk x19, #0xdfff, lsl #48 0xffff2000200bff80 <machine_kexec+0x30>: add x3, x3, #0xcb0 0xffff2000200bff84 <machine_kexec+0x34>: add x4, x1, x19 0xffff2000200bff88 <machine_kexec+0x38>: stp x2, x3, [x29,#112] 0xffff2000200bff8c <machine_kexec+0x3c>: adrp x2, 0xffff2000200bf000 <swsusp_arch_resume+0x1e8> 0xffff2000200bff90 <machine_kexec+0x40>: add x2, x2, #0xf50 0xffff2000200bff94 <machine_kexec+0x44>: str x2, [x29,#128] 0xffff2000200bff98 <machine_kexec+0x48>: mov w2, #0xf1f1f1f1 0xffff2000200bff9c <machine_kexec+0x4c>: str w2, [x1,x19] 0xffff2000200bffa0 <machine_kexec+0x50>: mov w2, #0xf200 0xffff2000200bffa4 <machine_kexec+0x54>: mov w1, #0xf3f3f3f3 0xffff2000200bffa8 <machine_kexec+0x58>: movk w2, #0xf2f2, lsl #16 0xffff2000200bffac <machine_kexec+0x5c>: stp w2, w1, [x4,#4] We notice that: 1. machine_kexec() start address is 0xffff2000200bff50 2. the instruction at machine_kexec+0x44 stores the same value 0xffff2000200bff50 (comes from 0xffff2000200bf000 + 0xf50) into stack postion [x29,#128]. When arm64_in_kdump_text() searches for LR from stack, it met 0xffff2000200bff50 firstly, so got wrong bt->bptr. We know that the real LR is always greater than the start address of a function, so let's fix it by changing the search conditon to (*ptr > xxx_start) && (*ptr < xxx_end). Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
1, Add the implementation to get stack frame from active & inactive task's stack. 2, Add 'bt -l' command support get a line number associated with a current pc address. 3, Add 'bt -f' command support to display all stack data contained in a frame With the patch, we can get the backtrace, crash> bt PID: 113 TASK: ff6000000226c200 CPU: 0 COMMAND: "sh" #0 [ff20000010333b90] riscv_crash_save_regs at ffffffff800078f8 #1 [ff20000010333cf0] panic at ffffffff806578c6 #2 [ff20000010333d50] sysrq_reset_seq_param_set at ffffffff8038c03c #3 [ff20000010333da0] __handle_sysrq at ffffffff8038c604 #4 [ff20000010333e00] write_sysrq_trigger at ffffffff8038cae4 #5 [ff20000010333e20] proc_reg_write at ffffffff801b7ee8 #6 [ff20000010333e40] vfs_write at ffffffff80152bb2 #7 [ff20000010333e80] ksys_write at ffffffff80152eda #8 [ff20000010333ed0] sys_write at ffffffff80152f52 crash> bt -l PID: 113 TASK: ff6000000226c200 CPU: 0 COMMAND: "sh" #0 [ff20000010333b90] riscv_crash_save_regs at ffffffff800078f8 /buildroot/qemu_riscv64_virt_defconfig/build/linux-custom/arch/riscv/kernel/crash_save_regs.S: 47 #1 [ff20000010333cf0] panic at ffffffff806578c6 /buildroot/qemu_riscv64_virt_defconfig/build/linux-custom/kernel/panic.c: 276 ... ... crash> bt -f PID: 113 TASK: ff6000000226c200 CPU: 0 COMMAND: "sh" #0 [ff20000010333b90] riscv_crash_save_regs at ffffffff800078f8 [PC: ffffffff800078f8 RA: ffffffff806578c6 SP: ff20000010333b90 SIZE: 352] ff20000010333b90: ff20000010333bb0 ffffffff800078f8 ff20000010333ba0: ffffffff8008862c ff20000010333b90 ff20000010333bb0: ffffffff810dde38 ff6000000226c200 ff20000010333bc0: ffffffff8032be68 0720072007200720 ... ... Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
Kernel commit 7d65f4a65532 ("irq: Consolidate do_softirq() arch overriden implementations") renamed the call_softirq to do_softirq_own_stack, and there is no exception frame also when coming from do_softirq_own_stack. Without the patch, crash may unnecessarily output an exception frame with a warning as below: crash> foreach bt ... PID: 0 TASK: ffff914f820a8000 CPU: 25 COMMAND: "swapper/25" #0 [fffffe0000504e48] crash_nmi_callback at ffffffffa665d763 #1 [fffffe0000504e50] nmi_handle at ffffffffa662a423 #2 [fffffe0000504ea8] default_do_nmi at ffffffffa6fe7dc9 #3 [fffffe0000504ec8] do_nmi at ffffffffa662a97f #4 [fffffe0000504ef0] end_repeat_nmi at ffffffffa70015e8 [exception RIP: clone_endio+172] RIP: ffffffffc005c1ec RSP: ffffa1d403d08e98 RFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff915326fba230 RCX: 0000000000000018 RDX: ffffffffc0075400 RSI: 0000000000000000 RDI: ffff915326fba230 RBP: ffff915326fba1c0 R8: 0000000000001000 R9: ffff915308d6d2a0 R10: 000000a97dfe5e10 R11: ffffa1d40038fe98 R12: ffff915302babc40 R13: ffff914f94360000 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- <NMI exception stack> --- #5 [ffffa1d403d08e98] clone_endio at ffffffffc005c1ec [dm_mod] #6 [ffffa1d403d08ed0] blk_update_request at ffffffffa6a96954 #7 [ffffa1d403d08f10] scsi_end_request at ffffffffa6c9b968 #8 [ffffa1d403d08f48] scsi_io_completion at ffffffffa6c9bb3e #9 [ffffa1d403d08f90] blk_complete_reqs at ffffffffa6aa0e95 #10 [ffffa1d403d08fa0] __softirqentry_text_start at ffffffffa72000dc #11 [ffffa1d403d08ff0] do_softirq_own_stack at ffffffffa7000f9a --- <IRQ stack> --- #12 [ffffa1d40038fe70] do_softirq_own_stack at ffffffffa7000f9a [exception RIP: unknown or invalid address] RIP: 0000000000000000 RSP: 0000000000000000 RFLAGS: 00000000 RAX: ffffffffa672eae5 RBX: ffffffffa83b34e0 RCX: ffffffffa672eb12 RDX: 0000000000000010 RSI: 8b7d6c8869010c00 RDI: 0000000000000085 RBP: 0000000000000286 R8: ffff914f820a8000 R9: ffffffffa67a94e0 R10: 0000000000000286 R11: ffffffffa66fb4c5 R12: ffffffffa67a898b R13: 0000000000000000 R14: fffffffffffffff8 R15: ffffffffa67a1e68 ORIG_RAX: 0000000000000000 CS: 0000 SS: ffffffffa672edff bt: WARNING: possibly bogus exception frame #13 [ffffa1d40038ff30] start_secondary at ffffffffa665fa2c #14 [ffffa1d40038ff50] secondary_startup_64_no_verify at ffffffffa6600116 ... Reported-by: Marco Patalano <mpatalan@redhat.com> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
On kernels configured with CONFIG_RANDOMIZE_KSTACK_OFFSET=y and random_kstack_offset=on, a random offset is added to the stack with __kstack_alloca() at the beginning of do_syscall_64() and other syscall entry functions. This eventually does the following instruction. <do_syscall_64+32>: sub %rax,%rsp On the other hand, crash uses only a part of data for ORC unwinder to unwind stacks and if an ip value doesn't have a usable ORC data, it caluculates the frame size with parsing the assembly of the function. However, crash cannot calculate the frame size correctly with the instruction above, and prints stale return addresses like this: crash> bt 1 PID: 1 TASK: ffff9c250023b880 CPU: 0 COMMAND: "systemd" #0 [ffffb7e5c001fc80] __schedule at ffffffff91ae2b16 crash-utility#1 [ffffb7e5c001fd00] schedule at ffffffff91ae2ed3 crash-utility#2 [ffffb7e5c001fd18] schedule_hrtimeout_range_clock at ffffffff91ae7ed8 crash-utility#3 [ffffb7e5c001fda8] ep_poll at ffffffff913ef828 crash-utility#4 [ffffb7e5c001fe48] do_epoll_wait at ffffffff913ef943 crash-utility#5 [ffffb7e5c001fe80] __x64_sys_epoll_wait at ffffffff913f0130 crash-utility#6 [ffffb7e5c001fed0] do_syscall_64 at ffffffff91ad7169 crash-utility#7 [ffffb7e5c001fef0] do_syscall_64 at ffffffff91ad7179 << crash-utility#8 [ffffb7e5c001ff10] syscall_exit_to_user_mode at ffffffff91adaab2 << stale entries crash-utility#9 [ffffb7e5c001ff20] do_syscall_64 at ffffffff91ad7179 << crash-utility#10 [ffffb7e5c001ff50] entry_SYSCALL_64_after_hwframe at ffffffff91c0009b RIP: 00007f258d9427ae RSP: 00007fffda631d60 RFLAGS: 00000293 ... To fix this, enhance the usage of ORC data. The ORC unwinder often uses %rbp value, so keep it from exception frames and inactive task stacks. Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
On kernels configured with CONFIG_RANDOMIZE_KSTACK_OFFSET=y and random_kstack_offset=on, a random offset is added to task stacks with __kstack_alloca() at the beginning of do_syscall_64() and other syscall entry functions. This eventually does the following instruction. <do_syscall_64+32>: sub %rax,%rsp On the other hand, crash uses only a part of data for ORC unwinder to unwind stacks and if an ip value doesn't have a usable ORC data, it caluculates the frame size with parsing the assembly of the function. However, crash cannot calculate the frame size correctly with the instruction above, and prints stale return addresses like this: crash> bt 1 PID: 1 TASK: ffff9c250023b880 CPU: 0 COMMAND: "systemd" #0 [ffffb7e5c001fc80] __schedule at ffffffff91ae2b16 #1 [ffffb7e5c001fd00] schedule at ffffffff91ae2ed3 #2 [ffffb7e5c001fd18] schedule_hrtimeout_range_clock at ffffffff91ae7ed8 #3 [ffffb7e5c001fda8] ep_poll at ffffffff913ef828 #4 [ffffb7e5c001fe48] do_epoll_wait at ffffffff913ef943 #5 [ffffb7e5c001fe80] __x64_sys_epoll_wait at ffffffff913f0130 #6 [ffffb7e5c001fed0] do_syscall_64 at ffffffff91ad7169 #7 [ffffb7e5c001fef0] do_syscall_64 at ffffffff91ad7179 << #8 [ffffb7e5c001ff10] syscall_exit_to_user_mode at ffffffff91adaab2 << stale entries #9 [ffffb7e5c001ff20] do_syscall_64 at ffffffff91ad7179 << #10 [ffffb7e5c001ff50] entry_SYSCALL_64_after_hwframe at ffffffff91c0009b RIP: 00007f258d9427ae RSP: 00007fffda631d60 RFLAGS: 00000293 ... To fix this, enhance the use of ORC data. The ORC unwinder often uses %rbp value, so keep it from exception frames and inactive task stacks. Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Kernel commit fb799447ae29 ("x86,objtool: Split UNWIND_HINT_EMPTY in two"), which is contained in Linux 6.4 and later kernels, changed ORC_TYPE_CALL macro from 0 to 2. As a result, the "bt" command cannot use ORC entries and displays stale entries in a call trace. crash> bt 1 PID: 1 TASK: ffff93cd06294180 CPU: 51 COMMAND: "systemd" #0 [ffffb72bc00cbc98] __schedule at ffffffff86e52aae crash-utility#1 [ffffb72bc00cbd00] schedule at ffffffff86e52f6a crash-utility#2 [ffffb72bc00cbd18] schedule_hrtimeout_range_clock at ffffffff86e58ef5 crash-utility#3 [ffffb72bc00cbd88] ep_poll at ffffffff8669624d crash-utility#4 [ffffb72bc00cbe28] do_epoll_wait at ffffffff86696371 crash-utility#5 [ffffb72bc00cbe30] do_timerfd_settime at ffffffff8669902b << crash-utility#6 [ffffb72bc00cbe60] __x64_sys_epoll_wait at ffffffff86696bf0 crash-utility#7 [ffffb72bc00cbeb0] do_syscall_64 at ffffffff86e3feb9 crash-utility#8 [ffffb72bc00cbee0] __task_pid_nr_ns at ffffffff863330d7 << crash-utility#9 [ffffb72bc00cbf08] syscall_exit_to_user_mode at ffffffff86e466b2 << stale entries crash-utility#10 [ffffb72bc00cbf18] do_syscall_64 at ffffffff86e3fec9 << crash-utility#11 [ffffb72bc00cbf50] entry_SYSCALL_64_after_hwframe at ffffffff870000aa Also, struct orc_entry in kernel has changed, and debugging information for ORC unwinder can be displayed incorrectly. To fix these, (1) introduce "kernel_orc_entry_6_4" structure corresponding to 6.4 and abstruct structure "orc_entry" in crash, (2) switch ORC_TYPE_CALL to 2 or 0 with kernel's orc_entry structure. Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Kernel commit fb799447ae29 ("x86,objtool: Split UNWIND_HINT_EMPTY in two"), which is contained in Linux 6.4 and later kernels, changed ORC_TYPE_CALL macro from 0 to 2. As a result, the "bt" command cannot use ORC entries, and can display stale entries in a call trace. crash> bt 1 PID: 1 TASK: ffff93cd06294180 CPU: 51 COMMAND: "systemd" #0 [ffffb72bc00cbc98] __schedule at ffffffff86e52aae #1 [ffffb72bc00cbd00] schedule at ffffffff86e52f6a #2 [ffffb72bc00cbd18] schedule_hrtimeout_range_clock at ffffffff86e58ef5 #3 [ffffb72bc00cbd88] ep_poll at ffffffff8669624d #4 [ffffb72bc00cbe28] do_epoll_wait at ffffffff86696371 #5 [ffffb72bc00cbe30] do_timerfd_settime at ffffffff8669902b << #6 [ffffb72bc00cbe60] __x64_sys_epoll_wait at ffffffff86696bf0 #7 [ffffb72bc00cbeb0] do_syscall_64 at ffffffff86e3feb9 #8 [ffffb72bc00cbee0] __task_pid_nr_ns at ffffffff863330d7 << #9 [ffffb72bc00cbf08] syscall_exit_to_user_mode at ffffffff86e466b2 << stale entries #10 [ffffb72bc00cbf18] do_syscall_64 at ffffffff86e3fec9 << #11 [ffffb72bc00cbf50] entry_SYSCALL_64_after_hwframe at ffffffff870000aa Also, kernel commit ffb1b4a41016 added a member to struct orc_entry. Although this does not affect the crash's unwinder, its debugging information can be displayed incorrectly. To fix these, (1) introduce "kernel_orc_entry_6_4" structure corresponding to 6.4 and abstruction layer "orc_entry" structure in crash, (2) switch ORC_TYPE_CALL to 2 or 0 with kernel's orc_entry structure. Related orc_entry history: v4.14 39358a033b2e introduced struct orc_entry v4.19 d31a580266ee added orc_entry.end member v6.3 ffb1b4a41016 added orc_entry.signal member v6.4 fb799447ae29 removed end member and changed type member to 3 bits Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Previously we can only view the stack unwinding for the tasks which are running on each CPUs. This patch will enable the ability to view arbitrary tasks stack unwinding. After crash get initialized, "info threads" will output like the following: crash> info threads Id Target Id Frame 1 CPU 0 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... * 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, reserved=reserved@entry=false) at block/blk-mq.c:640 ... 13 CPU 12 <unavailable> in ?? () 14 CPU 13 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM > 0 0 0 ffffffff819f9480 RU 0.0 0 0 [swapper/0] > 0 0 1 ffff880169411fa0 RU 0.0 0 0 [swapper/1] ... 0 0 23 ffff8801694e0000 RU 0.0 0 0 [swapper/23] 1 0 13 ffff880169b30000 IN 0.0 193052 4180 systemd "info threads" show the tasks which are currently running on each CPU. If we'd like to view systemd task's stack unwinding, which is inactive status, we do the following: crash> set 1 or crash> set ffff880169b30000 Then the register cache of systemd will be swapped into CPU 13: crash> info threads Id Target Id Frame 1 CPU 0 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, reserved=reserved@entry=false) at block/blk-mq.c:640 ... 13 CPU 12 <unavailable> in ?? () * 14 CPU 13 0xffffffff816a8f65 in context_switch (rq=0x0, next=0x0, prev=0xffff880169b30000) at kernel/sched/core.c:2527 ... And we can view the stack unwinding of systemd: crash> bt PID: 1 TASK: ffff880169b30000 CPU: 13 COMMAND: "systemd" #0 [ffff880169b3bd58] __schedule at ffffffff816a8f65 crash-utility#1 [ffff880169b3bdc0] schedule at ffffffff816a94e9 crash-utility#2 [ffff880169b3bdd0] schedule_hrtimeout_range_clock at ffffffff816a86fd crash-utility#3 [ffff880169b3be68] schedule_hrtimeout_range at ffffffff816a8733 crash-utility#4 [ffff880169b3be78] ep_poll at ffffffff8124bb7e crash-utility#5 [ffff880169b3bf30] sys_epoll_wait at ffffffff8124d00d crash-utility#6 [ffff880169b3bf80] system_call_fastpath at ffffffff816b5009 RIP: 00007f0449407923 RSP: 00007ffc35a3c378 RFLAGS: 00010246 RAX: 00000000000000e8 RBX: ffffffff816b5009 RCX: 0000000000000071 RDX: 000000000000001d RSI: 00007ffc35a3d5a0 RDI: 0000000000000004 RBP: 00007ffc35a3d810 R8: 0000000000000000 R9: 0000000000000000 R10: 00000000ffffffff R11: 0000000000000293 R12: 0000563ca2ebe980 R13: 0000000000000003 R14: ffffffffffffffff R15: 0000000000000001 ORIG_RAX: 00000000000000e8 CS: 0033 SS: 002b crash> gdb bt #0 0xffffffff816a8f65 in context_switch (rq=0x0, next=0x0, prev=0xffff880169b30000) at kernel/sched/core.c:2527 crash-utility#1 __schedule () at kernel/sched/core.c:3540 crash-utility#2 0xffffffff816a94e9 in schedule () at kernel/sched/core.c:3577 crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock (expires=expires@entry=0x0, delta=delta@entry=0, mode=mode@entry=HRTIMER_MODE_ABS, clock=clock@entry=1) at kernel/hrtimer.c:1724 crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range (expires=expires@entry=0x0, delta=delta@entry=0, mode=mode@entry=HRTIMER_MODE_ABS) at kernel/hrtimer.c:1778 crash-utility#5 0xffffffff8124bb7e in ep_poll (ep=0xffff880fd861f8c0, events=events@entry=0x7ffc35a3d5a0, maxevents=maxevents@entry=29, timeout=timeout@entry=-1) at fs/eventpoll.c:1669 crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait (timeout=<optimized out>, maxevents=29, events=<optimized out>, epfd=<optimized out>) at fs/eventpoll.c:2043 crash-utility#7 SyS_epoll_wait (epfd=<optimized out>, events=140721208415648, maxevents=29, timeout=4294967295) at fs/eventpoll.c:2008 crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () Signed-off-by: Tao Liu <ltao@redhat.com>
The stack unwinding is for kernel addresses only. If non-kernel address encountered, it is usually a user space address, or non-address value like a function call parameter. So stopping stack unwinding at non-kernel address will decrease the invalid unwind results. Before: crash> gdb bt #0 0xffffffff816a8f65 in context_switch ... crash-utility#1 __schedule () ... crash-utility#2 0xffffffff816a94e9 in schedule ... crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock ... crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range ... crash-utility#5 0xffffffff8124bb7e in ep_poll ... crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait ... crash-utility#7 SyS_epoll_wait ... crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () crash-utility#10 0xffff880100000001 in ?? () crash-utility#11 0xffff880169b3c010 in ?? () crash-utility#12 0x0000000000000040 in irq_stack_union () crash-utility#13 0xffff880169b3c058 in ?? () crash-utility#14 0xffff880169b3c048 in ?? () crash-utility#15 0xffff880169b3c050 in ?? () crash-utility#16 0x0000000000000000 in ?? () After: crash> gdb bt #0 0xffffffff816a8f65 in context_switch ... crash-utility#1 __schedule () ... crash-utility#2 0xffffffff816a94e9 in schedule () ... crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock ... crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range ... crash-utility#5 0xffffffff8124bb7e in ep_poll ... crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait ... crash-utility#7 SyS_epoll_wait ... crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () Signed-off-by: Tao Liu <ltao@redhat.com>
The live debug can be enabled for ppc64. For inactive tasks, it can show the stack unwinding results: crash> sys KERNEL: /usr/lib/debug/lib/modules/5.14.0-425.el9.ppc64le/vmlinux DUMPFILE: /proc/kcore ... crash> set 1 crash> bt PID: 1 TASK: c0000000035fc900 CPU: 1 COMMAND: "systemd" #0 [c00000000369fa60] __schedule at c000000000fc3c58 crash-utility#1 [c00000000369fb20] schedule at c000000000fc411c crash-utility#2 [c00000000369fb50] schedule_hrtimeout_range_clock at c000000000fcd2a4 crash-utility#3 [c00000000369fc00] ep_poll at c00000000063640c crash-utility#4 [c00000000369fcf0] do_epoll_wait at c000000000636584 crash-utility#5 [c00000000369fd40] sys_epoll_wait at c000000000636608 crash-utility#6 [c00000000369fdb0] system_call_exception at c00000000002e994 crash-utility#7 [c00000000369fe10] system_call_vectored_common at c00000000000bfe8 crash> gdb bt #0 0xc000000000fc3c58 in context_switch ... crash-utility#1 __schedule ... crash-utility#2 0xc000000000fc411c in schedule_loop ... crash-utility#3 schedule ... crash-utility#4 0xc000000000fcd2a4 in schedule_hrtimeout_range_clock ... crash-utility#5 0xc00000000063640c in ep_poll ... crash-utility#6 0xc000000000636584 in do_epoll_wait ... crash-utility#7 0xc000000000636608 in __do_sys_epoll_wait ... crash-utility#8 __se_sys_epoll_wait ... crash-utility#9 0xc00000000002e994 in system_call_exception ... crash-utility#10 0xc00000000000bfe8 in system_call_vectored_common ... However for active tasks in live mode, stack unwind will fail. The behaviour is similar for "bt" and "gdb bt": crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM 0 0 0 c000000002af6380 RU 0.0 0 0 [swapper/0] > 0 0 1 c0000000035f9000 RU 0.0 0 0 [swapper/1] > 0 0 2 c0000000035f0180 RU 0.0 0 0 [swapper/2] ... crash> set c0000000035f0180 crash> bt PID: 0 TASK: c0000000035f0180 CPU: 2 COMMAND: "swapper/2" (active) crash> gdb bt #0 0xc000000003847d50 in ?? () crash-utility#1 0x0000000000000000 in ?? () Signed-off-by: Tao Liu <ltao@redhat.com>
Previously we can only view the stack unwinding for the tasks which are running on each CPUs. This patch will enable the ability to view arbitrary tasks stack unwinding. After crash get initialized, "info threads" will output like the following: crash> info threads Id Target Id Frame 1 CPU 0 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... * 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, reserved=reserved@entry=false) at block/blk-mq.c:640 ... 13 CPU 12 <unavailable> in ?? () 14 CPU 13 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM > 0 0 0 ffffffff819f9480 RU 0.0 0 0 [swapper/0] > 0 0 1 ffff880169411fa0 RU 0.0 0 0 [swapper/1] ... 0 0 23 ffff8801694e0000 RU 0.0 0 0 [swapper/23] 1 0 13 ffff880169b30000 IN 0.0 193052 4180 systemd "info threads" show the tasks which are currently running on each CPU. If we'd like to view systemd task's stack unwinding, which is inactive status, we do the following: crash> set 1 or crash> set ffff880169b30000 Then the register cache of systemd will be swapped into CPU 13: crash> info threads Id Target Id Frame 1 CPU 0 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, reserved=reserved@entry=false) at block/blk-mq.c:640 ... 13 CPU 12 <unavailable> in ?? () * 14 CPU 13 0xffffffff816a8f65 in context_switch (rq=0x0, next=0x0, prev=0xffff880169b30000) at kernel/sched/core.c:2527 ... And we can view the stack unwinding of systemd: crash> bt PID: 1 TASK: ffff880169b30000 CPU: 13 COMMAND: "systemd" #0 [ffff880169b3bd58] __schedule at ffffffff816a8f65 crash-utility#1 [ffff880169b3bdc0] schedule at ffffffff816a94e9 crash-utility#2 [ffff880169b3bdd0] schedule_hrtimeout_range_clock at ffffffff816a86fd crash-utility#3 [ffff880169b3be68] schedule_hrtimeout_range at ffffffff816a8733 crash-utility#4 [ffff880169b3be78] ep_poll at ffffffff8124bb7e crash-utility#5 [ffff880169b3bf30] sys_epoll_wait at ffffffff8124d00d crash-utility#6 [ffff880169b3bf80] system_call_fastpath at ffffffff816b5009 RIP: 00007f0449407923 RSP: 00007ffc35a3c378 RFLAGS: 00010246 RAX: 00000000000000e8 RBX: ffffffff816b5009 RCX: 0000000000000071 RDX: 000000000000001d RSI: 00007ffc35a3d5a0 RDI: 0000000000000004 RBP: 00007ffc35a3d810 R8: 0000000000000000 R9: 0000000000000000 R10: 00000000ffffffff R11: 0000000000000293 R12: 0000563ca2ebe980 R13: 0000000000000003 R14: ffffffffffffffff R15: 0000000000000001 ORIG_RAX: 00000000000000e8 CS: 0033 SS: 002b crash> gdb bt #0 0xffffffff816a8f65 in context_switch (rq=0x0, next=0x0, prev=0xffff880169b30000) at kernel/sched/core.c:2527 crash-utility#1 __schedule () at kernel/sched/core.c:3540 crash-utility#2 0xffffffff816a94e9 in schedule () at kernel/sched/core.c:3577 crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock (expires=expires@entry=0x0, delta=delta@entry=0, mode=mode@entry=HRTIMER_MODE_ABS, clock=clock@entry=1) at kernel/hrtimer.c:1724 crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range (expires=expires@entry=0x0, delta=delta@entry=0, mode=mode@entry=HRTIMER_MODE_ABS) at kernel/hrtimer.c:1778 crash-utility#5 0xffffffff8124bb7e in ep_poll (ep=0xffff880fd861f8c0, events=events@entry=0x7ffc35a3d5a0, maxevents=maxevents@entry=29, timeout=timeout@entry=-1) at fs/eventpoll.c:1669 crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait (timeout=<optimized out>, maxevents=29, events=<optimized out>, epfd=<optimized out>) at fs/eventpoll.c:2043 crash-utility#7 SyS_epoll_wait (epfd=<optimized out>, events=140721208415648, maxevents=29, timeout=4294967295) at fs/eventpoll.c:2008 crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () Signed-off-by: Tao Liu <ltao@redhat.com>
The stack unwinding is for kernel addresses only. If non-kernel address encountered, it is usually a user space address, or non-address value like a function call parameter. So stopping stack unwinding at non-kernel address will decrease the invalid unwind results. Before: crash> gdb bt #0 0xffffffff816a8f65 in context_switch ... crash-utility#1 __schedule () ... crash-utility#2 0xffffffff816a94e9 in schedule ... crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock ... crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range ... crash-utility#5 0xffffffff8124bb7e in ep_poll ... crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait ... crash-utility#7 SyS_epoll_wait ... crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () crash-utility#10 0xffff880100000001 in ?? () crash-utility#11 0xffff880169b3c010 in ?? () crash-utility#12 0x0000000000000040 in irq_stack_union () crash-utility#13 0xffff880169b3c058 in ?? () crash-utility#14 0xffff880169b3c048 in ?? () crash-utility#15 0xffff880169b3c050 in ?? () crash-utility#16 0x0000000000000000 in ?? () After: crash> gdb bt #0 0xffffffff816a8f65 in context_switch ... crash-utility#1 __schedule () ... crash-utility#2 0xffffffff816a94e9 in schedule () ... crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock ... crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range ... crash-utility#5 0xffffffff8124bb7e in ep_poll ... crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait ... crash-utility#7 SyS_epoll_wait ... crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () Signed-off-by: Tao Liu <ltao@redhat.com>
The live debug can be enabled for ppc64. For inactive tasks, it can show the stack unwinding results: crash> sys KERNEL: /usr/lib/debug/lib/modules/5.14.0-425.el9.ppc64le/vmlinux DUMPFILE: /proc/kcore ... crash> set 1 crash> bt PID: 1 TASK: c0000000035fc900 CPU: 1 COMMAND: "systemd" #0 [c00000000369fa60] __schedule at c000000000fc3c58 crash-utility#1 [c00000000369fb20] schedule at c000000000fc411c crash-utility#2 [c00000000369fb50] schedule_hrtimeout_range_clock at c000000000fcd2a4 crash-utility#3 [c00000000369fc00] ep_poll at c00000000063640c crash-utility#4 [c00000000369fcf0] do_epoll_wait at c000000000636584 crash-utility#5 [c00000000369fd40] sys_epoll_wait at c000000000636608 crash-utility#6 [c00000000369fdb0] system_call_exception at c00000000002e994 crash-utility#7 [c00000000369fe10] system_call_vectored_common at c00000000000bfe8 crash> gdb bt #0 0xc000000000fc3c58 in context_switch ... crash-utility#1 __schedule ... crash-utility#2 0xc000000000fc411c in schedule_loop ... crash-utility#3 schedule ... crash-utility#4 0xc000000000fcd2a4 in schedule_hrtimeout_range_clock ... crash-utility#5 0xc00000000063640c in ep_poll ... crash-utility#6 0xc000000000636584 in do_epoll_wait ... crash-utility#7 0xc000000000636608 in __do_sys_epoll_wait ... crash-utility#8 __se_sys_epoll_wait ... crash-utility#9 0xc00000000002e994 in system_call_exception ... crash-utility#10 0xc00000000000bfe8 in system_call_vectored_common ... However for active tasks in live mode, stack unwind will fail. The behaviour is similar for "bt" and "gdb bt": crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM 0 0 0 c000000002af6380 RU 0.0 0 0 [swapper/0] > 0 0 1 c0000000035f9000 RU 0.0 0 0 [swapper/1] > 0 0 2 c0000000035f0180 RU 0.0 0 0 [swapper/2] ... crash> set c0000000035f0180 crash> bt PID: 0 TASK: c0000000035f0180 CPU: 2 COMMAND: "swapper/2" (active) crash> gdb bt #0 0xc000000003847d50 in ?? () crash-utility#1 0x0000000000000000 in ?? () Signed-off-by: Tao Liu <ltao@redhat.com>
Currently, gdb passthroughs of 'bt', 'frame', 'up', 'down', 'info locals' don't work. This is due to gdb not knowing the register values to unwind the stack frames Every gdb passthrough goes through `gdb_interface`. And then, gdb expects `crash_target::fetch_registers` to give it the register values, which is dependent on `machdep->get_cpu_reg` to read the register values for specific architecture. ---------------------------- gdb passthrough (eg. "bt") | | crash -------------------------> | | | gdb_interface | | | | | | ---------------------- | fetch_registers | | | | crash_target<-------------------------+--| gdb | | --------------------------+->| | | Registers (SP,NIP, etc.)| | | | | | | | | ---------------------- | ---------------------------- Implement `machdep->get_cpu_reg` on PPC64, so that crash provides the register values to gdb to unwind stack frames properly With these changes, on powerpc, 'bt' command output in gdb mode, will look like this: gdb> bt #0 0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>, newregs=0xc00000000486f8d8) at ./arch/powerpc/include/asm/kexec.h:69 crash-utility#1 __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974 crash-utility#2 0xc000000000168918 in panic (fmt=<optimized out>) at kernel/panic.c:358 crash-utility#3 0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155 crash-utility#4 0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:602 crash-utility#5 0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163 crash-utility#6 0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>, buf=<optimized out>, file=<optimized out>, pde=0xc000000009ed3a80) at fs/proc/inode.c:340 crash-utility#7 proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352 crash-utility#8 0xc0000000005b3bbc in vfs_write (file=file@entry=0xc00000009dda7d00, buf=buf@entry=0xebcfc7c6040 <error: Cannot access memory at address 0xebcfc7c6040>, count=count@entry=2, pos=pos@entry=0xc00000000486fda0) at fs/read_write.c:582 instead of earlier output without this patch: gdb> bt #0 <unavailable> in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) Also, 'get_dumpfile_regs' has been introduced to get registers from multiple supported vmcore formats. Correspondingly a flag 'BT_NO_PRINT_REGS' has been introduced to tell helper functions to get registers, to not print registers with every call to backtrace in gdb. Note: This feature to support GDB unwinding doesn't support live debugging Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@nec.com> Improved-by: Tao Liu <ltao@redhat.com> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Previously we can only view the stack unwinding for the tasks which are running on each CPUs. This patch will enable the ability to view arbitrary tasks stack unwinding. After crash get initialized, "info threads" will output like the following: crash> info threads Id Target Id Frame 1 CPU 0 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... * 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, reserved=reserved@entry=false) at block/blk-mq.c:640 ... 13 CPU 12 <unavailable> in ?? () 14 CPU 13 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM > 0 0 0 ffffffff819f9480 RU 0.0 0 0 [swapper/0] > 0 0 1 ffff880169411fa0 RU 0.0 0 0 [swapper/1] ... 0 0 23 ffff8801694e0000 RU 0.0 0 0 [swapper/23] 1 0 13 ffff880169b30000 IN 0.0 193052 4180 systemd "info threads" show the tasks which are currently running on each CPU. If we'd like to view systemd task's stack unwinding, which is inactive status, we do the following: crash> set 1 or crash> set ffff880169b30000 Then the register cache of systemd will be swapped into CPU 13: crash> info threads Id Target Id Frame 1 CPU 0 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, reserved=reserved@entry=false) at block/blk-mq.c:640 ... 13 CPU 12 <unavailable> in ?? () * 14 CPU 13 0xffffffff816a8f65 in context_switch (rq=0x0, next=0x0, prev=0xffff880169b30000) at kernel/sched/core.c:2527 ... And we can view the stack unwinding of systemd: crash> bt PID: 1 TASK: ffff880169b30000 CPU: 13 COMMAND: "systemd" #0 [ffff880169b3bd58] __schedule at ffffffff816a8f65 crash-utility#1 [ffff880169b3bdc0] schedule at ffffffff816a94e9 crash-utility#2 [ffff880169b3bdd0] schedule_hrtimeout_range_clock at ffffffff816a86fd crash-utility#3 [ffff880169b3be68] schedule_hrtimeout_range at ffffffff816a8733 crash-utility#4 [ffff880169b3be78] ep_poll at ffffffff8124bb7e crash-utility#5 [ffff880169b3bf30] sys_epoll_wait at ffffffff8124d00d crash-utility#6 [ffff880169b3bf80] system_call_fastpath at ffffffff816b5009 RIP: 00007f0449407923 RSP: 00007ffc35a3c378 RFLAGS: 00010246 RAX: 00000000000000e8 RBX: ffffffff816b5009 RCX: 0000000000000071 RDX: 000000000000001d RSI: 00007ffc35a3d5a0 RDI: 0000000000000004 RBP: 00007ffc35a3d810 R8: 0000000000000000 R9: 0000000000000000 R10: 00000000ffffffff R11: 0000000000000293 R12: 0000563ca2ebe980 R13: 0000000000000003 R14: ffffffffffffffff R15: 0000000000000001 ORIG_RAX: 00000000000000e8 CS: 0033 SS: 002b crash> gdb bt #0 0xffffffff816a8f65 in context_switch (rq=0x0, next=0x0, prev=0xffff880169b30000) at kernel/sched/core.c:2527 crash-utility#1 __schedule () at kernel/sched/core.c:3540 crash-utility#2 0xffffffff816a94e9 in schedule () at kernel/sched/core.c:3577 crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock (expires=expires@entry=0x0, delta=delta@entry=0, mode=mode@entry=HRTIMER_MODE_ABS, clock=clock@entry=1) at kernel/hrtimer.c:1724 crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range (expires=expires@entry=0x0, delta=delta@entry=0, mode=mode@entry=HRTIMER_MODE_ABS) at kernel/hrtimer.c:1778 crash-utility#5 0xffffffff8124bb7e in ep_poll (ep=0xffff880fd861f8c0, events=events@entry=0x7ffc35a3d5a0, maxevents=maxevents@entry=29, timeout=timeout@entry=-1) at fs/eventpoll.c:1669 crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait (timeout=<optimized out>, maxevents=29, events=<optimized out>, epfd=<optimized out>) at fs/eventpoll.c:2043 crash-utility#7 SyS_epoll_wait (epfd=<optimized out>, events=140721208415648, maxevents=29, timeout=4294967295) at fs/eventpoll.c:2008 crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () Signed-off-by: Tao Liu <ltao@redhat.com> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
The stack unwinding is for kernel addresses only. If non-kernel address encountered, it is usually a user space address, or non-address value like a function call parameter. So stopping stack unwinding at non-kernel address will decrease the invalid unwind results. Before: crash> gdb bt #0 0xffffffff816a8f65 in context_switch ... crash-utility#1 __schedule () ... crash-utility#2 0xffffffff816a94e9 in schedule ... crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock ... crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range ... crash-utility#5 0xffffffff8124bb7e in ep_poll ... crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait ... crash-utility#7 SyS_epoll_wait ... crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () crash-utility#10 0xffff880100000001 in ?? () crash-utility#11 0xffff880169b3c010 in ?? () crash-utility#12 0x0000000000000040 in irq_stack_union () crash-utility#13 0xffff880169b3c058 in ?? () crash-utility#14 0xffff880169b3c048 in ?? () crash-utility#15 0xffff880169b3c050 in ?? () crash-utility#16 0x0000000000000000 in ?? () After: crash> gdb bt #0 0xffffffff816a8f65 in context_switch ... crash-utility#1 __schedule () ... crash-utility#2 0xffffffff816a94e9 in schedule () ... crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock ... crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range ... crash-utility#5 0xffffffff8124bb7e in ep_poll ... crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait ... crash-utility#7 SyS_epoll_wait ... crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () Signed-off-by: Tao Liu <ltao@redhat.com>
…ame" warning The "bogus exception frame" warning was observed again on a specific vmcore, and the remaining frame was truncated on x86_64 machine, when executing the "bt" command as below: crash> bt 0 -c 8 PID: 0 TASK: ffff9948c08f5640 CPU: 8 COMMAND: "swapper/8" #0 [fffffe1788788e58] crash_nmi_callback at ffffffff972672bb #1 [fffffe1788788e68] nmi_handle at ffffffff9722eb8e #2 [fffffe1788788eb0] default_do_nmi at ffffffff97e51cd0 #3 [fffffe1788788ed0] exc_nmi at ffffffff97e51ee1 #4 [fffffe1788788ef0] end_repeat_nmi at ffffffff980015f9 [exception RIP: __update_load_avg_se+13] RIP: ffffffff9736b16d RSP: ffffbec3c08acc78 RFLAGS: 00000046 RAX: 0000000000000000 RBX: ffff994c2f2b1a40 RCX: ffffbec3c08acdc0 RDX: ffff9948e4fe1d80 RSI: ffff994c2f2b1a40 RDI: 0000001d7ad7d55d RBP: ffffbec3c08acc88 R8: 0000001d921fca6f R9: ffff994c2f2b1328 R10: 00000000fffd0010 R11: ffffffff98e060c0 R12: 0000001d7ad7d55d R13: 0000000000000005 R14: ffff994c2f2b19c0 R15: 0000000000000001 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- <NMI exception stack> --- #5 [ffffbec3c08acc78] __update_load_avg_se at ffffffff9736b16d #6 [ffffbec3c08acce0] enqueue_entity at ffffffff9735c9ab #7 [ffffbec3c08acd28] enqueue_task_fair at ffffffff9735cef8 ... #18 [ffffbec3c08acf90] blk_complete_reqs at ffffffff977978d0 #19 [ffffbec3c08acfa0] __do_softirq at ffffffff97e66f7a #20 [ffffbec3c08acff0] do_softirq at ffffffff9730f6ef --- <IRQ stack> --- #21 [ffffbec3c022ff18] do_idle at ffffffff97368288 [exception RIP: unknown or invalid address] RIP: 0000000000000000 RSP: 0000000000000000 RFLAGS: 00000000 RAX: 0000000000000000 RBX: 000000089726a2d0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffffff9726a3dd R8: 0000000000000000 R9: 0000000000000000 R10: ffffffff9720015a R11: e48885e126bc1600 R12: 0000000000000000 R13: ffffffff973684a9 R14: 0000000000000094 R15: 0000000040000000 ORIG_RAX: 0000000000000000 CS: 0000 SS: 0000 bt: WARNING: possibly bogus exception frame crash> Actually there is no exception frame, when called from do_softirq(). With the patch: crash> bt 0 -c 8 ... #18 [ffffbec3c08acf90] blk_complete_reqs at ffffffff977978d0 #19 [ffffbec3c08acfa0] __do_softirq at ffffffff97e66f7a #20 [ffffbec3c08acff0] do_softirq at ffffffff9730f6ef --- <IRQ stack> --- #21 [ffffbec3c022ff28] cpu_startup_entry at ffffffff973684a9 #22 [ffffbec3c022ff38] start_secondary at ffffffff9726a3dd #23 [ffffbec3c022ff50] secondary_startup_64_no_verify at ffffffff9720015a crash> Reported-by: Jie Li <jieli@redhat.com> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Previously we can only view the stack unwinding for the tasks which are running on each CPUs. This patch will enable the ability to view arbitrary tasks stack unwinding. After crash get initialized, "info threads" will output like the following: crash> info threads Id Target Id Frame 1 CPU 0 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... * 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, reserved=reserved@entry=false) at block/blk-mq.c:640 ... 13 CPU 12 <unavailable> in ?? () 14 CPU 13 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM > 0 0 0 ffffffff819f9480 RU 0.0 0 0 [swapper/0] > 0 0 1 ffff880169411fa0 RU 0.0 0 0 [swapper/1] ... 0 0 23 ffff8801694e0000 RU 0.0 0 0 [swapper/23] 1 0 13 ffff880169b30000 IN 0.0 193052 4180 systemd "info threads" show the tasks which are currently running on each CPU. If we'd like to view systemd task's stack unwinding, which is inactive status, we do the following: crash> set 1 or crash> set ffff880169b30000 Then the register cache of systemd will be swapped into CPU 13: crash> info threads Id Target Id Frame 1 CPU 0 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, reserved=reserved@entry=false) at block/blk-mq.c:640 ... 13 CPU 12 <unavailable> in ?? () * 14 CPU 13 0xffffffff816a8f65 in context_switch (rq=0x0, next=0x0, prev=0xffff880169b30000) at kernel/sched/core.c:2527 ... And we can view the stack unwinding of systemd: crash> bt PID: 1 TASK: ffff880169b30000 CPU: 13 COMMAND: "systemd" #0 [ffff880169b3bd58] __schedule at ffffffff816a8f65 crash-utility#1 [ffff880169b3bdc0] schedule at ffffffff816a94e9 crash-utility#2 [ffff880169b3bdd0] schedule_hrtimeout_range_clock at ffffffff816a86fd crash-utility#3 [ffff880169b3be68] schedule_hrtimeout_range at ffffffff816a8733 crash-utility#4 [ffff880169b3be78] ep_poll at ffffffff8124bb7e crash-utility#5 [ffff880169b3bf30] sys_epoll_wait at ffffffff8124d00d crash-utility#6 [ffff880169b3bf80] system_call_fastpath at ffffffff816b5009 RIP: 00007f0449407923 RSP: 00007ffc35a3c378 RFLAGS: 00010246 RAX: 00000000000000e8 RBX: ffffffff816b5009 RCX: 0000000000000071 RDX: 000000000000001d RSI: 00007ffc35a3d5a0 RDI: 0000000000000004 RBP: 00007ffc35a3d810 R8: 0000000000000000 R9: 0000000000000000 R10: 00000000ffffffff R11: 0000000000000293 R12: 0000563ca2ebe980 R13: 0000000000000003 R14: ffffffffffffffff R15: 0000000000000001 ORIG_RAX: 00000000000000e8 CS: 0033 SS: 002b crash> gdb bt #0 0xffffffff816a8f65 in context_switch (rq=0x0, next=0x0, prev=0xffff880169b30000) at kernel/sched/core.c:2527 crash-utility#1 __schedule () at kernel/sched/core.c:3540 crash-utility#2 0xffffffff816a94e9 in schedule () at kernel/sched/core.c:3577 crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock (expires=expires@entry=0x0, delta=delta@entry=0, mode=mode@entry=HRTIMER_MODE_ABS, clock=clock@entry=1) at kernel/hrtimer.c:1724 crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range (expires=expires@entry=0x0, delta=delta@entry=0, mode=mode@entry=HRTIMER_MODE_ABS) at kernel/hrtimer.c:1778 crash-utility#5 0xffffffff8124bb7e in ep_poll (ep=0xffff880fd861f8c0, events=events@entry=0x7ffc35a3d5a0, maxevents=maxevents@entry=29, timeout=timeout@entry=-1) at fs/eventpoll.c:1669 crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait (timeout=<optimized out>, maxevents=29, events=<optimized out>, epfd=<optimized out>) at fs/eventpoll.c:2043 crash-utility#7 SyS_epoll_wait (epfd=<optimized out>, events=140721208415648, maxevents=29, timeout=4294967295) at fs/eventpoll.c:2008 crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () Signed-off-by: Tao Liu <ltao@redhat.com> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
The stack unwinding is for kernel addresses only. If non-kernel address encountered, it is usually a user space address, or non-address value like a function call parameter. So stopping stack unwinding at non-kernel address will decrease the invalid unwind results. Before: crash> gdb bt #0 0xffffffff816a8f65 in context_switch ... crash-utility#1 __schedule () ... crash-utility#2 0xffffffff816a94e9 in schedule ... crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock ... crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range ... crash-utility#5 0xffffffff8124bb7e in ep_poll ... crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait ... crash-utility#7 SyS_epoll_wait ... crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () crash-utility#10 0xffff880100000001 in ?? () crash-utility#11 0xffff880169b3c010 in ?? () crash-utility#12 0x0000000000000040 in irq_stack_union () crash-utility#13 0xffff880169b3c058 in ?? () crash-utility#14 0xffff880169b3c048 in ?? () crash-utility#15 0xffff880169b3c050 in ?? () crash-utility#16 0x0000000000000000 in ?? () After: crash> gdb bt #0 0xffffffff816a8f65 in context_switch ... crash-utility#1 __schedule () ... crash-utility#2 0xffffffff816a94e9 in schedule () ... crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock ... crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range ... crash-utility#5 0xffffffff8124bb7e in ep_poll ... crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait ... crash-utility#7 SyS_epoll_wait ... crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () Signed-off-by: Tao Liu <ltao@redhat.com>
Currently, gdb passthroughs of 'bt', 'frame', 'up', 'down', 'info locals' don't work. This is due to gdb not knowing the register values to unwind the stack frames Every gdb passthrough goes through `gdb_interface`. And then, gdb expects `crash_target::fetch_registers` to give it the register values, which is dependent on `machdep->get_cpu_reg` to read the register values for specific architecture. ---------------------------- gdb passthrough (eg. "bt") | | crash -------------------------> | | | gdb_interface | | | | | | ---------------------- | fetch_registers | | | | crash_target<-------------------------+--| gdb | | --------------------------+->| | | Registers (SP,NIP, etc.)| | | | | | | | | ---------------------- | ---------------------------- Implement `machdep->get_cpu_reg` on PPC64, so that crash provides the register values to gdb to unwind stack frames properly With these changes, on powerpc, 'bt' command output in gdb mode, will look like this: gdb> bt #0 0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>, newregs=0xc00000000486f8d8) at ./arch/powerpc/include/asm/kexec.h:69 crash-utility#1 __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974 crash-utility#2 0xc000000000168918 in panic (fmt=<optimized out>) at kernel/panic.c:358 crash-utility#3 0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155 crash-utility#4 0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:602 crash-utility#5 0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163 crash-utility#6 0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>, buf=<optimized out>, file=<optimized out>, pde=0xc000000009ed3a80) at fs/proc/inode.c:340 crash-utility#7 proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352 crash-utility#8 0xc0000000005b3bbc in vfs_write (file=file@entry=0xc00000009dda7d00, buf=buf@entry=0xebcfc7c6040 <error: Cannot access memory at address 0xebcfc7c6040>, count=count@entry=2, pos=pos@entry=0xc00000000486fda0) at fs/read_write.c:582 instead of earlier output without this patch: gdb> bt #0 <unavailable> in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) Also, 'get_dumpfile_regs' has been introduced to get registers from multiple supported vmcore formats. Correspondingly a flag 'BT_NO_PRINT_REGS' has been introduced to tell helper functions to get registers, to not print registers with every call to backtrace in gdb. Note: This feature to support GDB unwinding doesn't support live debugging Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@nec.com> Improved-by: Tao Liu <ltao@redhat.com> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Previously we can only view the stack unwinding for the tasks which are running on each CPUs. This patch will enable the ability to view arbitrary tasks stack unwinding. After crash get initialized, "info threads" will output like the following: crash> info threads Id Target Id Frame 1 CPU 0 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... * 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, reserved=reserved@entry=false) at block/blk-mq.c:640 ... 13 CPU 12 <unavailable> in ?? () 14 CPU 13 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM > 0 0 0 ffffffff819f9480 RU 0.0 0 0 [swapper/0] > 0 0 1 ffff880169411fa0 RU 0.0 0 0 [swapper/1] ... 0 0 23 ffff8801694e0000 RU 0.0 0 0 [swapper/23] 1 0 13 ffff880169b30000 IN 0.0 193052 4180 systemd "info threads" show the tasks which are currently running on each CPU. If we'd like to view systemd task's stack unwinding, which is inactive status, we do the following: crash> set 1 or crash> set ffff880169b30000 Then the register cache of systemd will be swapped into CPU 13: crash> info threads Id Target Id Frame 1 CPU 0 native_safe_halt () at arch/x86/include/asm/irqflags.h:54 ... 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, reserved=reserved@entry=false) at block/blk-mq.c:640 ... 13 CPU 12 <unavailable> in ?? () * 14 CPU 13 0xffffffff816a8f65 in context_switch (rq=0x0, next=0x0, prev=0xffff880169b30000) at kernel/sched/core.c:2527 ... And we can view the stack unwinding of systemd: crash> bt PID: 1 TASK: ffff880169b30000 CPU: 13 COMMAND: "systemd" #0 [ffff880169b3bd58] __schedule at ffffffff816a8f65 crash-utility#1 [ffff880169b3bdc0] schedule at ffffffff816a94e9 crash-utility#2 [ffff880169b3bdd0] schedule_hrtimeout_range_clock at ffffffff816a86fd crash-utility#3 [ffff880169b3be68] schedule_hrtimeout_range at ffffffff816a8733 crash-utility#4 [ffff880169b3be78] ep_poll at ffffffff8124bb7e crash-utility#5 [ffff880169b3bf30] sys_epoll_wait at ffffffff8124d00d crash-utility#6 [ffff880169b3bf80] system_call_fastpath at ffffffff816b5009 RIP: 00007f0449407923 RSP: 00007ffc35a3c378 RFLAGS: 00010246 RAX: 00000000000000e8 RBX: ffffffff816b5009 RCX: 0000000000000071 RDX: 000000000000001d RSI: 00007ffc35a3d5a0 RDI: 0000000000000004 RBP: 00007ffc35a3d810 R8: 0000000000000000 R9: 0000000000000000 R10: 00000000ffffffff R11: 0000000000000293 R12: 0000563ca2ebe980 R13: 0000000000000003 R14: ffffffffffffffff R15: 0000000000000001 ORIG_RAX: 00000000000000e8 CS: 0033 SS: 002b crash> gdb bt #0 0xffffffff816a8f65 in context_switch (rq=0x0, next=0x0, prev=0xffff880169b30000) at kernel/sched/core.c:2527 crash-utility#1 __schedule () at kernel/sched/core.c:3540 crash-utility#2 0xffffffff816a94e9 in schedule () at kernel/sched/core.c:3577 crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock (expires=expires@entry=0x0, delta=delta@entry=0, mode=mode@entry=HRTIMER_MODE_ABS, clock=clock@entry=1) at kernel/hrtimer.c:1724 crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range (expires=expires@entry=0x0, delta=delta@entry=0, mode=mode@entry=HRTIMER_MODE_ABS) at kernel/hrtimer.c:1778 crash-utility#5 0xffffffff8124bb7e in ep_poll (ep=0xffff880fd861f8c0, events=events@entry=0x7ffc35a3d5a0, maxevents=maxevents@entry=29, timeout=timeout@entry=-1) at fs/eventpoll.c:1669 crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait (timeout=<optimized out>, maxevents=29, events=<optimized out>, epfd=<optimized out>) at fs/eventpoll.c:2043 crash-utility#7 SyS_epoll_wait (epfd=<optimized out>, events=140721208415648, maxevents=29, timeout=4294967295) at fs/eventpoll.c:2008 crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () Signed-off-by: Tao Liu <ltao@redhat.com> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
The stack unwinding is for kernel addresses only. If non-kernel address encountered, it is usually a user space address, or non-address value like a function call parameter. So stopping stack unwinding at non-kernel address will decrease the invalid unwind results. Before: crash> gdb bt #0 0xffffffff816a8f65 in context_switch ... crash-utility#1 __schedule () ... crash-utility#2 0xffffffff816a94e9 in schedule ... crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock ... crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range ... crash-utility#5 0xffffffff8124bb7e in ep_poll ... crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait ... crash-utility#7 SyS_epoll_wait ... crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () crash-utility#10 0xffff880100000001 in ?? () crash-utility#11 0xffff880169b3c010 in ?? () crash-utility#12 0x0000000000000040 in irq_stack_union () crash-utility#13 0xffff880169b3c058 in ?? () crash-utility#14 0xffff880169b3c048 in ?? () crash-utility#15 0xffff880169b3c050 in ?? () crash-utility#16 0x0000000000000000 in ?? () After: crash> gdb bt #0 0xffffffff816a8f65 in context_switch ... crash-utility#1 __schedule () ... crash-utility#2 0xffffffff816a94e9 in schedule () ... crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock ... crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range ... crash-utility#5 0xffffffff8124bb7e in ep_poll ... crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait ... crash-utility#7 SyS_epoll_wait ... crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () Signed-off-by: Tao Liu <ltao@redhat.com>
Currently, gdb passthroughs of 'bt', 'frame', 'up', 'down', 'info locals' don't work. This is due to gdb not knowing the register values to unwind the stack frames Every gdb passthrough goes through `gdb_interface`. And then, gdb expects `crash_target::fetch_registers` to give it the register values, which is dependent on `machdep->get_cpu_reg` to read the register values for specific architecture. ---------------------------- gdb passthrough (eg. "bt") | | crash -------------------------> | | | gdb_interface | | | | | | ---------------------- | fetch_registers | | | | crash_target<-------------------------+--| gdb | | --------------------------+->| | | Registers (SP,NIP, etc.)| | | | | | | | | ---------------------- | ---------------------------- Implement `machdep->get_cpu_reg` on PPC64, so that crash provides the register values to gdb to unwind stack frames properly With these changes, on powerpc, 'bt' command output in gdb mode, will look like this: gdb> bt #0 0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>, newregs=0xc00000000486f8d8) at ./arch/powerpc/include/asm/kexec.h:69 crash-utility#1 __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974 crash-utility#2 0xc000000000168918 in panic (fmt=<optimized out>) at kernel/panic.c:358 crash-utility#3 0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155 crash-utility#4 0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:602 crash-utility#5 0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163 crash-utility#6 0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>, buf=<optimized out>, file=<optimized out>, pde=0xc000000009ed3a80) at fs/proc/inode.c:340 crash-utility#7 proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352 crash-utility#8 0xc0000000005b3bbc in vfs_write (file=file@entry=0xc00000009dda7d00, buf=buf@entry=0xebcfc7c6040 <error: Cannot access memory at address 0xebcfc7c6040>, count=count@entry=2, pos=pos@entry=0xc00000000486fda0) at fs/read_write.c:582 instead of earlier output without this patch: gdb> bt #0 <unavailable> in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) Also, 'get_dumpfile_regs' has been introduced to get registers from multiple supported vmcore formats. Correspondingly a flag 'BT_NO_PRINT_REGS' has been introduced to tell helper functions to get registers, to not print registers with every call to backtrace in gdb. Note: This feature to support GDB unwinding doesn't support live debugging Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@nec.com> Improved-by: Tao Liu <ltao@redhat.com> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
The stack unwinding is for kernel addresses only. If non-kernel address encountered, it is usually a user space address, or non-address value like a function call parameter. So stopping stack unwinding at non-kernel address will decrease the invalid unwind results. Before: crash> gdb bt #0 0xffffffff816a8f65 in context_switch ... crash-utility#1 __schedule () ... crash-utility#2 0xffffffff816a94e9 in schedule ... crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock ... crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range ... crash-utility#5 0xffffffff8124bb7e in ep_poll ... crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait ... crash-utility#7 SyS_epoll_wait ... crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () crash-utility#10 0xffff880100000001 in ?? () crash-utility#11 0xffff880169b3c010 in ?? () crash-utility#12 0x0000000000000040 in irq_stack_union () crash-utility#13 0xffff880169b3c058 in ?? () crash-utility#14 0xffff880169b3c048 in ?? () crash-utility#15 0xffff880169b3c050 in ?? () crash-utility#16 0x0000000000000000 in ?? () After: crash> gdb bt #0 0xffffffff816a8f65 in context_switch ... crash-utility#1 __schedule () ... crash-utility#2 0xffffffff816a94e9 in schedule () ... crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock ... crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range ... crash-utility#5 0xffffffff8124bb7e in ep_poll ... crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait ... crash-utility#7 SyS_epoll_wait ... crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () Signed-off-by: Tao Liu <ltao@redhat.com>
Currently, gdb passthroughs of 'bt', 'frame', 'up', 'down', 'info locals' don't work. This is due to gdb not knowing the register values to unwind the stack frames Every gdb passthrough goes through `gdb_interface`. And then, gdb expects `crash_target::fetch_registers` to give it the register values, which is dependent on `machdep->get_cpu_reg` to read the register values for specific architecture. ---------------------------- gdb passthrough (eg. "bt") | | crash -------------------------> | | | gdb_interface | | | | | | ---------------------- | fetch_registers | | | | crash_target<-------------------------+--| gdb | | --------------------------+->| | | Registers (SP,NIP, etc.)| | | | | | | | | ---------------------- | ---------------------------- Implement `machdep->get_cpu_reg` on PPC64, so that crash provides the register values to gdb to unwind stack frames properly With these changes, on powerpc, 'bt' command output in gdb mode, will look like this: gdb> bt #0 0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>, newregs=0xc00000000486f8d8) at ./arch/powerpc/include/asm/kexec.h:69 crash-utility#1 __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974 crash-utility#2 0xc000000000168918 in panic (fmt=<optimized out>) at kernel/panic.c:358 crash-utility#3 0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155 crash-utility#4 0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:602 crash-utility#5 0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163 crash-utility#6 0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>, buf=<optimized out>, file=<optimized out>, pde=0xc000000009ed3a80) at fs/proc/inode.c:340 crash-utility#7 proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352 crash-utility#8 0xc0000000005b3bbc in vfs_write (file=file@entry=0xc00000009dda7d00, buf=buf@entry=0xebcfc7c6040 <error: Cannot access memory at address 0xebcfc7c6040>, count=count@entry=2, pos=pos@entry=0xc00000000486fda0) at fs/read_write.c:582 instead of earlier output without this patch: gdb> bt #0 <unavailable> in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) Also, 'get_dumpfile_regs' has been introduced to get registers from multiple supported vmcore formats. Correspondingly a flag 'BT_NO_PRINT_REGS' has been introduced to tell helper functions to get registers, to not print registers with every call to backtrace in gdb. Note: This feature to support GDB unwinding doesn't support live debugging Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@nec.com> Improved-by: Tao Liu <ltao@redhat.com> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
The following segmentation fault occurred during session initialization: $ crash vmlinx vmcore ... please wait... (determining panic task)Segmentation fault Here is the backtrace of the crash-utility: (gdb) bt #0 value_search_module_6_4 (value=18446603338276298752, offset=0x7ffffffface0) at symbols.c:5564 #1 0x0000555555812bd0 in value_to_symstr (value=18446603338276298752, buf=buf@entry=0x7fffffffb9c0 "", radix=10, radix@entry=0) at symbols.c:5872 #2 0x00005555557694a2 in display_memory (addr=<optimized out>, count=2048, flag=208, memtype=memtype@entry=1, opt=opt@entry=0x0) at memory.c:1740 #3 0x0000555555769e1f in raw_stack_dump (stackbase=<optimized out>, size=<optimized out>) at memory.c:2194 #4 0x00005555557923ff in get_active_set_panic_task () at task.c:8639 #5 0x00005555557930d2 in get_dumpfile_panic_task () at task.c:7628 #6 0x00005555557a89d3 in panic_search () at task.c:7380 #7 get_panic_context () at task.c:6267 #8 task_init () at task.c:687 #9 0x00005555557305b3 in main_loop () at main.c:787 ... This is due to lack of existence check on module symbol table. Not all mod_mem_type will be existent for a module, e.g. in the following module case: (gdb) p lm->symtable[0] $1 = (struct syment *) 0x4dcbad0 (gdb) p lm->symtable[1] $2 = (struct syment *) 0x4dcbb70 (gdb) p lm->symtable[2] $3 = (struct syment *) 0x4dcbc10 (gdb) p lm->symtable[3] $4 = (struct syment *) 0x0 (gdb) p lm->symtable[4] $5 = (struct syment *) 0x4dcbcb0 (gdb) p lm->symtable[5] $6 = (struct syment *) 0x4dcbd00 (gdb) p lm->symtable[6] $7 = (struct syment *) 0x0 MOD_RO_AFTER_INIT(3) and MOD_INIT_RODATA(6) do not exist, which should be skipped, otherwise the segmentation fault will happen. Fixes: 7750e61 ("Support module memory layout change on Linux 6.4") Closes: #176 Reported-by: Naveen Chaudhary <naveenchaudhary2010@hotmail.com> Signed-off-by: Tao Liu <ltao@redhat.com>
The stack unwinding is for kernel addresses only. If non-kernel address encountered, it is usually a user space address, or non-address value like a function call parameter. So stopping stack unwinding at non-kernel address will decrease the invalid unwind results. Before: crash> gdb bt #0 0xffffffff816a8f65 in context_switch ... crash-utility#1 __schedule () ... crash-utility#2 0xffffffff816a94e9 in schedule ... crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock ... crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range ... crash-utility#5 0xffffffff8124bb7e in ep_poll ... crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait ... crash-utility#7 SyS_epoll_wait ... crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () crash-utility#10 0xffff880100000001 in ?? () crash-utility#11 0xffff880169b3c010 in ?? () crash-utility#12 0x0000000000000040 in irq_stack_union () crash-utility#13 0xffff880169b3c058 in ?? () crash-utility#14 0xffff880169b3c048 in ?? () crash-utility#15 0xffff880169b3c050 in ?? () crash-utility#16 0x0000000000000000 in ?? () After: crash> gdb bt #0 0xffffffff816a8f65 in context_switch ... crash-utility#1 __schedule () ... crash-utility#2 0xffffffff816a94e9 in schedule () ... crash-utility#3 0xffffffff816a86fd in schedule_hrtimeout_range_clock ... crash-utility#4 0xffffffff816a8733 in schedule_hrtimeout_range ... crash-utility#5 0xffffffff8124bb7e in ep_poll ... crash-utility#6 0xffffffff8124d00d in SYSC_epoll_wait ... crash-utility#7 SyS_epoll_wait ... crash-utility#8 <signal handler called> crash-utility#9 0x00007f0449407923 in ?? () Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@nec.com> Cc: Tao Liu <ltao@redhat.com> Cc: Alexey Makhalov <alexey.makhalov@broadcom.com> Signed-off-by: Tao Liu <ltao@redhat.com>
Currently, gdb passthroughs of 'bt', 'frame', 'up', 'down', 'info locals' don't work. This is due to gdb not knowing the register values to unwind the stack frames Every gdb passthrough goes through `gdb_interface`. And then, gdb expects `crash_target::fetch_registers` to give it the register values, which is dependent on `machdep->get_cpu_reg` to read the register values for specific architecture. ---------------------------- gdb passthrough (eg. "bt") | | crash -------------------------> | | | gdb_interface | | | | | | ---------------------- | fetch_registers | | | | crash_target<-------------------------+--| gdb | | --------------------------+->| | | Registers (SP,NIP, etc.)| | | | | | | | | ---------------------- | ---------------------------- Implement `machdep->get_cpu_reg` on PPC64, so that crash provides the register values to gdb to unwind stack frames properly With these changes, on powerpc, 'bt' command output in gdb mode, will look like this: gdb> bt #0 0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>, newregs=0xc00000000486f8d8) at ./arch/powerpc/include/asm/kexec.h:69 crash-utility#1 __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974 crash-utility#2 0xc000000000168918 in panic (fmt=<optimized out>) at kernel/panic.c:358 crash-utility#3 0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155 crash-utility#4 0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:602 crash-utility#5 0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163 crash-utility#6 0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>, buf=<optimized out>, file=<optimized out>, pde=0xc000000009ed3a80) at fs/proc/inode.c:340 crash-utility#7 proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352 crash-utility#8 0xc0000000005b3bbc in vfs_write (file=file@entry=0xc00000009dda7d00, buf=buf@entry=0xebcfc7c6040 <error: Cannot access memory at address 0xebcfc7c6040>, count=count@entry=2, pos=pos@entry=0xc00000000486fda0) at fs/read_write.c:582 instead of earlier output without this patch: gdb> bt #0 <unavailable> in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) Also, 'get_dumpfile_regs' has been introduced to get registers from multiple supported vmcore formats. Correspondingly a flag 'BT_NO_PRINT_REGS' has been introduced to tell helper functions to get registers, to not print registers with every call to backtrace in gdb. Note: This feature to support GDB unwinding doesn't support live debugging Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Lianbo Jiang <lijiang@redhat.com> Cc: HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@nec.com> Cc: Tao Liu <ltao@redhat.com> Cc: Alexey Makhalov <alexey.makhalov@broadcom.com> Improved-by: Tao Liu <ltao@redhat.com> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
With Kernel commit 65c9cc9e2c14 ("x86/fred: Reserve space for the FRED stack frame") in Linux 6.9-rc1 and later, x86_64 will add extra padding ('TOP_OF_KERNEL_STACK_PADDING (2 * 8)', see: arch/x86/include/asm\ /thread_info.h,) for kernel stack when the CONFIG_X86_FRED is enabled. As a result, the pt_regs will be moved downwards due to the offset of padding, and the values of registers read from pt_regs will be incorrect as below. Without the patch: crash> bt PID: 2040 TASK: ffff969136fc4180 CPU: 16 COMMAND: "bash" #0 [ffffa996409aba38] machine_kexec at ffffffff9f881eb7 #1 [ffffa996409aba90] __crash_kexec at ffffffff9fa1e49e #2 [ffffa996409abb48] panic at ffffffff9f91a6cd #3 [ffffa996409abbc8] sysrq_handle_crash at ffffffffa0015076 #4 [ffffa996409abbd0] __handle_sysrq at ffffffffa0015640 #5 [ffffa996409abc00] write_sysrq_trigger at ffffffffa0015ce5 #6 [ffffa996409abc28] proc_reg_write at ffffffff9fd35bf5 #7 [ffffa996409abc40] vfs_write at ffffffff9fc8d462 #8 [ffffa996409abcd0] ksys_write at ffffffff9fc8dadf #9 [ffffa996409abd08] do_syscall_64 at ffffffffa0517429 #10 [ffffa996409abf40] entry_SYSCALL_64_after_hwframe at ffffffffa060012b [exception RIP: unknown or invalid address] RIP: 0000000000000246 RSP: 0000000000000000 RFLAGS: 0000002b RAX: 0000000000000002 RBX: 00007f9b9f5b13e0 RCX: 000055cee7486fb0 RDX: 0000000000000001 RSI: 0000000000000001 RDI: 00007f9b9f4fda57 RBP: 0000000000000246 R8: 00007f9b9f4fda57 R9: ffffffffffffffda R10: 0000000000000000 R11: 00007f9b9f5b14e0 R12: 0000000000000002 R13: 000055cee7486fb0 R14: 0000000000000002 R15: 00007f9b9f5fb780 ORIG_RAX: 0000000000000033 CS: 7ffe65327978 SS: 0000 bt: WARNING: possibly bogus exception frame crash> With the patch: crash> bt PID: 2040 TASK: ffff969136fc4180 CPU: 16 COMMAND: "bash" #0 [ffffa996409aba38] machine_kexec at ffffffff9f881eb7 #1 [ffffa996409aba90] __crash_kexec at ffffffff9fa1e49e #2 [ffffa996409abb48] panic at ffffffff9f91a6cd #3 [ffffa996409abbc8] sysrq_handle_crash at ffffffffa0015076 #4 [ffffa996409abbd0] __handle_sysrq at ffffffffa0015640 #5 [ffffa996409abc00] write_sysrq_trigger at ffffffffa0015ce5 #6 [ffffa996409abc28] proc_reg_write at ffffffff9fd35bf5 #7 [ffffa996409abc40] vfs_write at ffffffff9fc8d462 #8 [ffffa996409abcd0] ksys_write at ffffffff9fc8dadf #9 [ffffa996409abd08] do_syscall_64 at ffffffffa0517429 #10 [ffffa996409abf40] entry_SYSCALL_64_after_hwframe at ffffffffa060012b RIP: 00007f9b9f4fda57 RSP: 00007ffe65327978 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f9b9f4fda57 RDX: 0000000000000002 RSI: 000055cee7486fb0 RDI: 0000000000000001 RBP: 000055cee7486fb0 R8: 0000000000000000 R9: 00007f9b9f5b14e0 R10: 00007f9b9f5b13e0 R11: 0000000000000246 R12: 0000000000000002 R13: 00007f9b9f5fb780 R14: 0000000000000002 R15: 00007f9b9f5f69e0 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b crash> Link: https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00754.html Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Signed-off-by: Tao Liu <ltao@redhat.com>
The commit 48764a1 may cause a regression issue when the CONFIG_X86_FRED is not enabled, this is because the SIZE(fred_frame) will call the SIZE_verify() to determine if the fred_frame is valid, otherwise it will emit an error: crash> bt 1 bt: invalid structure size: fred_frame FILE: x86_64.c LINE: 4089 FUNCTION: x86_64_low_budget_back_trace_cmd() [/home/k-hagio/bin/crash] error trace: 588df3 => 5cbc72 => 5eb3e1 => 5eb366 PID: 1 TASK: ffff9f94c024b980 CPU: 2 COMMAND: "systemd" #0 [ffffade44001bca8] __schedule at ffffffffb948ebbb #1 [ffffade44001bd10] schedule at ffffffffb948f04d #2 [ffffade44001bd20] schedule_hrtimeout_range_clock at ffffffffb9494fef #3 [ffffade44001bda8] ep_poll at ffffffffb8c91be8 #4 [ffffade44001be48] do_epoll_wait at ffffffffb8c91d11 #5 [ffffade44001be80] __x64_sys_epoll_wait at ffffffffb8c92590 #6 [ffffade44001bed0] do_syscall_64 at ffffffffb947f459 #7 [ffffade44001bf50] entry_SYSCALL_64_after_hwframe at ffffffffb96000ea 5eb366: SIZE_verify.part.42+70 5eb3e1: SIZE_verify+49 5cbc72: x86_64_low_budget_back_trace_cmd+3010 588df3: back_trace+1523 bt: invalid structure size: fred_frame FILE: x86_64.c LINE: 4089 FUNCTION: x86_64_low_budget_back_trace_cmd() Let's replace the SIZE(fred_frame) with the VALID_SIZE(fred_frame) to fix it. Fixes: 48764a1 ("x86_64: fix for adding top_of_kernel_stack_padding for kernel stack") Reported-by: Kazuhito Hagio <k-hagio-ab@nec.com> Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
If we use crash to parse ramdump(Qcom phone device) rathen than vmcore. Start command should be like: crash vmlinux --kaslr=xxx DDRCS0_0.BIN@0x0000000080000000,... --machdep vabits_actual=39 Then We will see bt command show misleading backtrace information below: crash> bt 16930 PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr" #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4 crash-utility#1 [ffffffc034c43850] __kvm_nvhe_$d.2314 at 6be732e004cf05a0 crash-utility#2 [ffffffc034c438b0] __kvm_nvhe_$d.2314 at 86c54c6004ceff80 crash-utility#3 [ffffffc034c43950] __kvm_nvhe_$d.2314 at 55d6f96003a7b120 crash-utility#4 [ffffffc034c439f0] __kvm_nvhe_$d.2314 at 9ccec46003a80a64 crash-utility#5 [ffffffc034c43ac0] __kvm_nvhe_$d.2314 at 8cf41e6003a945c4 crash-utility#6 [ffffffc034c43b10] __kvm_nvhe_$d.2314 at a8f181e00372c818 crash-utility#7 [ffffffc034c43b40] __kvm_nvhe_$d.2314 at 6dedde600372c0d0 crash-utility#8 [ffffffc034c43b90] __kvm_nvhe_$d.2314 at 62cc07e00373d0ac crash-utility#9 [ffffffc034c43c00] __kvm_nvhe_$d.2314 at 72fb1de00373bedc ... PC: 00000073f5294840 LR: 00000070d8f39ba4 SP: 00000070d4afd5d0 X29: 00000070d4afd600 X28: b4000071efcda7f0 X27: 00000070d4afe000 X26: 0000000000000000 X25: 00000070d9616000 X24: 0000000000000000 X23: 0000000000000000 X22: 0000000000000000 X21: 0000000000000000 X20: b40000728fd27520 X19: b40000728fd27550 X18: 000000702daba000 X17: 00000073f5294820 X16: 00000070d940f9d8 X15: 00000000000000bf X14: 0000000000000000 X13: 00000070d8ad2fac X12: b40000718fce5040 X11: 0000000000000000 X10: 0000000000000070 X9: 0000000000000001 X8: 0000000000000062 X7: 0000000000000020 X6: 0000000000000000 X5: 0000000000000000 X4: 0000000000000000 X3: 0000000000000000 X2: 0000000000000002 X1: 0000000000000080 X0: b40000728fd27550 ORIG_X0: b40000728fd27550 SYSCALLNO: ffffffff PSTATE: 40001000 By checking the raw data below, will see the lr (fp+8) data show the pointer which already been replaced by PAC prefix. crash> bt -f PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr" #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4 ffffffc034c437f0: ffffffc034c43850 6be732e004cf05a4 ffffffc034c43800: ffffffe006186108 a0ed07e004cf09c4 ffffffc034c43810: ffffff8a1a340000 ffffff8a8d343c00 ffffffc034c43820: ffffff89b3eada00 ffffff8b780db540 ffffffc034c43830: ffffff89b3eada00 0000000000000000 ffffffc034c43840: 0000000000000004 712b828118484a00 crash-utility#1 [ffffffc034c43850] __kvm_nvhe_$d.2314 at 6be732e004cf05a0 ffffffc034c43850: ffffffc034c438b0 86c54c6004ceff84 ffffffc034c43860: 000000708070f000 ffffffc034c43938 ffffffc034c43870: ffffff88bd822878 ffffff89b3eada00 ... So we check the CONFIG_ARM64_PTR_AUTH and CONFIG_ARM64_PTR_AUTH_KERNEL to double check if pac mechanism been enabled on this ramdump. Then we use vabits to figure it out. Fix then show the right backtrace below: crash> bt 16930 PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr" #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4 crash-utility#1 [ffffffc034c43850] __schedule at ffffffe004cf05a0 crash-utility#2 [ffffffc034c438b0] preempt_schedule_common at ffffffe004ceff80 crash-utility#3 [ffffffc034c43950] unmap_page_range at ffffffe003a7b120 crash-utility#4 [ffffffc034c439f0] unmap_vmas at ffffffe003a80a64 crash-utility#5 [ffffffc034c43ac0] exit_mmap at ffffffe003a945c4 crash-utility#6 [ffffffc034c43b10] __mmput at ffffffe00372c818 crash-utility#7 [ffffffc034c43b40] mmput at ffffffe00372c0d0 crash-utility#8 [ffffffc034c43b90] exit_mm at ffffffe00373d0ac crash-utility#9 [ffffffc034c43c00] do_exit at ffffffe00373bedc PC: 00000073f5294840 LR: 00000070d8f39ba4 SP: 00000070d4afd5d0 X29: 00000070d4afd600 X28: b4000071efcda7f0 X27: 00000070d4afe000 X26: 0000000000000000 X25: 00000070d9616000 X24: 0000000000000000 X23: 0000000000000000 X22: 0000000000000000 X21: 0000000000000000 X20: b40000728fd27520 X19: b40000728fd27550 X18: 000000702daba000 X17: 00000073f5294820 X16: 00000070d940f9d8 X15: 00000000000000bf X14: 0000000000000000 X13: 00000070d8ad2fac X12: b40000718fce5040 X11: 0000000000000000 X10: 0000000000000070 X9: 0000000000000001 X8: 0000000000000062 X7: 0000000000000020 X6: 0000000000000000 X5: 0000000000000000 X4: 0000000000000000 X3: 0000000000000000 X2: 0000000000000002 X1: 0000000000000080 X0: b40000728fd27550 ORIG_X0: b40000728fd27550 SYSCALLNO: ffffffff PSTATE: 40001000 Let's use GENMASK to replace the pac pointer to fix it. gki related commit url here: https://lore.kernel.org/all/20230412160134.306148-4-mark.rutland@arm.com/
If we use crash to parse ramdump(Qcom phone device) rathen than vmcore. Start command should be like: crash vmlinux --kaslr=xxx DDRCS0_0.BIN@0x0000000080000000,... --machdep vabits_actual=39 Then We will see bt command show misleading backtrace information below: crash> bt 16930 PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr" #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4 crash-utility#1 [ffffffc034c43850] __kvm_nvhe_$d.2314 at 6be732e004cf05a0 crash-utility#2 [ffffffc034c438b0] __kvm_nvhe_$d.2314 at 86c54c6004ceff80 crash-utility#3 [ffffffc034c43950] __kvm_nvhe_$d.2314 at 55d6f96003a7b120 crash-utility#4 [ffffffc034c439f0] __kvm_nvhe_$d.2314 at 9ccec46003a80a64 crash-utility#5 [ffffffc034c43ac0] __kvm_nvhe_$d.2314 at 8cf41e6003a945c4 crash-utility#6 [ffffffc034c43b10] __kvm_nvhe_$d.2314 at a8f181e00372c818 crash-utility#7 [ffffffc034c43b40] __kvm_nvhe_$d.2314 at 6dedde600372c0d0 crash-utility#8 [ffffffc034c43b90] __kvm_nvhe_$d.2314 at 62cc07e00373d0ac crash-utility#9 [ffffffc034c43c00] __kvm_nvhe_$d.2314 at 72fb1de00373bedc ... PC: 00000073f5294840 LR: 00000070d8f39ba4 SP: 00000070d4afd5d0 X29: 00000070d4afd600 X28: b4000071efcda7f0 X27: 00000070d4afe000 X26: 0000000000000000 X25: 00000070d9616000 X24: 0000000000000000 X23: 0000000000000000 X22: 0000000000000000 X21: 0000000000000000 X20: b40000728fd27520 X19: b40000728fd27550 X18: 000000702daba000 X17: 00000073f5294820 X16: 00000070d940f9d8 X15: 00000000000000bf X14: 0000000000000000 X13: 00000070d8ad2fac X12: b40000718fce5040 X11: 0000000000000000 X10: 0000000000000070 X9: 0000000000000001 X8: 0000000000000062 X7: 0000000000000020 X6: 0000000000000000 X5: 0000000000000000 X4: 0000000000000000 X3: 0000000000000000 X2: 0000000000000002 X1: 0000000000000080 X0: b40000728fd27550 ORIG_X0: b40000728fd27550 SYSCALLNO: ffffffff PSTATE: 40001000 By checking the raw data below, will see the lr (fp+8) data show the pointer which already been replaced by PAC prefix. crash> bt -f PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr" #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4 ffffffc034c437f0: ffffffc034c43850 6be732e004cf05a4 ffffffc034c43800: ffffffe006186108 a0ed07e004cf09c4 ffffffc034c43810: ffffff8a1a340000 ffffff8a8d343c00 ffffffc034c43820: ffffff89b3eada00 ffffff8b780db540 ffffffc034c43830: ffffff89b3eada00 0000000000000000 ffffffc034c43840: 0000000000000004 712b828118484a00 crash-utility#1 [ffffffc034c43850] __kvm_nvhe_$d.2314 at 6be732e004cf05a0 ffffffc034c43850: ffffffc034c438b0 86c54c6004ceff84 ffffffc034c43860: 000000708070f000 ffffffc034c43938 ffffffc034c43870: ffffff88bd822878 ffffff89b3eada00 ... So we check the CONFIG_ARM64_PTR_AUTH and CONFIG_ARM64_PTR_AUTH_KERNEL to double check if pac mechanism been enabled on this ramdump. Then we use vabits to figure it out. Fix then show the right backtrace below: crash> bt 16930 PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr" #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4 crash-utility#1 [ffffffc034c43850] __schedule at ffffffe004cf05a0 crash-utility#2 [ffffffc034c438b0] preempt_schedule_common at ffffffe004ceff80 crash-utility#3 [ffffffc034c43950] unmap_page_range at ffffffe003a7b120 crash-utility#4 [ffffffc034c439f0] unmap_vmas at ffffffe003a80a64 crash-utility#5 [ffffffc034c43ac0] exit_mmap at ffffffe003a945c4 crash-utility#6 [ffffffc034c43b10] __mmput at ffffffe00372c818 crash-utility#7 [ffffffc034c43b40] mmput at ffffffe00372c0d0 crash-utility#8 [ffffffc034c43b90] exit_mm at ffffffe00373d0ac crash-utility#9 [ffffffc034c43c00] do_exit at ffffffe00373bedc PC: 00000073f5294840 LR: 00000070d8f39ba4 SP: 00000070d4afd5d0 X29: 00000070d4afd600 X28: b4000071efcda7f0 X27: 00000070d4afe000 X26: 0000000000000000 X25: 00000070d9616000 X24: 0000000000000000 X23: 0000000000000000 X22: 0000000000000000 X21: 0000000000000000 X20: b40000728fd27520 X19: b40000728fd27550 X18: 000000702daba000 X17: 00000073f5294820 X16: 00000070d940f9d8 X15: 00000000000000bf X14: 0000000000000000 X13: 00000070d8ad2fac X12: b40000718fce5040 X11: 0000000000000000 X10: 0000000000000070 X9: 0000000000000001 X8: 0000000000000062 X7: 0000000000000020 X6: 0000000000000000 X5: 0000000000000000 X4: 0000000000000000 X3: 0000000000000000 X2: 0000000000000002 X1: 0000000000000080 X0: b40000728fd27550 ORIG_X0: b40000728fd27550 SYSCALLNO: ffffffff PSTATE: 40001000 Let's use GENMASK to replace the pac pointer to fix it. gki related commit url here: https://lore.kernel.org/all/20230412160134.306148-4-mark.rutland@arm.com/ Signed-off-by: bevis_chen <bevis_chen@asus.com>
If we use crash to parse ramdump(Qcom phone device) rathen than vmcore. Start command should be like: crash vmlinux --kaslr=xxx DDRCS0_0.BIN@0x0000000080000000,... --machdep vabits_actual=39 Then We will see bt command show misleading backtrace information below: crash> bt 16930 PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr" #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4 crash-utility#1 [ffffffc034c43850] __kvm_nvhe_$d.2314 at 6be732e004cf05a0 crash-utility#2 [ffffffc034c438b0] __kvm_nvhe_$d.2314 at 86c54c6004ceff80 crash-utility#3 [ffffffc034c43950] __kvm_nvhe_$d.2314 at 55d6f96003a7b120 crash-utility#4 [ffffffc034c439f0] __kvm_nvhe_$d.2314 at 9ccec46003a80a64 crash-utility#5 [ffffffc034c43ac0] __kvm_nvhe_$d.2314 at 8cf41e6003a945c4 crash-utility#6 [ffffffc034c43b10] __kvm_nvhe_$d.2314 at a8f181e00372c818 crash-utility#7 [ffffffc034c43b40] __kvm_nvhe_$d.2314 at 6dedde600372c0d0 crash-utility#8 [ffffffc034c43b90] __kvm_nvhe_$d.2314 at 62cc07e00373d0ac crash-utility#9 [ffffffc034c43c00] __kvm_nvhe_$d.2314 at 72fb1de00373bedc ... PC: 00000073f5294840 LR: 00000070d8f39ba4 SP: 00000070d4afd5d0 X29: 00000070d4afd600 X28: b4000071efcda7f0 X27: 00000070d4afe000 X26: 0000000000000000 X25: 00000070d9616000 X24: 0000000000000000 X23: 0000000000000000 X22: 0000000000000000 X21: 0000000000000000 X20: b40000728fd27520 X19: b40000728fd27550 X18: 000000702daba000 X17: 00000073f5294820 X16: 00000070d940f9d8 X15: 00000000000000bf X14: 0000000000000000 X13: 00000070d8ad2fac X12: b40000718fce5040 X11: 0000000000000000 X10: 0000000000000070 X9: 0000000000000001 X8: 0000000000000062 X7: 0000000000000020 X6: 0000000000000000 X5: 0000000000000000 X4: 0000000000000000 X3: 0000000000000000 X2: 0000000000000002 X1: 0000000000000080 X0: b40000728fd27550 ORIG_X0: b40000728fd27550 SYSCALLNO: ffffffff PSTATE: 40001000 By checking the raw data below, will see the lr (fp+8) data show the pointer which already been replaced by PAC prefix. crash> bt -f PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr" #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4 ffffffc034c437f0: ffffffc034c43850 6be732e004cf05a4 ffffffc034c43800: ffffffe006186108 a0ed07e004cf09c4 ffffffc034c43810: ffffff8a1a340000 ffffff8a8d343c00 ffffffc034c43820: ffffff89b3eada00 ffffff8b780db540 ffffffc034c43830: ffffff89b3eada00 0000000000000000 ffffffc034c43840: 0000000000000004 712b828118484a00 crash-utility#1 [ffffffc034c43850] __kvm_nvhe_$d.2314 at 6be732e004cf05a0 ffffffc034c43850: ffffffc034c438b0 86c54c6004ceff84 ffffffc034c43860: 000000708070f000 ffffffc034c43938 ffffffc034c43870: ffffff88bd822878 ffffff89b3eada00 ... So we check the CONFIG_ARM64_PTR_AUTH and CONFIG_ARM64_PTR_AUTH_KERNEL to double check if pac mechanism been enabled on this ramdump. Then we use vabits to figure it out. Fix then show the right backtrace below: crash> bt 16930 PID: 16930 TASK: ffffff89b3eada00 CPU: 2 COMMAND: "Firebase Backgr" #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4 crash-utility#1 [ffffffc034c43850] __schedule at ffffffe004cf05a0 crash-utility#2 [ffffffc034c438b0] preempt_schedule_common at ffffffe004ceff80 crash-utility#3 [ffffffc034c43950] unmap_page_range at ffffffe003a7b120 crash-utility#4 [ffffffc034c439f0] unmap_vmas at ffffffe003a80a64 crash-utility#5 [ffffffc034c43ac0] exit_mmap at ffffffe003a945c4 crash-utility#6 [ffffffc034c43b10] __mmput at ffffffe00372c818 crash-utility#7 [ffffffc034c43b40] mmput at ffffffe00372c0d0 crash-utility#8 [ffffffc034c43b90] exit_mm at ffffffe00373d0ac crash-utility#9 [ffffffc034c43c00] do_exit at ffffffe00373bedc PC: 00000073f5294840 LR: 00000070d8f39ba4 SP: 00000070d4afd5d0 X29: 00000070d4afd600 X28: b4000071efcda7f0 X27: 00000070d4afe000 X26: 0000000000000000 X25: 00000070d9616000 X24: 0000000000000000 X23: 0000000000000000 X22: 0000000000000000 X21: 0000000000000000 X20: b40000728fd27520 X19: b40000728fd27550 X18: 000000702daba000 X17: 00000073f5294820 X16: 00000070d940f9d8 X15: 00000000000000bf X14: 0000000000000000 X13: 00000070d8ad2fac X12: b40000718fce5040 X11: 0000000000000000 X10: 0000000000000070 X9: 0000000000000001 X8: 0000000000000062 X7: 0000000000000020 X6: 0000000000000000 X5: 0000000000000000 X4: 0000000000000000 X3: 0000000000000000 X2: 0000000000000002 X1: 0000000000000080 X0: b40000728fd27550 ORIG_X0: b40000728fd27550 SYSCALLNO: ffffffff PSTATE: 40001000 Let's use GENMASK to replace the pac pointer to fix it. gki related commit url here: https://lore.kernel.org/all/20230412160134.306148-4-mark.rutland@arm.com/ Signed-off-by: bevis_chen <bevis_chen@asus.com>
These funcs are defined in the sys/sysmacros.h header, not sys/types.h.
Linux C libraries are updating to drop the implicit include, so we need
to include it explicitly.