Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash bt -c 0 got wrong info when cpu0 save notes failed #105

Closed
xuchunmei000 opened this issue Dec 8, 2021 · 23 comments
Closed

crash bt -c 0 got wrong info when cpu0 save notes failed #105

xuchunmei000 opened this issue Dec 8, 2021 · 23 comments

Comments

@xuchunmei000
Copy link

xuchunmei000 commented Dec 8, 2021

my platform is aarch64 with kernel version 5.10.23, crash 7.2.9, kexec-tools 2.0.21, makedumpfile 1.6.9
when system crashes, cpu 0 and some other cpus are failed to stop ,following is some information about vmcore.
cpu126 is the panic cpu, and cpu 1 also failed to stop.
use help -D to get vmcore info, found that only one elf note parsed from vmcore, which should be cpu126, because other cpus are failed to stop, and only cpu126 can show backtrace.

crash> bt -c 1
PID: 66538  TASK: ffff00081749c200  CPU: 1   COMMAND: "fc_vcpu41"
bt: WARNING: cannot determine starting stack frame for task ffff00081749c200
crash> bt -c 0
PID: 66516  TASK: ffff00084642e300  CPU: 0   COMMAND: "fc_vcpu19"
Segmentation fault
crash> help -D | grep prstatus
  num_prstatus_notes: 1
crash> bt -c 126
PID: 0      TASK: ffff0400064de300  CPU: 126  COMMAND: "swapper/126"
 #0 [ffff8000250f3c90] __crash_kexec at ffff80001013a064
 #1 [ffff8000250f3e30] panic at ffff800010afd028
...
...

I found that arm64_get_crash_notes function, when get crash_notes failed, it will change to call diskdump_get_prstatus_percpu to get elf note from nt_prstatus_percpu, cpu0 will get dd->nt_prstatus_percpu[0] as note.

dd->nt_prstatus_percpu is parsed from vmcore for each cpu , when cpu offline or stop failed before crash, crash notes or elf notes failed to be saved, therefore use cpu as index to get note from dd->nt_prstatus_percpu will be wrong.

Any ideas to avoid to get wrong note for offline cpu or cpu failed to save notes ?

@k-hagio
Copy link
Contributor

k-hagio commented Dec 27, 2021

thanks for the report.
@lian-bo, I've not have been able to look into this yet, but this will be reproduced also on RHEL?

@lian-bo
Copy link
Member

lian-bo commented Dec 27, 2021

So far I haven't seen it on RHEL. Could you please try it on the latest crash-7.3 or crash-8.0? If this is still reproduced, would you mind sharing the vmcore or the reproducible steps?

@xuchunmei000
Copy link
Author

So far I haven't seen it on RHEL. Could you please try it on the latest crash-7.3 or crash-8.0? If this is still reproduced, would you mind sharing the vmcore or the reproducible steps?

here is the reproduce steps:
my aarch64 vm has 8 cpus, before OS crash, set some cpu offline,

echo 0 > /sys/devices/system/cpu/cpu0/online
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 0 > /sys/devices/system/cpu/cpu2/online
echo 0 > /sys/devices/system/cpu/cpu3/online

echo c > /proc/sysrq-trigger

then exec bt -c 0 got segment fault:

crash> bt -c 1
PID: 0      TASK: ffff0000c03510c0  CPU: 1   COMMAND: "swapper/1"
 #0 [ffff800011f73e90] arch_cpu_idle at ffff800010c0b0a4
crash> bt -c 2
PID: 0      TASK: ffff0000c039a180  CPU: 2   COMMAND: "swapper/2"
 #0 [ffff800011f7be90] arch_cpu_idle at ffff800010c0b0a4
crash> bt -c 3
PID: 0      TASK: ffff0000c039b240  CPU: 3   COMMAND: "swapper/3"
 #0 [ffff800011f83e90] arch_cpu_idle at ffff800010c0b0a4
crash> bt -c 0
PID: 0      TASK: ffff8000117fa240  CPU: 0   COMMAND: "swapper/0"
Segmentation fault (core dumped)

@lian-bo
Copy link
Member

lian-bo commented Jan 17, 2022

The crash-7.2.9 is old, can you try it with the latest upstream crash? I have never reproduced this issue.

@xuchunmei000
Copy link
Author

The crash-7.2.9 is old, can you try it with the latest upstream crash? I have never reproduced this issue.

I use latest upstream version 8.0.0 with gdb 10.2, still reproduce the issue.

@k-hagio
Copy link
Contributor

k-hagio commented Jan 20, 2022

hmm, I thought that map_cpus_to_prstatus_kdump_cmprs() maps cpus to prstatus, but it doesn't on arm64.
Is this the cause?

If I don't understand the situation, could you please send the whole help -D output on the 8-cpu machine?

@xuchunmei000
Copy link
Author

xuchunmei000 commented Jan 20, 2022

crash> help -D
diskdump_data:
          filename: ./vmcore
             flags: 1c6 (KDUMP_CMPRS_LOCAL|ERROR_EXCLUDED|LZO_SUPPORTED|SNAPPY_SUPPORTED|ZSTD_SUPPORTED)
               dfd: 3
               ofp: ffffb554b510
      machine_type: 183 (EM_AARCH64)

            header: aaab2101ee10
           signature: "KDUMP   "
      header_version: 6
             utsname:
               sysname: Linux
              nodename: localhost
               release: 5.10.60-9.al8.aarch64
               version: #1 SMP Mon Sep 6 20:56:34 CST 2021
               machine: aarch64
            domainname: (none)
           timestamp:
                tv_sec: 61cd7e17
               tv_usec: 0
              status: 2 (DUMP_DH_COMPRESSED_LZO)
          block_size: 4096
        sub_hdr_size: 2
       bitmap_blocks: 262
           max_mapnr: 4286464
    total_ram_blocks: 0
       device_blocks: 0
      written_blocks: 0
         current_cpu: 0
             nr_cpus: 4
      tasks[nr_cpus]: 0
                      0
                      0
                      0

        sub_header: 0 (n/a)

  sub_header_kdump: aaab2101fe20
           phys_base: 40000000
          dump_level: 31 (0x1f) (DUMP_EXCLUDE_ZERO|DUMP_EXCLUDE_CACHE|DUMP_EXCLUDE_CACHE_PRI|DUMP_EXCLUDE_USER_DATA|DUMP_EXCLUDE_FREE)
               split: 0
           start_pfn: (unused)
             end_pfn: (unused)
   offset_vmcoreinfo: 5872 (0x16f0)
     size_vmcoreinfo: 2885 (0xb45)
                      OSRELEASE=5.10.60-9.al8.aarch64
                      BUILD-ID=c7f4708939637fe3985ed53ecb1aad98b94c847a
                      PAGESIZE=4096
                      SYMBOL(init_uts_ns)=ffff8000117fa028
                      SYMBOL(node_online_map)=ffff8000117f1bd0
                      SYMBOL(swapper_pg_dir)=ffff8000113b2000
                      SYMBOL(_stext)=ffff8000100d0000
                      SYMBOL(vmap_area_list)=ffff800011bdb6a0
                      SYMBOL(mem_section)=ffff0003d4783200
                      LENGTH(mem_section)=1024
                      SIZE(mem_section)=16
                      OFFSET(mem_section.section_mem_map)=0
                      NUMBER(SECTION_SIZE_BITS)=30
                      NUMBER(MAX_PHYSMEM_BITS)=48
                      SIZE(page)=64
                      SIZE(pglist_data)=7680
                      SIZE(zone)=1472
                      SIZE(free_area)=88
                      SIZE(list_head)=16
                      SIZE(nodemask_t)=8
                      OFFSET(page.flags)=0
                      OFFSET(page._refcount)=52
                      OFFSET(page.mapping)=24
                      OFFSET(page.lru)=8
                      OFFSET(page._mapcount)=48
                      OFFSET(page.private)=40
                      OFFSET(page.compound_dtor)=16
                      OFFSET(page.compound_order)=17
                      OFFSET(page.compound_head)=8
                      OFFSET(pglist_data.node_zones)=0
                      OFFSET(pglist_data.nr_zones)=6944
                      OFFSET(pglist_data.node_start_pfn)=6952
                      OFFSET(pglist_data.node_spanned_pages)=6968
                      OFFSET(pglist_data.node_id)=6992
                      OFFSET(zone.free_area)=192
                      OFFSET(zone.vm_stat)=1280
                      OFFSET(zone.spanned_pages)=112
                      OFFSET(free_area.free_list)=0
                      OFFSET(list_head.next)=0
                      OFFSET(list_head.prev)=8
                      OFFSET(vmap_area.va_start)=0
                      OFFSET(vmap_area.list)=40
                      LENGTH(zone.free_area)=11
                      SYMBOL(prb)=ffff80001181f330
                      SYMBOL(printk_rb_static)=ffff80001181f370
                      SYMBOL(clear_seq)=ffff800011cfb9e0
                      SIZE(printk_ringbuffer)=80
                      OFFSET(printk_ringbuffer.desc_ring)=0
                      OFFSET(printk_ringbuffer.text_data_ring)=40
                      OFFSET(printk_ringbuffer.fail)=72
                      SIZE(prb_desc_ring)=40
                      OFFSET(prb_desc_ring.count_bits)=0
                      OFFSET(prb_desc_ring.descs)=8
                      OFFSET(prb_desc_ring.infos)=16
                      OFFSET(prb_desc_ring.head_id)=24
                      OFFSET(prb_desc_ring.tail_id)=32
                      SIZE(prb_desc)=24
                      OFFSET(prb_desc.state_var)=0
                      OFFSET(prb_desc.text_blk_lpos)=8
                      SIZE(prb_data_blk_lpos)=16
                      OFFSET(prb_data_blk_lpos.begin)=0
                      OFFSET(prb_data_blk_lpos.next)=8
                      SIZE(printk_info)=88
                      OFFSET(printk_info.seq)=0
                      OFFSET(printk_info.ts_nsec)=8
                      OFFSET(printk_info.text_len)=16
                      OFFSET(printk_info.caller_id)=20
                      OFFSET(printk_info.dev_info)=24
                      SIZE(dev_printk_info)=64
                      OFFSET(dev_printk_info.subsystem)=0
                      LENGTH(printk_info_subsystem)=16
                      OFFSET(dev_printk_info.device)=16
                      LENGTH(printk_info_device)=48
                      SIZE(prb_data_ring)=32
                      OFFSET(prb_data_ring.size_bits)=0
                      OFFSET(prb_data_ring.data)=8
                      OFFSET(prb_data_ring.head_lpos)=16
                      OFFSET(prb_data_ring.tail_lpos)=24
                      SIZE(atomic_long_t)=8
                      OFFSET(atomic_long_t.counter)=0
                      LENGTH(free_area.free_list)=5
                      NUMBER(NR_FREE_PAGES)=0
                      NUMBER(PG_lru)=4
                      NUMBER(PG_private)=13
                      NUMBER(PG_swapcache)=10
                      NUMBER(PG_swapbacked)=19
                      NUMBER(PG_slab)=9
                      NUMBER(PG_hwpoison)=22
                      NUMBER(PG_head_mask)=65536
                      NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-129
                      NUMBER(HUGETLB_PAGE_DTOR)=2
                      NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE)=-257
                      NUMBER(VA_BITS)=48
                      NUMBER(kimage_voffset)=0xffff7ffc67a00000
                      NUMBER(PHYS_OFFSET)=0x40000000
                      NUMBER(TCR_EL1_T1SZ)=0x10
                      KERNELOFFSET=c0000
                      NUMBER(KERNELPACMASK)=0x0
                      CRASHTIME=1640857111
         offset_note: 4200 (0x1068)
           size_note: 4560 (0x11d0)
           notes_buf: aaab21020e30
  num_vmcoredd_notes: 0
  num_prstatus_notes: 4
            notes[0]: aaab21020e30 (NT_PRSTATUS)
                      si.signo: 0  si.code: 0  si.errno: 0
                      cursig: 0  sigpend: 0  sighold: 0
                      pid: 1408  ppid: 0  pgrp: 0  sid:0
                      utime: 0.000000  stime: 0.000000
                      cutime: 0.000000  cstime: 0.000000
                       X0: ffff0000c8742800   X1: 0000000000000000   X2: ffff00036b6e90c0
                       X3: ffff800011bb22e8   X4: ffff00036b6e90c0   X5: 0000000000000000
                       X6: 000000000000000f   X7: ffff80001181f550   X8: 0000000000000000
                       X9: ffff8000102448fc  X10: 00000000ffff8000  X11: ffff800011adf550
                      X12: 0720072007200720  X13: 0720072007200720  X14: 0720072007200720
                      X15: ffff00036b6e9740  X16: 0000000000000000  X17: 0000000000000000
                      X18: 0000000000000030  X19: ffff00036b6e90c0  X20: ffff800011bb22a8
                      X21: 0000000000000000  X22: ffff800011e08000  X23: ffff80001329bab8
                      X24: ffff800011cf2000  X25: ffff800010cdcbc0  X26: 0000000000000000
                      X27: 0000000000000000  X28: ffff00036b6e90c0  X29: ffff80001329ba70
                       LR: ffff8000102448fc   SP: ffff80001329ba70   PC: ffff8000102449d4
                      PSTATE: 60000085   FPVALID: 00000000
            notes[1]: aaab21020fcc (NT_PRSTATUS)
                      si.signo: 0  si.code: 0  si.errno: 0
                      cursig: 0  sigpend: 0  sighold: 0
                      pid: 0  ppid: 0  pgrp: 0  sid:0
                      utime: 0.000000  stime: 0.000000
                      cutime: 0.000000  cstime: 0.000000
                       X0: 00000000000000e0   X1: ffff800011c60520   X2: 0000000000000001
                       X3: ffff80001097e240   X4: 0000000000000015   X5: 00ffffffffffffff
                       X6: 0000be9186c23431   X7: 00000010ab4a0098   X8: ffff0000c0398d20
                       X9: ffff80001097e268  X10: 0000000000000cc0  X11: 0000000000000000
                      X12: 0000000000000000  X13: 0000000000000000  X14: 0000000000000000
                      X15: 0000000000000000  X16: 0000000000000000  X17: 0000000000000000
                      X18: 0000000000000000  X19: 0000000000000001  X20: ffff800011c605a0
                      X21: ffff0003d4738600  X22: ffff800011c60520  X23: 0000000000000001
                      X24: 000001b6696821aa  X25: 0000000000000000  X26: 0000000000000000
                      X27: 0000000000000000  X28: 0000000000000000  X29: ffff800011f73e90
                       LR: ffff800010c0b0a0   SP: ffff800011f73e90   PC: ffff800010c0b0a8
                      PSTATE: 60c00005   FPVALID: 00000000
            notes[2]: aaab21021168 (NT_PRSTATUS)
                      si.signo: 0  si.code: 0  si.errno: 0
                      cursig: 0  sigpend: 0  sighold: 0
                      pid: 0  ppid: 0  pgrp: 0  sid:0
                      utime: 0.000000  stime: 0.000000
                      cutime: 0.000000  cstime: 0.000000
                       X0: 00000000000000e0   X1: ffff800011c60520   X2: 0000000000000001
                       X3: ffff80001097e240   X4: 0000000000000015   X5: 00ffffffffffffff
                       X6: 0000be9186c23431   X7: 0000000d7156c757   X8: ffff0000c039d020
                       X9: ffff80001097e268  X10: 0000000000000cc0  X11: 0000000000000000
                      X12: 0000000000000000  X13: 0000000000000000  X14: 0000000000000000
                      X15: 0000000000000000  X16: 0000000000000000  X17: 0000000000000000
                      X18: 0000000000000000  X19: 0000000000000001  X20: ffff800011c605a0
                      X21: ffff0003d4759600  X22: ffff800011c60520  X23: 0000000000000001
                      X24: 000001b666fedcd8  X25: 0000000000000000  X26: 0000000000000000
                      X27: 0000000000000000  X28: 0000000000000000  X29: ffff800011f7be90
                       LR: ffff800010c0b0a0   SP: ffff800011f7be90   PC: ffff800010c0b0a8
                      PSTATE: 60c00005   FPVALID: 00000000
            notes[3]: aaab21021304 (NT_PRSTATUS)
                      si.signo: 0  si.code: 0  si.errno: 0
                      cursig: 0  sigpend: 0  sighold: 0
                      pid: 0  ppid: 0  pgrp: 0  sid:0
                      utime: 0.000000  stime: 0.000000
                      cutime: 0.000000  cstime: 0.000000
                       X0: 00000000000000e0   X1: ffff800011c60520   X2: 0000000000000001
                       X3: ffff80001097e240   X4: 0000000000000015   X5: 00ffffffffffffff
                       X6: 0000be9186c23431   X7: 00000012126509af   X8: ffff0000c039e0e0
                       X9: ffff80001097e268  X10: 0000000000000cc0  X11: 0000000000000000
                      X12: 0000000000000000  X13: 0000000000000000  X14: 0000000000000000
                      X15: 0000000000000000  X16: 0000000000000000  X17: 0000000000000000
                      X18: 0000000000000000  X19: 0000000000000001  X20: ffff800011c605a0
                      X21: ffff0003d477a600  X22: ffff800011c60520  X23: 0000000000000001
                      X24: 000001b669615082  X25: 0000000000000000  X26: 0000000000000000
                      X27: 0000000000000000  X28: 0000000000000000  X29: ffff800011f83e90
                       LR: ffff800010c0b0a0   SP: ffff800011f83e90   PC: ffff800010c0b0a8
                      PSTATE: 60c00005   FPVALID: 00000000
       snapshot_task: 0
      num_qemu_notes: 0
        NOTE offsets: 1068 (NT_PRSTATUS)
                      1204 (NT_PRSTATUS)
                      13a0 (NT_PRSTATUS)
                      153c (NT_PRSTATUS)
    offset_eraseinfo: 0 (0x0)
      size_eraseinfo: 0 (0x0)
        start_pfn_64: (unused)
          end_pfn_64: (unused)
        max_mapnr_64: 4286464 (0x416800)

       data_offset: 109000
        block_size: 4096
       block_shift: 12
            bitmap: ffffb52c3010
        bitmap_len: 1073152
         max_mapnr: 4286464 (0x416800)
   dumpable_bitmap: ffffb51bc010
              byte: 0
               bit: 0
   compressed_page: aaab2104c330
         curbufptr: aaab21049320

 page_cache_hdr[0]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3f6000
           pg_bufptr: aaab2103c320
        pg_hit_count: 1
 page_cache_hdr[1]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3f7000
           pg_bufptr: aaab2103d320
        pg_hit_count: 1
 page_cache_hdr[2]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3f8000
           pg_bufptr: aaab2103e320
        pg_hit_count: 1
 page_cache_hdr[3]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3f9000
           pg_bufptr: aaab2103f320
        pg_hit_count: 1
 page_cache_hdr[4]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3fa000
           pg_bufptr: aaab21040320
        pg_hit_count: 1
 page_cache_hdr[5]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3fb000
           pg_bufptr: aaab21041320
        pg_hit_count: 1
 page_cache_hdr[6]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3fc000
           pg_bufptr: aaab21042320
        pg_hit_count: 1
 page_cache_hdr[7]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3fd000
           pg_bufptr: aaab21043320
        pg_hit_count: 1
 page_cache_hdr[8]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa4af000
           pg_bufptr: aaab21044320
        pg_hit_count: 1
 page_cache_hdr[9]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3a926b000
           pg_bufptr: aaab21045320
        pg_hit_count: 10
page_cache_hdr[10]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3a9540000
           pg_bufptr: aaab21046320
        pg_hit_count: 2
page_cache_hdr[11]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3a9541000
           pg_bufptr: aaab21047320
        pg_hit_count: 9
page_cache_hdr[12]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3ab6e9000
           pg_bufptr: aaab21048320
        pg_hit_count: 1
page_cache_hdr[13]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3ab6ea000
           pg_bufptr: aaab21049320
        pg_hit_count: 1
page_cache_hdr[14]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3f4000
           pg_bufptr: aaab2104a320
        pg_hit_count: 1
page_cache_hdr[15]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3f5000
           pg_bufptr: aaab2104b320
        pg_hit_count: 1

    page_cache_buf: aaab2103c320
       evict_index: 14
         evictions: 2734
          accesses: 23443
      cached_reads: 20693 (88%)
       valid_pages: aaab2103a250
 total_valid_pages: 154959

@k-hagio
Copy link
Contributor

k-hagio commented Jan 20, 2022

Thanks. so how does it work with this change?

--- a/diskdump.c
+++ b/diskdump.c
@@ -111,8 +111,7 @@ map_cpus_to_prstatus_kdump_cmprs(void)
 	if (pc->flags2 & QEMU_MEM_DUMP_COMPRESSED)  /* notes exist for all cpus */
 		goto resize_note_pointers;
 
-	if (!(online = get_cpus_online()) || (online == kt->cpus) || 
-	    machine_type("ARM64"))
+	if (!(online = get_cpus_online()) || (online == kt->cpus))
 		goto resize_note_pointers;
 
 	if (CRASHDEBUG(1))

@xuchunmei000
Copy link
Author

Thanks. so how does it work with this change?

--- a/diskdump.c
+++ b/diskdump.c
@@ -111,8 +111,7 @@ map_cpus_to_prstatus_kdump_cmprs(void)
 	if (pc->flags2 & QEMU_MEM_DUMP_COMPRESSED)  /* notes exist for all cpus */
 		goto resize_note_pointers;
 
-	if (!(online = get_cpus_online()) || (online == kt->cpus) || 
-	    machine_type("ARM64"))
+	if (!(online = get_cpus_online()) || (online == kt->cpus))
 		goto resize_note_pointers;
 
 	if (CRASHDEBUG(1))

I tried, but it does not work.

@k-hagio
Copy link
Contributor

k-hagio commented Jan 20, 2022

What is printed by help -D with the patch?

@xuchunmei000
Copy link
Author

What is printed by help -D with the patch?

sorry for late:

crash> help -D
diskdump_data:
          filename: /var/crash/127.0.0.1-2021-12-31-01:38:10/vmcore
             flags: 1c6 (KDUMP_CMPRS_LOCAL|ERROR_EXCLUDED|LZO_SUPPORTED|SNAPPY_SUPPORTED|ZSTD_SUPPORTED)
               dfd: 3
               ofp: ffff9f717510
      machine_type: 183 (EM_AARCH64)

            header: aaab02541e10
           signature: "KDUMP   "
      header_version: 6
             utsname:
               sysname: Linux
              nodename: localhost
               release: 5.10.60-9.al8.aarch64
               version: #1 SMP Mon Sep 6 20:56:34 CST 2021
               machine: aarch64
            domainname: (none)
           timestamp:
                tv_sec: 61cd7e17
               tv_usec: 0
              status: 2 (DUMP_DH_COMPRESSED_LZO)
          block_size: 4096
        sub_hdr_size: 2
       bitmap_blocks: 262
           max_mapnr: 4286464
    total_ram_blocks: 0
       device_blocks: 0
      written_blocks: 0
         current_cpu: 0
             nr_cpus: 4
      tasks[nr_cpus]: 0
                      0
                      0
                      0

        sub_header: 0 (n/a)

  sub_header_kdump: aaab02542e20
           phys_base: 40000000
          dump_level: 31 (0x1f) (DUMP_EXCLUDE_ZERO|DUMP_EXCLUDE_CACHE|DUMP_EXCLUDE_CACHE_PRI|DUMP_EXCLUDE_USER_DATA|DUMP_EXCLUDE_FREE)
               split: 0
           start_pfn: (unused)
             end_pfn: (unused)
   offset_vmcoreinfo: 5872 (0x16f0)
     size_vmcoreinfo: 2885 (0xb45)
                      OSRELEASE=5.10.60-9.al8.aarch64
                      BUILD-ID=c7f4708939637fe3985ed53ecb1aad98b94c847a
                      PAGESIZE=4096
                      SYMBOL(init_uts_ns)=ffff8000117fa028
                      SYMBOL(node_online_map)=ffff8000117f1bd0
                      SYMBOL(swapper_pg_dir)=ffff8000113b2000
                      SYMBOL(_stext)=ffff8000100d0000
                      SYMBOL(vmap_area_list)=ffff800011bdb6a0
                      SYMBOL(mem_section)=ffff0003d4783200
                      LENGTH(mem_section)=1024
                      SIZE(mem_section)=16
                      OFFSET(mem_section.section_mem_map)=0
                      NUMBER(SECTION_SIZE_BITS)=30
                      NUMBER(MAX_PHYSMEM_BITS)=48
                      SIZE(page)=64
                      SIZE(pglist_data)=7680
                      SIZE(zone)=1472
                      SIZE(free_area)=88
                      SIZE(list_head)=16
                      SIZE(nodemask_t)=8
                      OFFSET(page.flags)=0
                      OFFSET(page._refcount)=52
                      OFFSET(page.mapping)=24
                      OFFSET(page.lru)=8
                      OFFSET(page._mapcount)=48
                      OFFSET(page.private)=40
                      OFFSET(page.compound_dtor)=16
                      OFFSET(page.compound_order)=17
                      OFFSET(page.compound_head)=8
                      OFFSET(pglist_data.node_zones)=0
                      OFFSET(pglist_data.nr_zones)=6944
                      OFFSET(pglist_data.node_start_pfn)=6952
                      OFFSET(pglist_data.node_spanned_pages)=6968
                      OFFSET(pglist_data.node_id)=6992
                      OFFSET(zone.free_area)=192
                      OFFSET(zone.vm_stat)=1280
                      OFFSET(zone.spanned_pages)=112
                      OFFSET(free_area.free_list)=0
                      OFFSET(list_head.next)=0
                      OFFSET(list_head.prev)=8
                      OFFSET(vmap_area.va_start)=0
                      OFFSET(vmap_area.list)=40
                      LENGTH(zone.free_area)=11
                      SYMBOL(prb)=ffff80001181f330
                      SYMBOL(printk_rb_static)=ffff80001181f370
                      SYMBOL(clear_seq)=ffff800011cfb9e0
                      SIZE(printk_ringbuffer)=80
                      OFFSET(printk_ringbuffer.desc_ring)=0
                      OFFSET(printk_ringbuffer.text_data_ring)=40
                      OFFSET(printk_ringbuffer.fail)=72
                      SIZE(prb_desc_ring)=40
                      OFFSET(prb_desc_ring.count_bits)=0
                      OFFSET(prb_desc_ring.descs)=8
                      OFFSET(prb_desc_ring.infos)=16
                      OFFSET(prb_desc_ring.head_id)=24
                      OFFSET(prb_desc_ring.tail_id)=32
                      SIZE(prb_desc)=24
                      OFFSET(prb_desc.state_var)=0
                      OFFSET(prb_desc.text_blk_lpos)=8
                      SIZE(prb_data_blk_lpos)=16
                      OFFSET(prb_data_blk_lpos.begin)=0
                      OFFSET(prb_data_blk_lpos.next)=8
                      SIZE(printk_info)=88
                      OFFSET(printk_info.seq)=0
                      OFFSET(printk_info.ts_nsec)=8
                      OFFSET(printk_info.text_len)=16
                      OFFSET(printk_info.caller_id)=20
                      OFFSET(printk_info.dev_info)=24
                      SIZE(dev_printk_info)=64
                      OFFSET(dev_printk_info.subsystem)=0
                      LENGTH(printk_info_subsystem)=16
                      OFFSET(dev_printk_info.device)=16
                      LENGTH(printk_info_device)=48
                      SIZE(prb_data_ring)=32
                      OFFSET(prb_data_ring.size_bits)=0
                      OFFSET(prb_data_ring.data)=8
                      OFFSET(prb_data_ring.head_lpos)=16
                      OFFSET(prb_data_ring.tail_lpos)=24
                      SIZE(atomic_long_t)=8
                      OFFSET(atomic_long_t.counter)=0
                      LENGTH(free_area.free_list)=5
                      NUMBER(NR_FREE_PAGES)=0
                      NUMBER(PG_lru)=4
                      NUMBER(PG_private)=13
                      NUMBER(PG_swapcache)=10
                      NUMBER(PG_swapbacked)=19
                      NUMBER(PG_slab)=9
                      NUMBER(PG_hwpoison)=22
                      NUMBER(PG_head_mask)=65536
                      NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-129
                      NUMBER(HUGETLB_PAGE_DTOR)=2
                      NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE)=-257
                      NUMBER(VA_BITS)=48
                      NUMBER(kimage_voffset)=0xffff7ffc67a00000
                      NUMBER(PHYS_OFFSET)=0x40000000
                      NUMBER(TCR_EL1_T1SZ)=0x10
                      KERNELOFFSET=c0000
                      NUMBER(KERNELPACMASK)=0x0
                      CRASHTIME=1640857111
         offset_note: 4200 (0x1068)
           size_note: 4560 (0x11d0)
           notes_buf: aaab02543e30
  num_vmcoredd_notes: 0
  num_prstatus_notes: 8
            notes[0]: 0
            notes[1]: 0
            notes[2]: 0
            notes[3]: 0
            notes[4]: aaab02543e30 (NT_PRSTATUS)
                      si.signo: 0  si.code: 0  si.errno: 0
                      cursig: 0  sigpend: 0  sighold: 0
                      pid: 1408  ppid: 0  pgrp: 0  sid:0
                      utime: 0.000000  stime: 0.000000
                      cutime: 0.000000  cstime: 0.000000
                       X0: ffff0000c8742800   X1: 0000000000000000   X2: ffff00036b6e90c0
                       X3: ffff800011bb22e8   X4: ffff00036b6e90c0   X5: 0000000000000000
                       X6: 000000000000000f   X7: ffff80001181f550   X8: 0000000000000000
                       X9: ffff8000102448fc  X10: 00000000ffff8000  X11: ffff800011adf550
                      X12: 0720072007200720  X13: 0720072007200720  X14: 0720072007200720
                      X15: ffff00036b6e9740  X16: 0000000000000000  X17: 0000000000000000
                      X18: 0000000000000030  X19: ffff00036b6e90c0  X20: ffff800011bb22a8
                      X21: 0000000000000000  X22: ffff800011e08000  X23: ffff80001329bab8
                      X24: ffff800011cf2000  X25: ffff800010cdcbc0  X26: 0000000000000000
                      X27: 0000000000000000  X28: ffff00036b6e90c0  X29: ffff80001329ba70
                       LR: ffff8000102448fc   SP: ffff80001329ba70   PC: ffff8000102449d4
                      PSTATE: 60000085   FPVALID: 00000000
            notes[5]: aaab02543fcc (NT_PRSTATUS)
                      si.signo: 0  si.code: 0  si.errno: 0
                      cursig: 0  sigpend: 0  sighold: 0
                      pid: 0  ppid: 0  pgrp: 0  sid:0
                      utime: 0.000000  stime: 0.000000
                      cutime: 0.000000  cstime: 0.000000
                       X0: 00000000000000e0   X1: ffff800011c60520   X2: 0000000000000001
                       X3: ffff80001097e240   X4: 0000000000000015   X5: 00ffffffffffffff
                       X6: 0000be9186c23431   X7: 00000010ab4a0098   X8: ffff0000c0398d20
                       X9: ffff80001097e268  X10: 0000000000000cc0  X11: 0000000000000000
                      X12: 0000000000000000  X13: 0000000000000000  X14: 0000000000000000
                      X15: 0000000000000000  X16: 0000000000000000  X17: 0000000000000000
                      X18: 0000000000000000  X19: 0000000000000001  X20: ffff800011c605a0
                      X21: ffff0003d4738600  X22: ffff800011c60520  X23: 0000000000000001
                      X24: 000001b6696821aa  X25: 0000000000000000  X26: 0000000000000000
                      X27: 0000000000000000  X28: 0000000000000000  X29: ffff800011f73e90
                       LR: ffff800010c0b0a0   SP: ffff800011f73e90   PC: ffff800010c0b0a8
                      PSTATE: 60c00005   FPVALID: 00000000
            notes[6]: aaab02544168 (NT_PRSTATUS)
                      si.signo: 0  si.code: 0  si.errno: 0
                      cursig: 0  sigpend: 0  sighold: 0
                      pid: 0  ppid: 0  pgrp: 0  sid:0
                      utime: 0.000000  stime: 0.000000
                      cutime: 0.000000  cstime: 0.000000
                       X0: 00000000000000e0   X1: ffff800011c60520   X2: 0000000000000001
                       X3: ffff80001097e240   X4: 0000000000000015   X5: 00ffffffffffffff
                       X6: 0000be9186c23431   X7: 0000000d7156c757   X8: ffff0000c039d020
                       X9: ffff80001097e268  X10: 0000000000000cc0  X11: 0000000000000000
                      X12: 0000000000000000  X13: 0000000000000000  X14: 0000000000000000
                      X15: 0000000000000000  X16: 0000000000000000  X17: 0000000000000000
                      X18: 0000000000000000  X19: 0000000000000001  X20: ffff800011c605a0
                      X21: ffff0003d4759600  X22: ffff800011c60520  X23: 0000000000000001
                      X24: 000001b666fedcd8  X25: 0000000000000000  X26: 0000000000000000
                      X27: 0000000000000000  X28: 0000000000000000  X29: ffff800011f7be90
                       LR: ffff800010c0b0a0   SP: ffff800011f7be90   PC: ffff800010c0b0a8
                      PSTATE: 60c00005   FPVALID: 00000000
            notes[7]: aaab02544304 (NT_PRSTATUS)
                      si.signo: 0  si.code: 0  si.errno: 0
                      cursig: 0  sigpend: 0  sighold: 0
                      pid: 0  ppid: 0  pgrp: 0  sid:0
                      utime: 0.000000  stime: 0.000000
                      cutime: 0.000000  cstime: 0.000000
                       X0: 00000000000000e0   X1: ffff800011c60520   X2: 0000000000000001
                       X3: ffff80001097e240   X4: 0000000000000015   X5: 00ffffffffffffff
                       X6: 0000be9186c23431   X7: 00000012126509af   X8: ffff0000c039e0e0
                       X9: ffff80001097e268  X10: 0000000000000cc0  X11: 0000000000000000
                      X12: 0000000000000000  X13: 0000000000000000  X14: 0000000000000000
                      X15: 0000000000000000  X16: 0000000000000000  X17: 0000000000000000
                      X18: 0000000000000000  X19: 0000000000000001  X20: ffff800011c605a0
                      X21: ffff0003d477a600  X22: ffff800011c60520  X23: 0000000000000001
                      X24: 000001b669615082  X25: 0000000000000000  X26: 0000000000000000
                      X27: 0000000000000000  X28: 0000000000000000  X29: ffff800011f83e90
                       LR: ffff800010c0b0a0   SP: ffff800011f83e90   PC: ffff800010c0b0a8
                      PSTATE: 60c00005   FPVALID: 00000000
       snapshot_task: 0
      num_qemu_notes: 0
        NOTE offsets: 1068 (NT_PRSTATUS)
                      1204 (NT_PRSTATUS)
                      13a0 (NT_PRSTATUS)
                      153c (NT_PRSTATUS)
    offset_eraseinfo: 0 (0x0)
      size_eraseinfo: 0 (0x0)
        start_pfn_64: (unused)
          end_pfn_64: (unused)
        max_mapnr_64: 4286464 (0x416800)

       data_offset: 109000
        block_size: 4096
       block_shift: 12
            bitmap: ffff9f48f010
        bitmap_len: 1073152
         max_mapnr: 4286464 (0x416800)
   dumpable_bitmap: ffff9f388010
              byte: 0
               bit: 0
   compressed_page: aaab0256f330
         curbufptr: aaab0256c320

 page_cache_hdr[0]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3f6000
           pg_bufptr: aaab0255f320
        pg_hit_count: 1
 page_cache_hdr[1]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3f7000
           pg_bufptr: aaab02560320
        pg_hit_count: 1
 page_cache_hdr[2]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3f8000
           pg_bufptr: aaab02561320
        pg_hit_count: 1
 page_cache_hdr[3]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3f9000
           pg_bufptr: aaab02562320
        pg_hit_count: 1
 page_cache_hdr[4]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3fa000
           pg_bufptr: aaab02563320
        pg_hit_count: 1
 page_cache_hdr[5]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3fb000
           pg_bufptr: aaab02564320
        pg_hit_count: 1
 page_cache_hdr[6]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3fc000
           pg_bufptr: aaab02565320
        pg_hit_count: 1
 page_cache_hdr[7]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3fd000
           pg_bufptr: aaab02566320
        pg_hit_count: 1
 page_cache_hdr[8]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa4af000
           pg_bufptr: aaab02567320
        pg_hit_count: 1
 page_cache_hdr[9]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3a926b000
           pg_bufptr: aaab02568320
        pg_hit_count: 10
page_cache_hdr[10]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3a9540000
           pg_bufptr: aaab02569320
        pg_hit_count: 2
page_cache_hdr[11]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3a9541000
           pg_bufptr: aaab0256a320
        pg_hit_count: 9
page_cache_hdr[12]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3ab6e9000
           pg_bufptr: aaab0256b320
        pg_hit_count: 1
page_cache_hdr[13]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3ab6ea000
           pg_bufptr: aaab0256c320
        pg_hit_count: 1
page_cache_hdr[14]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3f4000
           pg_bufptr: aaab0256d320
        pg_hit_count: 1
page_cache_hdr[15]:
            pg_flags: 1 (PAGE_VALID)
             pg_addr: 3aa3f5000
           pg_bufptr: aaab0256e320
        pg_hit_count: 1

    page_cache_buf: aaab0255f320
       evict_index: 14
         evictions: 2734
          accesses: 23443
      cached_reads: 20693 (88%)
       valid_pages: aaab0255d250
 total_valid_pages: 154959

@k-hagio
Copy link
Contributor

k-hagio commented Jan 20, 2022

Thanks, it looks correctly mapped.

  num_prstatus_notes: 8
            notes[0]: 0
            notes[1]: 0
            notes[2]: 0
            notes[3]: 0
            notes[4]: aaab02543e30 (NT_PRSTATUS)
            ...

I tried, but it does not work.

What errors do you see? The same segfault by bt -c 0?

@xuchunmei000
Copy link
Author

Thanks, it looks correctly mapped.

  num_prstatus_notes: 8
            notes[0]: 0
            notes[1]: 0
            notes[2]: 0
            notes[3]: 0
            notes[4]: aaab02543e30 (NT_PRSTATUS)
            ...

I tried, but it does not work.

What errors do you see? The same segfault by bt -c 0?

yes.

crash> bt -c 1
PID: 0      TASK: ffff0000c03510c0  CPU: 1   COMMAND: "swapper/1"
 #0 [ffff800011f73e90] arch_cpu_idle at ffff800010c0b0a4
crash> bt -c 2
PID: 0      TASK: ffff0000c039a180  CPU: 2   COMMAND: "swapper/2"
 #0 [ffff800011f7be90] arch_cpu_idle at ffff800010c0b0a4
crash> bt -c 2
PID: 0      TASK: ffff0000c039a180  CPU: 2   COMMAND: "swapper/2"
 #0 [ffff800011f7be90] arch_cpu_idle at ffff800010c0b0a4
crash> bt -c 3
PID: 0      TASK: ffff0000c039b240  CPU: 3   COMMAND: "swapper/3"
 #0 [ffff800011f83e90] arch_cpu_idle at ffff800010c0b0a4
crash> bt -c 0
PID: 0      TASK: ffff8000117fa240  CPU: 0   COMMAND: "swapper/0"
Segmentation fault (core dumped)

@k-hagio
Copy link
Contributor

k-hagio commented Jan 20, 2022

So perhaps dd->nt_prstatus_percpu is not the cause.
Is it possible to debug where crash fails?
I don't have an arm machine and cannot reproduce this.

@xuchunmei000
Copy link
Author

So perhaps dd->nt_prstatus_percpu is not the cause. Is it possible to debug where crash fails? I don't have an arm machine and cannot reproduce this.

following is gdb info, the panic_task_regs[0] is same as panic_task_regs[4], while cpu 0 does not save crash_notes, it shound be empty.

 #0  arm64_is_kernel_exception_frame (bt=bt@entry=0xffffcd47d9f8, stkptr=stkptr@entry=18446603336542697776) at arm64.c:1925
1925		if (INSTACK(regs->sp, bt) && INSTACK(regs->regs[29], bt) &&
[Current thread is 1 (Thread 0xffff8aa7f010 (LWP 128066))]
(gdb) bt
#0  arm64_is_kernel_exception_frame (bt=bt@entry=0xffffcd47d9f8, stkptr=stkptr@entry=18446603336542697776) at arm64.c:1925
#1  0x0000aaaab26b2ef4 in arm64_back_trace_cmd (bt=0xffffcd47d9f8) at arm64.c:2760
#2  0x0000aaaab2684058 in back_trace (bt=0xffffcd47d9f8) at kernel.c:3186
#3  0x0000aaaab2685be4 in cmd_bt () at kernel.c:2789
#4  0x0000aaaab25fe2fc in exec_command () at main.c:892
#5  0x0000aaaab25fe5b8 in main_loop () at main.c:839
#6  0x0000aaaab292216c in captured_main (data=data@entry=0xffffcd47e1e0) at main.c:1284
#7  gdb_main (args=args@entry=0xffffcd47e220) at main.c:1313
#8  0x0000aaaab292225c in gdb_main_entry (argc=<optimized out>, argv=<optimized out>) at main.c:1338
#9  0x0000aaaab25f873c in main (argc=3, argv=0xffffcd47e418) at main.c:720
(gdb) p machdep->machspec->panic_task_regs[0]
$1 = {{user_regs = {regs = {18446462602095896576, 0, 18446462613420150976, 18446603336518673128, 18446462613420150976, 0, 15,
        18446603336514925904, 0, 18446603336492009724, 4294934528, 18446603336517809488, 513418191660123936, 513418191660123936,
        513418191660123936, 18446462613420152640, 0, 0, 48, 18446462613420150976, 18446603336518673064, 0, 18446603336521121792,
        18446603336542698168, 18446603336519983104, 18446603336503118784, 0, 0, 18446462613420150976, 18446603336542698096,
        18446603336492009724}, sp = 18446603336542698096, pc = 18446603336492009940, pstate = 1610612869}, {regs = {18446462602095896576,
        0, 18446462613420150976, 18446603336518673128, 18446462613420150976, 0, 15, 18446603336514925904, 0, 18446603336492009724,
        4294934528, 18446603336517809488, 513418191660123936, 513418191660123936, 513418191660123936, 18446462613420152640, 0, 0, 48,
        18446462613420150976, 18446603336518673064, 0, 18446603336521121792, 18446603336542698168, 18446603336519983104,
        18446603336503118784, 0, 0, 18446462613420150976, 18446603336542698096, 18446603336492009724}, sp = 18446603336542698096,
      pc = 18446603336492009940, pstate = 1610612869}}, orig_x0 = 0, syscallno = 0}
(gdb) p machdep->machspec->panic_task_regs[4]
$2 = {{user_regs = {regs = {18446462602095896576, 0, 18446462613420150976, 18446603336518673128, 18446462613420150976, 0, 15,
        18446603336514925904, 0, 18446603336492009724, 4294934528, 18446603336517809488, 513418191660123936, 513418191660123936,
        513418191660123936, 18446462613420152640, 0, 0, 48, 18446462613420150976, 18446603336518673064, 0, 18446603336521121792,
        18446603336542698168, 18446603336519983104, 18446603336503118784, 0, 0, 18446462613420150976, 18446603336542698096,
        18446603336492009724}, sp = 18446603336542698096, pc = 18446603336492009940, pstate = 1610612869}, {regs = {18446462602095896576,
        0, 18446462613420150976, 18446603336518673128, 18446462613420150976, 0, 15, 18446603336514925904, 0, 18446603336492009724,
        4294934528, 18446603336517809488, 513418191660123936, 513418191660123936, 513418191660123936, 18446462613420152640, 0, 0, 48,
        18446462613420150976, 18446603336518673064, 0, 18446603336521121792, 18446603336542698168, 18446603336519983104,
        18446603336503118784, 0, 0, 18446462613420150976, 18446603336542698096, 18446603336492009724}, sp = 18446603336542698096,
      pc = 18446603336492009940, pstate = 1610612869}}, orig_x0 = 0, syscallno = 0}

@xuchunmei000
Copy link
Author

So perhaps dd->nt_prstatus_percpu is not the cause. Is it possible to debug where crash fails? I don't have an arm machine and cannot reproduce this.

I have some debug information:
dd->nt_prstatus_percpu is correctly mapped in map_cpus_to_prstatus_kdump_cmprs with your patch.
but it is later than machdep->machspec->panic_task_regs saved.
machdep->machspec->panic_task_regs is saved from arm64_get_crash_notes called from arm64_init.
while map_cpus_to_prstatus_kdump_cmprs is called from task_init.
task_init is called later than machdep_init(arm64_init).

@lian-bo
Copy link
Member

lian-bo commented Jan 21, 2022

I got a vmcore with the same backtrace from customers, the "bt -a" may trigger the segfault on this specific vmcore. It seems to be a similar case.

@k-hagio
Copy link
Contributor

k-hagio commented Jan 25, 2022

task_init is called later than machdep_init(arm64_init).

Thanks for debugging.

How does this work with the patch above?

--- a/arm64.c
+++ b/arm64.c
@@ -472,7 +472,7 @@ arm64_init(int when)
                arm64_stackframe_init();
                break;
 
-       case POST_VM:
+       case POST_INIT:
                /*
                 * crash_notes contains machine specific information about the
                 * crash. In particular, it contains CPU registers at the time

@xuchunmei000
Copy link
Author

xuchunmei000 commented Jan 25, 2022

map_cpus_to_prstatus_kdump_cmprs

yes, it works, with previous patch for map_cpus_to_prstatus_kdump_cmprs.

@k-hagio
Copy link
Contributor

k-hagio commented Jan 25, 2022

Thanks for testing.
It's just an idea, will check if there is a better way.

@k-hagio
Copy link
Contributor

k-hagio commented Jan 25, 2022

@lian-bo, could you test this patch with the vmcore you got?
I think I will go with this.

--- a/arm64.c
+++ b/arm64.c
@@ -472,7 +472,7 @@ arm64_init(int when)
                arm64_stackframe_init();
                break;
 
-       case POST_VM:
+       case POST_INIT:
                /*
                 * crash_notes contains machine specific information about the
                 * crash. In particular, it contains CPU registers at the time
diff --git a/diskdump.c b/diskdump.c
index 3e1cfd548c96..d5674276e1fd 100644
--- a/diskdump.c
+++ b/diskdump.c
@@ -111,8 +111,7 @@ map_cpus_to_prstatus_kdump_cmprs(void)
        if (pc->flags2 & QEMU_MEM_DUMP_COMPRESSED)  /* notes exist for all cpus */
                goto resize_note_pointers;
 
-       if (!(online = get_cpus_online()) || (online == kt->cpus) || 
-           machine_type("ARM64"))
+       if (!(online = get_cpus_online()) || (online == kt->cpus))
                goto resize_note_pointers;
 
        if (CRASHDEBUG(1))

@lian-bo
Copy link
Member

lian-bo commented Jan 25, 2022

Sure. Crash got a lot of warnings when running the bt command on my vmcore, but I have no much time to investigate the details. As you know, currently I'm working on another issues. I will help to test it further once I have time.
But anyway, could you post it to upstream firstly? We can continue to talk about it there. Thanks.

@k-hagio
Copy link
Contributor

k-hagio commented Jan 26, 2022

ok, will post.

lian-bo pushed a commit that referenced this issue Jan 30, 2022
Currently on arm64, NT_PRSTATUS notes in dumpfile are not mapped to
online cpus and machine_specific->panic_task_regs correctly.  As a
result, the "bt" command can cause a segmentation fault.

  crash> bt -c 0
  PID: 0      TASK: ffff8000117fa240  CPU: 0   COMMAND: "swapper/0"
  Segmentation fault (core dumped)

To fix this,
1) make map_cpus_to_prstatus_kdump_cmprs() map the notes to
   dd->nt_prstatus_percpu also on arm64, and
2) move arm64_get_crash_notes() to machdep_init(POST_INIT) in order
   to apply the mapping to machine_specific->panic_task_regs.

Resolves: #105
Reported-by: xuchunmei000 <xuchunmei@linux.alibaba.com>
Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
Tested-by: David Wysochanski <dwysocha@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants