Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update on top of tao-rebase-v3 #2

Open
wants to merge 3 commits into
base: tao-rebase-v3
Choose a base branch
from

Conversation

YustasSwamp
Copy link
Contributor

Added 3 patches:

  1. Enhancements in vmware_guestdump that I sent previously. It was not present in one-thread-v2, so I cherry-picked it.
    It was required to testing purposes against many different VMware dump images.
  2. "gdb bt" fix for inactive tasks for VMware dumps
  3. set_context() minor improvement.

Alexey Makhalov added 3 commits April 29, 2024 13:29
There are several versions of debug.guest format. Current version of
the code is able to parse only version 4.

Improve parser to support other known versions. Split data structures
on sub-structures and introduce a helper functions to calculate a gap
between them based on the version number. Implement additional data
structure (struct mainmeminfo_old) and logic specifically for original
(version 1) format support.

Signed-off-by: Alexey Makhalov <alexey.makhalov@broadcom.com>
vmware_vmss_get_cpu_reg() whould be called only for active tasks
to get their registers from corresponding CPUs.
Otherwise, the standard path of fetching pt_regs from the memory
(inactive_task_frame) should be used.

Signed-off-by: Alexey Makhalov <alexey.makhalov@broadcom.com>
By doing it we avoid dropping gdb caches unnecessarily.

Signed-off-by: Alexey Makhalov <alexey.makhalov@broadcom.com>
liutgnu added a commit that referenced this pull request Apr 30, 2024
The stack unwinding is for kernel addresses only. If non-kernel address
encountered, it is usually a user space address, or non-address value
like a function call parameter. So stopping stack unwinding at non-kernel
address will decrease the invalid unwind results.

Before:
crash> gdb bt
 #0  0xffffffff816a8f65 in context_switch ...
 #1  __schedule () ...
 #2  0xffffffff816a94e9 in schedule ...
 #3  0xffffffff816a86fd in schedule_hrtimeout_range_clock ...
 #4  0xffffffff816a8733 in schedule_hrtimeout_range ...
 #5  0xffffffff8124bb7e in ep_poll ...
 #6  0xffffffff8124d00d in SYSC_epoll_wait ...
 #7  SyS_epoll_wait ...
 #8  <signal handler called>
 #9  0x00007f0449407923 in ?? ()
 #10 0xffff880100000001 in ?? ()
 #11 0xffff880169b3c010 in ?? ()
 #12 0x0000000000000040 in irq_stack_union ()
 #13 0xffff880169b3c058 in ?? ()
 #14 0xffff880169b3c048 in ?? ()
 #15 0xffff880169b3c050 in ?? ()
 #16 0x0000000000000000 in ?? ()

After:
crash> gdb bt
 #0  0xffffffff816a8f65 in context_switch ...
 #1  __schedule () ...
 #2  0xffffffff816a94e9 in schedule () ...
 #3  0xffffffff816a86fd in schedule_hrtimeout_range_clock ...
 #4  0xffffffff816a8733 in schedule_hrtimeout_range ...
 #5  0xffffffff8124bb7e in ep_poll ...
 #6  0xffffffff8124d00d in SYSC_epoll_wait ...
 #7  SyS_epoll_wait ...
 #8  <signal handler called>
 #9  0x00007f0449407923 in ?? ()

Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Cc: HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@nec.com>
Cc: Tao Liu <ltao@redhat.com>
Cc: Alexey Makhalov <alexey.makhalov@broadcom.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
liutgnu pushed a commit that referenced this pull request Apr 30, 2024
Currently, gdb passthroughs of 'bt', 'frame', 'up', 'down', 'info
locals' don't work. This is due to gdb not knowing the register values to
unwind the stack frames

Every gdb passthrough goes through `gdb_interface`. And then, gdb expects
`crash_target::fetch_registers` to give it the register values, which is
dependent on `machdep->get_cpu_reg` to read the register values for
specific architecture.

                                      ----------------------------
           gdb passthrough (eg. "bt") |                          |
   crash   -------------------------> |                          |
                                      |      gdb_interface       |
                                      |                          |
                                      |                          |
                                      |  ----------------------  |
                 fetch_registers      |  |                    |  |
crash_target<-------------------------+--|        gdb         |  |
            --------------------------+->|                    |  |
              Registers (SP,NIP, etc.)|  |                    |  |
                                      |  |                    |  |
                                      |  ----------------------  |
                                      ----------------------------

Implement `machdep->get_cpu_reg` on PPC64, so that crash provides the
register values to gdb to unwind stack frames properly

With these changes, on powerpc, 'bt' command output in gdb mode, will look
like this:

    gdb> bt
    #0  0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>, newregs=0xc00000000486f8d8) at ./arch/powerpc/include/asm/kexec.h:69
    #1  __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974
    #2  0xc000000000168918 in panic (fmt=<optimized out>) at kernel/panic.c:358
    #3  0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155
    #4  0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:602
    #5  0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163
    #6  0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>, buf=<optimized out>, file=<optimized out>, pde=0xc000000009ed3a80) at fs/proc/inode.c:340
    #7  proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352
    #8  0xc0000000005b3bbc in vfs_write (file=file@entry=0xc00000009dda7d00, buf=buf@entry=0xebcfc7c6040 <error: Cannot access memory at address 0xebcfc7c6040>, count=count@entry=2, pos=pos@entry=0xc00000000486fda0) at fs/read_write.c:582

instead of earlier output without this patch:

    gdb> bt
    #0  <unavailable> in ?? ()
    Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Also, 'get_dumpfile_regs' has been introduced to get registers from
multiple supported vmcore formats. Correspondingly a flag 'BT_NO_PRINT_REGS'
has been introduced to tell helper functions to get registers, to not
print registers with every call to backtrace in gdb.

 Note: This feature to support GDB unwinding doesn't support live debugging

Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Cc: HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@nec.com>
Cc: Tao Liu <ltao@redhat.com>
Cc: Alexey Makhalov <alexey.makhalov@broadcom.com>
Improved-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
@liutgnu
Copy link
Owner

liutgnu commented Apr 30, 2024

Hi Alexey,

I made some small improvement since tao-rebase-v3 and merged this PR in 1, could you please check it out?

Thanks,
Tao Liu

@YustasSwamp
Copy link
Contributor Author

Hi Tao,

tao-rebase-v4, passed all tests from my side. No issues found!
Thanks for putting all pieces together,
--Alexey

@liutgnu
Copy link
Owner

liutgnu commented May 2, 2024

Hi Alexey,

Glad to hear it works, thanks for your testing!

Thanks,
Tao Liu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants