Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debuging kernel doesn't start: invalid `Elf' handle #10

Closed
Nudin opened this issue Aug 13, 2019 · 5 comments
Closed

debuging kernel doesn't start: invalid `Elf' handle #10

Nudin opened this issue Aug 13, 2019 · 5 comments

Comments

@Nudin
Copy link

Nudin commented Aug 13, 2019

I followed the install-guide. After that I started it via sudo drgn -k and got:

Traceback (most recent call last):
  File "/usr/bin/drgn", line 11, in <module>
    load_entry_point('drgn==0.0.1', 'console_scripts', 'drgn')()
  File "/usr/lib/python3.7/site-packages/drgn-0.0.1-py3.7-linux-x86_64.egg/drgn/internal/cli.py", line 90, in main
    prog.load_default_debug_info()
_drgn.FileFormatError: libelf error: invalid `Elf' handle

Any idea what I'm doing wrong?

@osandov
Copy link
Owner

osandov commented Aug 13, 2019

Could you provide the output of running sudo strace drgn -k? Hopefully I can figure out which file failed from that.

Some other general information that would be appreciated:

  • What distro and distro version is this on?
  • What version of elftutils do you have installed? eu-readelf --version should give you that unless you did anything special to get the elfutils libraries.

@Nudin
Copy link
Author

Nudin commented Aug 13, 2019

@osandov
Copy link
Owner

osandov commented Aug 13, 2019

Ah, it looks like you're trying to debug the official Arch kernel, but Arch doesn't ship debugging information. Unfortunately, the only way to get debugging information on Arch is to build the kernel yourself with CONFIG_DEBUG_INFO=y. However, drgn should definitely be giving you a better error message here. I can reproduce the issue here, so I'll fix that.

@Nudin
Copy link
Author

Nudin commented Aug 13, 2019

Makes total sense! Adding a proper error message sounds the right way! Thanks for the quick help.

@osandov
Copy link
Owner

osandov commented Oct 4, 2019

This is fixed by the larger rework in 423d2cd:

$ sudo drgn -k
could not get debugging information for:
/usr/lib/modules/5.3.1-arch1-1-ARCH/kernel/drivers/input/serio/serio.ko.xz (libdwfl error: No DWARF information found)
/usr/lib/modules/5.3.1-arch1-1-ARCH/kernel/drivers/input/serio/i8042.ko.xz (libdwfl error: No DWARF information found)
/usr/lib/modules/5.3.1-arch1-1-ARCH/kernel/drivers/net/net_failover.ko.xz (libdwfl error: No DWARF information found)
/usr/lib/modules/5.3.1-arch1-1-ARCH/kernel/drivers/net/virtio_net.ko.xz (libdwfl error: No DWARF information found)
/usr/lib/modules/5.3.1-arch1-1-ARCH/kernel/drivers/macintosh/mac_hid.ko.xz (libdwfl error: No DWARF information found)
... 40 more

Thanks for reporting this!

@osandov osandov closed this as completed Oct 4, 2019
ypl-coffee added a commit to ypl-coffee/drgn that referenced this issue Jan 7, 2022
Simply doing "drgn" segmentation-faults on my machine:

(gdb) bt
 #0  apply_elf_rela_x86_64 (relocating=relocating@entry=0x7fc843650c90, r_offset=0, r_type=2, r_addend=0,
     sym_value=<optimized out>) at ../../libdrgn/arch_x86_64.c:498
 #1  0x00007fc88cbd523b in relocate_elf_section (platform=0x7fc843650c80, shdrnum=59, sh_addrs=0x7fc808000b20,
     symtab_scn=<optimized out>, reloc_scn=0x17614c8, scn=<optimized out>) at ../../libdrgn/debug_info.c:761
 osandov#2  relocate_elf_file (elf=<optimized out>) at ../../libdrgn/debug_info.c:865
 osandov#3  drgn_debug_info_find_sections (module=<optimized out>) at ../../libdrgn/debug_info.c:883
 osandov#4  drgn_debug_info_read_module (load=load@entry=0x7ffea9d70870, dindex_state=0x7ffea9d70810, head=0x163c6b0)
     at ../../libdrgn/debug_info.c:970
 osandov#5  0x00007fc88cbd5474 in drgn_debug_info_update_index._omp_fn.1 () at ../../libdrgn/debug_info.c:1037
 osandov#6  0x00007fc88cb19769 in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
 osandov#7  0x00007fc88cb21f00 in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
 osandov#8  0x00007fc88cb1f7aa in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
 osandov#9  0x00007fc88dfb7fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
 osandov#10 0x00007fc88dafe4cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Always returning NULL in libdrgn/debug_info.c:relocate_elf_file() fixes
the issue.  I don't know, maybe it's a bug in drgn's implementation of
ELF relocation, but add this hack for now.

Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
imran-kn added a commit to imran-kn/drgn-work that referenced this issue Mar 18, 2024
For example:

python3 -m drgn -s vmlinux -c vmcore-writer-reader-reader-reader contrib/locks.py rwsem --spinner-callstack ffffffffc03083c0
rwsem: ffffffffc03083c0 has 1 spinners and their call-stack is as follows:

call stack for pid: 239
 #0  __read_once_size (./include/linux/compiler.h:268:2)
 osandov#1  arch_atomic64_read (./arch/x86/include/asm/atomic64_64.h:22:9)
 osandov#2  atomic64_read (./include/asm-generic/atomic-instrumented.h:837:9)
 osandov#3  atomic_long_read (./include/asm-generic/atomic-long.h:28:9)
 osandov#4  rwsem_owner_flags (kernel/locking/rwsem.c:298:24)
 osandov#5  rwsem_spin_on_owner (kernel/locking/rwsem.c:737:9)
 osandov#6  rwsem_optimistic_spin (kernel/locking/rwsem.c:812:17)
 osandov#7  rwsem_down_read_slowpath (kernel/locking/rwsem.c:1018:6)
 osandov#8  __down_read_killable (kernel/locking/rwsem.c:1366:14)
 osandov#9  down_read_killable (kernel/locking/rwsem.c:1532:6)
 osandov#10 0xffffffffc030622c
.................

Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants