Drgn reads zeros for memory allocated by vm_map_ram() #217

brenns10 · 2022-10-26T18:22:12Z

I've tested this on Oracle UEK 5 and 7 (4.14 and 5.15 based). I'll try to reproduce this on mainline as well, maybe hacking it into the vmtest for a quick check. I had a coworker report that drgn was reading all zero's for a structure. We determined the memory address came from the vmalloc subsystem, particularly vm_map_ram(). I went ahead and created a reproducer kernel module to demo the issue.

The module allocates some pages, uses vm_map_ram() to map them, and then writes a pattern of data. When opened with crash, the pattern written is visible. But when viewed in drgn, the pattern is read as all 0's:

# dmesg after insmod
[  962.825481] drgn_vmalloc_test: 0xffff9e43c2400000

# drgn output
>>> prog.read(0xffff9e43c2400000, 64)
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

# crash output
crash> rd 0xffff9e43c2400000 8
ffff9e43c2400000:  0000000100000000 0000000300000002   ................
ffff9e43c2400010:  0000000500000004 0000000700000006   ................
ffff9e43c2400020:  0000000900000008 0000000b0000000a   ................
ffff9e43c2400030:  0000000d0000000c 0000000f0000000e   ................

@osandov, if you have an idea what the root cause is, without too much work, then by all means let me know. But otherwise I thought it might be a useful way for me to explore the memory reader subsystem and learn a bit more about drgn internals.

The text was updated successfully, but these errors were encountered:

osandov · 2022-10-26T21:00:25Z

This is on the live kernel, right? My wild guess is that this is a bug in /proc/kcore, because drgn isn't super fancy about reading memory. Do you get back the correct results if you read with access_remote_vm(prog["init_mm"].address_of_(), address, size) instead of prog.read()? That will translate the virtual address to a physical address from the kernel page table and then read based on the physical address. If that works, it's very likely a bug in /proc/kcore.

brenns10 · 2022-10-26T21:18:27Z

Yeah - live kernel. I tracked it through the /proc/kcore reads with some printf debugging and was coming to that conclusion myself.

The access_remote_vm() worked! So I guess it's time to look into the kernel for the bug.

I'm rather impressed that crash is doing the virt to phys translation manually given that /proc/kcore seems to have an ELF segment for this region of memory. Maybe the original /dev/mem did not, and crash was architected around that?

brenns10 · 2022-10-26T21:31:21Z

The issue must be in vread() not supporting memory which comes from vm_map_ram(), if the allocation is small it comes from vb_alloc().

brenns10 · 2022-10-27T18:08:19Z

As an FYI, looks like there will be an upstream fix for this!
https://lore.kernel.org/linux-mm/87ilk6gos2.fsf@oracle.com/T/#u

brenns10 closed this as completed Oct 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drgn reads zeros for memory allocated by vm_map_ram() #217

Drgn reads zeros for memory allocated by vm_map_ram() #217

brenns10 commented Oct 26, 2022

osandov commented Oct 26, 2022

brenns10 commented Oct 26, 2022

brenns10 commented Oct 26, 2022

brenns10 commented Oct 27, 2022

Drgn reads zeros for memory allocated by vm_map_ram() #217

Drgn reads zeros for memory allocated by vm_map_ram() #217

Comments

brenns10 commented Oct 26, 2022

osandov commented Oct 26, 2022

brenns10 commented Oct 26, 2022

brenns10 commented Oct 26, 2022

brenns10 commented Oct 27, 2022