Skip to content

[Bug]: kallsyms_lookup_name is called, despite not being exported via ABI since Linux 5.7 (crash on clang+lto Linux kernels like CachyOS) #644

@DeflateAwning

Description

@DeflateAwning

Version

7.2.8

Host OS Type

Linux

Host OS name + version

CachyOS @ Linux Kernel v6.18, 6.19, v7.0.0 compiled with Clang+LTO

Host Architecture

x86

Guest OS Type

all

Guest Architecture

x86

Guest OS name + version

N/A - bug is experienced during host system boot

Component

Host Support

What happened?

With the latest version of CachyOS on Linux Kernel v7.0.0 linux-cachyos built with clang and Linker-Time Optimizations (LTO) enabled, on x86-64-v3+ CPUs with Intel CET/IBT enabled, there is a bug where VirtualBox v7.2.8 prevents the system from booting, hanging or crashing during the Load Dynamic Kernel Modules bootup step. A bug report is present in the CachyOS bug tracker: CachyOS/linux-cachyos#818 (screenshot of kernel panic).

Virtualbox commit 29fa63f (@bird2tori) introduced a bug involving symbol lookup using invalid, unsupported techniques.

I'm not certain about the interconnection, but it involves some or all of these parts:

  • In RTR0DbgKrnlInfoQuerySymbol(...), kallsyms_lookup_name is used to look up symbols directly.
  • kallsyms_lookup_name is not exported via ABI nor intended to be used by kernel modules since kernel v5.7 (or thereabouts, as far as I can tell).
  • It seems that rtR0DbgKrnlInfoLnxQuerySymbolKprobe, which contains a "correct" implementation, should be used anywhere that kallsyms_lookup_name is used now (e.g., in RTR0DbgKrnlInfoOpen).

Possible Solution 1

I believe the modern way to do the kallsyms_lookup_name action is as follows (which is what's implemented in rtR0DbgKrnlInfoLnxQuerySymbolKprobe, it seems):

#include <linux/kprobes.h>

static unsigned long lookup_name(const char *name)
{
    struct kprobe kp = { .symbol_name = name };
    unsigned long addr;

    if (register_kprobe(&kp) < 0)
        return 0;
    addr = (unsigned long)kp.addr;
    unregister_kprobe(&kp);
    return addr;
}

kprobes uses kallsyms_lookup_name internally but it's a proper exported kernel API. The symbol resolution happens inside the kernel's own call chain, so no indirect-call IBT issue for the module. This is the most widely used workaround in out-of-tree modules post-5.7.

I think the solution is roughly to use rtR0DbgKrnlInfoLnxQuerySymbolKprobe more.

Possible Solution 2

It seems that most of what vboxdrv needs kallsyms_lookup_name for could be replaced with fprobe (kernel 5.15+) or kprobe + kretprobe, which are the blessed interfaces for exactly this kind of dynamic instrumentation from modules.

How can we reproduce this?

  1. Install CachyOS with the default kernel. Note that it may have to be a hardware install and not a VM, as VMs may not have ibt cpu feature support. Recommend: Use limine bootloader so you can easily roll back when you intentionally brick the install.
  2. sudo cat /proc/cpuinfo and ensure the ibt feature is present.
  3. Upgrade: sudo pacman -Syu
  4. Reboot.
  5. Set the boot splash screen to "Details" so you can see the error on boot.
  6. Install virtualbox: sudo pacman -S --needed virtualbox virtualbox-host-dkms
  7. Reboot.
  8. Your computer will no longer boot.

If you enable ibt=off in the cmdline of the kernel, you'll be able to boot fine.

Did you upload all of your necessary log files, screenshots, etc.?

  • Yes, I've uploaded all pertinent files to this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions