Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add check_syscall_source event #3953

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ env:
DNS_DATA_SOURCE
WRITABLE_DATA_SOURCE
SET_FS_PWD
CHECK_SYSCALL_SOURCE
jobs:
#
# DOC VERIFICATION
Expand Down
49 changes: 49 additions & 0 deletions docs/docs/events/builtin/extra/check_syscall_source.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# check_syscall_source

## Intro

check_syscall_source - An event reporting a syscall that was invoked from an unusual code location.

## Description

In most cases, all code running in a process is placed in dedicated code regions (VMAs, or Virtual Memory Areas) that are mapped from executable files that contain the code. Thus, the locations that syscalls are invoked from should be in one of these code regions.

When a syscall is invoked from an unusual location, this event is triggered. This may happen in the following scenarios:

- A shellcode is executed from the stack, the heap or an anonymous (non-file-backed) memory region.

- A packed program is executed, and is either statically linked or it calls syscalls directly (instead of using libc wrappers).

This event relies on an event filter to specify which syscalls should be monitored, to reduce overhead. An example command line usage of this event:

`tracee --events check_syscall_source.args.syscall=open,openat`.

To reduce noise in cases where code with significant syscall activity is being detected, any unique combination of process, syscall and VMA that contains the invoking code will be submitted as an event only once.

## Arguments

* `syscall`:`int`[K] - the syscall which was invoked from an unusual location. The syscall name is parsed if the `parse-arguments` option is specified.
* `ip`:`void *`[K] - the address from which the syscall was invoked (instruction pointer of the instruction following the syscall instruction).
* `is_stack`:`bool`[K] - whether the syscall was invoked from the stack. Mutually exclusive with `is_heap` and `is_anon_vma`.
* `is_heap`:`bool`[K] - whether the syscall was invoked from the heap. Mutually exclusive with `is_stack` and `is_anon_vma`.
* `is_anon_vma`:`bool`[K] - whether the syscall was invoked from an anonymous (non-file-backed) VMA. Mutually exclusive with `is_stack` and `is_heap`.

## Hooks

### sys_enter

#### Type

tracepoint

#### Purpose

Utilizes a tail call from the existing tracepoint on `sys_enter`. The called function analyzes the location from which the syscall was invoked. The analysis occurs only if a policy has selected this syscall as a filter for this event.

## Example Use Case

Detect shellcodes.

## Issues

Unwanted events may occur in scenarios where legitimate programs run code from unusual locations. This may happen in the case of JITs that write code to anonymous VMAs. Although such code is not expected to invoke syscalls directly (instead relying on some runtime that is mapped from an executable file), exceptions may exist.
3 changes: 2 additions & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,9 @@ require (
github.com/IBM/fluent-forward-go v0.2.1
github.com/Masterminds/sprig/v3 v3.2.3
github.com/aquasecurity/libbpfgo v0.7.0-libbpf-1.4
github.com/aquasecurity/libbpfgo/helpers v0.4.5
github.com/aquasecurity/tracee/api v0.0.0-20240531131043-a237ddf7b190
github.com/aquasecurity/tracee/signatures/helpers v0.0.0-20240122160245-67dec940088c
github.com/aquasecurity/tracee/signatures/helpers v0.0.0-20240404084547-37dc481cd60c
github.com/aquasecurity/tracee/types v0.0.0-20240122122429-7f84f526758d
github.com/containerd/containerd v1.7.14
github.com/docker/docker v24.0.9+incompatible
Expand Down
6 changes: 4 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,12 @@ github.com/agnivade/levenshtein v1.1.1 h1:QY8M92nrzkmr798gCo3kmMyqXFzdQVpxLlGPRB
github.com/agnivade/levenshtein v1.1.1/go.mod h1:veldBMzWxcCG2ZvUTKD2kJNRdCk5hVbJomOvKkmgYbo=
github.com/aquasecurity/libbpfgo v0.7.0-libbpf-1.4 h1:rQ94U12Xlz2tncE8Rxnw3vpp/9hgUIEu3/Lv0/XQM0Q=
github.com/aquasecurity/libbpfgo v0.7.0-libbpf-1.4/go.mod h1:iI7QCIZ3kXG0MR+FHsDZck6cYs1y1HyZP3sMObBg0sk=
github.com/aquasecurity/libbpfgo/helpers v0.4.5 h1:eCoLclL3yqv4N9jqGL3T/ckrLPms2r13C4V2xtU75yc=
github.com/aquasecurity/libbpfgo/helpers v0.4.5/go.mod h1:j/TQLmsZpOIdF3CnJODzYngG4yu1YoDCoRMELxkQSSA=
github.com/aquasecurity/tracee/api v0.0.0-20240531131043-a237ddf7b190 h1:NJ69oeaA2kcRxrt7YFHVrUHujXCtiee8Re69npcbmHk=
github.com/aquasecurity/tracee/api v0.0.0-20240531131043-a237ddf7b190/go.mod h1:jXLAr/iFkfaNTuNcdbx2blngdMD/qaAfxQe9rCL9jwk=
github.com/aquasecurity/tracee/signatures/helpers v0.0.0-20240122160245-67dec940088c h1:Gms5lUHPIq+OpI5HjcZ+l0NZHhSwBd/47nyUZY89c+M=
github.com/aquasecurity/tracee/signatures/helpers v0.0.0-20240122160245-67dec940088c/go.mod h1:SSh6X96P8pT/9B6eBl6ptBo8QnaSCNCZHMOZ1iXyPUw=
github.com/aquasecurity/tracee/signatures/helpers v0.0.0-20240404084547-37dc481cd60c h1:bmv9GWkpNlqVMDR8z6GuntbQvr5U7o76bWKZ+RNXK+I=
github.com/aquasecurity/tracee/signatures/helpers v0.0.0-20240404084547-37dc481cd60c/go.mod h1:SSh6X96P8pT/9B6eBl6ptBo8QnaSCNCZHMOZ1iXyPUw=
github.com/aquasecurity/tracee/types v0.0.0-20240122122429-7f84f526758d h1:6CQjy5G6Cj/VKm8RP1uZnBZxDgfyGo15HfWFnYrkGro=
github.com/aquasecurity/tracee/types v0.0.0-20240122122429-7f84f526758d/go.mod h1:J0f9nzJWrFmFgMoK0s4Yirfh82vfKMatXytd1YdfU2I=
github.com/arbovm/levenshtein v0.0.0-20160628152529-48b4e1c0c4d0 h1:jfIu9sQUG6Ig+0+Ap1h4unLjW6YQJpKZVmUzxsD4E/Q=
Expand Down
4 changes: 2 additions & 2 deletions pkg/ebpf/c/common/common.h
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ statfunc const char *get_device_name(struct device *dev)
#define has_prefix(p, s, n) \
({ \
int rc = 0; \
char *pre = p, *str = s; \
const char *pre = p, *str = s; \
_Pragma("unroll") for (int z = 0; z < n; pre++, str++, z++) \
{ \
if (!*pre) { \
Expand All @@ -49,7 +49,7 @@ statfunc const char *get_device_name(struct device *dev)

#else

static __inline int has_prefix(char *prefix, char *str, int n)
static __inline int has_prefix(const char *prefix, const char *str, int n)
{
int i;
#pragma unroll
Expand Down
3 changes: 2 additions & 1 deletion pkg/ebpf/c/common/kconfig.h
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@

enum kconfig_key_e
{
ARCH_HAS_SYSCALL_WRAPPER = 1000U
ARCH_HAS_SYSCALL_WRAPPER = 1000U,
MMU = 1001U
};

// PROTOTYPES
Expand Down
117 changes: 117 additions & 0 deletions pkg/ebpf/c/common/memory.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
#include <vmlinux.h>

#include <common/common.h>
#include <common/kconfig.h>

// PROTOTYPES

Expand All @@ -13,6 +14,12 @@ statfunc unsigned long get_arg_end_from_mm(struct mm_struct *);
statfunc unsigned long get_env_start_from_mm(struct mm_struct *);
statfunc unsigned long get_env_end_from_mm(struct mm_struct *);
statfunc unsigned long get_vma_flags(struct vm_area_struct *);
statfunc unsigned long get_vma_start(struct vm_area_struct *);
statfunc struct vm_area_struct *find_vma(struct task_struct *task, u64 addr);
statfunc bool vma_is_stack(struct vm_area_struct *vma);
statfunc bool vma_is_heap(struct vm_area_struct *vma);
statfunc bool vma_is_anon(struct vm_area_struct *vma);
statfunc bool vma_is_vdso(struct vm_area_struct *vma);

// FUNCTIONS

Expand Down Expand Up @@ -51,4 +58,114 @@ statfunc struct mount *real_mount(struct vfsmount *mnt)
return container_of(mnt, struct mount, mnt);
}

statfunc unsigned long get_vma_start(struct vm_area_struct *vma)
{
return BPF_CORE_READ(vma, vm_start);
}

/**
* A busy process can have somewhere in the ballpark of 1000 VMAs.
* In an ideally balanced tree, this means that the max depth is ~10.
* A poorly balanced tree can have a leaf node that is up to twice as deep
* as another leaf node, which in the worst case scenario places its depth
* at 2*10 = 20.
* To be extra safe and accomodate for VMA counts higher than 1000,
* we define the max traversal depth as 25.
*/
#define MAX_VMA_RB_TREE_DEPTH 25

// Given a task, find the first VMA which contains the given address.
statfunc struct vm_area_struct *find_vma(struct task_struct *task, u64 addr)
{
/**
* TODO: from kernel version 6.1, the data structure with which VMAs
* are managed changed from an RB tree to a maple tree.
* We currently don't support finding VMAs on such systems.
*/
struct mm_struct *mm = BPF_CORE_READ(task, mm);
if (!bpf_core_field_exists(mm->mm_rb))
return NULL;

// TODO: we don't support NOMMU systems yet (looking up VMAs on them requires walking the VMA
// linked list)
if (!get_kconfig(MMU)) {
return NULL;
}

struct vm_area_struct *vma = NULL;
struct rb_node *rb_node = BPF_CORE_READ(mm, mm_rb.rb_node);

#pragma unroll
for (int i = 0; i < MAX_VMA_RB_TREE_DEPTH; i++) {
barrier(); // without this, the compiler refuses to unroll the loop

if (rb_node == NULL)
break;

struct vm_area_struct *tmp = container_of(rb_node, struct vm_area_struct, vm_rb);
unsigned long vm_start = BPF_CORE_READ(tmp, vm_start);
unsigned long vm_end = BPF_CORE_READ(tmp, vm_end);

if (vm_end > addr) {
vma = tmp;
if (vm_start <= addr)
break;
rb_node = BPF_CORE_READ(rb_node, rb_left);
} else
rb_node = BPF_CORE_READ(rb_node, rb_right);
}

return vma;
}

statfunc bool vma_is_stack(struct vm_area_struct *vma)
{
struct mm_struct *vm_mm = BPF_CORE_READ(vma, vm_mm);
if (vm_mm == NULL)
return false;

u64 vm_start = BPF_CORE_READ(vma, vm_start);
u64 vm_end = BPF_CORE_READ(vma, vm_end);
u64 start_stack = BPF_CORE_READ(vm_mm, start_stack);

// logic taken from include/linux/mm.h (vma_is_initial_stack)
if (vm_start <= start_stack && start_stack <= vm_end)
return true;

return false;
}

statfunc bool vma_is_heap(struct vm_area_struct *vma)
{
struct mm_struct *vm_mm = BPF_CORE_READ(vma, vm_mm);
if (vm_mm == NULL)
return false;

u64 vm_start = BPF_CORE_READ(vma, vm_start);
u64 vm_end = BPF_CORE_READ(vma, vm_end);
u64 start_brk = BPF_CORE_READ(vm_mm, start_brk);
u64 brk = BPF_CORE_READ(vm_mm, brk);

// logic taken from include/linux/mm.h (vma_is_initial_heap)
if (vm_start < brk && start_brk < vm_end)
return true;

return false;
}

statfunc bool vma_is_anon(struct vm_area_struct *vma)
{
return BPF_CORE_READ(vma, vm_file) == NULL;
}

statfunc bool vma_is_vdso(struct vm_area_struct *vma)
{
struct vm_special_mapping *special_mapping =
(struct vm_special_mapping *) BPF_CORE_READ(vma, vm_private_data);
if (special_mapping == NULL)
return false;

return has_prefix("[vdso]", BPF_CORE_READ(special_mapping, name), 6) == 0;
}

#endif
16 changes: 16 additions & 0 deletions pkg/ebpf/c/maps.h
Original file line number Diff line number Diff line change
Expand Up @@ -246,6 +246,22 @@ struct sys_exit_init_tail {

typedef struct sys_exit_init_tail sys_exit_init_tail_t;

// store program for performing syscall checking logic
struct check_syscall_source_tail {
__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
__uint(max_entries, MAX_EVENT_ID);
__type(key, u32);
__type(value, u32);
} check_syscall_source_tail SEC(".maps");

// store syscalls with abnormal source per VMA per process
struct {
__uint(type, BPF_MAP_TYPE_LRU_HASH);
__uint(max_entries, 4096);
__type(key, syscall_source_key_t);
__type(value, bool);
} syscall_source_map SEC(".maps");

// store stack traces
#define MAX_STACK_ADDRESSES 1024 // max amount of diff stack trace addrs to buffer

Expand Down
101 changes: 101 additions & 0 deletions pkg/ebpf/c/tracee.bpf.c
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,11 @@ int tracepoint__raw_syscalls__sys_enter(struct bpf_raw_tracepoint_args *ctx)

id = *id_64;
}

// Call syscall checker if registered for this syscall.
// If so, it will make sure the following tail is called.
bpf_tail_call(ctx, &check_syscall_source_tail, id);

bpf_tail_call(ctx, &sys_enter_init_tail, id);
return 0;
}
Expand Down Expand Up @@ -5096,6 +5101,102 @@ int BPF_KPROBE(trace_set_fs_pwd)
return events_perf_submit(&p, 0);
}

enum vma_type
{
VMA_STACK,
VMA_HEAP,
VMA_ANON,
VMA_OTHER
};

statfunc enum vma_type get_vma_type(struct vm_area_struct *vma)
{
if (vma_is_stack(vma))
return VMA_STACK;

if (vma_is_heap(vma))
return VMA_HEAP;

if (vma_is_anon(vma) && !vma_is_vdso(vma)) {
return VMA_ANON;
}

return VMA_OTHER;
}

SEC("raw_tracepoint/check_syscall_source")
int check_syscall_source(struct bpf_raw_tracepoint_args *ctx)
{
// Get syscall ID.
// NOTE: this must happen first before any logic that may fail,
// because we must know the syscall ID for the tail call we preceded.
struct task_struct *task = (struct task_struct *) bpf_get_current_task();
u32 id = ctx->args[1];
if (is_compat(task)) {
// Translate 32bit syscalls to 64bit syscalls
u32 *id_64 = bpf_map_lookup_elem(&sys_32_to_64_map, &id);
if (id_64 == 0)
return 0;
id = *id_64;
}

program_data_t p = {};
if (!init_program_data(&p, ctx, CHECK_SYSCALL_SOURCE))
goto out;

if (!evaluate_scope_filters(&p))
goto out;

// Get instruction pointer
struct pt_regs *regs = (struct pt_regs *) ctx->args[0];
#if defined(bpf_target_x86)
u64 ip = BPF_CORE_READ(regs, ip);
#elif defined(bpf_target_arm64)
u64 ip = BPF_CORE_READ(regs, pc);
#endif

// Find VMA which contains the instruction pointer
struct vm_area_struct *vma = find_vma(task, ip);
if (vma == NULL)
goto out;

// Get VMA type and make sure it's abnormal (stack/heap/anonymous VMA)
enum vma_type vma_type = get_vma_type(vma);
if (vma_type == VMA_OTHER)
goto out;

// Build a key that identifies the combination of syscall,
// source VMA and process so we don't submit it multiple times
syscall_source_key_t key = {.syscall = id,
.tgid = get_task_ns_tgid(task),
.tgid_start_time = get_task_start_time(get_leader_task(task)),
.vma_addr = get_vma_start(vma)};
bool val = true;

// Try updating the map with the requirement that this key does not exist yet
if ((int) bpf_map_update_elem(&syscall_source_map, &key, &val, BPF_NOEXIST) == -17 /* EEXIST */)
// This key already exists, no need to submit the same syscall-vma-process combination again
goto out;

bool is_stack = vma_type == VMA_STACK;
bool is_heap = vma_type == VMA_HEAP;
bool is_anon = vma_type == VMA_ANON;

save_to_submit_buf(&p.event->args_buf, &id, sizeof(id), 0);
save_to_submit_buf(&p.event->args_buf, &ip, sizeof(ip), 1);
save_to_submit_buf(&p.event->args_buf, &is_stack, sizeof(is_stack), 2);
save_to_submit_buf(&p.event->args_buf, &is_heap, sizeof(is_heap), 3);
save_to_submit_buf(&p.event->args_buf, &is_anon, sizeof(is_anon), 4);

events_perf_submit(&p, 0);

out:
// Call sys_enter_init_tail which we preceded
bpf_tail_call(ctx, &sys_enter_init_tail, id);

return 0;
}

// clang-format off

// Network Packets (works from ~5.2 and beyond)
Expand Down