
 # Cloud-Native Monitoring with EBPF

gergely.szabo@origoss.com

# About Myself
* more than 15 years in the industry
* research, development, system architect, etc...
* currently at Origoss Solutions
  * Cloud Native
  * Kubernetes
  * Prometheus

# Agenda
* BPF
* Linux Kernel Tracing
* EBPF
* Prometheus Exporter

# BPF

# Packet Filtering Problem
  
![network filtering](filter.svg "Network filtering")

# Filtering Requirements

* Efficient
* Flexible filter rules
* Safe

# Where to Filter?

![Where to filter?](filter2.svg "Where to Filter?")

# BPF

## Steven McCanne and Van Jacobson:

## The BSD Packet Filter: A New Architecture for User-level Packet Capture, 1992

http://www.tcpdump.org/papers/bpf-usenix93.pdf

# BPF Architecture

![BPF Architecture](bpf_paper_fig1.png "BPF Architecture")

# Capturing without Filtering

In [None]:
%%bash
sudo tcpdump -nc 4

# Simple Filtering Rule

In [None]:
%%bash
sudo tcpdump -nc 4 tcp and port 80

# Complex Rule

To print all IPv4 HTTP packets to and from port 80, i.e. print only packets that contain data, not, for example, SYN and FIN packets and ACK-only packets.

In [None]:
%%bash
sudo tcpdump -nc 4 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'

# How Does This Work?

![BPF Architecture](bpf_paper_fig1.png "BPF Architecture")

# BPF VM Instruction Set

![BPF Instructions](bpf_instructions.png "BPF Instructions")

# Simple Filtering Rule

In [None]:
%%bash
tcpdump -d tcp and port 80

# Complex Rule

In [None]:
%%bash
tcpdump -d 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'

# Linux Kernel Tracepoints

* A tracepoint placed in code provides a hook to call a function (probe) that you can provide at runtime. 
* A tracepoint can be "on" or "off" 
 * When a tracepoint is "on", the function you provide is called each time the tracepoint is executed
* They can be used for tracing and performance accounting.


# Adding Tracepoints
```c
void blk_requeue_request(struct request_queue *q, struct request *rq)
{
	blk_delete_timer(rq);
	blk_clear_rq_complete(rq);
	trace_block_rq_requeue(q, rq);   // <- Tracepoint hook

	if (rq->cmd_flags & REQ_QUEUED)
		blk_queue_end_tag(q, rq);

	BUG_ON(blk_queued_rq(rq));

	elv_requeue_request(q, rq);
}
```

```c
void blk_requeue_request(struct request_queue *q, struct request *rq)
{
	blk_delete_timer(rq);
	blk_clear_rq_complete(rq);
	trace_block_rq_requeue(q, rq);

	if (rq->cmd_flags & REQ_QUEUED)x
		blk_queue_end_tag(q, rq);

	BUG_ON(blk_queued_rq(rq));

	elv_requeue_request(q, rq);
}
```

# List of Tracepoints

In [None]:
%%bash
perf list tracepoint

# Tracepoints in Action

In [None]:
%%bash
sudo perf stat -a -e kmem:kmalloc sleep 10

# Linux Kernel KProbes

* dynamically break into any kernel routine and collect debugging and performance information non-disruptively.
  * some parts of the kernel code can not be trapped
* two types of probes: kprobes, and kretprobes
* A kprobe can be inserted on virtually any instruction in the kernel.
* A return probe fires when a specified function returns.


# List of KProbes

In [None]:
%%bash
sudo cat /sys/kernel/debug/kprobes/list

# Probing a Linux Function

```c
void blk_delete_timer(struct request *req)
{
	list_del_init(&req->timeout_list);
}
```

In [4]:
%%bash
sudo sh -c 'echo p:demo_probe blk_delete_timer >> /sys/kernel/debug/tracing/kprobe_events'

# List of KProbes

In [6]:
%%bash
sudo cat /sys/kernel/debug/kprobes/list

00000000356f0433  k  blk_delete_timer+0x0    [DISABLED][FTRACE]


In [7]:
%%bash
sudo perf list | grep demo

  kprobes:demo_probe                                 [Tracepoint event]


# KProbes in Action

In [None]:
%%bash
sudo perf stat -a -e kprobes:demo_probe sleep 10

# Removing KProbe

In [9]:
%%bash
sudo sh -c 'echo "-:demo_probe" >> /sys/kernel/debug/tracing/kprobe_events'

In [10]:
%%bash
sudo cat /sys/kernel/debug/kprobes/list

In [11]:
%%bash
sudo perf list | grep demo