Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf: initial pcap exporter for lb #15376

Merged
merged 19 commits into from
Mar 30, 2021
Merged

bpf: initial pcap exporter for lb #15376

merged 19 commits into from
Mar 30, 2021

Conversation

borkmann
Copy link
Member

@borkmann borkmann commented Mar 17, 2021

See commit msgs.

Basis for agent + Hubble recorder framework:

@borkmann borkmann added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. area/daemon Impacts operation of the Cilium daemon. release-note/misc This PR makes changes that have no direct user impact. sig/hubble Impacts hubble server or relay feature/lb-only Impacts cilium running in lb-only datapath mode sig/loadbalancing labels Mar 17, 2021
@borkmann borkmann requested review from gandro and brb March 17, 2021 14:32
@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.10.0 Mar 17, 2021
@borkmann borkmann force-pushed the pr/lb-observ branch 3 times, most recently from ba9e583 to a4ad6d6 Compare March 22, 2021 20:58
Add a per-cpu ktime cache we can use for the packet capturing exporter
facility from the LB.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add a new capture type for pcap exporter to perf RB that we're going to use
for the lb-only mode out of XDP initially.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add both to the XDP DSR load balancer for capture points upon
packet arrival and departure after encap.

Also guard it under ENABLE_CAPTURE so that there's no overhead
when not compiled in, plus so that the agent can probe the boot
ktime helper availability.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
And also add __ prefixed variants that abstract the ktime handling
away from caller. Right now the capture is unconditional, next step
we'll add a classifier where the value (rule_id) of the match is
then passed through the capture helpers.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@borkmann borkmann force-pushed the pr/lb-observ branch 8 times, most recently from e6f0de6 to 4da7db7 Compare March 29, 2021 23:27
Add a common interface such that we prep both v4/v6 maps for allowing to
dump their entries from CLI.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@borkmann borkmann force-pushed the pr/lb-observ branch 2 times, most recently from 2d696d5 to ab0f473 Compare March 30, 2021 12:41
@borkmann
Copy link
Member Author

test-me-please

1 similar comment
@borkmann
Copy link
Member Author

test-me-please

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing!

@borkmann
Copy link
Member Author

retest-1.21-4.9

Move the v4/v6 maps under ENABLE_CAPTURE so they are not attempted to be
included given we also don't define their names/sizes from agent size.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add `cilium bpf recorder list` command to dump both the v4 and v6 wildcarded
maps tht contain the capture n-tuple filters.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add the dump of the recorder map to the sysdump for introspection.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add generated cmdref to the documentation.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@borkmann
Copy link
Member Author

test-me-please

@borkmann
Copy link
Member Author

borkmann commented Mar 30, 2021

retest-runtime (CI provision fail)

@borkmann
Copy link
Member Author

retest-runtime

Allow for easy debugging of recorder events and thus provide a monitoring
dissector.

Example:

  # ./daemon/cilium-agent --enable-ipv4=true --enable-ipv6=true       \
        --datapath-mode=lb-only --bpf-lb-algorithm=maglev             \
        --bpf-lb-maglev-table-size=2039 --bpf-lb-mode=dsr             \
        --bpf-lb-acceleration=native --devices=enp10s0f0np0           \
        --bpf-lb-dsr-dispatch=ipip --disable-envoy-version-check=true \
        --enable-bpf-clock-probe=true --enable-recorder=true

  # bpftool map update pinned /sys/fs/bpf/tc/globals/cilium_capture4_rules \
        key hex 0 0 0 0 c0 a8 a0 04 0 0 0 0 0 0 20 0 value hex 0 1 0 0 0 0 0 0

  # ./cilium/cilium bpf recorder list
  192.168.160.4/32:0 -> 0.0.0.0/0:0 ANY ID:256 CapLen:0

  # ./cilium/cilium service list
  ID   Frontend             Service Type   Backend
  1    192.168.160.3:8080   ExternalIPs    1 => 192.168.160.4:8080

  (from remote)# curl 192.168.160.3:8080
  [...]

  # ./cilium/cilium monitor
  [...]
  Recorder capture: dir:ingress rule:256 ts:99992076120464 caplen:74 len:74
  Ethernet	{Contents=[..14..] Payload=[..62..] SrcMAC=b8:ce:f6:05:e7:76 DstMAC=b8:ce:f6:05:e7:62 EthernetType=IPv4 Length=0}
  IPv4	{Contents=[..20..] Payload=[..40..] Version=4 IHL=5 TOS=0 Length=60 Id=44495 Flags=DF FragOffset=0 TTL=64 Protocol=TCP Checksum=52115 SrcIP=192.168.160.4 DstIP=192.168.160.3 Options=[] Padding=[]}
  TCP	{Contents=[..40..] Payload=[] SrcPort=52128 DstPort=8080(http-alt) Seq=592744369 Ack=0 DataOffset=10 FIN=false SYN=true RST=false PSH=false ACK=false URG=false ECE=false CWR=false NS=false Window=42340 Checksum=59261 Urgent=0 Options=[..5..] Padding=[]}
  ----
  Recorder capture: dir:egress rule:256 ts:99992076120464 caplen:94 len:94
  Ethernet	{Contents=[..14..] Payload=[..86..] SrcMAC=b8:ce:f6:05:e7:62 DstMAC=b8:ce:f6:05:e7:76 EthernetType=IPv4 Length=0}
  IPv4	{Contents=[..20..] Payload=[..40..] Version=4 IHL=5 TOS=0 Length=60 Id=44495 Flags=DF FragOffset=0 TTL=64 Protocol=TCP Checksum=52115 SrcIP=192.168.160.4 DstIP=192.168.160.3 Options=[] Padding=[]}
  IPv4	{Contents=[..20..] Payload=[..40..] Version=4 IHL=5 TOS=0 Length=60 Id=44495 Flags=DF FragOffset=0 TTL=64 Protocol=TCP Checksum=52115 SrcIP=192.168.160.4 DstIP=192.168.160.3 Options=[] Padding=[]}
  TCP	{Contents=[..40..] Payload=[] SrcPort=52128 DstPort=8080(http-alt) Seq=592744369 Ack=0 DataOffset=10 FIN=false SYN=true RST=false PSH=false ACK=false URG=false ECE=false CWR=false NS=false Window=42340 Checksum=59261 Urgent=0 Options=[..5..] Padding=[]}
  ----
  [...]

Note that our monitor dissector code is currently broken in that it cannot
handle IPIP correctly (hence the same IP header twice above). It can only
cache/dump IP layer once right now, subject to a different fix at some point.

The future pcap writer code, needs to:

  1) Translate the boot time ts into a time of day ts.
  2) Dump into a different pcap file based on rule id.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@borkmann
Copy link
Member Author

test-me-please

@borkmann
Copy link
Member Author

retest-runtime

@borkmann
Copy link
Member Author

Hit #15469 in 4.19: unrelated.

@borkmann
Copy link
Member Author

Hit #14959 in 4.9: unrelated.

@borkmann
Copy link
Member Author

Hit #15469 and #15472 in net-next: unrelated.

@borkmann borkmann merged commit 2ff996c into master Mar 30, 2021
1.10.0 automation moved this from In progress to Done Mar 30, 2021
@borkmann borkmann deleted the pr/lb-observ branch March 30, 2021 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/daemon Impacts operation of the Cilium daemon. feature/lb-only Impacts cilium running in lb-only datapath mode release-note/misc This PR makes changes that have no direct user impact. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. sig/hubble Impacts hubble server or relay
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants