Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cilium: optimize bpf to use jiffies for ct maps #11434

Merged
merged 5 commits into from May 9, 2020
Merged

Conversation

borkmann
Copy link
Member

@borkmann borkmann commented May 8, 2020

See commit msgs, gives ~ +1.1Mpps on my test machine under xdp benchmark.

@borkmann borkmann added pending-review sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/misc This PR makes changes that have no direct user impact. labels May 8, 2020
@borkmann borkmann requested review from brb, tklauser, joestringer and a team May 8, 2020 19:45
@borkmann borkmann requested review from a team as code owners May 8, 2020 19:45
@borkmann borkmann requested a review from a team May 8, 2020 19:45
@borkmann
Copy link
Member Author

borkmann commented May 8, 2020

test-me-please

@borkmann
Copy link
Member Author

borkmann commented May 8, 2020

test-me-please

@borkmann borkmann requested a review from jrfastab May 8, 2020 19:57
@borkmann borkmann requested a review from a team as a code owner May 8, 2020 20:19
@borkmann
Copy link
Member Author

borkmann commented May 8, 2020

test-me-please

@coveralls
Copy link

coveralls commented May 8, 2020

Coverage Status

Coverage increased (+0.007%) to 37.88% when pulling 691a455 on pr/ktime_ns into 78739a1 on master.

Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oO That's neat! And I didn't think it would work so well. warp is almost always <=2 on my system.

I also checked that the number of retries is high enough (in case of on interrupts). I couldn't make ./cilium-probe-kernel-hz fail even while creating artificial load (though not RT).

bpf/cilium-probe-kernel-hz.c Outdated Show resolved Hide resolved
@borkmann
Copy link
Member Author

borkmann commented May 8, 2020

test-me-please

@borkmann
Copy link
Member Author

borkmann commented May 8, 2020

test-me-please

Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! A few small nits inline, address as you see fit.

bpf/cilium-probe-kernel-hz.c Outdated Show resolved Hide resolved
bpf/cilium-probe-kernel-hz.c Show resolved Hide resolved
pkg/bpf/bpf_linux.go Outdated Show resolved Hide resolved
In order to use and work with jiffies64() BPF helper, we need to know
the HZ value that the kernel is operating in. Doing something like a
sysconf(_SC_CLK_TCK) from user space to get the tick frequency won't
work since this is the user space HZ value whereas BPF programs run
in kernel space, so we need the kernel HZ. It is not exposed directly
and we cannot assume that the kernel config is always exposed by
distros, so as an alternative /proc/timer_list can be used. This work
adds a smal tool cilium-probe-kernel-hz which probes and sanity checks
the HZ value and emits a define KERNEL_HZ which can be used by the
BPF prog.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Rework our time-keeping helpers a bit and add new jiffie helpers
for converting seconds back and forth based on the KERNEL_HZ.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Switch CT timeout and report intervals to more lightweight jiffies
instead. Our bpf_ktime_get_sec() can become very expensive and we
only need a low resolution time source anyway in all our code. For
now the most expensive operation is the constant update of the
CT entry timeout which is needed for every packet. Switch it to
jiffies for newer kernels. Precision is scaled down in order to
better fit entry->lifetime.

Under XDP stress testing this gave a performance improvement of
approx +1.1Mpps. More candidates are on todo to convert later as
well (services w/ session affinity, SNAT create timeout).

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Adapt the CT GC accordingly to either work on monotonic clock or jiffies.
Given the timeout of low-res anyway, we can simply read out the current
kernel value of jiffies via /proc/timer_list and use that for comparing
timeouts.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@borkmann
Copy link
Member Author

borkmann commented May 8, 2020

test-me-please

For existing deployments, we need to continue to use ktime as source, but
for newly deployed ones via helm, we can opt-into probing for jiffies as
source instead. Therefore implement an agent switch --enable-bpf-clock-probe
to configure it as well as helm support. Also update cilium-agent.md doc.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@borkmann
Copy link
Member Author

borkmann commented May 9, 2020

test-me-please

@borkmann
Copy link
Member Author

borkmann commented May 9, 2020

test-with-kernel

1 similar comment
@borkmann
Copy link
Member Author

borkmann commented May 9, 2020

test-with-kernel

@borkmann borkmann merged commit d02e541 into master May 9, 2020
@borkmann borkmann deleted the pr/ktime_ns branch May 9, 2020 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/misc This PR makes changes that have no direct user impact. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants