Skip to content

Commit

Permalink
cilium: disable bind-protection in kube-proxy free probe mode
Browse files Browse the repository at this point in the history
The probe mode is expected to only run alongside kube-proxy as hybrid.
There was confusion that the kube-proxy log was throwing (harmless) warnings
to its log that it could not bind sockets to service ports in the hostns.
This is due to Cilium performing bind protection right out of the bind(2)
syscall with eBPF. To avoid this confusion, defer to kube-proxy to bind
sockets instead. This is less efficient and consuming more resources, but
if users want to avoid the overhead, they would run kube-proxy free in strict
mode anyway where Cilium does the bind protection by default anyway.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
  • Loading branch information
borkmann committed Nov 26, 2020
1 parent 05ac4ac commit 2a3e5d4
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 4 deletions.
12 changes: 9 additions & 3 deletions Documentation/gettingstarted/kubeproxy-free.rst
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ Note, in above Helm configuration, the ``kubeProxyReplacement`` has been set to
underlying Linux kernel support is missing.

By default, Helm sets ``kubeProxyReplacement=probe``, which automatically
disable a subset of the features to implement the kube-proxy replacement instead
disables a subset of the features to implement the kube-proxy replacement instead
of bailing out if the kernel support is missing. This makes the assumption that
Cilium's eBPF kube-proxy replacement would co-exist with kube-proxy on the system
to optimize Kubernetes services. Given we've used kubeadm to explicitly deploy
Expand Down Expand Up @@ -1008,15 +1008,21 @@ This section therefore elaborates on the various ``kubeProxyReplacement`` option
(see :ref:`kubeproxy-free` note), then the Cilium agent will bail out on start-up
with an error message.

- ``kubeProxyReplacement=probe``: This option is intended for a hybrid setup,
- ``kubeProxyReplacement=probe``: This option is only intended for a hybrid setup,
that is, kube-proxy is running in the Kubernetes cluster where Cilium partially
replaces and optimizes kube-proxy functionality. Once the Cilium agent is up and
running, it probes the underlying kernel for the availability of needed eBPF kernel
features and, if not present, disables a subset of the functionality in eBPF by
relying on kube-proxy to complement the remaining Kubernetes service handling. The
Cilium agent will emit an info message into its log in such case. For example, if
the kernel does not support :ref:`host-services`, then the ClusterIP translation
for the node's host-namespace is done through kube-proxy's iptables rules.
for the node's host-namespace is done through kube-proxy's iptables rules. Also,
the Cilium agent will set ``nodePort.bindProtection`` to ``false`` in this mode in
order to defer to kube-proxy for performing the bind-protection of the host namespace.
This is done to avoid having kube-proxy throw (harmless) warnings to its log stating
that it could not perform bind calls. In the ``strict`` mode this bind protection is
performed by Cilium in a more efficient manner with the help of eBPF instead of
allocating and binding actual sockets.

- ``kubeProxyReplacement=partial``: Similarly to ``probe``, this option is
intended for a hybrid setup, that is, kube-proxy is running in the Kubernetes cluster
Expand Down
7 changes: 6 additions & 1 deletion daemon/cmd/kube_proxy_replacement.go
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,12 @@ func initKubeProxyReplacementOptions() (strict bool) {
log.Fatalf("Invalid value for --%s: %s", option.NodePortAcceleration, option.Config.NodePortAcceleration)
}

if !option.Config.NodePortBindProtection {
if option.Config.KubeProxyReplacement == option.KubeProxyReplacementProbe {
// We let kube-proxy do the less efficient bind-protection in
// this case to avoid the latter throwing (harmless) warnings
// to its log that bind request is rejected.
option.Config.NodePortBindProtection = false
} else if !option.Config.NodePortBindProtection {
log.Warning("NodePort BPF configured without bind(2) protection against service ports")
}

Expand Down

0 comments on commit 2a3e5d4

Please sign in to comment.