From 6421eccb846146c907cf89bdf400ef168527d7e3 Mon Sep 17 00:00:00 2001 From: Sebastian Wicki Date: Tue, 2 Jun 2020 14:04:04 +0200 Subject: [PATCH] doc: Troubleshooting with Hubble This adds a Hubble section to the Troubleshooting chapter, explaining how to use the CLI for embedded Hubble. Signed-off-by: Sebastian Wicki --- Documentation/troubleshooting.rst | 116 +++++++++++++++++++++++++++++- 1 file changed, 115 insertions(+), 1 deletion(-) diff --git a/Documentation/troubleshooting.rst b/Documentation/troubleshooting.rst index b0d162fd2f3b..cf905e2305e3 100644 --- a/Documentation/troubleshooting.rst +++ b/Documentation/troubleshooting.rst @@ -137,7 +137,117 @@ e.g.: IPv6 address pool: 4/4294967295 allocated Controller Status: 20/20 healthy Proxy Status: OK, ip 10.0.28.238, port-range 10000-20000 - Cluster health: 2/2 reachable (2018-04-11T15:41:01Z) + Hubble: Ok Current/Max Flows: 2542/4096 (62.06%), Flows/s: 164.21 Metrics: Disabled + Cluster health: 2/2 reachable (2018-04-11T15:41:01Z) + +.. _hubble_troubleshooting: + +Observing Flows with Hubble +=========================== + +Hubble is a built-in observability tool which allows you to inspect recent flow +events on all endpoints managed by Cilium. It needs to be enabled via the Helm +value ``global.hubble.enabled=true`` or the ``--enable-hubble`` option on +cilium-agent. + +Observing flows of a specific pod +--------------------------------- + +In order to observe the traffic of a specific pod, you will first have to +:ref:`retrieve the name of the cilium instance managing it`. +The Hubble CLI is part of the Cilium container image and can be accessed via +``kubectl exec``. The following query for example will show all events related +to flows which either originated or terminated in the ``default/tiefighter`` pod +in the last three minutes: + +.. code:: bash + + $ kubectl exec -n kube-system cilium-77lk6 -- hubble observe --since 3m --pod default/tiefighter + Jun 2 11:14:46.041 default/tiefighter:38314 kube-system/coredns-66bff467f8-ktk8c:53 to-endpoint FORWARDED UDP + Jun 2 11:14:46.041 kube-system/coredns-66bff467f8-ktk8c:53 default/tiefighter:38314 to-endpoint FORWARDED UDP + Jun 2 11:14:46.041 default/tiefighter:38314 kube-system/coredns-66bff467f8-ktk8c:53 to-endpoint FORWARDED UDP + Jun 2 11:14:46.042 kube-system/coredns-66bff467f8-ktk8c:53 default/tiefighter:38314 to-endpoint FORWARDED UDP + Jun 2 11:14:46.042 default/tiefighter:57746 default/deathstar-5b7489bc84-9bftc:80 L3-L4 FORWARDED TCP Flags: SYN + Jun 2 11:14:46.042 default/tiefighter:57746 default/deathstar-5b7489bc84-9bftc:80 to-endpoint FORWARDED TCP Flags: SYN + Jun 2 11:14:46.042 default/deathstar-5b7489bc84-9bftc:80 default/tiefighter:57746 to-endpoint FORWARDED TCP Flags: SYN, ACK + Jun 2 11:14:46.042 default/tiefighter:57746 default/deathstar-5b7489bc84-9bftc:80 to-endpoint FORWARDED TCP Flags: ACK + Jun 2 11:14:46.043 default/tiefighter:57746 default/deathstar-5b7489bc84-9bftc:80 to-endpoint FORWARDED TCP Flags: ACK, PSH + Jun 2 11:14:46.043 default/deathstar-5b7489bc84-9bftc:80 default/tiefighter:57746 to-endpoint FORWARDED TCP Flags: ACK, PSH + Jun 2 11:14:46.043 default/tiefighter:57746 default/deathstar-5b7489bc84-9bftc:80 to-endpoint FORWARDED TCP Flags: ACK, FIN + Jun 2 11:14:46.048 default/deathstar-5b7489bc84-9bftc:80 default/tiefighter:57746 to-endpoint FORWARDED TCP Flags: ACK, FIN + Jun 2 11:14:46.048 default/tiefighter:57746 default/deathstar-5b7489bc84-9bftc:80 to-endpoint FORWARDED TCP Flags: ACK + +You may also use ``-o json`` to obtain more detailed information about each +flow event. + +In the following example the first command extracts the numeric security +identities for all dropped flows which originated in the ``default/xwing`` pod +in the last three minutes. The numeric security identity can then be used +together with the Cilium CLI to obtain more information about why flow was +dropped: + +.. code:: bash + + $ kubectl exec -n kube-system cilium-77lk6 -- \ + hubble observe --since 3m --type drop --from-pod default/xwing -o json | \ + jq .destination.identity | sort -u + 788 + + $ kubectl exec -n kube-system cilium-77lk6 -- \ + cilium policy trace --src-k8s-pod default:xwing --dst-identity 788 + ---------------------------------------------------------------- + + Tracing From: [k8s:class=xwing, k8s:io.cilium.k8s.policy.cluster=default, k8s:io.cilium.k8s.policy.serviceaccount=default, k8s:io.kubernetes.pod.namespace=default, k8s:org=alliance] => To: [k8s:class=deathstar, k8s:io.cilium.k8s.policy.cluster=default, k8s:io.cilium.k8s.policy.serviceaccount=default, k8s:io.kubernetes.pod.namespace=default, k8s:org=empire] Ports: [0/ANY] + + Resolving ingress policy for [k8s:class=deathstar k8s:io.cilium.k8s.policy.cluster=default k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default k8s:org=empire] + * Rule {"matchLabels":{"any:class":"deathstar","any:org":"empire","k8s:io.kubernetes.pod.namespace":"default"}}: selected + Allows from labels {"matchLabels":{"any:org":"empire","k8s:io.kubernetes.pod.namespace":"default"}} + No label match for [k8s:class=xwing k8s:io.cilium.k8s.policy.cluster=default k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default k8s:org=alliance] + 1/1 rules selected + Found no allow rule + Ingress verdict: denied + + Final verdict: DENIED + + +Please refer to the :ref:`policy troubleshooting guide` for +more detail about how to troubleshoot policy related drops. + +.. note:: + **Hubble Relay** (beta) allows you to query multiple Hubble instances + simultaneously without having to first manually target a specific node. + +Ensure Hubble is running correctly +---------------------------------- + +To ensure the Hubble client can connect to the Hubble server running inside +Cilium, you may use the ``hubble status`` command: + +.. code:: bash + + $ hubble status + Healthcheck (via unix:///var/run/cilium/hubble.sock): Ok + Max Flows: 4096 + Current Flows: 2542 (62.06%) + +``cilium-agent`` must be running with the ``--enable-hubble`` option in order +for the Hubble server to be enabled. When deploying Cilium with Helm, make sure +to set the ``global.hubble.enabled=true`` value. + +To check if Hubble is enabled in your deployment, you may look for the +following output in ``cilium status``: + +.. code:: bash + + $ cilium status + ... + Hubble: Ok Current/Max Flows: 2542/4096 (62.06%), Flows/s: 164.21 Metrics: Disabled + ... + +.. note:: + Pods need to be managed by Cilium in order to be observable by Hubble. + See how to :ref:`ensure a pod is managed by Cilium` + for more details. Connectivity Problems ===================== @@ -289,6 +399,8 @@ consumed. Policy Troubleshooting ====================== +.. _ensure_managed_pod: + Ensure pod is managed by Cilium ------------------------------- @@ -416,6 +528,8 @@ When running in :ref:`arch_direct_routing` mode: Useful Scripts ============== +.. _retrieve_cilium_pod: + Retrieve Cilium pod managing a particular pod ---------------------------------------------