Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc/hubble-internals: update Hubble Relay section to reflect current state #14042

Merged
merged 1 commit into from
Nov 17, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 44 additions & 31 deletions Documentation/hubble.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,11 @@ achieve all of this at large scale.
Hubble's server component is embedded into the Cilium agent in order to achieve
high performance with low-overhead. The gRPC services offered by Hubble server
may be consumed locally via a Unix domain socket or, more typically, through
Hubble Relay. Hubble-relay is a standalone component which is aware of all
running Hubble instances and offers full cluster visibility by connecting to
their respective gRPC APIs. This capability is usually referred to as
multi-node. Hubble Relay's main goal is to offer a rich API that can be safely
exposed and consumed by the Hubble UI and CLI.
Hubble Relay. Hubble Relay is a standalone component which is aware of all
Hubble instances and offers full cluster visibility by connecting to their
respective gRPC APIs. This capability is usually referred to as multi-node.
Hubble Relay's main goal is to offer a rich API that can be safely exposed and
consumed by the Hubble UI and CLI.

.. note:: This guide does not cover Hubble in standalone mode, which is
deprecated with the release of Cilium v1.8.
Expand All @@ -44,31 +44,33 @@ local Unix domain socket.
The Observer service
^^^^^^^^^^^^^^^^^^^^

The Observer service is the principal service. It makes two methods available:
``GetFlows`` and ``ServerStatus``. While the ``ServerStatus`` method is pretty
straightforward (it provides metrics related to the running server), the
``GetFlows`` one is far more sophisticated and the more important one.
The Observer service is the principal service. It provides three RPC endpoints:
``GetFlows``, ``GetNodes`` and ``ServerStatus``. While ``ServerStatus`` and
``GetNodes`` endpoints are pretty straightforward (they provides metrics and
other information related to the running instance(s)), ``GetFlows`` is far more
sophisticated and the more important one.

Using ``GetFlows``, callers can get a stream of payloads. Request parameters
allow callers to specify filters in the form of blacklists and whitelists to
allow for fine-grained filtering of data.
Using ``GetFlows``, callers get a stream of payloads. Request parameters allow
callers to specify filters in the form of blacklists and whitelists to allow
for fine-grained filtering of data.

In order to answer ``GetFlows`` requests, Hubble stores monitoring events from
Cilium's event monitor into a ring buffer structure. Monitoring events are
obtained by registering a new listener to Cilium's monitor. The ring buffer is
capable of storing a configurable amount of events in memory. Events are
continuously consumed, overriding older ones once the ring buffer is full.
Cilium's event monitor into a user-space ring buffer structure. Monitoring
events are obtained by registering a new listener on Cilium monitor. The
ring buffer is capable of storing a configurable amount of events in memory.
Events are continuously consumed, overriding older ones once the ring buffer is
full.

.. image:: ./images/hubble_getflows.png

For efficiency, the internal buffer length is a bit mask of ones + 1. The most
significant bit of this bit mask is the same position of the most significant
bit position of 'n'. In other terms, the internal buffer size is always a power
of 2. As the ring buffer is a hot code path, it has been designed to not employ
any locking mechanisms and uses atomic operations instead. While this approach
has performance benefits, it also has the downsides of being a complex
component and that reading the very last event written to the buffer is not
possible as it cannot be guaranteed that it has been fully written.
of 2 with 1 slot reserved for the writer. In effect, from a user perspective,
the ring buffer capacity is one less than a power of 2. As the ring buffer is a
hot code path, it has been designed to not employ any locking mechanisms and
uses atomic operations instead. While this approach has performance benefits,
it also has the downsides of being a complex component.

Due to its complex nature, the ring buffer is typically accessed via a ring
reader that abstracts the complexity of this data structure for reading. The
Expand All @@ -81,8 +83,8 @@ The Peer service

The Peer service sends information about Hubble peers in the cluster in a
stream. When the ``Notify`` method is called, it reports information about all
the peers in the cluster and subsequently sends information about peers that are
updated, added or removed from the cluster. Thus , it allows the caller to
the peers in the cluster and subsequently sends information about peers that
are updated, added or removed from the cluster. Thus, it allows the caller to
keep track of all Hubble instances and query their respective gRPC services.

This service is typically only exposed on a local Unix domain socket and is
Expand All @@ -98,11 +100,22 @@ Cilium's datapath node handler interface.
Hubble Relay
------------

.. note:: At the time of this writing, Hubble Relay component is still
work in progress and may undergo major changes. For this reason,
internal documentation about Hubble Relay is limited.

Hubble Relay is a component that was introduced in the context of multi-node
support. It leverages the Peer service to obtain information about Hubble
instances and consume their gRPC API in order to provide a more rich API that
covers events from across the entire cluster.
Hubble Relay is the Hubble component that brings multi-node support. It
leverages the Peer service to obtain information about Hubble instances and
consume their gRPC API in order to provide a more rich API that covers events
from across the entire cluster (or even multiple clusters in a ClusterMesh
scenario).

Hubble Relay was first introduced as a technology preview with the release of
Cilium v1.8. It is declared stable with the release of Cilium v1.9.

Hubble Relay implements the Observer service for multi-node. To that end, it
maintains a persistent connection with every Hubble peer in a cluster with a
peer manager. This component provides callers with the list of peers. Callers
may report when a peer is unreachable, in which case the peer manager will
attempt to reconnect.

As Hubble Relay connects to every node in a cluster, the Hubble server
instances must make their API available (by default on port 4244). By default,
Hubble server endpoints are secured using mutual TLS (mTLS) when exposed on a
TCP port in order to limit access to Hubble Relay only.
rolinh marked this conversation as resolved.
Show resolved Hide resolved