-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hubble/relay: improve peer connections handling #12556
Commits on Jul 21, 2020
-
hubble/relay: decouple peer syncing logic from the server struct
This is a first step that will allow implementing unit tests for the relay package. A new interface, `peerSyncer`, is defined and implemented by a `syncer` struct. Having the interface will allow creating a mock for testing purposes. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for 510b2e7 - Browse repository at this point
Copy the full SHA 510b2e7View commit details -
hubble/relay: move reconnection logic to peer syncer
This provides a better abstraction as the client that calls `List()` is expected to receive a list of peers that are "ready" to connect to and does not have to manage re-connect. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for 49194dd - Browse repository at this point
Copy the full SHA 49194ddView commit details -
hubble/relay: add a pool manager to handle peers connections
This commit is an in-depth reimplementation of the peers synchronization/connections handling logic. All logic regarding management of peers and connections to them has been moved to a separate package with a clearly defined interface for the consumer (`pool.PeerManager`). The goal here is threefold: 1. Separation of concerns (by hiding all internal peer handling logic into a dedicated package) 2. Increased testability 3. Improved connections handling The consumer of the `pool` package does not have to care whether a peer is connected or not anymore. If a peer is not connected, it's `Conn` attribute will be `<nil>`. If `Conn` is not `<nil>` but the peer cannot be reached, the consumer can signal it by using the manager's `ReportOffline(name string)` method. The new `pool.Manager` (which implements the `pool.PeerManager` interface) also implements a better mechanism to ensure peers connection are healthy by using a dedicated routine that monitors for offline/reported offline peers and attempts to (re-)connect to them. This improves the situation as with the older implementation, re-connecting to a peer required a new call to the Hubble Relay service (and the peers that were "woken up" were not available for the request). Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for ec924ef - Browse repository at this point
Copy the full SHA ec924efView commit details -
hubble/peer: define a Peer service Client interface and Builder
Such interface are useful to create good abstractions with code that require a Peer service client. They can also easily be mocked which is essential to write good unit tests. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for 215bc08 - Browse repository at this point
Copy the full SHA 215bc08View commit details -
hubble/peer: implement ClientBuilder for a local connection
LocalClientBuilder creates a new Client which is suitable when the gRPC connection to the Peer service is local (typically a Unix Domain Socket). This scenario is the most common one and is directly useful to Hubble Relay. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for 0bb2c86 - Browse repository at this point
Copy the full SHA 0bb2c86View commit details -
hubble/relay: use peer.ClientBuilder to build Peer service clients
Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for 68b45a1 - Browse repository at this point
Copy the full SHA 68b45a1View commit details -
hubble/relay: define and use a gRPC client connection builder
This commits defines an interface for a gRPC client connection builder. When instantiating a new pool.Manager, the caller may specify a grpc client connection builder implementation to be used to create connection to Hubble peers. This allows for fine grained control over the client connection parameters by the caller. In addition, as this is an interface, it can easily be mocked which is useful when writing unit tests. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for a1c87a7 - Browse repository at this point
Copy the full SHA a1c87a7View commit details -
hubble/relay: implement backoff for peer connection attempts
With the rework of peer connections handling, a goroutine periodically checks whether peer connections are healthy or not. This check is necessary as there are scenarios where a connection can be lost without any peer service notification or client request (for instance, a node reboot). This commit improves reconnection attempts by implementing a backoff mechanism so attempts are not too frequent. There is a default backoff mechanism that can be configured by caller when instantiating a new pool.Manager. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for 7bf60d3 - Browse repository at this point
Copy the full SHA 7bf60d3View commit details -
hubble/relay: make the connection check interval configurable
Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for cfe2615 - Browse repository at this point
Copy the full SHA cfe2615View commit details -
hubble/relay: report offline peers to the pool manager
Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for 1db1402 - Browse repository at this point
Copy the full SHA 1db1402View commit details -
hubble/relay: ignore backoff on direct connection requests
On peer change notifications or call to `pool.Manager.ReportOffline(name string)`, the backoff interval should be ignored to re-attempt a connection. This change typically addresses the following scenario: 1. Node Foo is offline 2. The connectivity checker attempts to reconnect several time; each time the backoff is increased. 3. Node Foo becomes available; a peer change notification is received 4. Oops, no connection attempt is made to Foo because of a large backoff duration Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for 0427a13 - Browse repository at this point
Copy the full SHA 0427a13View commit details -
hubble/relay: do not call disconnect before connection attempt
This is not necessary as the connect function closes pre-existing connections. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for c3107e0 - Browse repository at this point
Copy the full SHA c3107e0View commit details -
hubble: move FakePeerNotifyServer to testutils
Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for 6e113c0 - Browse repository at this point
Copy the full SHA 6e113c0View commit details -
hubble: move peer interface definitions to hubble/peer/types
This is necessary to avoid import cycles when implementing fakes for the peer interfaces in the `testutils` package. Note that the `types` sub-package is commonly found throughout Cilium's codebase so this change follows the convention. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for ed450b3 - Browse repository at this point
Copy the full SHA ed450b3View commit details -
hubble/testutils: implement fakers for peer/types/{Client,ClientBuilder}
Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for c9a5df5 - Browse repository at this point
Copy the full SHA c9a5df5View commit details -
hubble/testutils: add FakeGRPCClientStream
This will allow mocking gRPC clients which is useful for unit testing. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for 431382f - Browse repository at this point
Copy the full SHA 431382fView commit details -
hubble/testutils: implement FakePeerNotifyClient
Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for c34a041 - Browse repository at this point
Copy the full SHA c34a041View commit details -
hubble/relay: implement unit tests for the peer pool manager
These unit tests are not exhaustive although they still provide a decent coverage. However, it provides a good structure to implement more tests in the future such as complex peer connectivity issue scenarios or simply regression tests for bugs that are fixed. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for a9b8845 - Browse repository at this point
Copy the full SHA a9b8845View commit details -
hubble/[peer|relay]: remove Target() from the ClientBuilder interface
This commit removes `Target()` from the peer `ClientBuilder` interface and, instead, adds the target as a parameter to `Client()`. As a consequence, Hubble Relay code is adapted to the change. The peer pool manager now has a new option to provide the address of the peer gRPC service. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for d395dd2 - Browse repository at this point
Copy the full SHA d395dd2View commit details -
hubble/relay: add pool option to set the logger and remove debug option
The debug option was only used to set the log level of the logger. As the caller now has control over the logger to be used, this option is no longer necessary. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for 58d9524 - Browse repository at this point
Copy the full SHA 58d9524View commit details -
hubble/relay: wait on goroutine to finish when pool.Stop() is called
This change makes tests more reliable and is cleaner at the same time as once the function returns, the pool manager is really shut down. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for 334035b - Browse repository at this point
Copy the full SHA 334035bView commit details -
hubble/relay: assert logger messages in pool manager tests
This commit tests the pool manager logger output for certain tests. This change revealed some flakiness in the tests that have been addressed. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for 7c4f4e5 - Browse repository at this point
Copy the full SHA 7c4f4e5View commit details -
hubble/relay: use a RWLock to protect the peers map in the pool Manager
The peers map is expected to be more frequently read than written to, especially with the `List()` method being exposed and used by Hubble Relay for every request it handles. Therefore, a RWLock is more approriate than a regular lock. Suggested-by: André Martins <andre@cilium.io> Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for 864f3f0 - Browse repository at this point
Copy the full SHA 864f3f0View commit details -
hubble/relay: rename struct member from pool to pm to avoid confusion
The name `pool` clashes with the package `pool`. It is not a problem per se but creates a bit of confusion so rename it to `pm`. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Configuration menu - View commit details
-
Copy full SHA for f95e76a - Browse repository at this point
Copy the full SHA f95e76aView commit details