Skip to content

Commit

Permalink
upstream: subset load balancer (#1857)
Browse files Browse the repository at this point in the history
Fixes #1279, and is the final part of splitting #1735.

Provides the actual load balancer for subsets with basic stats for the number of subsets active, created, removed, selected (used for host selection), and fallbacks (used fallback path).

Signed-off-by: Stephan Zuercher <stephan@turbinelabs.io>
  • Loading branch information
zuercher authored and htuch committed Oct 24, 2017
1 parent 9e07aa3 commit e8a6881
Show file tree
Hide file tree
Showing 12 changed files with 1,839 additions and 26 deletions.
16 changes: 16 additions & 0 deletions docs/configuration/cluster_manager/cluster_stats.rst
Original file line number Diff line number Diff line change
Expand Up @@ -175,3 +175,19 @@ the following statistics:
lb_zone_routing_cross_zone, Counter, Zone aware routing mode but have to send cross zone
lb_local_cluster_not_ok, Counter, Local host set is not set or it is panic mode for local cluster
lb_zone_number_differs, Counter, Number of zones in local and upstream cluster different

Load balancer subset statistics
-------------------------------

Statistics for monitoring `load balancer subset <arch_overview_load_balancer_subsets>`
decisions. Stats are rooted at *cluster.<name>.* and contain the following statistics:

.. csv-table::
:header: Name, Type, Description
:widths: 1, 1, 2

lb_subsets_active, Gauge, Number of currently available subsets.
lb_subsets_created, Counter, Number of subsets created.
lb_subsets_removed, Counter, Number of subsets removed due to no hosts.
lb_subsets_selected, Counter, Number of times any subset was selected for load balancing.
lb_subsets_fallback, Counter, Number of times the fallback policy was invoked.
152 changes: 152 additions & 0 deletions docs/intro/arch_overview/load_balancing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -128,3 +128,155 @@ with regard to percentage relations in the local zone between originating and up
In this case the local zone of the upstream cluster can get all of the requests from the
local zone of the originating cluster and also have some space to allow traffic from other zones
in the originating cluster (if needed).

.. _arch_overview_load_balancer_subsets:

Load Balancer Subsets
---------------------

Envoy may be configured to divide hosts within an upstream cluster into subsets based on metadata
attached to the hosts. Routes may then specify the metadata that a host must match in order to be
selected by the load balancer, with the option of falling back to a predefined set of hosts,
including any host.

Subsets use the load balancer policy specified by the cluster. The original destination policy may
not be used with subsets because the upstream hosts are not known in advance. Subsets are compatible
with zone aware routing, but be aware that the use of subsets may easily violate the minimum hosts
condition described above.

If subsets are `configured
<https://github.com/envoyproxy/data-plane-api/blob/9897e3f/api/cds.proto#L237>`_ and a route
specifies no metadata or no subset matching the metadata exists, the subset load balancer initiates
its fallback policy. The default policy is ``NO_ENDPOINT``, in which case the request fails as if
the cluster had no hosts. Conversely, the ``ANY_ENDPOINT`` fallback policy load balances across all
hosts in the cluster, without regard to host metadata. Finally, the ``DEFAULT_SUBSET`` causes
fallback to load balance among hosts that match a specific set of metadata.

Subsets must be predefined to allow the subset load balancer to efficiently select the correct
subset of hosts. Each definition is a set of keys, which translates to zero or more
subsets. Conceptually, each host that has a metadata value for all of the keys in a definition is
added to a subset specific to its key-value pairs. If no host has all the keys, no subsets result
from the definition. Multiple definitions may be provided, and a single host may appear in multiple
subsets if it matches multiple definitions.

During routing, the route's metadata match configuration is used to find a specific subset. If there
is a subset with the exact keys and values specified by the route, the subset is used for load
balancing. Otherwise, the fallback policy is used. The cluster's subset configuration must,
therefore, contain a definition that has the same keys as a given route in order for subset load
balancing to occur.

This feature can only be enabled using the V2 configuration API. Furthermore, host metadata is only
supported when using the EDS discovery type for clusters. Host metadata for subset load balancing
must be placed under the filter name ``"envoy.lb"``. Similarly, route metadata match criteria use
the ``"envoy.lb"`` filter name. Host metadata may be hierarchical (e.g., the value for a top-level
key may be a structured value or list), but the subset load balancer only compares top-level keys
and values. Therefore when using structured values, a route's match criteria will only match if an
identical structured value appears in the host's metadata.

Examples
^^^^^^^^

We'll use simple metadata where all values are strings. Assume the following hosts are defined and
associated with a cluster:

====== ======================
Host Metadata
====== ======================
host1 v: 1.0, stage: prod
host2 v: 1.0, stage: prod
host3 v: 1.1, stage: canary
host4 v: 1.2-pre, stage: dev
====== ======================

The cluster may enable subset load balancing like this:

::

---
name: cluster-name
type: EDS
eds_cluster_config:
eds_config:
path: '.../eds.conf'
connect_timeout:
seconds: 10
lb_policy: LEAST_REQUEST
lb_subset_config:
fallback_policy: DEFAULT_SUBSET
default_subset:
stage: prod
subset_selectors:
- keys:
- v
- stage
- keys:
- stage

The following table describes some routes and the result of their application to the
cluster. Typically the match criteria would be used with routes matching specific aspects of the
request, such as the path or header information.

====================== ============= ==========================================
Match Criteria Balances Over Reason
====================== ============= ==========================================
stage: canary host3 Subset of hosts selected
v: 1.2-pre, stage: dev host4 Subset of hosts selected
v: 1.0 host1, host2 Fallback: No subset selector for "v" alone
other: x host1, host2 Fallback: No subset selector for "other"
(none) host1, host2 Fallback: No subset requested
====================== ============= ==========================================

Metadata match criteria may also be specified on a route's weighted clusters. Metadata match
criteria from the selected weighted cluster are merged with and override the criteria from the
route:

==================== =============================== ====================
Route Match Criteria Weighted Cluster Match Criteria Final Match Criteria
==================== =============================== ====================
stage: canary stage: prod stage: prod
v: 1.0 stage: prod v: 1.0, stage: prod
v: 1.0, stage: prod stage: canary v: 1.0, stage: canary
v: 1.0, stage: prod v: 1.1, stage: canary v: 1.1, stage: canary
(none) v: 1.0 v: 1.0
v: 1.0 (none) v: 1.0
==================== =============================== ====================


Example Host With Metadata
**************************

An EDS ``LbEndpoint`` with host metadata:

::

---
endpoint:
address:
socket_address:
protocol: TCP
address: 127.0.0.1
port_value: 8888
metadata:
filter_metadata:
envoy.lb:
version: '1.0'
stage: 'prod'


Example Route With Metadata Criteria
************************************

An RDS ``Route`` with metadata match criteria:

::

---
match:
prefix: /
route:
cluster: cluster-name
metadata_match:
filter_metadata:
envoy.lb:
version: '1.0'
stage: 'prod'
5 changes: 5 additions & 0 deletions include/envoy/upstream/upstream.h
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,11 @@ class HostSet {
COUNTER (lb_zone_routing_all_directly) \
COUNTER (lb_zone_routing_sampled) \
COUNTER (lb_zone_routing_cross_zone) \
GAUGE (lb_subsets_active) \
COUNTER (lb_subsets_created) \
COUNTER (lb_subsets_removed) \
COUNTER (lb_subsets_selected) \
COUNTER (lb_subsets_fallback) \
COUNTER (upstream_cx_total) \
GAUGE (upstream_cx_active) \
COUNTER (upstream_cx_http1_total) \
Expand Down
21 changes: 21 additions & 0 deletions source/common/upstream/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ envoy_cc_library(
":load_balancer_lib",
":load_stats_reporter_lib",
":ring_hash_lb_lib",
":subset_lb_lib",
"//include/envoy/event:dispatcher_interface",
"//include/envoy/http:codes_interface",
"//include/envoy/local_info:local_info_interface",
Expand Down Expand Up @@ -258,6 +259,26 @@ envoy_cc_library(
"//source/common/json:config_schemas_lib",
"//source/common/json:json_loader_lib",
"//source/common/protobuf",
"//source/common/router:router_lib",
],
)

envoy_cc_library(
name = "subset_lb_lib",
srcs = ["subset_lb.cc"],
hdrs = ["subset_lb.h"],
deps = [
":load_balancer_lib",
":ring_hash_lb_lib",
":upstream_lib",
"//include/envoy/runtime:runtime_interface",
"//include/envoy/upstream:load_balancer_interface",
"//source/common/common:assert_lib",
"//source/common/common:logger_lib",
"//source/common/config:metadata_lib",
"//source/common/config:well_known_names",
"//source/common/protobuf",
"//source/common/protobuf:utility_lib",
],
)

Expand Down
58 changes: 32 additions & 26 deletions source/common/upstream/cluster_manager_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
#include "common/upstream/load_balancer_impl.h"
#include "common/upstream/original_dst_cluster.h"
#include "common/upstream/ring_hash_lb.h"
#include "common/upstream/subset_lb.h"

#include "fmt/format.h"

Expand Down Expand Up @@ -543,33 +544,38 @@ ClusterManagerImpl::ThreadLocalClusterManagerImpl::ClusterEntry::ClusterEntry(
parent.parent_.local_info_, parent.parent_, parent.parent_.runtime_,
parent.parent_.random_,
Router::ShadowWriterPtr{new Router::ShadowWriterImpl(parent.parent_)}) {

switch (cluster->lbType()) {
case LoadBalancerType::LeastRequest: {
lb_.reset(new LeastRequestLoadBalancer(host_set_, parent.local_host_set_, cluster->stats(),
if (cluster->lbSubsetInfo().isEnabled()) {
lb_.reset(new SubsetLoadBalancer(cluster->lbType(), host_set_, parent.local_host_set_,
cluster->stats(), parent.parent_.runtime_,
parent.parent_.random_, cluster->lbSubsetInfo()));
} else {
switch (cluster->lbType()) {
case LoadBalancerType::LeastRequest: {
lb_.reset(new LeastRequestLoadBalancer(host_set_, parent.local_host_set_, cluster->stats(),
parent.parent_.runtime_, parent.parent_.random_));
break;
}
case LoadBalancerType::Random: {
lb_.reset(new RandomLoadBalancer(host_set_, parent.local_host_set_, cluster->stats(),
parent.parent_.runtime_, parent.parent_.random_));
break;
}
case LoadBalancerType::RoundRobin: {
lb_.reset(new RoundRobinLoadBalancer(host_set_, parent.local_host_set_, cluster->stats(),
parent.parent_.runtime_, parent.parent_.random_));
break;
}
case LoadBalancerType::Random: {
lb_.reset(new RandomLoadBalancer(host_set_, parent.local_host_set_, cluster->stats(),
parent.parent_.runtime_, parent.parent_.random_));
break;
}
case LoadBalancerType::RoundRobin: {
lb_.reset(new RoundRobinLoadBalancer(host_set_, parent.local_host_set_, cluster->stats(),
parent.parent_.runtime_, parent.parent_.random_));
break;
}
case LoadBalancerType::RingHash: {
lb_.reset(new RingHashLoadBalancer(host_set_, cluster->stats(), parent.parent_.runtime_,
parent.parent_.random_));
break;
}
case LoadBalancerType::OriginalDst: {
lb_.reset(new OriginalDstCluster::LoadBalancer(
host_set_, parent.parent_.primary_clusters_.at(cluster->name()).cluster_));
break;
}
break;
}
case LoadBalancerType::RingHash: {
lb_.reset(new RingHashLoadBalancer(host_set_, cluster->stats(), parent.parent_.runtime_,
parent.parent_.random_));
break;
}
case LoadBalancerType::OriginalDst: {
lb_.reset(new OriginalDstCluster::LoadBalancer(
host_set_, parent.parent_.primary_clusters_.at(cluster->name()).cluster_));
break;
}
}
}

host_set_.addMemberUpdateCb([this](const std::vector<HostSharedPtr>&,
Expand Down
Loading

0 comments on commit e8a6881

Please sign in to comment.