Skip to content

Commit

Permalink
Make over provisioning factor configurable (envoyproxy#4003)
Browse files Browse the repository at this point in the history
Description:
Add over-provisioning-factor to EDS proto.
Add a over_provisioning_factor() method for HostSet class.

Risk Level: Medium
Testing: Unit tests.
Docs Changes: inline
envoyproxy#3222

Signed-off-by: Xin Zhuang <stevenzzz@google.com>
  • Loading branch information
stevenzzzz authored and alyssawilk committed Aug 15, 2018
1 parent 6c08fb4 commit 8c189a5
Show file tree
Hide file tree
Showing 26 changed files with 298 additions and 125 deletions.
12 changes: 12 additions & 0 deletions api/envoy/api/v2/eds.proto
Expand Up @@ -12,6 +12,7 @@ import "google/api/annotations.proto";

import "validate/validate.proto";
import "gogoproto/gogo.proto";
import "google/protobuf/wrappers.proto";

option (gogoproto.equal_all) = true;

Expand Down Expand Up @@ -80,6 +81,17 @@ message ClusterLoadAssignment {
// "lb"_drop = 20% // 50% of the remaining 'actual' load, which is 40%.
// actual_outgoing_load = 20% // remaining after applying all categories.
repeated DropOverload drop_overloads = 2;

// Priority levels and localities are considered overprovisioned with this
// factor (in percentage). This means that we don't consider a priority
// level or locality unhealthy until the percentage of healthy hosts
// multiplied by the overprovisioning factor drops below 100.
// With the default value 140(1.4), Envoy doesn't consider a priority level
// or a locality unhealthy until their percentage of healthy hosts drops
// below 72%.
// Read more at :ref:`priority levels <arch_overview_load_balancing_priority_levels>` and
// :ref:`localities <arch_overview_load_balancing_locality_weighted_lb>`.
google.protobuf.UInt32Value overprovisioning_factor = 3 [(validate.rules).uint32.gt = 0];
}

// Load balancing policy settings.
Expand Down
26 changes: 20 additions & 6 deletions docs/root/intro/arch_overview/load_balancing.rst
Expand Up @@ -132,6 +132,17 @@ Please note that fully resolved IP address should be passed in this header. For
routed to a host with IP address 10.195.16.237 at port 8888, the request header value should be set as
``10.195.16.237:8888``.

.. _arch_overview_load_balancing_overprovisioning_factor:

Overprovisioning Factor
-----------------------
Priority levels and localities are considered overprovisioned with
:ref:`this percentage <envoy_api_field_ClusterLoadAssignment.Policy.overprovisioning_factor>`.
Envoy doesn't consider a priority level or locality unhealthy until the
percentage of healthy hosts multiplied by the overprovisioning factor drops
below 100. The default value is 1.4, so a priority level or locality will not be
considered unhealthy until the percentage of healthy endpoints goes below 72%.

.. _arch_overview_load_balancing_panic_threshold:

Panic threshold
Expand Down Expand Up @@ -163,9 +174,10 @@ priority may also be specified. When endpoints at the highest priority level (P=
traffic will land on endpoints in that priority level. As endpoints for the highest priority level
become unhealthy, traffic will begin to trickle to lower priority levels.

Currently, it is assumed that each priority level is over-provisioned by a (hard-coded) factor of
1.4. So if 80% of the endpoints are healthy, the priority level is still considered healthy because
80*1.4 > 100. As the number of healthy endpoints dips below 72%, the health of the priority level
Currently, it is assumed that each priority level is over-provisioned by the
:ref:`overprovisioning factor <arch_overview_load_balancing_overprovisioning_factor>`.
With default factor value 1.4, if 80% of the endpoints are healthy, the priority level is still considered
healthy because 80*1.4 > 100. As the number of healthy endpoints dips below 72%, the health of the priority level
goes below 100. At that point the percent of traffic equivalent to the health of P=0 will go to P=0
and remaining traffic will flow to P=1.

Expand Down Expand Up @@ -303,12 +315,14 @@ When all endpoints are healthy, the locality is picked using a weighted
round-robin schedule, where the locality weight is used for weighting. When some
endpoints in a locality are unhealthy, we adjust the locality weight to reflect
this. As with :ref:`priority levels
<arch_overview_load_balancing_priority_levels>`, we assume an over-provision
factor (currently hardcoded at 1.4), which means we do not perform any weight
<arch_overview_load_balancing_priority_levels>`, we assume an
:ref:`over-provision factor <arch_overview_load_balancing_overprovisioning_factor>`
(default value 1.4), which means we do not perform any weight
adjustment when only a small number of endpoints in a locality are unhealthy.

Assume a simple set-up with 2 localities X and Y, where X has a locality weight
of 1 and Y has a locality weight of 2, L=Y 100% healthy.
of 1 and Y has a locality weight of 2, L=Y 100% healthy,
with default overprovisioning factor 1.4.

+----------------------------+---------------------------+----------------------------+
| L=X healthy endpoints | Percent of traffic to L=X | Percent of traffic to L=Y |
Expand Down
9 changes: 8 additions & 1 deletion include/envoy/upstream/upstream.h
Expand Up @@ -237,17 +237,24 @@ class HostSet {
* @param locality_weights supplies a map from locality to associated weight.
* @param hosts_added supplies the hosts added since the last update.
* @param hosts_removed supplies the hosts removed since the last update.
* @param overprovisioning_factor if presents, overwrites the current overprovisioning_factor.
*/
virtual void updateHosts(HostVectorConstSharedPtr hosts, HostVectorConstSharedPtr healthy_hosts,
HostsPerLocalityConstSharedPtr hosts_per_locality,
HostsPerLocalityConstSharedPtr healthy_hosts_per_locality,
LocalityWeightsConstSharedPtr locality_weights,
const HostVector& hosts_added, const HostVector& hosts_removed) PURE;
const HostVector& hosts_added, const HostVector& hosts_removed,
absl::optional<uint32_t> overprovisioning_factor) PURE;

/**
* @return uint32_t the priority of this host set.
*/
virtual uint32_t priority() const PURE;

/**
* @return uint32_t the overprovisioning factor of this host set.
*/
virtual uint32_t overprovisioning_factor() const PURE;
};

typedef std::unique_ptr<HostSet> HostSetPtr;
Expand Down
2 changes: 1 addition & 1 deletion source/common/upstream/cluster_manager_impl.cc
Expand Up @@ -958,7 +958,7 @@ void ClusterManagerImpl::ThreadLocalClusterManagerImpl::updateClusterMembership(
cluster_entry->priority_set_.getOrCreateHostSet(priority).updateHosts(
std::move(hosts), std::move(healthy_hosts), std::move(hosts_per_locality),
std::move(healthy_hosts_per_locality), std::move(locality_weights), hosts_added,
hosts_removed);
hosts_removed, absl::nullopt);

// If an LB is thread aware, create a new worker local LB on membership changes.
if (cluster_entry->lb_factory_ != nullptr) {
Expand Down
29 changes: 18 additions & 11 deletions source/common/upstream/eds.cc
Expand Up @@ -13,6 +13,7 @@
#include "common/network/resolver_impl.h"
#include "common/network/utility.h"
#include "common/protobuf/utility.h"
#include "common/upstream/load_balancer_impl.h"
#include "common/upstream/sds_subscription.h"

namespace Envoy {
Expand Down Expand Up @@ -69,6 +70,7 @@ void EdsClusterImpl::onConfigUpdate(const ResourceVector& resources, const std::
PriorityStateManager priority_state_manager(*this, local_info_);
for (const auto& locality_lb_endpoint : cluster_load_assignment.endpoints()) {
const uint32_t priority = locality_lb_endpoint.priority();

if (priority > 0 && !cluster_name_.empty() && cluster_name_ == cm_.localClusterName()) {
throw EnvoyException(
fmt::format("Unexpected non-zero priority for local cluster '{}'.", cluster_name_));
Expand All @@ -85,6 +87,9 @@ void EdsClusterImpl::onConfigUpdate(const ResourceVector& resources, const std::
// Track whether we rebuilt any LB structures.
bool cluster_rebuilt = false;

const uint32_t overprovisioning_factor = PROTOBUF_GET_WRAPPED_OR_DEFAULT(
cluster_load_assignment.policy(), overprovisioning_factor, kDefaultOverProvisioningFactor);

// Loop over existing priorities not present in the config. This will empty out any priorities
// the config update did not refer to
auto& priority_state = priority_state_manager.priorityState();
Expand All @@ -93,9 +98,9 @@ void EdsClusterImpl::onConfigUpdate(const ResourceVector& resources, const std::
if (locality_weights_map_.size() <= i) {
locality_weights_map_.resize(i + 1);
}
cluster_rebuilt |=
updateHostsPerLocality(i, *priority_state[i].first, locality_weights_map_[i],
priority_state[i].second, priority_state_manager, updated_hosts);
cluster_rebuilt |= updateHostsPerLocality(
i, overprovisioning_factor, *priority_state[i].first, locality_weights_map_[i],
priority_state[i].second, priority_state_manager, updated_hosts);
}
}

Expand All @@ -109,8 +114,8 @@ void EdsClusterImpl::onConfigUpdate(const ResourceVector& resources, const std::
locality_weights_map_.resize(i + 1);
}
cluster_rebuilt |=
updateHostsPerLocality(i, empty_hosts, locality_weights_map_[i], empty_locality_map,
priority_state_manager, updated_hosts);
updateHostsPerLocality(i, overprovisioning_factor, empty_hosts, locality_weights_map_[i],
empty_locality_map, priority_state_manager, updated_hosts);
}

updateHostMap(std::move(updated_hosts));
Expand All @@ -125,10 +130,11 @@ void EdsClusterImpl::onConfigUpdate(const ResourceVector& resources, const std::
}

bool EdsClusterImpl::updateHostsPerLocality(
const uint32_t priority, const HostVector& new_hosts, LocalityWeightsMap& locality_weights_map,
LocalityWeightsMap& new_locality_weights_map, PriorityStateManager& priority_state_manager,
const uint32_t priority, const uint32_t overprovisioning_factor, const HostVector& new_hosts,
LocalityWeightsMap& locality_weights_map, LocalityWeightsMap& new_locality_weights_map,
PriorityStateManager& priority_state_manager,
std::unordered_map<std::string, HostSharedPtr>& updated_hosts) {
const auto& host_set = priority_set_.getOrCreateHostSet(priority);
const auto& host_set = priority_set_.getOrCreateHostSet(priority, overprovisioning_factor);
HostVectorSharedPtr current_hosts_copy(new HostVector(host_set.hosts()));

HostVector hosts_added;
Expand All @@ -141,16 +147,17 @@ bool EdsClusterImpl::updateHostsPerLocality(
// out of the locality scheduler, we discover their new weights. We don't currently have a shared
// object for locality weights that we can update here, we should add something like this to
// improve performance and scalability of locality weight updates.
if (updateDynamicHostList(new_hosts, *current_hosts_copy, hosts_added, hosts_removed,
if (host_set.overprovisioning_factor() != overprovisioning_factor ||
updateDynamicHostList(new_hosts, *current_hosts_copy, hosts_added, hosts_removed,
updated_hosts) ||
locality_weights_map != new_locality_weights_map) {
locality_weights_map = new_locality_weights_map;
ENVOY_LOG(debug, "EDS hosts or locality weights changed for cluster: {} ({}) priority {}",
info_->name(), host_set.hosts().size(), host_set.priority());

priority_state_manager.updateClusterPrioritySet(priority, std::move(current_hosts_copy),
hosts_added, hosts_removed, absl::nullopt);

hosts_added, hosts_removed, absl::nullopt,
overprovisioning_factor);
return true;
}
return false;
Expand Down
4 changes: 2 additions & 2 deletions source/common/upstream/eds.h
Expand Up @@ -36,8 +36,8 @@ class EdsClusterImpl : public BaseDynamicClusterImpl,
private:
using LocalityWeightsMap =
std::unordered_map<envoy::api::v2::core::Locality, uint32_t, LocalityHash, LocalityEqualTo>;
bool updateHostsPerLocality(const uint32_t priority, const HostVector& new_hosts,
LocalityWeightsMap& locality_weights_map,
bool updateHostsPerLocality(const uint32_t priority, const uint32_t overprovisioning_factor,
const HostVector& new_hosts, LocalityWeightsMap& locality_weights_map,
LocalityWeightsMap& new_locality_weights_map,
PriorityStateManager& priority_state_manager,
std::unordered_map<std::string, HostSharedPtr>& updated_hosts);
Expand Down
2 changes: 1 addition & 1 deletion source/common/upstream/health_discovery_service.cc
Expand Up @@ -248,7 +248,7 @@ void HdsCluster::initialize(std::function<void()> callback) {
auto healthy = createHealthyHostList(*initial_hosts_);

first_host_set.updateHosts(initial_hosts_, healthy, HostsPerLocalityImpl::empty(),
HostsPerLocalityImpl::empty(), {}, *initial_hosts_, {});
HostsPerLocalityImpl::empty(), {}, *initial_hosts_, {}, absl::nullopt);
}

void HdsCluster::setOutlierDetector(const Outlier::DetectorSharedPtr&) {
Expand Down
8 changes: 4 additions & 4 deletions source/common/upstream/load_balancer_impl.cc
Expand Up @@ -58,13 +58,13 @@ void LoadBalancerBase::recalculatePerPriorityState(uint32_t priority) {

// Determine the health of the newly modified priority level.
// Health ranges from 0-100, and is the ratio of healthy hosts to total hosts, modified by the
// somewhat arbitrary overprovision factor of kOverProvisioningFactor.
// Eventually the overprovision factor will likely be made configurable.
// overprovisioning factor.
HostSet& host_set = *priority_set_.hostSetsPerPriority()[priority];
per_priority_health_[priority] = 0;
if (host_set.hosts().size() > 0) {
per_priority_health_[priority] = std::min<uint32_t>(
100, kOverProvisioningFactor * host_set.healthyHosts().size() / host_set.hosts().size());
per_priority_health_[priority] =
std::min<uint32_t>(100, (host_set.overprovisioning_factor() *
host_set.healthyHosts().size() / host_set.hosts().size()));
}

// Now that we've updated health for the changed priority level, we need to caculate percentage
Expand Down
6 changes: 2 additions & 4 deletions source/common/upstream/load_balancer_impl.h
Expand Up @@ -15,10 +15,8 @@
namespace Envoy {
namespace Upstream {

// Priority levels and localities are considered overprovisioned with this factor. This means that
// we don't consider a priority level or locality unhealthy until the percentage of healthy hosts
// multiplied by kOverProvisioningFactor drops below 100.
static constexpr uint32_t kOverProvisioningFactor = 140;
// Priority levels and localities are considered overprovisioned with this factor.
static constexpr uint32_t kDefaultOverProvisioningFactor = 140;

/**
* Base class for all LB implementations.
Expand Down
4 changes: 2 additions & 2 deletions source/common/upstream/original_dst_cluster.cc
Expand Up @@ -150,7 +150,7 @@ void OriginalDstCluster::addHost(HostSharedPtr& host) {
new_hosts->emplace_back(host);
first_host_set.updateHosts(new_hosts, createHealthyHostList(*new_hosts),
HostsPerLocalityImpl::empty(), HostsPerLocalityImpl::empty(), {},
{std::move(host)}, {});
{std::move(host)}, {}, absl::nullopt);
}

void OriginalDstCluster::cleanup() {
Expand All @@ -175,7 +175,7 @@ void OriginalDstCluster::cleanup() {
if (to_be_removed.size() > 0) {
host_set.updateHosts(new_hosts, createHealthyHostList(*new_hosts),
HostsPerLocalityImpl::empty(), HostsPerLocalityImpl::empty(), {}, {},
to_be_removed);
to_be_removed, absl::nullopt);
}

cleanup_timer_->enableTimer(cleanup_interval_ms_);
Expand Down
4 changes: 2 additions & 2 deletions source/common/upstream/ring_hash_lb.cc
@@ -1,6 +1,7 @@
#include "common/upstream/ring_hash_lb.h"

#include <cstdint>
#include <iostream>
#include <string>
#include <vector>

Expand Down Expand Up @@ -125,9 +126,8 @@ RingHashLoadBalancer::Ring::Ring(
std::sort(ring_.begin(), ring_.end(), [](const RingEntry& lhs, const RingEntry& rhs) -> bool {
return lhs.hash_ < rhs.hash_;
});

if (ENVOY_LOG_CHECK_LEVEL(trace)) {
for (auto entry : ring_) {
for (const auto& entry : ring_) {
ENVOY_LOG(trace, "ring hash: host={} hash={}", entry.host_->address()->asString(),
entry.hash_);
}
Expand Down
12 changes: 9 additions & 3 deletions source/common/upstream/subset_lb.cc
Expand Up @@ -544,10 +544,16 @@ void SubsetLoadBalancer::HostSubsetImpl::update(const HostVector& hosts_added,
}
}

HostSetImplPtr SubsetLoadBalancer::PrioritySubsetImpl::createHostSet(uint32_t priority) {
HostSetImplPtr SubsetLoadBalancer::PrioritySubsetImpl::createHostSet(
uint32_t priority, absl::optional<uint32_t> overprovisioning_factor) {
// Use original hostset's overprovisioning_factor.
RELEASE_ASSERT(priority < original_priority_set_.hostSetsPerPriority().size(), "");
return HostSetImplPtr{new HostSubsetImpl(*original_priority_set_.hostSetsPerPriority()[priority],
locality_weight_aware_)};

const HostSetPtr& host_set = original_priority_set_.hostSetsPerPriority()[priority];

ASSERT(!overprovisioning_factor.has_value() ||
overprovisioning_factor.value() == host_set->overprovisioning_factor());
return HostSetImplPtr{new HostSubsetImpl(*host_set, locality_weight_aware_)};
}

void SubsetLoadBalancer::PrioritySubsetImpl::update(uint32_t priority,
Expand Down
7 changes: 4 additions & 3 deletions source/common/upstream/subset_lb.h
Expand Up @@ -39,8 +39,8 @@ class SubsetLoadBalancer : public LoadBalancer, Logger::Loggable<Logger::Id::ups
class HostSubsetImpl : public HostSetImpl {
public:
HostSubsetImpl(const HostSet& original_host_set, bool locality_weight_aware)
: HostSetImpl(original_host_set.priority()), original_host_set_(original_host_set),
locality_weight_aware_(locality_weight_aware) {}
: HostSetImpl(original_host_set.priority(), original_host_set.overprovisioning_factor()),
original_host_set_(original_host_set), locality_weight_aware_(locality_weight_aware) {}

void update(const HostVector& hosts_added, const HostVector& hosts_removed,
HostPredicate predicate);
Expand Down Expand Up @@ -79,7 +79,8 @@ class SubsetLoadBalancer : public LoadBalancer, Logger::Loggable<Logger::Id::ups
LoadBalancerPtr lb_;

protected:
HostSetImplPtr createHostSet(uint32_t priority) override;
HostSetImplPtr createHostSet(uint32_t priority,
absl::optional<uint32_t> overprovisioning_factor) override;

private:
const PrioritySet& original_priority_set_;
Expand Down

0 comments on commit 8c189a5

Please sign in to comment.