Skip to content

Commit

Permalink
Merge pull request envoyproxy#38 from istio-private/backport-1.4
Browse files Browse the repository at this point in the history
Backport 1.4
  • Loading branch information
jplevyak committed Jun 17, 2020
2 parents a5363aa + 6c46280 commit dece5fd
Show file tree
Hide file tree
Showing 71 changed files with 1,536 additions and 141 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -300,6 +300,14 @@ message HttpConnectionManager {
// is terminated with a 408 Request Timeout error code if no upstream response
// header has been received, otherwise a stream reset occurs.
//
// This timeout also specifies the amount of time that Envoy will wait for the peer to open enough
// window to write any remaining stream data once the entirety of stream data (local end stream is
// true) has been buffered pending available window. In other words, this timeout defends against
// a peer that does not release enough window to completely write the stream, even though all
// data has been proxied within available flow control windows. If the timeout is hit in this
// case, the :ref:`tx_flush_timeout <config_http_conn_man_stats_per_codec>` counter will be
// incremented.
//
// Note that it is possible to idle timeout even if the wire traffic for a stream is non-idle, due
// to the granularity of events presented to the connection manager. For example, while receiving
// very large request headers, it may be the case that there is traffic regularly arriving on the
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,14 @@ message HttpConnectionManager {
// is terminated with a 408 Request Timeout error code if no upstream response
// header has been received, otherwise a stream reset occurs.
//
// This timeout also specifies the amount of time that Envoy will wait for the peer to open enough
// window to write any remaining stream data once the entirety of stream data (local end stream is
// true) has been buffered pending available window. In other words, this timeout defends against
// a peer that does not release enough window to completely write the stream, even though all
// data has been proxied within available flow control windows. If the timeout is hit in this
// case, the :ref:`tx_flush_timeout <config_http_conn_man_stats_per_codec>` counter will be
// incremented.
//
// Note that it is possible to idle timeout even if the wire traffic for a stream is non-idle, due
// to the granularity of events presented to the connection manager. For example, while receiving
// very large request headers, it may be the case that there is traffic regularly arriving on the
Expand Down
15 changes: 15 additions & 0 deletions docs/root/configuration/best_practices/edge.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@ HTTP proxies should additionally configure:
* :ref:`HTTP/2 maximum concurrent streams limit <envoy_api_field_core.Http2ProtocolOptions.max_concurrent_streams>` to 100,
* :ref:`HTTP/2 initial stream window size limit <envoy_api_field_core.Http2ProtocolOptions.initial_stream_window_size>` to 64 KiB,
* :ref:`HTTP/2 initial connection window size limit <envoy_api_field_core.Http2ProtocolOptions.initial_connection_window_size>` to 1 MiB.
* :ref:`headers_with_underscores_action setting <envoy_api_field_core.HttpProtocolOptions.headers_with_underscores_action>` to REJECT_REQUEST, to protect upstream services that treat '_' and '-' as interchangeable.
* :ref:`Listener connection limits. <config_listeners_runtime>`
* :ref:`Global downstream connection limits <config_overload_manager>`.

The following is a YAML example of the above recommendation.

Expand Down Expand Up @@ -108,3 +111,15 @@ The following is a YAML example of the above recommendation.
http2_protocol_options:
initial_stream_window_size: 65536 # 64 KiB
initial_connection_window_size: 1048576 # 1 MiB
layered_runtime:
layers:
- name: static_layer_0
static_layer:
envoy:
resource_limits:
listener:
example_listener_name:
connection_limit: 10000
overload:
global_downstream_max_connections: 50000
9 changes: 9 additions & 0 deletions docs/root/configuration/http/http_conn_man/stats.rst
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,16 @@ All http2 statistics are rooted at *http2.*
rx_reset, Counter, Total number of reset stream frames received by Envoy
too_many_header_frames, Counter, Total number of times an HTTP2 connection is reset due to receiving too many headers frames. Envoy currently supports proxying at most one header frame for 100-Continue one non-100 response code header frame and one frame with trailers
trailers, Counter, Total number of trailers seen on requests coming from downstream
tx_flush_timeout, Counter, Total number of :ref:`stream idle timeouts <envoy_api_field_config.filter.network.http_connection_manager.v2.HttpConnectionManager.stream_idle_timeout>` waiting for open stream window to flush the remainder of a stream
tx_reset, Counter, Total number of reset stream frames transmitted by Envoy
streams_active, Gauge, Active streams as observed by the codec
pending_send_bytes, Gauge, Currently buffered body data in bytes waiting to be written when stream/connection window is opened.

.. attention::

The HTTP/2 `streams_active` gauge may be greater than the HTTP connection manager
`downstream_rq_active` gauge due to differences in stream accounting between the codec and the
HTTP connection manager.

Tracing statistics
------------------
Expand Down
1 change: 1 addition & 0 deletions docs/root/configuration/listeners/listeners.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ Listeners

overview
stats
runtime
listener_filters/listener_filters
network_filters/network_filters
lds
8 changes: 8 additions & 0 deletions docs/root/configuration/listeners/runtime.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
.. _config_listeners_runtime:

Runtime
-------
The following runtime settings are supported:

envoy.resource_limits.listener.<name of listener>.connection_limit
Sets a limit on the number of active connections to the specified listener.
2 changes: 2 additions & 0 deletions docs/root/configuration/listeners/stats.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,10 @@ Every listener has a statistics tree rooted at *listener.<address>.* with the fo
downstream_cx_destroy, Counter, Total destroyed connections
downstream_cx_active, Gauge, Total active connections
downstream_cx_length_ms, Histogram, Connection length milliseconds
downstream_cx_overflow, Counter, Total connections rejected due to enforcement of listener connection limit
downstream_pre_cx_timeout, Counter, Sockets that timed out during listener filter processing
downstream_pre_cx_active, Gauge, Sockets currently undergoing listener filter processing
global_cx_overflow, Counter, Total connections rejected due to enforecement of the global connection limit
no_filter_chain_match, Counter, Total connections that didn't match any filter chain
ssl.connection_error, Counter, Total TLS connection errors not including failed certificate verifications
ssl.handshake, Counter, Total successful TLS connection handshakes
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,30 @@ The following overload actions are supported:
envoy.overload_actions.stop_accepting_connections, Envoy will stop accepting new network connections on its configured listeners
envoy.overload_actions.shrink_heap, Envoy will periodically try to shrink the heap by releasing free memory to the system

Limiting Active Connections
---------------------------

Currently, the only supported way to limit the total number of active connections allowed across all
listeners is via specifying an integer through the runtime key
``overload.global_downstream_max_connections``. The connection limit is recommended to be less than
half of the system's file descriptor limit, to account for upstream connections, files, and other
usage of file descriptors.
If the value is unspecified, there is no global limit on the number of active downstream connections
and Envoy will emit a warning indicating this at startup. To disable the warning without setting a
limit on the number of active downstream connections, the runtime value may be set to a very large
limit (~2e9).

If it is desired to only limit the number of downstream connections for a particular listener,
per-listener limits can be set via the :ref:`listener configuration <config_listeners>`.

One may simultaneously specify both per-listener and global downstream connection limits and the
conditions will be enforced independently. For instance, if it is known that a particular listener
should have a smaller number of open connections than others, one may specify a smaller connection
limit for that specific listener and allow the global limit to enforce resource utilization among
all listeners.

An example configuration can be found in the :ref:`edge best practices document <best_practices_edge>`.

Statistics
----------

Expand Down
20 changes: 20 additions & 0 deletions docs/root/faq/configuration/resource_limits.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
.. _faq_resource_limits:

How does Envoy prevent file descriptor exhaustion?
==================================================

:ref:`Per-listener connection limits <config_listeners_runtime>` may be configured as an upper bound
on the number of active connections a particular listener will accept. The listener may accept more
connections than the configured value on the order of the number of worker threads.

In addition, one may configure a :ref:`global limit <config_overload_manager>` on the number of
connections that will apply across all listeners.

On Unix-based systems, it is recommended to keep the sum of all connection limits less than half of
the system's file descriptor limit to account for upstream connections, files, and other usage of
file descriptors.

.. note::

This per-listener connection limiting will eventually be handled by the :ref:`overload manager
<arch_overview_overload_manager>`.
4 changes: 3 additions & 1 deletion docs/root/faq/configuration/timeouts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,9 @@ context request/stream is interchangeable.
<envoy_api_field_config.filter.network.http_connection_manager.v2.HttpConnectionManager.stream_idle_timeout>`
is the amount of time that the connection manager will allow a stream to exist with no upstream
or downstream activity. The default stream idle timeout is *5 minutes*. This timeout is strongly
recommended for streaming APIs (requests or responses that never end).
recommended for all requests (not just streaming requests/responses) as it additionally defends
against an HTTP/2 peer that does not open stream window once an entire response has been buffered
to be sent to a downstream client).

Route timeouts
^^^^^^^^^^^^^^
Expand Down
1 change: 1 addition & 0 deletions docs/root/faq/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ Configuration
configuration/zipkin_tracing
configuration/flow_control
configuration/timeouts
configuration/resource_limits

Load balancing
--------------
Expand Down
4 changes: 4 additions & 0 deletions docs/root/intro/version_history.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ Version history
* http: added HTTP/1.1 flood protection. Can be temporarily disabled using the runtime feature `envoy.reloadable_features.http1_flood_protection`.
1.12.5 (Pending)
================
* http: the :ref:`stream_idle_timeout <envoy_api_field_config.filter.network.http_connection_manager.v2.HttpConnectionManager.stream_idle_timeout>`
now also defends against an HTTP/2 peer that does not open stream window once an entire response has been buffered to be sent to a downstream client.
* listener: add runtime support for `per-listener limits <config_listeners_runtime>` on active/accepted connections.
* overload management: add runtime support for :ref:`global limits <config_overload_manager>` on active/accepted connections.

1.12.4 (June 8, 2020)
=====================
Expand Down
10 changes: 10 additions & 0 deletions examples/front-proxy/front-envoy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ static_resources:
socket_address:
address: 0.0.0.0
port_value: 80
name: example_listener_name
filter_chains:
- filters:
- name: envoy.http_connection_manager
Expand Down Expand Up @@ -64,3 +65,12 @@ admin:
socket_address:
address: 0.0.0.0
port_value: 8001
layered_runtime:
layers:
- name: static_layer_0
static_layer:
envoy:
resource_limits:
listener:
example_listener_name:
connection_limit: 10000
9 changes: 9 additions & 0 deletions include/envoy/buffer/buffer.h
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,15 @@ class Instance {
public:
virtual ~Instance() = default;

/**
* Register function to call when the last byte in the last slice of this
* buffer has fully drained. Note that slices may be transferred to
* downstream buffers, drain trackers are transferred along with the bytes
* they track so the function is called only after the last byte is drained
* from all buffers.
*/
virtual void addDrainTracker(std::function<void()> drain_tracker) PURE;

/**
* Copy data into the buffer (deprecated, use absl::string_view variant
* instead).
Expand Down
7 changes: 7 additions & 0 deletions include/envoy/http/codec.h
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,13 @@ class Stream {
* @return uint32_t the stream's configured buffer limits.
*/
virtual uint32_t bufferLimit() PURE;

/**
* Set the flush timeout for the stream. At the codec level this is used to bound the amount of
* time the codec will wait to flush body data pending open stream window. It does *not* count
* small window updates as satisfying the idle timeout as this is a potential DoS vector.
*/
virtual void setFlushTimeout(std::chrono::milliseconds timeout) PURE;
};

/**
Expand Down
11 changes: 11 additions & 0 deletions include/envoy/network/listener.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

#include "envoy/api/io_error.h"
#include "envoy/common/exception.h"
#include "envoy/common/resource.h"
#include "envoy/network/connection.h"
#include "envoy/network/connection_balancer.h"
#include "envoy/network/listen_socket.h"
Expand Down Expand Up @@ -108,6 +109,11 @@ class ListenerConfig {
* though the implementation may be a NOP balancer.
*/
virtual ConnectionBalancer& connectionBalancer() PURE;

/**
* Open connection resources for this listener.
*/
virtual ResourceLimit& openConnections() PURE;
};

/**
Expand All @@ -122,6 +128,11 @@ class ListenerCallbacks {
* @param socket supplies the socket that is moved into the callee.
*/
virtual void onAccept(ConnectionSocketPtr&& socket) PURE;

/**
* Called when a new connection is rejected.
*/
virtual void onReject() PURE;
};

/**
Expand Down
1 change: 1 addition & 0 deletions source/common/buffer/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ envoy_cc_library(
"//source/common/common:stack_array",
"//source/common/common:utility_lib",
"//source/common/event:libevent_lib",
"//source/server:backtrace_lib",
],
)

Expand Down
15 changes: 12 additions & 3 deletions source/common/buffer/buffer_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,12 @@ void OwnedImpl::addImpl(const void* data, uint64_t size) {
}
}

void OwnedImpl::addDrainTracker(std::function<void()> drain_tracker) {
ASSERT(!old_impl_);
ASSERT(!slices_.empty());
slices_.back()->addDrainTracker(std::move(drain_tracker));
}

void OwnedImpl::add(const void* data, uint64_t size) { addImpl(data, size); }

void OwnedImpl::addBufferFragment(BufferFragment& fragment) {
Expand Down Expand Up @@ -305,9 +311,11 @@ void* OwnedImpl::linearize(uint32_t size) {
auto dest = static_cast<uint8_t*>(reservation.mem_);
do {
uint64_t data_size = slices_.front()->dataSize();
memcpy(dest, slices_.front()->data(), data_size);
bytes_copied += data_size;
dest += data_size;
if (data_size > 0) {
memcpy(dest, slices_.front()->data(), data_size);
bytes_copied += data_size;
dest += data_size;
}
slices_.pop_front();
} while (bytes_copied < linearized_size);
ASSERT(dest == static_cast<const uint8_t*>(reservation.mem_) + linearized_size);
Expand All @@ -331,6 +339,7 @@ void OwnedImpl::coalesceOrAddSlice(SlicePtr&& other_slice) {
// Copy content of the `other_slice`. The `move` methods which call this method effectively
// drain the source buffer.
addImpl(other_slice->data(), slice_size);
other_slice->transferDrainTrackersTo(*slices_.back());
} else {
// Take ownership of the slice.
slices_.emplace_back(std::move(other_slice));
Expand Down
23 changes: 22 additions & 1 deletion source/common/buffer/buffer_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,11 @@ class Slice {
public:
using Reservation = RawSlice;

virtual ~Slice() = default;
virtual ~Slice() {
for (const auto& drain_tracker : drain_trackers_) {
drain_tracker();
}
}

/**
* @return a pointer to the start of the usable content.
Expand Down Expand Up @@ -137,6 +141,9 @@ class Slice {
*/
uint64_t append(const void* data, uint64_t size) {
uint64_t copy_size = std::min(size, reservableSize());
if (copy_size == 0) {
return 0;
}
uint8_t* dest = base_ + reservable_;
reservable_ += copy_size;
// NOLINTNEXTLINE(clang-analyzer-core.NullDereference)
Expand Down Expand Up @@ -193,6 +200,15 @@ class Slice {
*/
virtual bool canCoalesce() const { return true; }

void transferDrainTrackersTo(Slice& destination) {
destination.drain_trackers_.splice(destination.drain_trackers_.end(), drain_trackers_);
ASSERT(drain_trackers_.empty());
}

void addDrainTracker(std::function<void()> drain_tracker) {
drain_trackers_.emplace_back(std::move(drain_tracker));
}

protected:
Slice(uint64_t data, uint64_t reservable, uint64_t capacity)
: data_(data), reservable_(reservable), capacity_(capacity) {}
Expand All @@ -208,6 +224,8 @@ class Slice {

/** Total number of bytes in the slice */
uint64_t capacity_;

std::list<std::function<void()>> drain_trackers_;
};

using SlicePtr = std::unique_ptr<Slice>;
Expand Down Expand Up @@ -512,6 +530,7 @@ class OwnedImpl : public LibEventInstance {
OwnedImpl(const void* data, uint64_t size);

// Buffer::Instance
void addDrainTracker(std::function<void()> drain_tracker) override;
void add(const void* data, uint64_t size) override;
void addBufferFragment(BufferFragment& fragment) override;
void add(absl::string_view data) override;
Expand Down Expand Up @@ -567,6 +586,8 @@ class OwnedImpl : public LibEventInstance {
*/
static void useOldImpl(bool use_old_impl);

static bool newBuffersUseOldImpl() { return use_old_impl_; }

/**
* Describe the in-memory representation of the slices in the buffer. For use
* in tests that want to make assertions about the specific arrangement of
Expand Down
2 changes: 1 addition & 1 deletion source/common/http/codec_client.cc
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ namespace Http {
CodecClient::CodecClient(Type type, Network::ClientConnectionPtr&& connection,
Upstream::HostDescriptionConstSharedPtr host,
Event::Dispatcher& dispatcher)
: type_(type), connection_(std::move(connection)), host_(host),
: type_(type), host_(host), connection_(std::move(connection)),
idle_timeout_(host_->cluster().idleTimeout()) {
if (type_ != Type::HTTP3) {
// Make sure upstream connections process data and then the FIN, rather than processing
Expand Down
6 changes: 4 additions & 2 deletions source/common/http/codec_client.h
Original file line number Diff line number Diff line change
Expand Up @@ -155,9 +155,11 @@ class CodecClient : Logger::Loggable<Logger::Id::client>,
}

const Type type_;
ClientConnectionPtr codec_;
Network::ClientConnectionPtr connection_;
// The order of host_, connection_, and codec_ matter as during destruction each can refer to
// the previous, at least in tests.
Upstream::HostDescriptionConstSharedPtr host_;
Network::ClientConnectionPtr connection_;
ClientConnectionPtr codec_;
Event::TimerPtr idle_timer_;
const absl::optional<std::chrono::milliseconds> idle_timeout_;

Expand Down
Loading

0 comments on commit dece5fd

Please sign in to comment.