Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validation: Add config validation mode. (#499) #863

Merged
merged 33 commits into from
May 26, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
51eba6e
async client: send headers and data in initialize(), not ctor
Apr 24, 2017
2b80353
upstream: Move the ClusterManagerFactory into the Server::Instance, a…
Apr 27, 2017
10bf44f
validation: Add config validation mode. (#499)
Mar 3, 2017
3e5a64c
tools/check_format.py fix
Apr 28, 2017
235be89
review comments
May 1, 2017
44e5b53
tools/check_format.py fix (sigh)
May 1, 2017
6ecdca9
review comments
May 8, 2017
9fda0da
check_format.py fix, again
May 8, 2017
b45e4ae
use NOT_IMPLEMENTED instead of EnvoyException()
May 8, 2017
d351abc
Merge remote-tracking branch 'upstream/master' into validate
May 16, 2017
8555de0
Envoy namespaces to match upstream
May 16, 2017
41e3e62
Remove the list<WorkerPtr> from ValidationInstance.
May 16, 2017
7732881
Replace one more EnvoyException that I missed earlier
May 16, 2017
a3af91c
Include some BUILD updates that got missed in the merge
May 16, 2017
d4770dc
Replace the ValidationInstance's StoreRoot with an IsolatedStoreImpl.
May 16, 2017
9948580
review comments
May 18, 2017
20a021b
Merge remote-tracking branch 'upstream/master' into validate
May 19, 2017
10a4556
Remove ValidationHotRestart.
May 22, 2017
47670a2
Add tests for ValidationAsyncClient and ValidationClusterManager.
May 22, 2017
f82eea8
Move validateConfig() out of main.cc for testability.
May 22, 2017
93d6069
Test ValidationConnectionHandler too
May 22, 2017
569265b
fix format
May 22, 2017
aecac42
review comments
May 24, 2017
1a7d6c7
Refactor ConfigurationImplTest to dedupe the setup noise.
May 24, 2017
9aa303b
tools/check_format.py fix
May 24, 2017
8499fb1
Add docs for validation mode.
May 24, 2017
1e4ed13
Merge remote-tracking branch 'upstream/master' into validate
May 24, 2017
18fb3c2
Add death tests for NOT_IMPLEMENTED methods.
May 25, 2017
f7ab514
exclude config_validation/ from coverage metrics, since it has so man…
May 26, 2017
1fdbb46
Revert "Add death tests for NOT_IMPLEMENTED methods."
May 26, 2017
f0918d2
Merge branch 'master' of https://github.com/lyft/envoy into validate
May 26, 2017
1a99564
catch up with upstream api change
May 26, 2017
c95e78c
tools/check_format.py fix
May 26, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions docs/operations/cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,17 @@ following are the command line options that Envoy supports.

*(required)* The path to the :ref:`JSON configuration file <config>`.

.. option:: --mode <string>

*(optional)* One of the operating modes for Envoy:

* ``serve``: *(default)* Validate the JSON configuration and then serve traffic normally.

* ``validate``: Validate the JSON configuration and then exit, printing either an "OK" message (in
which case the exit code is 0) or any errors generated by the configuration file (exit code 1).
No network traffic is generated, and the hot restart process is not performed, so no other Envoy
process on the machine will be disturbed.

.. option:: --admin-address-path <path string>

*(optional)* The output file path where the admin address and port will be written.
Expand Down
28 changes: 28 additions & 0 deletions include/envoy/server/options.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,28 @@
namespace Envoy {
namespace Server {

/**
* Whether to run Envoy in serving mode, or in config validation mode at one of two levels (in which
* case we'll verify the configuration file is valid, print any errors, and exit without serving.)
*/
enum class Mode {
/**
* Default mode: Regular Envoy serving process. Configs are validated in the normal course of
* initialization, but if all is well we proceed to serve traffic.
*/
Serve,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine in the enum here, but for the command-line, I'm in agreement with @PiotrSikora, it's probably cleaner to have a --dry-run and --dry-run-lite flags, I don't think anyone will ever set --mode=serve.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, my thinking is that we might end up with any number of distinct modes for running Envoy -- "full validation" and "lite validation" are two, but conceptually --hot-restart-version is another: if we implemented that today, it might be a value for --mode. These all have two important things in common: that they significantly modify the semantics of what it means to run the binary (you expect drastically different things to happen) and, more importantly to the flag question, that it doesn't make sense to specify more than one.

We can grow a lot of boolean flags and document them all with "set either zero or one of these flags because they're mutually exclusive" or we can make that implicit with an enum-valued flag, and I argue that's much clearer. I agree nobody will ever explicitly set --mode serve, but I think it's still semantically a default state.

I'm 100% open to different names (e.g. I don't feel strongly about "validate" over "dry run"). I'm also willing to be outvoted on an enum vs. a fleet of booleans, but it was an intentional stance.

(cc @PiotrSikora because this is in reply to your comment too)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does hot restart version do that isn't done by --mode=serve?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It prints the hot restart version and exits immediately (https://github.com/lyft/envoy/blob/master/source/server/options_impl.cc#L63) without serving, validating, or doing anything else.


/**
* Validate as much as possible without opening network connections upstream or downstream.
*/
Validate,

// TODO(rlazarus): Add a third option for "light validation": Mock out access to the filesystem.
// Perform no validation of files referenced in the config, such as runtime configs, SSL certs,
// etc. Validation will pass even if those files are malformed or don't exist, allowing the config
// to be validated in a non-prod environment.
};

/**
* General options for the server.
*/
Expand Down Expand Up @@ -62,6 +84,12 @@ class Options {
*/
virtual uint64_t restartEpoch() PURE;

/**
* @return whether to verify the configuration file is valid, print any errors, and exit
* without serving.
*/
virtual Mode mode() const PURE;

/**
* @return std::chrono::milliseconds the duration in msec between log flushes.
*/
Expand Down
3 changes: 3 additions & 0 deletions include/envoy/upstream/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,12 @@ envoy_cc_library(
":load_balancer_interface",
":thread_local_cluster_interface",
":upstream_interface",
"//include/envoy/access_log:access_log_interface",
"//include/envoy/http:async_client_interface",
"//include/envoy/http:conn_pool_interface",
"//include/envoy/json:json_object_interface",
"//include/envoy/local_info:local_info_interface",
"//include/envoy/runtime:runtime_interface",
],
)

Expand Down
15 changes: 15 additions & 0 deletions include/envoy/upstream/cluster_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,12 @@
#include <string>
#include <unordered_map>

#include "envoy/access_log/access_log.h"
#include "envoy/http/async_client.h"
#include "envoy/http/conn_pool.h"
#include "envoy/json/json_object.h"
#include "envoy/local_info/local_info.h"
#include "envoy/runtime/runtime.h"
#include "envoy/upstream/load_balancer.h"
#include "envoy/upstream/thread_local_cluster.h"
#include "envoy/upstream/upstream.h"
Expand Down Expand Up @@ -103,6 +106,8 @@ class ClusterManager {
virtual void shutdown() PURE;
};

typedef std::unique_ptr<ClusterManager> ClusterManagerPtr;

/**
* Global configuration for any SDS clusters.
*/
Expand Down Expand Up @@ -139,6 +144,16 @@ class ClusterManagerFactory {
public:
virtual ~ClusterManagerFactory() {}

/**
* Allocate a cluster manager from configuration JSON.
*/
virtual ClusterManagerPtr clusterManagerFromJson(const Json::Object& config, Stats::Store& stats,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the other methods from this interface now that we have this? E.g. can allocateConPool just be part of ClusterManagerImpl.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not easily -- that distinction is there for testability, see https://github.com/lyft/envoy/blob/master/test/common/upstream/cluster_manager_impl_test.cc#L38. (I'm actually wasn't originally 100% sure that the ClusterManagerFactory is morally speaking the right place for, well, a ClusterManager factory function -- but I successfully talked myself into it. I could be talked back out of it, if there's a better place.)

ThreadLocal::Instance& tls,
Runtime::Loader& runtime,
Runtime::RandomGenerator& random,
const LocalInfo::LocalInfo& local_info,
AccessLog::AccessLogManager& log_manager) PURE;

/**
* Allocate an HTTP connection pool.
*/
Expand Down
4 changes: 2 additions & 2 deletions source/common/common/assert.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ namespace Envoy {
#define RELEASE_ASSERT(X) \
{ \
if (!(X)) { \
Logger::Registry::getLog(Logger::Id::assert) \
Envoy::Logger::Registry::getLog(Envoy::Logger::Id::assert) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the changes in this file needed? Seems like they shouldn't be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I was using NOT_IMPLEMENTED and NOT_REACHED over in main.cc, which isn't itself namespaced. Wrapping these macro definitions in namespace Envoy { } doesn't have the intended effect on their expansions, because, well, macros.

Turns out I don't need those macros there anymore, but I think morally speaking this is still correct. Happy to pull it out of this PR if you'd rather have it separately, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope fine, was just curious.

.critical("assert failure: {}: {}:{}", #X, __FILE__, __LINE__); \
abort(); \
} \
Expand All @@ -26,7 +26,7 @@ namespace Envoy {
* Indicate a panic situation and exit.
*/
#define PANIC(X) \
Logger::Registry::getLog(Logger::Id::assert) \
Envoy::Logger::Registry::getLog(Envoy::Logger::Id::assert) \
.critical("panic: {}: {}:{}", X, __FILE__, __LINE__); \
abort();

Expand Down
3 changes: 3 additions & 0 deletions source/common/http/async_client_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ AsyncClient::Request* AsyncClientImpl::send(MessagePtr&& request, AsyncClient::C
const Optional<std::chrono::milliseconds>& timeout) {
AsyncRequestImpl* async_request =
new AsyncRequestImpl(std::move(request), *this, callbacks, timeout);
async_request->initialize();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want consistency in AsyncStreamImpl with the initialization pattern? I.e. it might be surprising to someone using both AsyncRequestImpl and AsyncStreamImpl that they have slightly different ways to kick off a request after construction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AsyncStreamImpl doesn't currently do any real work in its ctor, though. I could add an initialize() method but it would only be for consistency's sake.

std::unique_ptr<AsyncStreamImpl> new_request{async_request};

// The request may get immediately failed. If so, we will return nullptr.
Expand Down Expand Up @@ -155,7 +156,9 @@ AsyncRequestImpl::AsyncRequestImpl(MessagePtr&& request, AsyncClientImpl& parent
AsyncClient::Callbacks& callbacks,
const Optional<std::chrono::milliseconds>& timeout)
: AsyncStreamImpl(parent, *this, timeout), request_(std::move(request)), callbacks_(callbacks) {
}

void AsyncRequestImpl::initialize() {
sendHeaders(request_->headers(), !request_->body());
if (!remoteClosed() && request_->body()) {
sendData(*request_->body(), true);
Expand Down
1 change: 1 addition & 0 deletions source/common/http/async_client_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,7 @@ class AsyncRequestImpl final : public AsyncClient::Request,
virtual void cancel() override;

private:
void initialize();
void onComplete();

// AsyncClient::StreamCallbacks
Expand Down
8 changes: 8 additions & 0 deletions source/common/upstream/cluster_manager_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -552,6 +552,14 @@ ClusterManagerImpl::ThreadLocalClusterManagerImpl::ClusterEntry::connPool(
return container.pools_[enumToInt(priority)].get();
}

ClusterManagerPtr ProdClusterManagerFactory::clusterManagerFromJson(
const Json::Object& config, Stats::Store& stats, ThreadLocal::Instance& tls,
Runtime::Loader& runtime, Runtime::RandomGenerator& random,
const LocalInfo::LocalInfo& local_info, AccessLog::AccessLogManager& log_manager) {
return ClusterManagerPtr{
new ClusterManagerImpl(config, *this, stats, tls, runtime, random, local_info, log_manager)};
}

Http::ConnectionPool::InstancePtr
ProdClusterManagerFactory::allocateConnPool(Event::Dispatcher& dispatcher, HostConstSharedPtr host,
ResourcePriority priority) {
Expand Down
5 changes: 5 additions & 0 deletions source/common/upstream/cluster_manager_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,11 @@ class ProdClusterManagerFactory : public ClusterManagerFactory {
local_info_(local_info) {}

// Upstream::ClusterManagerFactory
ClusterManagerPtr clusterManagerFromJson(const Json::Object& config, Stats::Store& stats,
ThreadLocal::Instance& tls, Runtime::Loader& runtime,
Runtime::RandomGenerator& random,
const LocalInfo::LocalInfo& local_info,
AccessLog::AccessLogManager& log_manager) override;
Http::ConnectionPool::InstancePtr allocateConnPool(Event::Dispatcher& dispatcher,
HostConstSharedPtr host,
ResourcePriority priority) override;
Expand Down
2 changes: 2 additions & 0 deletions source/exe/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ envoy_cc_library(
"//source/common/event:libevent_lib",
"//source/common/local_info:local_info_lib",
"//source/common/network:utility_lib",
"//source/common/stats:stats_lib",
"//source/common/stats:thread_local_store_lib",
"//source/server:drain_manager_lib",
"//source/server:options_lib",
Expand Down Expand Up @@ -58,6 +59,7 @@ envoy_cc_library(
"//source/common/common:compiler_requirements_lib",
":envoy_common_lib",
":hot_restart_lib",
"//source/server/config_validation:server_lib",
] + select({
"//bazel:disable_signal_trace": [],
"//conditions:default": [":sigaction_lib"],
Expand Down
22 changes: 17 additions & 5 deletions source/exe/main.cc
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
#include "common/event/libevent.h"
#include "common/local_info/local_info_impl.h"
#include "common/network/utility.h"
#include "common/stats/stats_impl.h"
#include "common/stats/thread_local_store.h"

#include "exe/hot_restart.h"
Expand All @@ -13,6 +14,7 @@
#include "exe/signal_action.h"
#endif

#include "server/config_validation/server.h"
#include "server/drain_manager_impl.h"
#include "server/options_impl.h"
#include "server/server.h"
Expand Down Expand Up @@ -45,10 +47,24 @@ int main(int argc, char** argv) {
// Enabled by default. Control with "bazel --define=signal_trace=disabled"
Envoy::SignalAction handle_sigs;
#endif
ares_library_init(ARES_LIB_INIT_ALL);
Envoy::Event::Libevent::Global::initialize();
Envoy::OptionsImpl options(argc, argv, Envoy::Server::SharedMemory::version(),
spdlog::level::warn);
Envoy::Server::ProdComponentFactory component_factory;
Envoy::LocalInfo::LocalInfoImpl local_info(
Envoy::Network::Utility::getLocalAddress(Envoy::Network::Address::IpVersion::v4),
options.serviceZone(), options.serviceClusterName(), options.serviceNodeName());

switch (options.mode()) {
case Envoy::Server::Mode::Serve:
break;
case Envoy::Server::Mode::Validate:
Envoy::Thread::MutexBasicLockable log_lock;
Envoy::Logger::Registry::initialize(options.logLevel(), log_lock);
return Envoy::Server::validateConfig(options, component_factory, local_info) ? 0 : 1;
}

ares_library_init(ARES_LIB_INIT_ALL);

std::unique_ptr<Envoy::Server::HotRestartImpl> restarter;
try {
Expand All @@ -61,11 +77,7 @@ int main(int argc, char** argv) {
Envoy::Logger::Registry::initialize(options.logLevel(), restarter->logLock());
Envoy::DefaultTestHooks default_test_hooks;
Envoy::Stats::ThreadLocalStoreImpl stats_store(*restarter);
Envoy::Server::ProdComponentFactory component_factory;
// TODO(henna): Add CLI option for local address IP version.
Envoy::LocalInfo::LocalInfoImpl local_info(
Envoy::Network::Utility::getLocalAddress(Envoy::Network::Address::IpVersion::v4),
options.serviceZone(), options.serviceClusterName(), options.serviceNodeName());
Envoy::Server::InstanceImpl server(options, default_test_hooks, *restarter, stats_store,
restarter->accessLogLock(), component_factory, local_info);
server.run();
Expand Down
2 changes: 1 addition & 1 deletion source/server/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -136,13 +136,13 @@ envoy_cc_library(
"//source/common/api:api_lib",
"//source/common/common:utility_lib",
"//source/common/common:version_lib",
"//source/common/json:config_schemas_lib",
"//source/common/memory:stats_lib",
"//source/common/network:utility_lib",
"//source/common/runtime:runtime_lib",
"//source/common/ssl:context_lib",
"//source/common/stats:statsd_lib",
"//source/common/thread_local:thread_local_lib",
"//source/common/upstream:cluster_manager_lib",
"//source/server/http:admin_lib",
],
)
Expand Down
98 changes: 98 additions & 0 deletions source/server/config_validation/BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
licenses(["notice"]) # Apache 2

load("//bazel:envoy_build_system.bzl", "envoy_cc_library", "envoy_package")

envoy_package()

envoy_cc_library(
name = "api_lib",
srcs = ["api.cc"],
hdrs = ["api.h"],
deps = [
":dispatcher_lib",
"//include/envoy/api:api_interface",
"//include/envoy/filesystem:filesystem_interface",
"//source/common/api:api_lib",
],
)

envoy_cc_library(
name = "async_client_lib",
srcs = ["async_client.cc"],
hdrs = ["async_client.h"],
deps = [
"//include/envoy/http:async_client_interface",
"//include/envoy/http:message_interface",
],
)

envoy_cc_library(
name = "cluster_manager_lib",
srcs = ["cluster_manager.cc"],
hdrs = ["cluster_manager.h"],
deps = [
":async_client_lib",
"//include/envoy/upstream:cluster_manager_interface",
"//source/common/upstream:cluster_manager_lib",
],
)

envoy_cc_library(
name = "connection_handler_lib",
srcs = ["connection_handler.cc"],
hdrs = ["connection_handler.h"],
deps = [
"//include/envoy/api:api_interface",
"//include/envoy/network:connection_handler_interface",
"//include/envoy/network:filter_interface",
"//include/envoy/network:listen_socket_interface",
"//source/common/common:assert_lib",
],
)

envoy_cc_library(
name = "dispatcher_lib",
srcs = ["dispatcher.cc"],
hdrs = ["dispatcher.h"],
deps = [
"//include/envoy/event:dispatcher_interface",
"//source/common/common:assert_lib",
"//source/common/event:dispatcher_lib",
],
)

envoy_cc_library(
name = "dns_lib",
srcs = ["dns.cc"],
hdrs = ["dns.h"],
deps = [
"//include/envoy/event:dispatcher_interface",
"//include/envoy/network:dns_interface",
],
)

envoy_cc_library(
name = "server_lib",
srcs = ["server.cc"],
hdrs = ["server.h"],
deps = [
":api_lib",
":cluster_manager_lib",
":connection_handler_lib",
":dns_lib",
"//include/envoy/common:optional",
"//include/envoy/server:drain_manager_interface",
"//include/envoy/server:instance_interface",
"//include/envoy/ssl:context_manager_interface",
"//include/envoy/tracing:http_tracer_interface",
"//source/common/access_log:access_log_manager_lib",
"//source/common/common:assert_lib",
"//source/common/runtime:runtime_lib",
"//source/common/ssl:context_lib",
"//source/common/stats:stats_lib",
"//source/common/thread_local:thread_local_lib",
"//source/server:configuration_lib",
"//source/server:server_lib",
"//source/server/http:admin_lib",
],
)
16 changes: 16 additions & 0 deletions source/server/config_validation/api.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#include "server/config_validation/api.h"

#include "server/config_validation/dispatcher.h"

namespace Envoy {
namespace Api {

ValidationImpl::ValidationImpl(std::chrono::milliseconds file_flush_interval_msec)
: Impl(file_flush_interval_msec) {}

Event::DispatcherPtr ValidationImpl::allocateDispatcher() {
return Event::DispatcherPtr{new Event::ValidationDispatcher()};
}

} // Api
} // Envoy
Loading