-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sds api: implement SDS api #3700
Conversation
name: "abc.com" | ||
tls_certificate: | ||
certificate_chain: | ||
filename: "test/config/integration/certs/cacert.pem" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use {{ test_rundir }}
and substitute, same for readFileToStringForTest below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks!
source/common/secret/sds_api.cc
Outdated
// TODO(jaebong): replace next line with | ||
// "envoy.service.discovery.v2.SecretDiscoveryService.StreamSecrets" to support streaming | ||
// service | ||
"envoy.service.discovery.v2.SecretDiscoveryService.FetchSecrets"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just use StreamSecrets here, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to StreamSecrets and removed TODO.
@JimmyCYJ can you take a look on test failures? @PiotrSikora @mattklein123 PTAL once test passes. |
Please ping me when ready. |
@lizan both coverage and ipv6_tests show that some directories do not exist. I am not clear how to fix them. Have you seen that before? |
@JimmyCYJ you can ignore those warnings, download full log. The coverage failure is due to the test coverage is 97.9%, which is below envoy threshold 98.0%. See https://67413-65214191-gh.circle-artifacts.com/0/build/envoy/generated/coverage/coverage.html for detail reports. |
|
||
class SecretManagerUtil { | ||
public: | ||
virtual ~SecretManagerUtil() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not need for this destructor. or just change SecretManagerUtil to namespace
* @param config_source envoy::api::v2::core::ConfigSource. | ||
* @return hash code. | ||
*/ | ||
static std::string configSourceHash(const envoy::api::v2::core::ConfigSource& config_source) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems that it is only used in one place, so not need to have it as a separate file. call in inside the findOrCreateDynamicSecretProvider.
@@ -102,6 +68,8 @@ ContextConfigImpl::ContextConfigImpl(const envoy::api::v2::auth::CommonTlsContex | |||
tlsVersionFromProto(config.tls_params().tls_minimum_protocol_version(), TLS1_VERSION)), | |||
max_protocol_version_( | |||
tlsVersionFromProto(config.tls_params().tls_maximum_protocol_version(), TLS1_2_VERSION)) { | |||
readConfig(config); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename it to readCertChainConfig
private: | ||
class InitManager : public Init::Manager { | ||
public: | ||
void initialize(std::function<void()> callback); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this function defined?
|
||
std::string yaml = | ||
R"EOF( | ||
name: "abc.com" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we combine this static secret config with overall config defined in line 307? I think it will be easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per offline discussion, I am going to keep it this way.
test/common/secret/sds_api_test.cc
Outdated
|
||
TEST_F(SdsApiTest, BasicTest) { | ||
::testing::InSequence s; | ||
Server::MockInstance server; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrap it with NiceMock<> to eliminate these warnings:
Uninteresting mock function call - taking default action specified at:
test/mocks/server/mocks.cc:124:
Function call: random()
Returns: @0x7ffff7fc3ff8 8-byte object <F8-F5 C6-04 00-00 00-00>
unknown file: Failure
@mattklein123, would you mind to take a look? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this. Some comments to get started. I didn't review tests yet. Please make sure you have 100% test coverage for all the code you added. Thank you!
/** | ||
* An interface to fetch dynamic secret. | ||
*/ | ||
class DynamicSecretProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the plan here to add other secret types? Otherwise should this be named DynamicTlsCertificateSecretProvider or similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we plan to support other secret types, https://github.com/envoyproxy/data-plane-api/blob/master/envoy/api/v2/auth/cert.proto#L320. Is it okay to add TODO here and keep the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit different from the static secret API, so I would match what we did there and make these typed, or change static to be similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Then I will rename it to DynamicTlsCertificateSecretProvider. Thanks!
* @return the dynamic secret provider. | ||
*/ | ||
virtual DynamicSecretProviderSharedPtr | ||
findOrCreateDynamicSecretProvider(const envoy::api::v2::core::ConfigSource& config_source, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per comment above should this be specific to TlsCertificates for now like the static versions?
include/envoy/ssl/context_config.h
Outdated
/** | ||
* @return true of the config is valid. | ||
*/ | ||
virtual bool isValid() const PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is going to be used in next PR. Please refer to our high level design PR(https://github.com/envoyproxy/envoy/pull/3748/files). For example, ContextManagerImpl::createSslClientContext() needs this method to decide whether to create ssl client context.
I find it is convenient to use this method in tests when I add tests into context_impl_test.cc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I don't have time to go through the other change. It seems a bit odd to have an isValid()
on this. The context should always be valid or it should throw an exception, so I would change the logic and remove this. Is there something I'm missing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe isValid() is not accurate. It returns true if we have static secrets to create context, or we have created a dynamic secret provider and the provider has downloaded secrets already. We are going to check if secrets are ready or not before creating any context objects. If we have created a secret provider but that provider has not downloaded any secret, then we should not create context. Does that make sense? Is it better to rename it to isSecretAvailable()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will you do if the secrets aren't ready? Can this ever happen with warming? I don't think it should ever happen...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are considering that if the gRPC connection breaks when envoy is waiting for secrets, then maybe we need isValid() before creating context. I am going to remove isValid() from this PR, and come up with a better idea in the next PR.
include/envoy/ssl/context_config.h
Outdated
/** | ||
* @return the DynamicSecretProvider object. | ||
*/ | ||
virtual Secret::DynamicSecretProvider* getDynamicSecretProvider() const PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this return a const pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to high level design PR(https://github.com/envoyproxy/envoy/pull/3748/files), this method is needed for adding and removing callbacks. Perhaps I can remove it from this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove code that is not used in this PR. It's hard to review without context.
|
||
void SecretManagerImpl::addOrUpdateSecret(const envoy::api::v2::auth::Secret& secret) { | ||
std::string configSourceHash(const envoy::api::v2::core::ConfigSource& config_source) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be moved into https://github.com/envoyproxy/envoy/blob/master/source/common/config/utility.h or some other common location? Also, can we just use https://github.com/envoyproxy/envoy/blob/master/source/common/protobuf/utility.h#L141 here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the advice! I will use MessageUtil::hash() to replace this hash function.
return; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we need to throw some type of config error here if neither of the above cases are true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is okay to do nothing here. Without this change, the readConfig() returns empty string.
https://github.com/envoyproxy/envoy/blob/master/source/common/ssl/context_config_impl.cc#L19
In that case cert_chain_ and private_key_ are empty. I am trying to make this readCertChainConfig() consistent with existing behavior.
source/common/secret/sds_api.h
Outdated
const envoy::api::v2::core::ConfigSource sds_config_; | ||
std::unique_ptr<Config::Subscription<envoy::api::v2::auth::Secret>> subscription_; | ||
std::function<void()> initialize_callback_; | ||
std::string sds_config_name_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const
|
||
// Manages pairs of secret name and Ssl::TlsCertificateConfig. | ||
std::unordered_map<std::string, Ssl::TlsCertificateConfigPtr> static_tls_certificate_secrets_; | ||
mutable std::shared_timed_mutex static_tls_certificate_secrets_mutex_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this lock needed? AFAICT all processing in this area should happen on the main thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lock will be removed.
|
||
// map hash code of SDS config source and SdsApi object. | ||
std::unordered_map<std::string, std::weak_ptr<DynamicSecretProvider>> dynamic_secret_providers_; | ||
mutable std::shared_timed_mutex dynamic_secret_providers_mutex_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this lock needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not needed. I will remove this lock and the one above.
mutable std::shared_timed_mutex static_tls_certificate_secrets_mutex_; | ||
|
||
// map hash code of SDS config source and SdsApi object. | ||
std::unordered_map<std::string, std::weak_ptr<DynamicSecretProvider>> dynamic_secret_providers_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do you handle removal? If there is no removal do we need a weak pointer here? Though it seems like we should handle removal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the PR description, that covers the lifetime and ownership of DynamicTlsCertificateSecretProvider. This DynamicTlsCertificateSecretProvider object will be removed if nobody is using it. The weak_ptr entry will remain in this map.
@mattklein123 PTAL, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few more comments to work through, then will take another pass.
*/ | ||
virtual DynamicTlsCertificateSecretProviderSharedPtr | ||
findOrCreateDynamicTlsCertificateSecretProvider( | ||
const envoy::api::v2::core::ConfigSource& config_source, std::string config_name) PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const std::string& config_name
@@ -5,6 +5,7 @@ | |||
#include <vector> | |||
|
|||
#include "envoy/common/pure.h" | |||
#include "envoy/secret/dynamic_secret_provider.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this needed?
@@ -1,22 +1,37 @@ | |||
#pragma once | |||
|
|||
#include <shared_mutex> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
del
source/common/secret/sds_api.cc
Outdated
SdsApi::SdsApi(Server::Instance& server, const envoy::api::v2::core::ConfigSource& sds_config, | ||
std::string sds_config_name) | ||
: server_(server), sds_config_(sds_config), sds_config_name_(sds_config_name), secret_hash_(0) { | ||
server_.initManager().registerTarget(*this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe this is correct in all cases. For listener secrets, you are going to need to use the listener init manager, not the server manager. Likewise for cluster secrets you need to use the cluster init manager (which might be the server during startup). You should remove all references to server in these files are pass the init manager directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out.
I think we can add this method into SdsApi,
void SdsApi::registerInitTarget(Init::Manager& init_manager) {
init_manager.registerTarget(*this);
}
and add secret_provider_->registerInitTarget(init_manager)
into ContextConfigImpl::readCertChainConfig()
.
We can add accessor virtual Init::Manager&initManager() PURE
into Configuration::TransportSocketFactoryContext.
-
Pass listener init manager to ContextConfigImpl.
As ListenerImpl inherits TransportSocketFactoryContext and provides initManager(), we can modify ServerContextConfigImpl(), and pass context.initManager() into ServerContextConfigImpl() via DownstreamSslSocketFactory::createTransportSocketFactory(), and then pass initManager to ContextConfigImpl(). -
Pass cluster init manager to ContextConfigImpl.
To pass cluster init manager to ClientContextConfigImpl, we can implement ClusterInfoImpl::initManager(), as ClusterInfoImpl inherits Server::Configuration::TransportSocketFactoryContext(). Then pass initManager to ClientContextConfigImpl() via UpstreamSslSocketFactory::createTransportSocketFactory().
Does this make sense? I haven't figured out an convenient way to pass initManager to ClusterInfoImpl. Not sure if it is okay to just let ClusterInfoImpl::initManager() return secret_manager_.Server() (assume that we expose SecretManager::server_). Do you have any suggestions? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we definitely cannot add any concept of InitManager to ClusterInfo. The init manager stuff must happen on the main thread only and cluster info is a multi-threaded concept. I would need to sit down with the code in detail to figure out the right way to handle this, but it's possible that the overall design will need rethinking. In the mean time I would consult with @lizan and @PiotrSikora and see if they can help you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattklein123 @lizan and @PiotrSikora I have added a new member sds_init_manager_ into ClusterImplBase, and register SdsApi target at sds_init_manager_. sds_init_manager_ initializes registered targets when ClusterImplBase::onPreInitComplete() is called.
Both cluster init manager and listener manager are passed to TransportSocketFactoryContext, so that ContextConfigImpl could use init manager to create SdsApi objects.
: secret_manager_(secret_manager), | ||
init_manager_(init_manager), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If context_config object is holding a reference to init_manager, need to make sure its life time is longer than context_config. For LDS, its initManager is the whole ListenerImpl, which is fine. Not sure about ClusterManagerImpl initManager, what is its lifetime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per offline discussion, this init_manager_ is now removed from ContextConfigImpl. Only the ClusterInfoImpl owns a reference to ClusterImplBase::sds_init_manager_.
We registered SDS api at this init manager within the constructor of ClusterInfoImpl, and ClusterInfoImpl is created by the constructor of ClusterImplBase.
This ClusterImplBase::sds_init_manager_ is only called to initialize SDS api within ClusterImplBase::onPreInitComplete(). Nobody is using it once ClusterImplBase is destructed. I think it is safe.
@@ -474,7 +481,8 @@ class ClusterImplBase : public Cluster, protected Logger::Loggable<Logger::Id::u | |||
* over and determines if there is an initial health check pass needed, etc. | |||
*/ | |||
void onPreInitComplete(); | |||
|
|||
|
|||
Server::InitManagerImpl sds_init_manager_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add comment to state that this init_manager is only used by SDS api.
@@ -506,6 +508,8 @@ void ClusterImplBase::onPreInitComplete() { | |||
pending_initialize_health_checks_ += host_set->hosts().size(); | |||
} | |||
|
|||
sds_init_manager_.initialize([]() -> void {}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong.
How about split the ClusterImplBase::onPreInitComplete into following two functions:
ClusterImplBase::onPreInitComplete() {
sds_init_manager_.initialize( [this]() { onSdsInitDone(); });
}
ClusterImplBase::onSdsInitDone() {
// exact same code as onPreInitComplete
}
I'm going to look at this change in detail later today. Something about all of this does not sound right to me, but until I look in detail I don't have a lot to say about it. |
* Called by every concrete cluster when pre-init is complete. At this point, shared init takes | ||
* over and determines if there is an initial health check pass needed, etc. | ||
* Called by every concrete cluster when pre-init is complete. At this point, SDS init manager | ||
* initializes every registered SDS api target. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update the comment as:
At this point, shared init starts sds_init_manager_ initialization and determines if there is an initial health check pass needed, etc.
@JimmyCYJ @lizan @PiotrSikora I went through the code. Largely, I think we are in good shape, but there is something I think we need to change to not confuse things in the future. I don't think it's a super fundamental design change so I think it shouldn't be too hard to do. I'm going to describe my concerns and the solutions below. Primarily, the cluster init manager has be owned by the cluster manager, and not the cluster info. The cluster manager is actually where init happens and in the future we might add other things that require init at this level. So we we want to add an init manager concept within the cluster manager, and pass that through as needed. Concretely, there are two changes I would like to make:
Thank you for working on this and please LMK if this makes sense or if you need additional clarification before proceeding. Hopefully, @lizan can also help you more directly with this set of changes. |
dynamic_secret_providers_[map_key] = dynamic_secret_provider; | ||
} | ||
|
||
for (auto it = dynamic_secret_providers_.begin(); it != dynamic_secret_providers_.end();) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extract this part to a function, use weak_ptr::expired and std::remove_if.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have extracted this part into SecretManagerImpl::removeDeletedSecretProvider(), and use weak_ptr::expired() instead of lock(). Thanks!
@@ -434,14 +414,52 @@ ClusterSharedPtr ClusterImplBase::create(const envoy::api::v2::Cluster& cluster, | |||
return std::move(new_cluster); | |||
} | |||
|
|||
Stats::ScopePtr ClusterImplBase::generateStatsScope(const envoy::api::v2::Cluster& config, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not need to be a member function. It can be a global function in the annoymous namespace.
: std::string(config.alt_stat_name()))); | ||
} | ||
|
||
Network::TransportSocketFactoryPtr ClusterImplBase::createTransportSocketFactory( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be a global function inside annoymous namespace
Stats::IsolatedStoreImpl load_report_stats_store_; | ||
mutable ClusterLoadReportStats load_report_stats_; | ||
Network::TransportSocketFactoryPtr transport_socket_factory_; | ||
Stats::ScopePtr stats_scope_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is stats_scope_ needed in this class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll need this to own the scope, it is passed by rvalue-ref.
createTransportSocketFactory(cluster, *stats_scope_.get(), | ||
ssl_context_manager, secret_manager, | ||
init_manager_), | ||
stats_scope_.release(), added_via_api)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need following in the ClusterImplBase constuctor
ClusterImplBase:: ClusterImplBase(...) {
auto stat_scope = generateStatsScope(cluster, stats);
auto socket_factory = createTransportSocketFactory(cluster, *stat_scope, ...);
info_ = std::make_unique<ClusterInfoImpl>(..., socket_factory, std::move(stat_scope), ...);
...
}
@lizan @PiotrSikora This PR is ready for review. Could you help to review it? Thanks |
* @return true if the config is valid. Only when SDS dynamic secret is needed, but has not been | ||
* downloaded yet, the config is invalid. | ||
*/ | ||
virtual bool isValid() const PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would isReady
be better name for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
||
private: | ||
std::unordered_map<std::string, Ssl::TlsCertificateConfigPtr> tls_certificate_secrets_; | ||
void removeDeletedSecretProvider(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removeExpiredSecretProvider
?
public: | ||
ClusterInfoImpl(const envoy::api::v2::Cluster& config, | ||
const envoy::api::v2::core::BindConfig& bind_config, Runtime::Loader& runtime, | ||
Stats::Store& stats, Ssl::ContextManager& ssl_context_manager, | ||
Secret::SecretManager& secret_manager, bool added_via_api); | ||
Network::TransportSocketFactoryPtr&& socket_factory, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lizan sorry for the missing context. I had them do this. Before we were passing a bunch of stuff into ClusterInfo and also exposing accessors, which IMO was pretty confusing as Info is multi-threaded where is the cluster base is only for the main thread. IMO what is done now is clearer and easier to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explanation. I'm OK with this, just was curious why :)
Stats::IsolatedStoreImpl load_report_stats_store_; | ||
mutable ClusterLoadReportStats load_report_stats_; | ||
Network::TransportSocketFactoryPtr transport_socket_factory_; | ||
Stats::ScopePtr stats_scope_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll need this to own the scope, it is passed by rvalue-ref.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -54,18 +55,29 @@ class ContextConfigImpl : public virtual Ssl::ContextConfig { | |||
unsigned minProtocolVersion() const override { return min_protocol_version_; }; | |||
unsigned maxProtocolVersion() const override { return max_protocol_version_; }; | |||
|
|||
bool isValid() const override { | |||
// either secret_provider_ is nullptr or secret_provider_->secret() is NOT nullptr. | |||
return !secret_provider_ || secret_provider_->secret(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this function actually end up getting called. Is it always on the main thread? Or will it be called from multiple threads? I'm not sure this is thread safe. Can you explain the full flow? This ends up being quite confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is called from the main thread. It is called in the ContextManagerImpl::createClientContext() and createServerContext(), if pass-in ContextConfig is not valid, return nullptr for SocketFactory.
In the next PR, SocketFactory will register a UpdateCallback to SDS_api, if SDS secret is fetched, it will call the onUpdate(), SocketFactory will call ContextManagerImpl::createClientContext() again to create a valid Context if ContextConfig is valid. This is called in the main thread too since SDS_api onConfigUpdate will be called in the main thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thanks. I think we will need proper locking around the shared_ptr swaps and we might want to use TLS at a later time, but this should work for now and we can figure it out in the follow up PR.
public: | ||
ClusterInfoImpl(const envoy::api::v2::Cluster& config, | ||
const envoy::api::v2::core::BindConfig& bind_config, Runtime::Loader& runtime, | ||
Stats::Store& stats, Ssl::ContextManager& ssl_context_manager, | ||
Secret::SecretManager& secret_manager, bool added_via_api); | ||
Network::TransportSocketFactoryPtr&& socket_factory, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lizan sorry for the missing context. I had them do this. Before we were passing a bunch of stuff into ClusterInfo and also exposing accessors, which IMO was pretty confusing as Info is multi-threaded where is the cluster base is only for the main thread. IMO what is done now is clearer and easier to understand.
*/ | ||
void onPreInitComplete(); | ||
|
||
/** | ||
* Called by every concrete cluster after all Sds api targets registered at SDS init manager are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove SDS from comments, this is generic for the future.
@@ -22,6 +22,10 @@ void ContextManagerImpl::releaseContext(Context* context) { | |||
|
|||
ClientContextPtr ContextManagerImpl::createSslClientContext(Stats::Scope& scope, | |||
const ClientContextConfig& config) { | |||
if (!config.isValid()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this going to work on the future? Can you take me through the logic that is coming in the follow up update PRs so I understand? Right now, this will return a null context which is fine, things will fail (and we can have integration tests in this PR just proving things fail). But in the future, how is the context going to get updated inside the transport factory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this from my other comment.
In the next PR, SocketFactory will register a UpdateCallback to SDS_api, if SDS secret is fetched or updated, it will call the SocketFactory::onUpdate(), SocketFactory will call ContextManagerImpl::createClientContext() to create a valid Context if ContextConfig is valid. This is called in the main thread too since SDS_api::onConfigUpdate will be called in the main thread.
@JimmyCYJ @qiwzhang @lizan two other things that we need (can be done as follow up PRs):
Can you please add some TODOs to this PR for ^ and we can do both of these in follow ups? Thank you. |
@mattklein123 Thanks for the advice, will add TODO in this PR. |
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
Signed-off-by: Wayne Zhang <qiwzhang@google.com>
Just updated this PR with all changes needed to support dynamic SDS. This PR only serves as a reference PR for complete implementation. Its changes will be split into smaller PR to actual code review. |
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
@mattklein123 Yes, I am closing it. Thanks! |
Implement SDS api which fetches dynamic secrets.
Description:
Implement SDS api that fetches secrets from remote SDS server. Secrets are stored in Secret Manager.
Updating secrets at listener and cluster will follow.
From the high level design PR: #3748,
or a smaller PR (istio#4) which has removed the changes from this PR and #3754.
Here is the lifetime and ownership of DynamicSecretProvider.
When all ListenImpl instances are going away, DynamicSecretProvider object will be deleted.
Here is the relationship between SecretCallback and DynamicSecretProvider.
Risk Level: Low
Fixes #1194
Signed-off-by: Jimmy Chen jimmychen.0102@gmail.com