-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OpenSSL Library Context Service #16927
Add OpenSSL Library Context Service #16927
Conversation
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/45770#018e16c4-73d5-47fc-b484-66d128498dcd ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/45770#018e16c4-73d2-474f-b56c-6f845761e7e6 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/45770#018e16ec-9f37-4b6c-9e88-f3a8c3cf7f2b ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/45989#018e2eb6-89c6-4365-b461-2db5f7e67d1d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meat and potatoes lgtm. still need to go through the big unit testing diff in a bit more detail.
template<class R> | ||
using initialize_result = result<R, std::string>; | ||
|
||
ss::logger lg("ossl-library-context-service"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to break this out into it's own compilation unit in case upcoming additions to crypto
need logging facilities as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally did actually... but there are some shared items between the two... open to going back to that idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idk, up to you. I don't have a reliable heuristic for deciding logger scope
} | ||
|
||
if (fips_mode) { | ||
if (!EVP_set_default_properties(ctx, "fips=yes")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"fips=yes"
😐
* by the Apache License, Version 2.0 | ||
*/ | ||
|
||
#pragma once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would it be a good idea to namespace these as a matter of course?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah probably a good idea
src/v/README.md
Outdated
@@ -13,6 +13,7 @@ compat | Compatibility checks | | |||
compression | utilities supporting compression/decompression of many types | | |||
config | Redpanda cluster level and node level configuration options | | |||
container | Generic Redpanda specific containers and data structures | | |||
crypto | Middelware library used to perform cryptographic operations | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo with Middleware
* @param thread_worker Thread worker used to perform I/O operations outside | ||
* of the Seastar context | ||
* @param config_file Path to the OpenSSL config file | ||
* @param module_path Path to the directory that contains the FIPS module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be optional? Then you may not really need the fips mode bit...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which one? The config_file? Sorry GH is showing me 4 lines :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The module_path is only valid if fips_mode::yes right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we wanted to use the legacy provider we would need it... but I don't think we would (no need for MD2 or RC4)
In fact the config file really is only needed in FIPS mode (I have tests that show that starting up in non FIPS mode without a config works fine).
I'll come up with some thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe there is just std::optional<fips_config>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of like keeping the flag. Then I can check that both config file and module path are provided when in FIPS mode and error out if they're not provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is sort of an awkward thing to express in the typesystem in C++ and std::variant<no_fips_tag, fips_config>
is awkward. I'll stop trying to be pedantic about the typesystem enforcing you can't do silly things (so you don't even need to handle errors of the required config not being present because the typesystem doesn't allow you to set the flag true without giving values).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, I appreciate where you're coming from =). I guess part of me is trying to future proof for an event where we do need module_path or conf_File even when not in FIPS mode. This may come up when trying to add support for MD5 (which the FIPS module doens't provide but the base one does)
src/v/crypto/ssl_utils.h
Outdated
fn(ptr); | ||
} else if constexpr (std::is_invocable_r_v<int, decltype(fn), T*>) { | ||
if (1 != fn(ptr)) { | ||
throw ossl_error(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to throw in a destructor, which I am pretty sure is a bad thing to do...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I guess i'll just have to ignore it then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest asserting if they fail, but if it's actually expected they can fail at runtime, then I would suggest logging? I don't know.
How where you expecting to handle these exceptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the only one that needs it is the OSSL_PROVIDER_unload which only happens at shut down. Maybe I'll wrap this up in a function that logs a warning if it fails
v::gtest_main | ||
LABELS crypto | ||
ARGS "-- -c 2" | ||
ENV "OPENSSL_CONF=${CMAKE_CURRENT_BINARY_DIR}/test/openssl_conf.cnf;MODULE_DIR=${REDPANDA_DEPS_INSTALL_DIR}/lib64/ossl-modules" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This second bit doesn't work for the open source build does it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it won't. But then again running in FIPS mode isn't really supported for the open source build. The user would have to build the FIPS module and configure it for it to work.
I check to see if the directory exists and if it doesn't skip the FIPS tests? Do we have any other unit tests that do not run in the oss build?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any other unit tests that do not run in the oss build?
Not that I know of - FWIW I don't know if our unit test cmake wrapper stuff works well in the open source build at all, but we could just disable these tests (i.e. not build them) in non OSS build with some CMake flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still need resolving right? You can use GTEST_SKIP()
to skip tests in the setup phase I think.
src/v/crypto/ossl_context_service.cc
Outdated
_shard_ctx.reset(); | ||
_shard_ctx = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW these two lines do the same thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops meant to replace 228 with 227 and not have both :)
src/v/crypto/ossl_context_service.cc
Outdated
_thread_worker_ctx.reset(); | ||
_thread_worker_ctx = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These do the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double whammy of goodness
* These contexts will be fetched by the OpenSSL library when performing | ||
* cryptographic operations. | ||
*/ | ||
class ossl_context_service final { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the lifecycle of this service? Is it bound to the full application lifecycle? I'm assuming it exists on every core? Should we mention the special responsibilities of core0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add that to the comment, but yes full application lifecycle
/* | ||
* Copyright 2024 Redpanda Data, Inc. | ||
* | ||
* Use of this software is governed by the Business Source License | ||
* included in the file licenses/BSL.md | ||
* | ||
* As of the Change Date specified in that file, in accordance with | ||
* the Business Source License, use of this software will be governed | ||
* by the Apache License, Version 2.0 | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What stuff is going to be RCL vs BSL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That... is an interesting question. Because FIPS mode is definitely an enterprise feature, but the use of OpenSSL is not. I guess I"d argue this is BSL since it's available in the main product regardless of FIPS or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair I wonder if it's worth while to have an implementation that is "fips mode" that is RCL, but if that's just passing a flag to openssl, I guess there isn't anything interesting there...
is_fips_mode fips_mode() const; | ||
|
||
private: | ||
class impl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is a private impl neccessary here? I can't think of a future instance there could be a second inner implementation of this class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is a private impl neccessary here? I can't think of a future instance there could be a second inner implementation of this class
I assume is is for a compiler firewall; to hide #include <openssl/*>
from the header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, hides away all the annoying OpenSSL stuff. Pretty much why I did the same pattern for the crypto library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh right, sweet definitely a great idea
src/v/crypto/ossl_context_service.cc
Outdated
impl& operator=(impl&&) noexcept = delete; | ||
|
||
ss::future<> start() { | ||
lg.trace("Starting service..."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this log occurs once i think its fine to be at a higher level, also it could print the service name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Service name should be printed as the logger name, yes?
}); | ||
|
||
if (init_resp.has_error()) { | ||
throw exception(init_resp.assume_error()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
throw ossl_error ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was constructed from ossl_error::build_error()
which steps through the OpenSSL error stack to build up the message. I didn't want to throw from the functions that were running on the alien thread. Also ossl_error
inherits from crypto::exception
.
src/v/crypto/ssl_utils.h
Outdated
if constexpr (std::is_invocable_v<decltype(fn), T*>) { | ||
fn(ptr); | ||
} else if constexpr (std::is_invocable_r_v<int, decltype(fn), T*>) { | ||
if (1 != fn(ptr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do all of the deleters return 1 on success?
return msg + internal::ossl_error::build_error(); | ||
} | ||
|
||
initialize_result<initialize_return> initialize_openssl( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
using inititalize_result = result<initialize_return, std::string>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the two initialize functions return different structures
src/v/crypto/ossl_context_service.cc
Outdated
std::string_view conf_file, | ||
is_fips_mode fips_mode) { | ||
if (!conf_file.empty()) { | ||
lg.debug("Attempting to load config file {}", conf_file); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to ossl config file
src/v/crypto/ossl_context_service.cc
Outdated
// On main shard, load the 'null' provider to the default context | ||
// This prevents the default context from performing any | ||
// cryptographic operation | ||
_defctxnull = internal::OSSL_PROVIDER_ptr( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: This means _defctxnull
is also expected to always be nullptr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no _defctxnull is not a nullptr. It loads a null provider into the global context so the global context can not be used to perform any crypto operations.
// perform | ||
_thread_worker_ctx = internal::OSSL_LIB_CTX_ptr(OSSL_LIB_CTX_new()); | ||
auto init_resp = co_await _thread_worker.submit([this] { | ||
return initialize_worker_thread( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Could the context made on shard0 clash with the shard0 context _shard_ctx
since ossl will be looking for one context per shard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _thread_worker_ctx
will get assigned to the worker thread's context via the call to OSSL_LIB_CTX_set0_default
"Created new shard context for {} replacing {}", | ||
fmt::ptr(_shard_ctx.get()), | ||
fmt::ptr(_old_context)); | ||
auto init_resp = co_await _thread_worker.submit([this] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: How does this intended to work? The memory for the contexts is allocated on the correct shard but then the work to initialize and set them is dispatched to thread worker which may not run the function on the calling shard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory allocated in an alien thread uses the system allocator, not the seastar allocator. The seastar allocator has a slow path for freeing memory not allocated on that shard, but it's generally not recommended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 , i was referring to _shard_ctx
which is allocated within a seastar shards context, due to the call being a few lines above the call to .submit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To further clarify my initial comment, I would assume we would want to allocate and initialize the OSSLContext on all shards , allocation and initialization occurring in each respective shard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allocation and assignment of the OpenSSL library context happens per shard. The initialization of the library context needs to happen in an alien thread because there's a bunch of system call stuff that OpenSSL does under the hood (namely dlloading the providers and opening and reading files, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chatted with @graphcareful , I'll add some comments that hopefully will clear this up :)
src/v/README.md
Outdated
@@ -13,6 +13,7 @@ compat | Compatibility checks | | |||
compression | utilities supporting compression/decompression of many types | | |||
config | Redpanda cluster level and node level configuration options | | |||
container | Generic Redpanda specific containers and data structures | | |||
crypto | Middelware library used to perform cryptographic operations | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also missing license: https://github.com/redpanda-data/redpanda/blob/dev/licenses/third_party.md
is_fips_mode fips_mode() const; | ||
|
||
private: | ||
class impl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is a private impl neccessary here? I can't think of a future instance there could be a second inner implementation of this class
I assume is is for a compiler firewall; to hide #include <openssl/*>
from the header.
internal::OSSL_PROVIDER_ptr base_provider; | ||
}; | ||
|
||
std::string make_ssl_error_response(const std::string& msg) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: Feels strange that initialising libcrypto provider has ssl in the name. I think we should stick to "ossl" for openssl (libcrypto and libssl).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referring to OSSL_PROVIDER_ptr
? Sorry not following the context :)
53797f0
to
8d35f8c
Compare
Force push
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my vscode linter decided to lint this file... if people object to the changes I'll revert them
licenses/third_party.md
Outdated
@@ -32,12 +31,13 @@ please keep this up to date with every new library use. | |||
| lksctp-tools | LGPL v2.1 | | |||
| lz4 | BSD 2 | | |||
| nettle | LGPL v3 | | |||
| OpenSSL | Apache License 2 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to specify v3? I know older versions had different licensing.
v::gtest_main | ||
LABELS crypto | ||
ARGS "-- -c 2" | ||
ENV "OPENSSL_CONF=${CMAKE_CURRENT_BINARY_DIR}/test/openssl_conf.cnf;MODULE_DIR=${REDPANDA_DEPS_INSTALL_DIR}/lib64/ossl-modules" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still need resolving right? You can use GTEST_SKIP()
to skip tests in the setup phase I think.
8d35f8c
to
c466590
Compare
Force push
|
Signed-off-by: Michael Boquard <michael@redpanda.com>
Signed-off-by: Michael Boquard <michael@redpanda.com>
Added service to support loading shard local library contexts Signed-off-by: Michael Boquard <michael@redpanda.com>
Signed-off-by: Michael Boquard <michael@redpanda.com>
Signed-off-by: Michael Boquard <michael@redpanda.com>
Signed-off-by: Michael Boquard <michael@redpanda.com>
c466590
to
2275bb4
Compare
Force push
|
Fixes: https://github.com/redpanda-data/core-internal/issues/1143
Backports Required
Release Notes