-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support listening on UNIX domain sockets. #2701
Conversation
Since the `Address` protobuf type is already used for listener addresses, there is no API change. Both file-like and Linux's abstract socket addresses are supported. There's a comment in `ProdListenerComponentFactory` that implies UNIX sockets aren't supported by hot restart. I didn't try to get that working, since I'm planning to use this feature in a container where Envoy never hot restarts. Signed-off-by: John Millikin <jmillikin@stripe.com>
afa5246
to
0eba9e9
Compare
I'm having some trouble writing an integration test for this, since the existing test class hierarchy ( |
ASSERT(address->type() == Network::Address::Type::Ip || | ||
address->type() == Network::Address::Type::Pipe); | ||
if (address->type() == Network::Address::Type::Pipe) { | ||
return std::make_shared<Network::UdsListenSocket>(address->asString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick non-full review comment: This is not going to support hot restart. I think it should be really easy to support hot restart by folding this logic into the hot restart duplication logic below. We already pass a string for the address name to the parent process, where we can then look up the socket by name. I'm pretty sure that it will mostly just work if you add code below to handle creating a "unix://" socket name and passing it.
Also, we should definitely have an integration test for this, and optimally add a unix socket to the hot restart test also. @ggreenway or @alyssawilk can likely guide you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, updated this code to use the same hot-restart recovery logic for UDS as is used for TCP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good.
I'd see if you can override the connect code in the test tcp client of the integration tests; I don't think you need to do any UDS testing on the upstream side.
Also, please add some coverage in the hot-restart integration test.
fd_ = local_address_->socket(Address::SocketType::Stream); | ||
UdsListenSocket::UdsListenSocket(const Address::InstanceConstSharedPtr& address) | ||
: ListenSocketImpl(-1, address) { | ||
fd_ = address->socket(Address::SocketType::Stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be directly passed to ListenSocketImpl on the previous line? Then the code would mirror the TcpListenSocket constructor a few lines up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an interaction here with the filesystem (see call to remove()
in the old code) that I haven't fully understood yet. If I can verify in the tests that it's safe to remove within the address impl, I'll clean up this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. If it can't be cleaned up, please add a comment explaining why it is that way.
Signed-off-by: John Millikin <jmillikin@stripe.com>
Signed-off-by: John Millikin <jmillikin@stripe.com>
Signed-off-by: John Millikin <jmillikin@stripe.com>
c5c1eb8
to
3144b52
Compare
Signed-off-by: John Millikin <jmillikin@stripe.com>
Integration tests added for UNIX listeners and hot restart. |
Signed-off-by: John Millikin <jmillikin@stripe.com>
|
||
std::string getAdminSocketName() { | ||
return abstract_namespace_ | ||
? "@/envoy-tests/uds-integration-test/admin" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need some variation in the name to avoid collisions between concurrent test runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Signed-off-by: John Millikin <jmillikin@stripe.com>
Signed-off-by: John Millikin <jmillikin@stripe.com>
5f5bede
to
3faa599
Compare
return abstract_namespace_ | ||
? "@/envoy-tests/uds-integration-test/admin" | ||
: TestEnvironment::unixDomainSocketPath("uds-integration-test.admin.sock"); | ||
? fmt::format("@/envoy-tests/{}/{}/admin", testInfo->test_case_name(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about the case where multiple top-level builds are running concurrently on a box, like coverage and ASAN. These abstract-namespace UDS paths are global to the box, which is great for their intended use case, and difficult for making these tests work. @htuch is there any precedent anywhere else in the tests for a collision type like this? Should we put a random number in the path? Or maybe call mktmp and use the pathname it returns?
Sorry to be so picky about this; just trying to avoid flaky tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the existing UDS integration test is using abstract sockets for its admin address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is an issue, see what was done recently in
envoy/test/exe/main_common_test.cc
Line 47 in b6b1564
/** |
unixDomainSocketAbstract()
util that does the same as TestEnvironment::unixDomainSocketPath
for abstract sockets? We could have a per-instance global for the entropy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I changed this to take the output of TestEnvironment::unixDomainSocketPath
and prepend "@"
for abstract sockets. This should be sufficiently random, since the file paths are already sharing a system-wide temp dir.
@@ -280,6 +280,7 @@ int PipeInstance::bind(int fd) const { | |||
return ::bind(fd, reinterpret_cast<const sockaddr*>(&address_), | |||
offsetof(struct sockaddr_un, sun_path) + address_length_); | |||
} | |||
unlink(address_.sun_path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we RELEASE_ASSERT
that this is successful? Is this behavior tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a test. If this fails, the bind will return an appropriate errno depending on whether the existing file is a pre-existing socket, or some other file type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just add a comment here. I generally find that when reading code, if an error is deliberately ignored it's helpful to know this vs. suspecting some missing error handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
source/common/network/utility.h
Outdated
@@ -201,6 +201,9 @@ class Utility { | |||
*/ | |||
static absl::uint128 Ip6htonl(const absl::uint128& address); | |||
|
|||
static Address::InstanceConstSharedPtr | |||
parseProtobufAddress(const envoy::api::v2::core::Address& proto_address); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename to protobufAddressToAddress
for symmetry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
This lets `--max-obj-name-len 500` be used in validation mode, and fixes assertion errors from the name length being calculated during the construction of `HotRestartImpl`. Signed-off-by: John Millikin <jmillikin@stripe.com>
Signed-off-by: John Millikin <jmillikin@stripe.com>
Signed-off-by: John Millikin <jmillikin@stripe.com>
Signed-off-by: John Millikin <jmillikin@stripe.com>
Signed-off-by: John Millikin <jmillikin@stripe.com>
Signed-off-by: John Millikin <jmillikin@stripe.com>
Signed-off-by: John Millikin <jmillikin@stripe.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work. This is great. Just 2 minor comments if you feel like it.
@@ -261,6 +261,11 @@ PipeInstance::PipeInstance(const sockaddr_un* address, socklen_t ss_len) | |||
} | |||
|
|||
PipeInstance::PipeInstance(const std::string& pipe_path) : InstanceBase(Type::Pipe) { | |||
if (pipe_path.size() >= sizeof(address_.sun_path)) { | |||
throw EnvoyException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get an EXPECT_THROW_WITH_MESSAGE test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I used WITH_REGEX
to avoid making too many assumptions about sizeof(sun_path)
.
source/common/network/utility.cc
Outdated
case envoy::api::v2::core::Address::kPipe: | ||
return std::make_shared<Address::PipeInstance>(proto_address.pipe().path()); | ||
default: | ||
throw EnvoyException("Address must be a socket or pipe: " + proto_address.DebugString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would do NOT_REACHED
here. This should never happen w/ schema checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Signed-off-by: John Millikin <jmillikin@stripe.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo comments.
@@ -280,6 +280,7 @@ int PipeInstance::bind(int fd) const { | |||
return ::bind(fd, reinterpret_cast<const sockaddr*>(&address_), | |||
offsetof(struct sockaddr_un, sun_path) + address_length_); | |||
} | |||
unlink(address_.sun_path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just add a comment here. I generally find that when reading code, if an error is deliberately ignored it's helpful to know this vs. suspecting some missing error handling.
@@ -0,0 +1,38 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this a v2 YAML? We're trying to avoid adding any new v1 configs to integration tests. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started out with v2 yaml, then went to v1 json because the test helpers aren't compatible with the v2 configs at all.
sed -e "s#{{ socket_dir }}#${SOCKET_DIR}#" | \ | ||
sed -e "s#{{ ip_loopback_address }}#127.0.0.1#" | \ | ||
cat > "${HOT_RESTART_JSON_UDS}" | ||
JSON_TEST_ARRAY+=("${HOT_RESTART_JSON_UDS}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, hotrestart_test
is also v1 JSON. This makes me sad, but no need to fix in this PR.
if (!abstract_namespace_) { | ||
return path; | ||
} | ||
return "@" + path; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and the above method could be slightly refactored to reduce common code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Signed-off-by: John Millikin <jmillikin@stripe.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
…ogle import. envoyproxy#2701 hardcoded IPv4 for the integration test. When parameterizing this, envoyproxy#2829 was encountered, so the corresponding test was disabled. Signed-off-by: Harvey Tuch <htuch@google.com>
* fix duration unit conversion * format * wasm gen * add test
Rename release_validator to validation_cleanup. The release_validator function does not actually release the validator. On iOS is does nothing and on Android it merely detaches the JVM from the current thread. So "cleanup validation" is a better description of it's function than "release validator" and matches the existing description "to clean up after validation completion." Signed-off-by: Ryan Hamilton rch@google.com Signed-off-by: JP Simard <jp@jpsim.com>
Rename release_validator to validation_cleanup. The release_validator function does not actually release the validator. On iOS is does nothing and on Android it merely detaches the JVM from the current thread. So "cleanup validation" is a better description of it's function than "release validator" and matches the existing description "to clean up after validation completion." Signed-off-by: Ryan Hamilton rch@google.com Signed-off-by: JP Simard <jp@jpsim.com>
Description:
Since the
Address
protobuf type is already used for listener addresses, there is no API change. Both file-like and Linux's abstract socket addresses are supported.Risk Level: Medium
Testing:
Unit test for the proto->address conversion, integration tests for listeners and hot restart, and I've done some local testing on a MacOS machine to verify basic functionality.
Docs Changes:
N/A
Release Notes:
Added support for listening on UNIX domain sockets.
Fixes #1605
Fixes #2697
Signed-off-by: John Millikin jmillikin@stripe.com