Verify socket passing in hot restart test#805
Verify socket passing in hot restart test#805mattklein123 merged 14 commits intoenvoyproxy:masterfrom
Conversation
|
@htuch for first pass. |
|
|
||
| .. option:: -a <path string>, --admin-address-path <path string> | ||
|
|
||
| *(optional)* The output file path where the admin address will be written. |
| /** | ||
| * @return int the total number of listeners | ||
| */ | ||
| virtual int numListeners() PURE; |
There was a problem hiding this comment.
Do we strictly need this? A caller could just iterate incrementally on getListenSocketByIndex until a nullptr is returned for the purposes we need in this PR.
| * Constructs a Json Object from a String. | ||
| */ | ||
| static ObjectPtr LoadFromString(const std::string& json); | ||
|
|
There was a problem hiding this comment.
As an aside, looks like LoadFromString and peers violate the coding style in their naming. A distinct cleanup PR might be nice to make this file consistent.
| for (int i = 0; i < server_.numListeners(); ++i) { | ||
| listeners.push_back(server_.getListenSocketByIndex(i)->localAddress()->asString()); | ||
| } | ||
| response.add(fmt::format("{}", Json::Factory::listAsJsonString(listeners))); |
There was a problem hiding this comment.
I don't think you need the fmt::format here.
|
|
||
| if (options.adminAddressPath().length() > 0) { | ||
| std::ofstream admin_address_file(options.adminAddressPath()); | ||
| ASSERT(admin_address_file.is_open()); |
There was a problem hiding this comment.
There should be better error handling here - it's easy for the user to supply a bad path, somewhere not writable, etc. We should log().criticial the failure and return false (add return code). The exception catching in the caller is too focused on the JSON case, there are other reasons we can fail here, so that should also be refactored to reflect that.
| def ReplaceListenerAddresses(original_json, admin_address, no_port_change): | ||
| # Get original listener addresses | ||
| with open(original_json, 'r') as original_json_file: | ||
| parsed_json = json.load(original_json_file, object_pairs_hook=OrderedDict) |
There was a problem hiding this comment.
Am I correct in thinking we're using OrderedDict here just so we get deterministic output? If so, maybe just add a comment.
| admin_response = admin_conn.getresponse() | ||
|
|
||
| if not admin_response.status == 200: | ||
| admin_conn.close() |
There was a problem hiding this comment.
I would use a try: ... finally: admin_conn.close() block here.
| if no_port_change: | ||
| return CheckNoChange(original_listeners, updated_listeners) | ||
| else: | ||
| for updated, original in zip(updated_listeners, original_listeners): |
There was a problem hiding this comment.
Nit: maybe s/updated_listeners/discovered_listeners/
| return CheckNoChange(original_listeners, updated_listeners) | ||
| else: | ||
| for updated, original in zip(updated_listeners, original_listeners): | ||
| original['address'] = "tcp://" + updated |
There was a problem hiding this comment.
Nit: the style guide says we need to have consistent use of "" and ''. Basically, prefer '' over "" unless there is legacy in a file you want to be consistent with.
| while not os.path.exists(admin_address_path): | ||
| time.sleep(1) | ||
| counter += 1 | ||
| if counter > 20: |
There was a problem hiding this comment.
You should probably document that this also waits for the admin address file to appear on the filesystem. Also, best not to use magic numbers like 20, make this a constant at the top of the file, e.g. ADMIN_FILE_WAIT_TIMEOUT_SECS.
BTW, as mentioned in person, the really robust way to do this is with something like inotify, but that's not worth doing for a simple test like this.
htuch
left a comment
There was a problem hiding this comment.
Looking solid, almost ready for external review.
| Http::Code AdminImpl::handlerListenerAddresses(const std::string&, Buffer::Instance& response) { | ||
| std::list<std::string> listeners; | ||
| int listener_index = 0; | ||
| while (server_.getListenSocketByIndex(listener_index) != nullptr) { |
There was a problem hiding this comment.
Can write this as while (auto listen_socket = server_.getListenSocketByIndex(listener_index) ) { ...
| Http::Code handlerServerInfo(const std::string& url, Buffer::Instance& response); | ||
| Http::Code handlerStats(const std::string& url, Buffer::Instance& response); | ||
| Http::Code handlerQuitQuitQuit(const std::string& url, Buffer::Instance& response); | ||
| Http::Code handlerListenerAddresses(const std::string& url, Buffer::Instance& response); |
There was a problem hiding this comment.
Might want to make this more generic, e.g. handlerListenerInfo, to allow later extension for more than addresses.
|
|
||
| std::list<std::string> list4 = {"127.0.0.1:46465", "127.0.0.1:52211", "127.0.0.1:58941"}; | ||
| EXPECT_STREQ("[\"127.0.0.1:46465\",\"127.0.0.1:52211\",\"127.0.0.1:58941\"]", | ||
| Json::Factory::listAsJsonString(list4).c_str()); |
There was a problem hiding this comment.
I wonder if it would be more robust to ::listAsJsonString and then parse it back with RapidJSON and make sure you get back an equivalent C++ std::list. That way any changes to the RapidJSON output formatting (e.g. whitespace use) don't cause the test to fail.
| exports_files([ | ||
| "gen_git_sha.sh", | ||
| "git_sha_rewriter.py", | ||
| "socket_passing.py", |
There was a problem hiding this comment.
This should now be declared with envoy_py_test_binary (hot off the press, since #823, which is still inflight but will probably be merged before this PR).
|
|
||
| echo "Updating original config json listener addresses" | ||
| UPDATED_HOT_RESTART_JSON="${TEST_TMPDIR}"/hot_restart_updated.json | ||
| tools/socket_passing.py "-o" "${HOT_RESTART_JSON}" "-a" "${ADMIN_ADDRESS_PATH_0}" \ |
There was a problem hiding this comment.
This should be ${TEST_RUNDIR}"/tools/socket_passing.py since #819 (sorry about the churn!).
| TestEnvironment::temporaryPath("some/unlikely/bad/path.prof"), | ||
| Network::Utility::resolveUrl("tcp://127.0.0.1:0"), server_); | ||
| AdminImpl admin_bad_profile_path( | ||
| "/dev/null", TestEnvironment::temporaryPath("some/unlikely/bad/path.prof"), "", |
There was a problem hiding this comment.
Yeah, I think this points to making AdminImplOptions something worth considering.
| AdminImpl admin_bad_address_out_path( | ||
| "/dev/null", cpu_profile_path_, bad_path, | ||
| Network::Test::getSomeLoopbackAddress(Network::Address::IpVersion::v4), server_); | ||
| EXPECT_FALSE(std::ifstream(bad_path)); |
There was a problem hiding this comment.
Is it possible to EXPECT on the log output? I'm not sure, but if it's easy to do it might be worth doing that as well.
There was a problem hiding this comment.
So far, I haven't found a way to check log outputs.
There was a problem hiding this comment.
The logger could be mocked and the log statement EXPECTed, but, the way the logger object is brought into existence in many places including in AdminImpl makes this difficult. It's super convenient to be able to inherit in any class from Logger::Loggable and just get the magic log() method but this kills the flexibility of being able to pass in a mocked logger. Some test code added to the logger could provide a special mode where you can check what's been logged, but probably this is nontrivial and best done as a separate PR.
There was a problem hiding this comment.
You could also just make the error a PANIC() and then do a death test. I don't have a strong preference here.
| # exisiting json config file with updated listener addresses. This script is | ||
| # currently called in the hot restart integration test to update listener | ||
| # addresses bound to port 0 in the intial json config file. With the | ||
| # -n or -no_port_change option, this script does not update an exisiting json |
| import sys | ||
| import time | ||
|
|
||
| ADMIN_FILE_TIMEOUT_SECS = 20 |
There was a problem hiding this comment.
Add a comment explaining what this means.
| return False | ||
| discovered_listeners = json.loads(admin_response.read()) | ||
| except: | ||
| sys.stderr.write('Cannot connect to admin.\n') |
There was a problem hiding this comment.
Generally it's scary to do a wildcard exception catch in Python and then not propagate further. In this case, it means exceptions will be hard to debug as we lose info. You could do except e: \n sys.stderr.write('Cannot connect to admin: %s' % e) or something like that. Or re-raise with
except e:
sys.stdout.write(...)
raise e
or create a custom exception class. Probably the first is fine for this test tool.
| #pragma once | ||
|
|
||
| #include <chrono> | ||
| #include <fstream> |
There was a problem hiding this comment.
Nit: shouldn't this be in admin.cc only?
| std::vector<Json::ObjectPtr> output3 = | ||
| Json::Factory::LoadFromString(Json::Factory::listAsJsonString(list3))->asObjectArray(); | ||
| EXPECT_EQ(output3.size(), 4); | ||
| EXPECT_STREQ(output3[0]->asString().c_str(), "one"); |
There was a problem hiding this comment.
Isn't it possible to EXPECT_EQ with std::string? Might be cleaner than dropping to C strings.
| std::vector<Json::ObjectPtr> listener_info = | ||
| Json::Factory::LoadFromString(response->body())->asObjectArray(); | ||
| for (std::size_t index = 0; index < listener_info.size(); index++) { | ||
| EXPECT_STREQ( |
There was a problem hiding this comment.
Same point on EXPECT_EQ here and elsewhere.
| @@ -1,6 +1,15 @@ | |||
| package(default_visibility = ["//visibility:public"]) | |||
There was a problem hiding this comment.
Can you switch this to envoy_package since #828, thanks!
| import time | ||
|
|
||
| # Seconds to wait for the admin address output file to appear. The script exits | ||
| # if the file is not found. |
| sys.stderr.write('Cannot connect to admin: %s\n' % e) | ||
| return False | ||
| else: | ||
| if not len(discovered_listeners) == len(original_listeners): |
There was a problem hiding this comment.
A != B is clearer than not A == B. The second is in danger of being read (not A) == B, which isn't what it means, but the casual reader might easily make that mistake.
| # if the file is not found. | ||
| ADMIN_FILE_TIMEOUT_SECS = 20 | ||
|
|
||
| def ReplaceListenerAddresses(original_json, admin_address, updated_json): |
There was a problem hiding this comment.
I think this name is now legacy, it should probably be PrintListenerAddresses or something you think is better.
| with open(admin_address_path, 'r') as admin_address_file: | ||
| admin_address = admin_address_file.read() | ||
|
|
||
| result = ReplaceListenerAddresses(args.original_json, admin_address, args.updated_json) |
htuch
left a comment
There was a problem hiding this comment.
This is ready for wider review. @mattklein123
| @@ -1,5 +1,6 @@ | |||
| load( | |||
| "//bazel:envoy_build_system.bzl", | |||
| "envoy_py_test_binary", | |||
There was a problem hiding this comment.
Nit: prefer alphabetical ordering here.
|
@hennna can we get this passing tests? If you want to wait until cmake removal that's fine. It should be soon. |
|
Nevermind looks like you fixed it. Will look in a bit. |
|
The fix is only for the alphabetical ordering comment by @htuch. I'll wait for cmake removal rather than add changes that will need to be reverted. |
|
OK let's just wait until this is passing tests then we can take a quick final pass through. |
|
@hennna please merge master. This should pass tests now. |
|
@mattklein123 updated |
mattklein123
left a comment
There was a problem hiding this comment.
looks good, a few small comments.
| {"/stats", "print server stats", MAKE_HANDLER(handlerStats)}, | ||
| {"/listeners", "print listener addresses", MAKE_HANDLER(handlerListenerInfo)}} { | ||
|
|
||
| if (address_out_path.length() > 0) { |
There was a problem hiding this comment.
nit: !address_out_path.empty()
|
|
||
| *(required)* The path to the :ref:`JSON configuration file <config>`. | ||
|
|
||
| .. option:: -a <path string>, --admin-address-path <path string> |
There was a problem hiding this comment.
Can we drop the -a short option. This is a very uncommon option and would rather it be fully explicit. (Obviously need to fix tclap stuff below).
| public: | ||
| TestOptionsImpl(const std::string& config_path) : config_path_(config_path) {} | ||
| TestOptionsImpl(const std::string& config_path) | ||
| : config_path_(config_path), admin_address_path_("") {} |
There was a problem hiding this comment.
nit: admin_address_path_("") not needed, as its the default.
| AdminImpl admin_bad_address_out_path( | ||
| "/dev/null", cpu_profile_path_, bad_path, | ||
| Network::Test::getSomeLoopbackAddress(Network::Address::IpVersion::v4), server_); | ||
| EXPECT_FALSE(std::ifstream(bad_path)); |
There was a problem hiding this comment.
You could also just make the error a PANIC() and then do a death test. I don't have a strong preference here.
|
I'll defer the log.critical vs. PANIC change to after the wider discussion on failure modes in the upcoming doc PR. There doesn't seem to be strong feelings for either option in this specific PR. |
We don't currently run the Kotlin linter on anything other than `hello_world`, but we should. This PR fixes all existing violations and enables the linter. Resolves envoyproxy/envoy-mobile#599. Signed-off-by: Michael Rebello <me@michaelrebello.com> Signed-off-by: JP Simard <jp@jpsim.com>
We don't currently run the Kotlin linter on anything other than `hello_world`, but we should. This PR fixes all existing violations and enables the linter. Resolves envoyproxy/envoy-mobile#599. Signed-off-by: Michael Rebello <me@michaelrebello.com> Signed-off-by: JP Simard <jp@jpsim.com>
In order to get updated admin and listener address information, the following changes were made:
Add an optional command line flag <-a> <--admin-address-path>. When set, the admin address will be written to the specified file.
Add a /listeners URL admin handler that returns all server listener addresses.
Write a python helper script that either updates the listener addresses in the original config file or checks that the server listener addresses have not changed after envoy startup.
Fixes #654.