INDY-1431: implement client stack restart on reached connections limit. #799

sergey-shilov · 2018-07-09T08:15:30Z

Client stack is restarted if connections threshold is reached and
specified minimal time (configurable, should be greater than TIME_WAIT)
is spent since previous client stack restart to avoid sockets overhead
in TIME_WAIT or FIN_WAIT state.

Signed-off-by: Sergey Shilov sergey.shilov@dsr-company.com

Client stack is restarted if connections threshold is reached and specified minimal time (configurable, should be greater than TIME_WAIT) is spent since previous client stack restart to avoid sockets overhead in TIME_WAIT or FIN_WAIT state. Signed-off-by: Sergey Shilov <sergey.shilov@dsr-company.com>

Signed-off-by: Sergey Shilov <sergey.shilov@dsr-company.com>

ashcherbakov · 2018-07-09T08:21:56Z

plenum/common/stacks.py

+    def init_stack_restart_params(self):
+        self.connected_clients_num = 0
+        self.stack_restart_is_needed = False
+        self.last_start_time = time.time()


Should we use time.perf_counter()?

Seems like time.perf_counter() is not appropriate here. From time module reference:
"The reference point of the returned value is undefined, so that only the difference between the results of consecutive calls is valid."
But I'm calculating a time point in future. So using time.time() function is much more understandable for me as it returns the time in seconds since the epoch.
And also about time.perf_counter(): "A clock with the highest available resolution to measure a short duration." We don't need the highest resolution here as calculated time out is about tens of minutes.

ashcherbakov · 2018-07-09T08:23:59Z

plenum/common/stacks.py

+            if event['event'] == zmq.EVENT_ACCEPTED:
+                self.connected_clients_num += 1
+            if event['event'] == zmq.EVENT_DISCONNECTED:
+                assert self.connected_clients_num > 0


asserts will be dropped in production. Should we use an exception here? Or it can be expected that we get disconnected event without connected one?

I've added a condition not to get negative numbers in addition to assertion. Also added logging. My concern here is about lost ZMQ events and I don't think that exception is needed here.

ashcherbakov · 2018-07-09T08:24:27Z

plenum/common/stacks.py

+    def restart(self):
+        logger.warning("Stopping client stack on node {}".format(self))
+        self.stop()
+        time.sleep(0.2)


magic number 0.2 here

Just for safety.

ashcherbakov · 2018-07-09T08:33:20Z

plenum/test/test_node.py

@@ -410,6 +410,8 @@ def restart_clientstack(self):
        time.sleep(0.2)
        logger.debug("Starting clientstack on node {}".format(self))
        self.clientstack.start()
+        # Sleep to allow disconnected clients to reconnect before sending replies from the server side.


Should we use restart method from clientstack here?

Of course, done.

ashcherbakov · 2018-07-09T08:37:34Z

plenum/common/stacks.py

+    def _can_restart(self):
+        return self.next_restart_min_time < time.time()
+
+    def check_for_stack_restart(self):


Please add unit tests for this method

ashcherbakov · 2018-07-09T08:38:44Z

plenum/common/stacks.py

+            self.min_stack_restart_timeout + \
+            randint(0, self.max_stack_restart_time_deviation)
+
+    def check_listener_events(self):


Please add tests for this method and listener monitor

ashcherbakov · 2018-07-09T08:40:42Z

stp_zmq/zstack.py

@@ -946,6 +955,20 @@ def prepare_to_send(self, msg: Any):
        self.msgLenVal.validate(msg_bytes)
        return msg_bytes

+    @staticmethod
+    def get_monitor_events(monitor_socket, non_block=True):


There is a similar method in Remote. Should we get rid of duplication?

There is events.append(message['event']) line in Remote's implementation. Should it be the same?

I've added a comment for this method. My concern is that working with monitor socket is not fully correct and should be reviewed. Now we call get_monitor_socket() each time when we want to get it, it's strange. If it is a singleton then it is ok, but I'm not sure. So I'll create a separate ticket and clean up this implementation.

ashcherbakov · 2018-07-09T08:42:34Z

stp_zmq/zstack.py

@@ -359,6 +365,9 @@ def open(self):
        )

    def close(self):
+        if self.listener_monitor is not None:
+            self.listener.disable_monitor()


Disabling monitor on the non-listener socket is done differently. Is it intended?

See comment above.

Now connections tracking and stack restart are separated and optional. NOTE: connections tracking must be enabled if stack restart is enabled as stack restart uses connections tracking mechanism for triggering of restart. Signed-off-by: Sergey Shilov <sergey.shilov@dsr-company.com>

ashcherbakov · 2018-07-10T13:28:44Z

plenum/common/stacks.py

+                        'please check your configuration'.format(self)
+            raise RuntimeError(error_str)
+
+        self.track_connected_clients_num_enabled = config.TRACK_CONNECTED_CLIENTS_NUM_ENABLED


Should it be self.track_connected_clients_num_enabled=create_listener_monitor?

The create_listener_monitor is an input parameter of ZStack class while config.TRACK_CONNECTED_CLIENTS_NUM_ENABLED is a configuration parameter, I don't understand why it should be replaced.

Signed-off-by: Sergey Shilov <sergey.shilov@dsr-company.com>

ashcherbakov · 2018-07-10T16:31:02Z

plenum/test/zstack_tests/test_clientstack_restart_trigger.py

+    node.clientstack.connected_clients_num = max_connected_clients_num + 1
+    node.clientstack.handle_connections_limit()
+
+    assert is_restarted is False


Please also check that we eventually restart (after the timeout)

Signed-off-by: Sergey Shilov <sergey.shilov@dsr-company.com>

Sergey Shilov added 3 commits July 9, 2018 11:14

Fix client stack restart tests.

8efbea4

Signed-off-by: Sergey Shilov <sergey.shilov@dsr-company.com>

Fix flake8.

aaf3656

Signed-off-by: Sergey Shilov <sergey.shilov@dsr-company.com>

ashcherbakov reviewed Jul 9, 2018

View reviewed changes

Sergey Shilov added 2 commits July 9, 2018 16:12

Merge remote-tracking branch 'base/master' into feature/INDY-1431

c858e1e

ashcherbakov reviewed Jul 10, 2018

View reviewed changes

Add test of client stack restart trigger.

67e589b

Signed-off-by: Sergey Shilov <sergey.shilov@dsr-company.com>

ashcherbakov reviewed Jul 10, 2018

View reviewed changes

Make client stack restart tests separate.

36c31e1

Signed-off-by: Sergey Shilov <sergey.shilov@dsr-company.com>

ashcherbakov approved these changes Jul 11, 2018

View reviewed changes

ashcherbakov merged commit 7f36aba into hyperledger:master Jul 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

INDY-1431: implement client stack restart on reached connections limit. #799

INDY-1431: implement client stack restart on reached connections limit. #799

sergey-shilov commented Jul 9, 2018

ashcherbakov Jul 9, 2018

sergey-shilov Jul 9, 2018

ashcherbakov Jul 9, 2018

sergey-shilov Jul 9, 2018

ashcherbakov Jul 9, 2018

sergey-shilov Jul 9, 2018

ashcherbakov Jul 9, 2018

sergey-shilov Jul 9, 2018

ashcherbakov Jul 9, 2018

sergey-shilov Jul 10, 2018

ashcherbakov Jul 9, 2018

ashcherbakov Jul 9, 2018

sergey-shilov Jul 9, 2018

ashcherbakov Jul 9, 2018

sergey-shilov Jul 9, 2018

ashcherbakov Jul 10, 2018

sergey-shilov Jul 10, 2018

ashcherbakov Jul 10, 2018

sergey-shilov Jul 10, 2018

INDY-1431: implement client stack restart on reached connections limit. #799

INDY-1431: implement client stack restart on reached connections limit. #799

Conversation

sergey-shilov commented Jul 9, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment