-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INDY-1680: add function for removing replica #912
INDY-1680: add function for removing replica #912
Conversation
Changes: - add function to remove a replica (and cleaning client requests for this replica) - add grow count of replicas in view change - add tests for check this function and check the correct work of other systems TODO: - correctly process messages for switched off replicas (for implementation this replicas should be a dict) Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Changes: - correctly process messages for switched off replicas (for implementation this replicas should be a dict) - change a replicas list format to a dict Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
…cas_status() Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Changes: - add method add_replica - add method remove_replica - rename method to restoreReplicas - adjustReplicas(from, to) - change monitor replicas list to dict - in _select_primaries add restore_replicas - in restore_replicas in remove requests use requestQueues Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
…nto task-1680-switch-of-replica
plenum/recorder/replayable_node.py
Outdated
@@ -87,7 +87,7 @@ def create_replicas(self, config=None): | |||
return _TestReplicas(self, self.monitor, config) | |||
|
|||
def adjustReplicas(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this method violate the base class contract?
Also the logic of this method should be corrected since replicas
is a dict now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed it.
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
plenum/server/instances.py
Outdated
|
||
@property | ||
def masterId(self) -> Optional[int]: | ||
""" | ||
Return the index of the replica that belongs to the master protocol | ||
instance | ||
""" | ||
return 0 if self.count > 0 else None | ||
return 0 if len(self.started) > 0 else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct the condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
plenum/server/monitor.py
Outdated
def removeInstance(self, index=None): | ||
# TODO: This doesn't take into account index, but this function is never called with defined index, | ||
# probably we can simplify this thing? | ||
def removeInstance(self, index): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please correct the parameter name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. And in Instances.remove() changed parameter name too.
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
plenum/server/node.py
Outdated
while len(self.replicas) > self.requiredNumberOfInstances: | ||
self.replicas.shrink() | ||
self.processStashedMsgsForReplica(old_required_number_of_instances) | ||
old_required_number_of_instances += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter name does not suppose the value changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
@@ -73,15 +76,17 @@ def _master_replica(self): | |||
|
|||
def service_inboxes(self, limit: int = None): | |||
number_of_processed_messages = \ | |||
sum(replica.serviceQueues(limit) for replica in self._replicas) | |||
sum(replica.serviceQueues(limit) for replica in self._replicas.values()) | |||
return number_of_processed_messages | |||
|
|||
def pass_message(self, message, instance_id=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also Node.msgHasAcceptableInstId
method should be corrected. It must stash messages for instances with a number higher than the current maximum required number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
plenum/server/replicas.py
Outdated
def primaries(self) -> dict: | ||
primaries = dict() | ||
for r in self._replicas.values(): | ||
primaries[r.instId] = r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This property must return not replicas but names of instances primaries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
plenum/test/test_node.py
Outdated
@@ -145,7 +145,7 @@ def __init__(self, *args, **kwargs): | |||
notifierEventTriggeringConfig=notifierEventTriggeringConfig, | |||
pluginPaths=pluginPaths) | |||
for i in range(len(self.replicas)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be more clear to iterate over the keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
plenum/server/replicas.py
Outdated
if index is None: | ||
index = self.num_replicas - 1 | ||
replica = self._replicas.pop(index) | ||
for request_key in replica.requestQueues: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requestQueues
is not a request queue, it is a map from ledger IDs to request queues.
Is there a unit test verifying freeing requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed it.
Now all tests from test_replica_removing.py check it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to add verification of freeing the request by the replica (i.e. forwardedTo
has been decreased by 1) to test_replica_removing_in_ordering
.
ensureElectionsDone(looper=looper, nodes=txnPoolNodeSet) | ||
|
||
|
||
def _check_replica_removed(node, start_replicas_count): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method should take an instance ID. Also it should check that items related to the replica are removed from all the relevant collections and it should not verify conditions not related to removing the replica (such as master degradation and requests freeing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added check that replica removed from all relevant collections and changed parameters list. But I think, we should check things as a master degraded in _check_replica_removed() because it's a proof that replica was successfully removed without negative effects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check this but not in _check_replica_removed
method since it is not its responsibility according to its name.
Also assert len(node.requests) == 0
should not be a responsibility of this method too.
index = start_replicas_count - 1 | ||
node.replicas.remove_replica(index) | ||
_check_replica_removed(node, start_replicas_count) | ||
# trigger view change on all nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such the teardown sections in tests make them dependent on each other. Instead of this introduce a fixture which ensures view change and use this fixture in every test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
looper.run(checkNodesConnected(txnPoolNodeSet)) | ||
waitForViewChange(looper, txnPoolNodeSet, expectedViewNo=1, | ||
customTimeout=2 * tconf.VIEW_CHANGE_TIMEOUT) | ||
ensureElectionsDone(looper=looper, nodes=txnPoolNodeSet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restoration of replicas should be verified in an obvious way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
plenum/recorder/replayable_node.py
Outdated
old_required_number_of_instances: int, | ||
new_required_number_of_instances: int): | ||
r = super().adjustReplicas(old_required_number_of_instances, | ||
new_required_number_of_instances) | ||
if r > 0: | ||
if hasattr(self, 'sent_pps'): | ||
new_replicas = self.replicas[-r:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.replicas
is a dict now. Correct its usage here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
plenum/recorder/replayable_node.py
Outdated
new_replicas = self.replicas[-r:] | ||
new_replicas = [r for inst_id, r in self.replicas | ||
if old_required_number_of_instances >= | ||
inst_id < new_required_number_of_instances] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the inequation.
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
changes: - updated test_request_time_tracker.py - updated test_replica_removing_before_ordering() Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Changes: