-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INDY-1404: Do not do view change until catchup is finished #757
Conversation
- do not process any View Change related messages (including ViewChangeDone from Current State for primary propagation) until the current catchup is finished; - this fixes initialization of a node where we mixed together initial catchup and primary propagation (initial view change); - also fixed recovering when primary is disconnected and (n-f)th node joins the pool Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com>
Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com>
test this please |
Signed-off-by: Andrey Kononykhin <andkononykhin@gmail.com>
test this please |
Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com>
Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com>
Signed-off-by: ArtObr <artemobruchnikov@gmail.com>
fix for failing test
test this please |
…rledger/indy-plenum into catchup-cancellation
Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com>
Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com>
Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com>
Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com>
…rledger/indy-plenum into catchup-cancellation
Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com>
@@ -404,3 +404,9 @@ def _validate_message(self, dct): | |||
self._raise_invalid_fields( | |||
f.MSG.nm, msg, | |||
"The message type must be {} ".format(expected_type_cls.typename)) | |||
|
|||
|
|||
class FutureViewChangeDone: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if purpose of this message was documented, as with all other messages in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
plenum/server/node.py
Outdated
@@ -1905,6 +1909,9 @@ def is_catchup_needed(self) -> bool: | |||
Check if all requests ordered till last prepared certificate | |||
Check if last catchup resulted in no txns | |||
""" | |||
if not self.view_change_in_progress: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks a bit counterintuitive, it would be nice to have a comment explaining why no catchup is needed if view change is not in progress
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
# This is especially critical for Propagate Primary mode (on receiving CURRENT_STATE by a new node). | ||
if self.view_no in self._next_view_indications: | ||
for frm, vcd in self._next_view_indications[self.view_no].items(): | ||
self._on_verified_view_change_done_msg(vcd, frm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably it's better to call process_vchd_msg
here, which will call _on_verified_view_change_done_msg
only when it's truly verified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will not work. Added a comment with a description.
# remove all for previous views | ||
for view_no in tuple(self._next_view_indications.keys()): | ||
if view_no <= self.view_no: | ||
self._next_view_indications.pop(view_no) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
del self._next_view_indications[view_no]
could be more appropriate, since pop returns removed item which we don't actually need
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
ensure_view_change(looper, other_nodes) | ||
checkProtocolInstanceSetup(looper=looper, nodes=other_nodes, numInstances=2) | ||
ensure_all_nodes_have_same_data(looper, nodes=other_nodes) | ||
looper.runFor(expectedPoolViewChangeStartedTimeout(len(txnPoolNodeSet))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a probably source of flakiness. Can we come up with some predicate which can indicate that view change won't happen for sure?
old_meths = do_view_change(txnPoolNodeSet) | ||
for node in txnPoolNodeSet: | ||
node.view_changer.sendInstanceChange(old_view_no + 1) | ||
looper.runFor(expectedPoolViewChangeStartedTimeout(len(txnPoolNodeSet))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a probably source of flakiness. Can we come up with some predicate which can indicate that view change won't happen for sure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
from plenum.test.waits import expectedPoolViewChangeStartedTimeout | ||
|
||
|
||
@pytest.fixture(params=['starting', 'discovering', 'discovered', 'syncing']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need strings here and lot's of if/else? Would the following do the job?
@pytest.fixture(params=[Mode.starting, Mode.discovering...])
def mode(request)
return request.param
from plenum.test.waits import expectedPoolViewChangeStartedTimeout | ||
|
||
|
||
@pytest.fixture(params=['starting', 'discovering', 'discovered', 'syncing']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need strings here and lot's of if/else? Would the following do the job?
@pytest.fixture(params=[Mode.starting, Mode.discovering...])
def mode(request)
return request.param
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com>
Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com>
test this please |
1 similar comment
test this please |
Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com>
test this please |
2 similar comments
test this please |
test this please |
Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com>
test this please |
Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com>
…hup-cancellation Signed-off-by: ashcherbakov <alexander.sherbakov@dsr-company.com> # Conflicts: # plenum/test/recorder/test_replay_with_view_change.py
Signed-off-by: ashcherbakov alexander.sherbakov@dsr-company.com