-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INDY-1759: Restoration of lastPrePrepareSeqNo #971
INDY-1759: Restoration of lastPrePrepareSeqNo #971
Conversation
- Implemented restoration of lastPrePrepareSeqNo on a backup primary after restart. - Made some renaming in the code related to view change to make it clearer. - Corrected sdk_send_batches_of_random and sdk_send_batches_of_random_and_check test helper functions. - Wrote a test for a positive case of lastPrePrepareSeqNo restoration on a backup primary after restart. Signed-off-by: Nikita Spivachuk <nikita.spivachuk@dsr-company.com>
…nto backup-primary-restoration Signed-off-by: Nikita Spivachuk <nikita.spivachuk@dsr-company.com> # Conflicts: # plenum/common/messages/node_messages.py # plenum/server/view_change/view_changer.py
Signed-off-by: ArtObr <artemobruchnikov@gmail.com>
Signed-off-by: ArtObr <artemobruchnikov@gmail.com>
…nto indy_1759_test
Signed-off-by: ArtObr <artemobruchnikov@gmail.com>
- Corrected tests that restarted node instances which had been stopped previously. Now a new node instance is created for restarting a node. Signed-off-by: Nikita Spivachuk <nikita.spivachuk@dsr-company.com>
Signed-off-by: ArtObr <artemobruchnikov@gmail.com>
- Added more tests for lastPrePrepareSeqNo restoration feature. Signed-off-by: Nikita Spivachuk <nikita.spivachuk@dsr-company.com>
Signed-off-by: ArtObr <artemobruchnikov@gmail.com>
@@ -962,14 +962,17 @@ def sdk_send_batches_of_random_and_check(looper, txnPoolNodeSet, sdk_pool, sdk_w | |||
if num_batches == 1: | |||
return sdk_send_random_and_check(looper, txnPoolNodeSet, sdk_pool, sdk_wallet, num_reqs, **kwargs) | |||
|
|||
reqs_in_batch = num_reqs // num_batches | |||
reqs_in_last_batch = reqs_in_batch + num_reqs % num_batches |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks incorrect, it should be num_reqs % num_batches
?
plenum/server/node.py
Outdated
@@ -674,6 +683,69 @@ def on_view_change_complete(self): | |||
replica.clear_requests_and_fix_last_ordered() | |||
self.monitor.reset() | |||
|
|||
def store_last_sent_pre_prepare_seq_no(self, inst_id, pp_seq_no): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Please move all logic related to last_sent_pre_prepare_seq_no to a separate helper class
- Cover it by unit tests
plenum/server/replica.py
Outdated
@@ -745,6 +745,7 @@ def send3PCBatch(self): | |||
if ppReq is None: | |||
continue | |||
self.sendPrePrepare(ppReq) | |||
self.node.store_last_sent_pre_prepare_seq_no(self.instId, ppReq.ppSeqNo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not store it for master instance.
plenum/server/node.py
Outdated
logger.info("{} restoring lastPrePrepareSeqNo " | ||
"from stored lastSentPrePrepare value {}" | ||
.format(self, serialized_value)) | ||
primary_replica.lastPrePrepareSeqNo = pp_seq_no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about watermarks?
plenum/server/node.py
Outdated
if self.view_changer.previous_view_no == 0: | ||
backup_primary_pp_seq_no_restored = \ | ||
self._try_restore_last_sent_pre_prepare_seq_no() | ||
if not backup_primary_pp_seq_no_restored: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this flag? Why can't we always erase it?
looper, txnPoolNodeSet, sdk_pool_handle, sdk_wallet_client, | ||
tconf, tdir, allPluginsPath): | ||
|
||
for _ in range(6): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need 6 view changes here? This will be very long test.
I think the check for viewNo can be done on unit test level.
backup_inst_id = 1 | ||
|
||
|
||
def test_backup_primary_does_not_restore_pp_seq_no_if_view_is_not_same( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test looks too complex to just check that backup primary does not restore pp_seq_no if viewNo is not the same.
Why do we need to restart all nodes here?
backup_inst_id = 1 | ||
|
||
|
||
def test_backup_primary_restores_pp_seq_no_if_view_is_same( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can also be done as a unit test.
nodeCount = 7 | ||
|
||
|
||
def test_backup_replica_does_not_restore_pp_seq_no_if_not_primary_anymore( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be done as a unit test.
I think we need to have the following tests:
|
Signed-off-by: ArtObr <artemobruchnikov@gmail.com>
Indy 1759 test
- Extracted last_sent_pp_seq_no restoration feature to a separate class. - Added shifting watermarks correspondingly to restored last_sent_pp_seq_no. - Removed storing last_sent_pp_seq_no for master replica. - Added unit tests for last_sent_pp_seq_no restoration class. Signed-off-by: Nikita Spivachuk <nikita.spivachuk@dsr-company.com>
def check_last_warning(expected_msg): | ||
global warning_msg_count | ||
warning_msg_count += 1 | ||
assert warning_msg_count == len(container) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather prefer error codes (instead of just true/false) instead of analyzing logs
- Added integration tests for last_sent_pp_seq_no restoration feature. - Added a unit test for LastSentPpStoreHelper._restore_last_sent_pp_seq_no method. - Rolled back changes in FakeNode class that had caused regression in existing tests. - Removed OutputWarningHandler test helper class. - Removed tests of last_sent_pp_seq_no restoration feature which had been replaced with new ones. Signed-off-by: Nikita Spivachuk <nikita.spivachuk@dsr-company.com>
- Removed tests of last_sent_pp_seq_no restoration feature which had been replaced with new ones. Signed-off-by: Nikita Spivachuk <nikita.spivachuk@dsr-company.com>
backup_inst_id = 1 | ||
|
||
|
||
def test_node_erases_stored_last_sent_pp_key_on_pool_restart( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tests stops a tested node, then restarts other nodes in the pool, then start the tested node.
It would be great to write another test, where all nodes (including the tested one) are restarted at the same time
- Added an integration test for erasing last_sent_pp_seq_no in case of simultaneous restart of all the nodes in the pool. - Made advancing last_ordered_3pc on a backup replica up to master's last_ordered_3pc on completion of a regular view change (not propagate primary) conditional: now it is made only if the backup replica is not primary. Signed-off-by: Nikita Spivachuk <nikita.spivachuk@dsr-company.com>
…nto backup-primary-restoration Signed-off-by: Nikita Spivachuk <nikita.spivachuk@dsr-company.com> # Conflicts: # plenum/server/node.py
- Updated tests in test_no_propagate_request_on_different_last_ordered_before_vc module according to the change in the logic of advancing last_ordered_3pc on backup replicas on a regular view change. - Fixed intermittent failures in tests in test_no_propagate_request_on_different_last_ordered_before_vc module. Signed-off-by: Nikita Spivachuk <nikita.spivachuk@dsr-company.com>
(ci) test this please |
lastPrePrepareSeqNo
on a backup primary after restart.sdk_send_batches_of_random
andsdk_send_batches_of_random_and_check
test helper functions.lastPrePrepareSeqNo
restoration feature.last_ordered_3pc
on a backup replica up to master'slast_ordered_3pc
on completion of a regular view change (not propagate primary) conditional: now it is made only if the backup replica is not primary.test_no_propagate_request_on_different_last_ordered_before_vc
module according to the change in the logic of advancinglast_ordered_3pc
on backup replicas on a regular view change.test_no_propagate_request_on_different_last_ordered_before_vc
module.