-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INDY-1682: bugfix in logic for instance performance degraded #963
INDY-1682: bugfix in logic for instance performance degraded #963
Conversation
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
test this please |
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
@@ -74,7 +74,7 @@ def test_rsr_ema_tm_past_windows_processed_on_add_request(tm): | |||
assert tm.reqs_in_window == 1 | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add tests that getThroughput is None in the start window
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We test it in test_rsr_ema_tm_after_start_switches_to_revival_on_not_empty_window
. Is it ok or we need more tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to add a test which will specifically verify get_throughput
value during the safe start to TESTS OF THROUGHPUT CALCULATION
section of this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added test_rsr_ema_tm_throughput_in_safe_start()
@@ -98,7 +98,35 @@ def test_rsr_ema_tm_past_windows_processed_on_get_throughput(tm): | |||
|
|||
# [30, 45) | |||
tm.get_throughput(42) | |||
assert tm.window_start_ts == 30 | |||
assert tm.window_start_ts == 15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 15? We are expecting that there is no info before window_size * min_cnt seconds pass, that is 15*16?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because get_throughput recalculated throughput value early, but now in the safe start we do it only in add_request().
After tm.add_request(16)
value of tm.window_start_ts
will 15. And after tm.get_throughput(42)
it will not be change.
plenum/test/view_change/conftest.py
Outdated
@@ -44,7 +44,7 @@ def fake_view_changer(request, tconf): | |||
) | |||
monitor = FakeSomething( | |||
isMasterDegraded=lambda: False, | |||
areBackupsDegraded=lambda: [], | |||
areBackupsDegraded=lambda a: [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a parameter in lambda here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, my mistake. Fixed it.
@@ -278,3 +285,15 @@ def test_master_not_degraded_on_revival_spike_on_one_backup_while_load_stopped(t | |||
throughput_ratio = get_throughput_ratio(inst_req_streams, tconf) | |||
|
|||
assert_master_not_degraded(throughput_ratio, tconf) | |||
|
|||
|
|||
def test_master_not_degraded_on_new_instance(fake_monitor, tconf): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test rather checks that newly added instance is not degraded.
I think we need two tests: that master is not degraded when a new instance is added, and that new instance is not degraded right after it's added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be I can add a new check for master instance in this test or we need two tests with different checkings for one case?
|
||
assert tm.throughput_before_idle == 0 | ||
assert tm.idle_start_ts == 0 | ||
assert tm.empty_windows_count == 4 | ||
assert tm.empty_windows_count == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why it's 0 now? Should we have a new test or edit this one to get the expected behaviour?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is value "0" bad? We didn't order any transactions and empty_windows_count is 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment to the code explaining why empty_windows_count
must be 0 here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
@@ -74,7 +74,7 @@ def test_rsr_ema_tm_past_windows_processed_on_add_request(tm): | |||
assert tm.reqs_in_window == 1 | |||
|
|||
|
|||
def test_rsr_ema_tm_past_windows_processed_on_get_throughput(tm): | |||
def test_rsr_ema_tm_past_windows_processed_on_get_throughput_after_start(tm): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name is incorrect because windows are not processed. We would name this test test_rsr_ema_tm_past_windows_not_processed_on_get_throughput_during_safe_start
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it.
refactoring test_revival_spike_resistant_ema_throughput_measurement.py Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
@@ -180,7 +210,8 @@ def tm_in_normal(tm_after_start): | |||
tm.add_request(ts) | |||
|
|||
# [15, 30) - [225, 240) -- up to 16 not empty windows | |||
tm.get_throughput(15) | |||
tm.add_request(15) | |||
assert tm.get_throughput(15) is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assert should not be a responsibility of this fixture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -137,9 +164,11 @@ def test_rsr_ema_tm_after_start_switches_to_revival_on_not_empty_window(tm_after | |||
|
|||
# [60, 75) | |||
throughput = tm.get_throughput(62) | |||
tm.add_request(62) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last two lines should be swapped because the test should verify get_throughput
return value in REVIVAL, not FADED state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
refactoring test_revival_spike_resistant_ema_throughput_measurement.py Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
@@ -180,7 +210,7 @@ def tm_in_normal(tm_after_start): | |||
tm.add_request(ts) | |||
|
|||
# [15, 30) - [225, 240) -- up to 16 not empty windows | |||
tm.get_throughput(15) | |||
tm.add_request(15) | |||
assert tm.state == State.REVIVAL | |||
|
|||
for ts in range(15, 240, 5): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now there is one extra request at 15.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
refactoring test_revival_spike_resistant_ema_throughput_measurement.py Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
The master replica can have throughput = 0, but before that the monitor is always reseted.
When we adding a new instance with adding a new node, other instances can have throughput> 0. In the old logic a new insance perfomance was degraded.
In the new logic added checking that a new replica received any requests.