Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INDY-1682: bugfix in logic for instance performance degraded #963

Merged

Conversation

Toktar
Copy link
Contributor

@Toktar Toktar commented Oct 31, 2018

The master replica can have throughput = 0, but before that the monitor is always reseted.
When we adding a new instance with adding a new node, other instances can have throughput> 0. In the old logic a new insance perfomance was degraded.
In the new logic added checking that a new replica received any requests.

Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
@Toktar
Copy link
Contributor Author

Toktar commented Oct 31, 2018

test this please

Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
@Toktar Toktar changed the title INDY-1682: bugfix in getThroughputs() INDY-1682: bugfix in logic for insurance performance degraded Nov 2, 2018
@@ -74,7 +74,7 @@ def test_rsr_ema_tm_past_windows_processed_on_add_request(tm):
assert tm.reqs_in_window == 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests that getThroughput is None in the start window

Copy link
Contributor Author

@Toktar Toktar Nov 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We test it in test_rsr_ema_tm_after_start_switches_to_revival_on_not_empty_window . Is it ok or we need more tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to add a test which will specifically verify get_throughput value during the safe start to TESTS OF THROUGHPUT CALCULATION section of this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test_rsr_ema_tm_throughput_in_safe_start()

@@ -98,7 +98,35 @@ def test_rsr_ema_tm_past_windows_processed_on_get_throughput(tm):

# [30, 45)
tm.get_throughput(42)
assert tm.window_start_ts == 30
assert tm.window_start_ts == 15
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 15? We are expecting that there is no info before window_size * min_cnt seconds pass, that is 15*16?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because get_throughput recalculated throughput value early, but now in the safe start we do it only in add_request().
After tm.add_request(16) value of tm.window_start_ts will 15. And after tm.get_throughput(42) it will not be change.

@@ -44,7 +44,7 @@ def fake_view_changer(request, tconf):
)
monitor = FakeSomething(
isMasterDegraded=lambda: False,
areBackupsDegraded=lambda: [],
areBackupsDegraded=lambda a: [],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a parameter in lambda here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, my mistake. Fixed it.

@@ -278,3 +285,15 @@ def test_master_not_degraded_on_revival_spike_on_one_backup_while_load_stopped(t
throughput_ratio = get_throughput_ratio(inst_req_streams, tconf)

assert_master_not_degraded(throughput_ratio, tconf)


def test_master_not_degraded_on_new_instance(fake_monitor, tconf):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test rather checks that newly added instance is not degraded.
I think we need two tests: that master is not degraded when a new instance is added, and that new instance is not degraded right after it's added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be I can add a new check for master instance in this test or we need two tests with different checkings for one case?


assert tm.throughput_before_idle == 0
assert tm.idle_start_ts == 0
assert tm.empty_windows_count == 4
assert tm.empty_windows_count == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it's 0 now? Should we have a new test or edit this one to get the expected behaviour?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is value "0" bad? We didn't order any transactions and empty_windows_count is 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment to the code explaining why empty_windows_count must be 0 here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
ashcherbakov
ashcherbakov previously approved these changes Nov 2, 2018
@@ -74,7 +74,7 @@ def test_rsr_ema_tm_past_windows_processed_on_add_request(tm):
assert tm.reqs_in_window == 1


def test_rsr_ema_tm_past_windows_processed_on_get_throughput(tm):
def test_rsr_ema_tm_past_windows_processed_on_get_throughput_after_start(tm):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is incorrect because windows are not processed. We would name this test test_rsr_ema_tm_past_windows_not_processed_on_get_throughput_during_safe_start.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it.

refactoring test_revival_spike_resistant_ema_throughput_measurement.py

Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
@@ -180,7 +210,8 @@ def tm_in_normal(tm_after_start):
tm.add_request(ts)

# [15, 30) - [225, 240) -- up to 16 not empty windows
tm.get_throughput(15)
tm.add_request(15)
assert tm.get_throughput(15) is None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert should not be a responsibility of this fixture.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -137,9 +164,11 @@ def test_rsr_ema_tm_after_start_switches_to_revival_on_not_empty_window(tm_after

# [60, 75)
throughput = tm.get_throughput(62)
tm.add_request(62)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last two lines should be swapped because the test should verify get_throughput return value in REVIVAL, not FADED state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

refactoring test_revival_spike_resistant_ema_throughput_measurement.py

Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
@@ -180,7 +210,7 @@ def tm_in_normal(tm_after_start):
tm.add_request(ts)

# [15, 30) - [225, 240) -- up to 16 not empty windows
tm.get_throughput(15)
tm.add_request(15)
assert tm.state == State.REVIVAL

for ts in range(15, 240, 5):
Copy link
Contributor

@spivachuk spivachuk Nov 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now there is one extra request at 15.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

refactoring test_revival_spike_resistant_ema_throughput_measurement.py

Signed-off-by: toktar <renata.toktar@dsr-corporation.com>
@Toktar Toktar changed the title INDY-1682: bugfix in logic for insurance performance degraded INDY-1682: bugfix in logic for instance performance degraded Nov 2, 2018
@ashcherbakov ashcherbakov merged commit b6d4335 into hyperledger:master Nov 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants