-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[INDY-1468] move avgLatency computing to EMA algorithm #812
[INDY-1468] move avgLatency computing to EMA algorithm #812
Conversation
Signed-off-by: Andrew Nikitin <andrew.nikitin@dsr-corporation.com>
Signed-off-by: Andrew Nikitin <andrew.nikitin@dsr-corporation.com>
plenum/server/monitor.py
Outdated
self._moving_average(curr_avg_lat, | ||
duration)) | ||
|
||
def _moving_average(self, p0, p1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming p0
to something like accum
or old_accum
and p1
back to next_val
will make intentions more clear
plenum/server/monitor.py
Outdated
|
||
def add_request(self, ordered_ts): | ||
self.update_time(ordered_ts) | ||
self.reqs_in_window += 1 | ||
|
||
def _moving_average(self, next_val): | ||
def add_duration(self, identifier, duration): | ||
if identifier not in self.avg_latencies: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setdefault
can be used here to avoid triple lookup
plenum/server/monitor.py
Outdated
|
||
def add_request(self, ordered_ts): | ||
self.update_time(ordered_ts) | ||
self.reqs_in_window += 1 | ||
|
||
def _moving_average(self, next_val): | ||
def add_duration(self, identifier, duration): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need separate add_duration
and add_request
? Why not single add_request(id, timestamp, tto/time_to_order/latency)
?
Also can we come up with some name for identifier
so it can be more clear what it identifies?
plenum/server/monitor.py
Outdated
total_reqs, curr_avg_lat = self.avg_latencies[identifier] | ||
total_reqs += 1 | ||
self.avg_latencies[identifier] = (total_reqs, | ||
self._moving_average(curr_avg_lat, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using moving average function to calculate average latency here, and it's using alpha
which is based on throughput_min_cnt
. Either "throughput" part should be removed from names of variables that control alpha, or separate set for latencies should be created.
Also here we're calculating moving average over latencies taking into account only their number, not their temporal placement. When we do the same with throughput we get measurements at well defined intervals, so we can really say that throughput is averaged over window with well defined duration. Yet this is not a case with latencies now, do we really intend that here?
plenum/server/monitor.py
Outdated
pass | ||
def get_avg_latency(self, identifier): | ||
if identifier not in self.avg_latencies: | ||
return .0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not None
?
plenum/server/monitor.py
Outdated
return None | ||
self.update_time(request_time) | ||
return self._moving_average(self.reqs_in_window / self.inner_window) | ||
return self._moving_average(self.throughput, self.reqs_in_window / self.throughput_window_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some thinking I came to conclusion that if we want to get smooth averages between window boundaries it's not that simple, and what's done here is not helping in any way. Imagine you had throughput of 1, then you measure it before you get another request and this gives you values less than 1, which is yet constant until you get a request, after which throughput jumps back to 1 and remains here until next window starts, at which point it jumps below 1 again. So, I believe right thing to do here is to either just return self.throughput
or come up with better solution (which can turn out quite complicated). I would just stick with simple options for now.
assert rm.avg_latencies['some_client_identifier'][1] != 0 | ||
|
||
|
||
def test_avg_latency_accuracy(request_measurement): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add tests with different duration values
… max latency check Signed-off-by: Andrew Nikitin <andrew.nikitin@dsr-corporation.com>
plenum/config.py
Outdated
@@ -137,6 +137,11 @@ | |||
LatencyWindowSize = 30 | |||
LatencyGraphDuration = 240 | |||
|
|||
# This parameter defines minimal count of accumulated latencies for each client | |||
MIN_LATENCY_COUNT = 50 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think requiring 50 for each client is too much
plenum/config.py
Outdated
# This parameter defines minimal count of accumulated latencies for each client | ||
MIN_LATENCY_COUNT = 50 | ||
# This parameter defines coefficient alpha, which represents the degree of weighting decrease. | ||
LATENCY_ALPHA = 0.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better than the Alpha based on MinCount?
…LATENCY_COUNT Signed-off-by: Andrew Nikitin <andrew.nikitin@dsr-corporation.com>
plenum/server/monitor.py
Outdated
self.reqs_in_window = 0 | ||
self.throughput = 0 | ||
self.inner_window = inner_window | ||
self.min_cnt = min_cnt | ||
self.throughput_window_size = throughput_window_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since class now is named ThroughputMeasurement
probably there's no longer need in adding "thoughput" to some fields and parameter names. What about thoughput_window_size
-> window_size
, throughput_min_cnt
-> min_window_count
?
|
||
def add_request(self, ordered_ts): | ||
self.update_time(ordered_ts) | ||
self.reqs_in_window += 1 | ||
|
||
def _moving_average(self, next_val): | ||
def _accumulate(self, old_accum, next_val): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is always called with old_accum= self.throughput
, probably this parameter can be removed
plenum/server/monitor.py
Outdated
self.inner_window = inner_window | ||
self.min_cnt = min_cnt | ||
self.throughput_window_size = throughput_window_size | ||
self.throughput_min_cnt = throughput_min_cnt | ||
self.first_ts = time.perf_counter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timestamps are injected everywhere except here. This is both inconsistency and can lead to problems if someone decides to use some other timing function instead of time.perf_counter
.
plenum/server/monitor.py
Outdated
def add_duration(self, identifier, duration): | ||
if identifier not in self.avg_latencies: | ||
self.avg_latencies[identifier] = (0, .0) | ||
total_reqs, curr_avg_lat = self.avg_latencies[identifier] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These 3 dict lookups can be replaced with just one:
total_reqs, curr_avg_lat = self.avg_latencies.get(identifier, (0, .0))
plenum/server/monitor.py
Outdated
if avg_lat: | ||
if cid not in avgLatencies: | ||
avgLatencies[cid] = [] | ||
avgLatencies[cid].append(avg_lat) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about replacing triple lookup with
avgLatencies.setdefault(cid, []).append(avg_lat)
Signed-off-by: Andrew Nikitin <andrew.nikitin@dsr-corporation.com>
Signed-off-by: Andrew Nikitin andrew.nikitin@dsr-corporation.com