-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PR-1]Node level metrics[latency and data rate] measurement on Kernel Servers #794
[PR-1]Node level metrics[latency and data rate] measurement on Kernel Servers #794
Conversation
Thanks @VenuReddy2103. Let me know when this is ready for review. |
Thanks @VenuReddy2103 . Let me know when this is ready for review. |
Sorry @VenuReddy2103 I was busy on other things today. Will try to review tomorrow. |
@VenuReddy2103 I tried to review this today, but it's essentially impossible. Please self review this, explain how you would like me to review this, and explicitly request review (with a written comment) when this is ready for my review. In the mean time I'm removing myself from the reviewers list. |
@VenuReddy2103 Still no reply to #794 (comment) ? |
46e6632
to
791f3fe
Compare
791f3fe
to
e221e23
Compare
Data rate calculation was not correct. Have fixed it. Will test further to find issues and fix. Will notify when ready for review |
@@ -36,8 +36,9 @@ | |||
/* TODO: Need to check if size of data need to be increased to get the better data transfer rates. Sometimes, empty | |||
heartbeats are taking longer time than heartbeat with 2k arbitrary data. Thus, resulting in negative data transfer | |||
rate */ | |||
private static final int RANDOM_DATA_SIZE = 2 * 1024; | |||
private static final byte[] randomBytes = new byte[RANDOM_DATA_SIZE]; | |||
private static final int DATA_SIZE_IN_MB = 10; // size in MegaBytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You still need to explain here why 10MB has been chosen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have modified it to start with 1k of data and increase by step size if the data transfer time is not significant when compared to latency.
@@ -267,13 +268,10 @@ public void measureServerMetrics() { | |||
/* Measure data transfer rate by sending some arbitrary data */ | |||
server.receiveHeartBeat(randomBytes); | |||
long t3 = System.nanoTime(); | |||
metric.latency = (t2 - t1) / 2; | |||
metric.latency = (t2 - t1); // RTT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latency needs to be 1-way latency, so the original code was correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latency is not implicitly equal to half of RTT, because delay may be asymmetrical between two given endpoints. As we make RPC calls to destination and then wait for an acknowledgement/result to come back before making further calls, RTT may be better ?
Was searching about it and found few links treating that way -
https://www.sas.co.uk/blog/what-is-network-latency-how-do-you-use-a-latency-calculator-to-calculate-throughput
Let me know your opinion
metric.rate = | ||
((RANDOM_DATA_SIZE * 8 * 1000.0 * 1000.0) | ||
/ (((t3 - t2) - (t2 - t1)) * 1024.0 * 1024.0)); | ||
metric.rate = (8.0 / DATA_SIZE_IN_MB) / ((t3 - t2)/(1000.0 * 1000.0 * 1000.0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong, or at best very confusing. You want bytes per second.
So:
DATA_SIZE_IN_BYTES/(t3 - t2)/TimeUnit.SECONDS.toNanos(1);
?
Of course, beware of integer arithmetic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit is in Mbps(MegaBits/Sec)
e221e23
to
6759829
Compare
Still working on this PR. Progress so far -
Yet to do: |
67185e2
to
77bea4c
Compare
Metrics measurement process is independent for each server. And the frequency of measurement is also different. Following mechanism is used:
|
between client and server, took the round trip time(RTT) as latency | ||
*/ | ||
|
||
/* Make an empty heartbeat to ensure session is established and cached before measurement */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This is a bit messy, but I guess necessary.
/* Measure data transfer rate by sending some arbitrary data */ | ||
server.receiveHeartBeat(serverInfo.data); | ||
long t3 = System.nanoTime(); | ||
if (t3 - t2 < t2 - t1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment to describe what you're doing in this block, e.g. "data too small relative to line latency - try again immediately with one more step of data".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment
server.receiveHeartBeat(serverInfo.data); | ||
long t3 = System.nanoTime(); | ||
if (t3 - t2 < t2 - t1) { | ||
period = MIN_METRIC_POLL_PERIOD_MS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the current values and logic, this is not going to work very well.
For example, for a 10GBit ethernet, with say 50ms latency (a fairly common link between data centers), you will need to send around about 1 GByte of data (which will take about 1 second) for the round trip latency to be about 10% of the total transfer time (so that the measurement is off by less than about 10%).
Because you are only starting with 1Kbyte, and incrementing by 1KByte every 1 second, it will take 1 million seconds to get up to 1 Gbyte. That's about 12 days . And in the process you will waste many, many, many GBytes of transferred data getting there.
I would suggest that you come up with a better plan. One suggestion might be to double the data size every time (and also perhaps start with a bigger number than 1K).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS I think slow 3G links are a few hundred Kbit/s to 1Mbit/sec, and latencies around 100ms.
https://www.ofcom.org.uk/about-ofcom/latest/media/media-releases/2014/3g-4g-bb-speeds
So the smallest amount of data you're ever likely to need to send (using similar assumptions to above) is around 128KBytes. You may as well start just below that (say 32Kbytes) and keep doubling it until your data transmission time is significantly greater than your basic line latency (10x in the above example).
In that case, to get from 2^7K (128K) to 2^20K (1G) would take 20-7 = 13 attempts. That seems fine. Way better than a million above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Doubling the amount of data until your data transmission time is greater than line latency.
>= KernelServerInfo.MIN_STABLE_DATA_RATE_TIMES) { | ||
/* Data rates are consistent. Double the poll period */ | ||
period = period << 1; | ||
/* Limit max poll period to 512 sec */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Limit the max poll period to MAX_METRIC_POLL_PERIOD_MS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the comment
@@ -28,41 +34,180 @@ | |||
* @author iyzhang | |||
*/ | |||
public class KernelClient { | |||
private static final int MIN_METRIC_POLL_PERIOD_MS = 1000; /* Minimum metrics poll period */ | |||
private static final int MAX_METRIC_POLL_PERIOD_MS = 512000; /* Maximum metrics poll period */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is probably too long? It will take almost 10 minutes to detect a faster or slower link? I think we should aim for a faster response than that. Either way, this number needs to be properly considered, and you need to document here why you chose the given number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this heartbeat between kernel servers is to just measure latency and transfer rate(i.e., not liveliness), and once we get the stable samples, we can reduce the frequency of this heartbeats gradually.
If the remote kernel server is down due to some reason, anyway, any rpc calls to that particular server fails, then contacts respective DM group policy to get the available server, uses it.
KernelServerInfo
gets removed from this servers
map after MAX_FAILED_HEARTBEATS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you missed my point. If a node moves between networks of different speeds (e.g. from wifi to 3G or vice verse) or if it's link simply gets congested, then it will take until the next metric poll period for that change to be detected. If the metric poll period is 512 seconds, then that's almost 10 minutes before the change gets detected. That's way too long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Have reduced it to 128 seconds now. From the test results, amount of data to send in heartbeats to measure data rate is much negligible when compared to link capacity. So, I believe 128sec(~2min) as upper bound is ok.
} | ||
|
||
serverInfo.stableDataRateTimes++; | ||
metric.latency = (t2 - t1); // RTT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely you should set this irrespective of whether the sample was ignored above. The measured latency is always correct. The measured throughput is only correct in some cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually node metric object has latency and data rate values in it. So ignoring this latency when data rate cannot be calculated. Otherwise, we need to send node metric object(sample) to OMS with valid latency and data rate value being 0.
Since, this case can happen for few initial times until data size is significant enough to get data transfer time > latency, ignoring the sample.
KernelServer server = serverInfo.remoteRef; | ||
int dataLength = serverInfo.data.len; | ||
NodeMetric metric = serverInfo.metric; | ||
int period = serverInfo.metricPollPeriod; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you'll find that if you remove this alias, your logic will get a lot simpler. Just use serverInfo.metricPollPeriod directly? Similar argument for above 3 aliases, perhaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified
@@ -74,4 +79,47 @@ private boolean matchRequirements(List<Requirement> requirements) { | |||
} | |||
return !requirements.isEmpty(); | |||
} | |||
|
|||
private void writeObject(java.io.ObjectOutputStream s) throws java.io.IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this method is here and so weird. Please comment to explain why you didn't do the standard stuff:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added header and comments in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No you didn't.
* @param data | ||
* @throws RemoteException | ||
*/ | ||
void receiveHeartBeat(RandomData data) throws RemoteException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add comments explaining why it doesn't just take a String as a parameter. Superficially it looks quite silly to create a whole new data type just for this purpose. I understand you're probably trying to make the serialization and deserialization more efficient or something, but I think there's probably a much better way (e.g. by using a String.subString() on the fly to pass in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments for RandomData class.
* @param data | ||
*/ | ||
@Override | ||
public void receiveHeartBeat(RandomData data) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Come to think of it, the compiler/JIT might optimize this data away because it's never used. Might be worth checking that. I'm not sure what the best solution is if it is being optimized away. You might need to access it somehow, without using up too many CPU cycles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have checked it. Call is not optimized. Probably because it is an RPC. And data is also deserialized on remote end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See inline comments for requested changes.
248871c
to
f71b126
Compare
Link speed between the two systems used for testing- 1000 Mbps. KS1 192.168.59.2 running on system1 along with oms. KS2 192.168.59.4 running on system2 . Data rate unit is in Bytes/Sec. PFA the test logs below: |
f71b126
to
f4841a7
Compare
f4841a7
to
c7a4e5d
Compare
@@ -28,41 +38,256 @@ | |||
* @author iyzhang | |||
*/ | |||
public class KernelClient { | |||
private static final int MIN_METRIC_POLL_PERIOD_MS = 1000; /* Minimum metrics poll period */ | |||
private static final int MAX_METRIC_POLL_PERIOD_MS = 128000; /* Maximum metrics poll period */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You still don't explain here in comments why these numbers were chosen. They seem fairly arbitrary.
Put yourself in the position of someone coming after you to change the code. They will ask themselves, "can I change this number, to what should I change it, and why".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
51f5f4b
to
177a3ff
Compare
…ts local kernel server is not registered to OMS. And also, kernel servers do metric measurement to the kernel server collocated within OMS
Have fixed APP Client not exiting issue. APP client creates a dummy local kernel server which are meant to route RPC calls through it to remote kernel server where MicroService it interacts reside. But that local kernel server is not registered to OMS and do not send heartbeats to OMS. We were measuring node metrics from that dummy local kernel server to all the remaining remote kernel servers. In fact, Such App clients do not allow deployment of MicroServices on them(and also do not participate in automatic migration of MicroService). Hence, they shouldn't measure node metrics. Fundmover app logs: HanksTodo app logs: Have 4 Kernel servers with 2 servers in each region. KVStore app logs: Have 4 Kernel servers with 2 servers in each region. |
OK, to avoid further delays I'm going to merge this PR, and make the proposed improvements in followup PRs. |
This PR is to measure the latency and data rate from each node to each other node available and give these metrics to OMS in existing heartbeat between kernel server and OMS. These received metrics handling on OMS is not part of this PR.
This PR is same as old PR #742. Just raised based on new fork and closing the old PR.