-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance the client phone home metrics #18326
Conversation
Formerly, we were just sending the active clients count at the time of the phone home operation. This is an attempt to enhance the metrics sent during the phone home. Now, along with the active clients count at the time of the phone home operation, we are also sending the following information: - Connections opened in the last 24 hours (connections authenticated) - Connections closed in the last 24 hours - Total connection duration in the last 24 hours (sum of the duration of the closed connections + active connections) - Client versions connected in the last 24 hours Now, during the phone home, each node will send the number of active clients in the cluster plus the local informations listed above. Note that, If the phone home is not enabled, we will not collect the informations listed above.
86f56ce
to
c5624eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few nit
comments, they aren't super-important to address.
ClientEndpointStatistics stat = getStats(endpoint); | ||
long uptime = getUptime(now, endpoint); | ||
if (uptime > 0) { | ||
stat.incrementTotalConnectionDuration(uptime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to keep it in-memory and increment? We could also set it to now - creationTime
and that would have the same result (unless I overlook something).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were reporting all-time total connection duration, now-creationTime
would work. But, we planned to send stats in 24-hour intervals and reset them afterward. Do you think that we should store and report all-time records instead?
I think this way, it is much easier to query, you just sum up values on a certain time interval. If we were reporting all-time values, queries have to sum up values reported recently by the unique nodes.
* active at that time. | ||
* @return map of client type to statistics snapshots. | ||
*/ | ||
Map<String, ClientEndpointStatisticsSnapshot> getSnapshotsAndReset(Collection<ClientEndpoint> activeEndpoints); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this could be called drainSnapshots()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is okay, I would like to keep this and the method name on the Counter class as they are
* @param newValue the new value. | ||
* @return the old value of the counter. | ||
*/ | ||
long getAndSet(long newValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: similarly, this could be drain()
without parameters (the actual newValue
is always 0
anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this method might be useful in the future, we use counters in other places too. So, I tried to be as generic as possible here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, no prob, let's get this merged then.
Formerly, we were just sending the active clients count at the time
of the phone home operation. This is an attempt to enhance the metrics
sent during the phone home. Now, along with the active clients count
at the time of the phone home operation, we are also sending the following
information:
closed connections + active connections)
Now, during the phone home, each node will send the number of active clients
in the cluster plus the local informations listed above.
Note that, If the phone home is not enabled, we will not collect the informations
listed above.
Closes #18308