Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance the client phone home metrics #18326

Merged
merged 2 commits into from
Mar 4, 2021

Conversation

mdumandag
Copy link
Contributor

Formerly, we were just sending the active clients count at the time
of the phone home operation. This is an attempt to enhance the metrics
sent during the phone home. Now, along with the active clients count
at the time of the phone home operation, we are also sending the following
information:

  • Connections opened in the last 24 hours (connections authenticated)
  • Connections closed in the last 24 hours
  • Total connection duration in the last 24 hours (sum of the duration of the
    closed connections + active connections)
  • Client versions connected in the last 24 hours

Now, during the phone home, each node will send the number of active clients
in the cluster plus the local informations listed above.

Note that, If the phone home is not enabled, we will not collect the informations
listed above.

Closes #18308

Formerly, we were just sending the active clients count at the time
of the phone home operation. This is an attempt to enhance the metrics
sent during the phone home. Now, along with the active clients count
at the time of the phone home operation, we are also sending the following
information:

- Connections opened in the last 24 hours (connections authenticated)
- Connections closed in the last 24 hours
- Total connection duration in the last 24 hours (sum of the duration of the
closed connections + active connections)
- Client versions connected in the last 24 hours

Now, during the phone home, each node will send the number of active clients
in the cluster plus the local informations listed above.

Note that, If the phone home is not enabled, we will not collect the informations
listed above.
@mdumandag mdumandag added Type: Enhancement Team: Client Source: Internal PR or issue was opened by an employee labels Mar 2, 2021
@mdumandag mdumandag added this to the 4.2 milestone Mar 2, 2021
@mdumandag mdumandag self-assigned this Mar 2, 2021
@mdumandag mdumandag requested a review from a team as a code owner March 2, 2021 11:32
@emre-aydin emre-aydin requested a review from erosb March 3, 2021 14:18
Copy link
Contributor

@erosb erosb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few nit comments, they aren't super-important to address.

ClientEndpointStatistics stat = getStats(endpoint);
long uptime = getUptime(now, endpoint);
if (uptime > 0) {
stat.incrementTotalConnectionDuration(uptime);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to keep it in-memory and increment? We could also set it to now - creationTime and that would have the same result (unless I overlook something).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were reporting all-time total connection duration, now-creationTime would work. But, we planned to send stats in 24-hour intervals and reset them afterward. Do you think that we should store and report all-time records instead?

I think this way, it is much easier to query, you just sum up values on a certain time interval. If we were reporting all-time values, queries have to sum up values reported recently by the unique nodes.

* active at that time.
* @return map of client type to statistics snapshots.
*/
Map<String, ClientEndpointStatisticsSnapshot> getSnapshotsAndReset(Collection<ClientEndpoint> activeEndpoints);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this could be called drainSnapshots()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is okay, I would like to keep this and the method name on the Counter class as they are

* @param newValue the new value.
* @return the old value of the counter.
*/
long getAndSet(long newValue);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: similarly, this could be drain() without parameters (the actual newValue is always 0 anyway)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this method might be useful in the future, we use counters in other places too. So, I tried to be as generic as possible here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, no prob, let's get this merged then.

@erosb erosb merged commit a55161d into hazelcast:master Mar 4, 2021
@mdumandag mdumandag deleted the client-phone-homes branch March 4, 2021 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Account correctly for number of total clients in phone homes
4 participants