Required Client side metrics #96

NiteshKant · 2014-04-21T07:26:53Z

As there was no insight initially on which metrics should be provided. This comment is updated post-implementation to provide information about the available metrics.
Following metrics will be available for out of the box servo metrics plugin.

TCP

Live Connections: The number of open connections from this clients to a server. This is a gauge.
Connection count: The total number of connections ever created by this client. This is a monotonically increasing counter.
Pending Connections: The number of connections that are pending. This is a gauge.
Failed connects: Total number of connect failures.
Connection times: Time taken to establish a connection.
Pending connection close: Number of connections which are requested to be closed but are not yet closed. This is a gauge.
Failed connection close: Number of times when the connection close failed. This is a monotonically increasing counter.
Pending pool acquires: For clients with a connection pool, the number of acquires that are pending. This is a gauge.
Failed pool acquires: For clients with a connection pool, the number of acquires that failed. This is a monotonically increasing counter.
Pool acquire times: For clients with a connection pool, time taken to acquire a connection from the pool.
Pending pool releases: For clients with a connection pool, the number of releases that are pending. This is a gauge.
Failed pool releases: For clients with a connection pool, the number of releases that failed. This is a monotonically increasing counter.
Pool releases times: For clients with a connection pool, time taken to release a connection to the pool.
Pool acquires: For clients with a connection pool, the total number of acquires from the pool.
Pool evictions: For clients with a connection pool, the total number of evictions from the pool.
Pool reuse: For clients with a connection pool, the total number of times a connection from the pool was reused.
Pool releases: For clients with a connection pool, the total number of releases to the pool.
Pending Writes: Writes that are pending to be written over the socket. This includes writes which are not flushed.
This is a gauge.
Pending Flushes: Flushes that are issued but are not over yet. This is a gauge.
Bytes Written: Total number of bytes written. This is a monotonically increasing counter.
Write Times: The time taken to finish a write.
Bytes Read: The total number of bytes read. This is a monotonically increasing counter.
Failed Writes: The total number of writes that failed. This is a monotonically increasing counter.
Failed Flushes: The total number of flushes that failed. This is a monotonically increasing counter.
Flush times: The time taken to finish a flush.

HTTP

HTTP contains all the metrics that are available from TCP. The following metrics are specific to HTTP:

Request backlog: The number of requests that have been submitted but not started processing. This is a gauge.
Inflight requests: The number of requests that have been started processing but not yet finished processing. This is a gauge.
Processed Requests: Total number of requests processed. This is a monotonically increasing counter.
Request Write Times: Time taken to write requests, including headers and content.
Response Read Times: Time taken to read a response.
Failed Responses: Total number of responses that failed i.e. for which the requests were sent but response was an error.
Failed request writes: Total number of requests for which the writes failed.

UDP

UDP contains all the metrics that are available from TCP.

NiteshKant · 2014-05-19T21:58:43Z

Moving this to milestone 0.3.5.
Had to invest time on issue #117 which requires a release now.

allenxwang · 2014-05-23T21:36:40Z

Here are the metrics I can think of:

Connection time (for connection pool, this should be the time to make a new connection)
For HttpClient, time for the first read after a request is submitted
Counters for different exceptions (ConnectException, ReadTimeoutException, etc)
For HttpClient, counters for different status codes

benjchristensen · 2014-05-23T21:38:25Z

How do you envision these working? Maintaining them all internally and an API for retrieving them? or firing the events to a plugin so the implementation decides what to do with them?

allenxwang · 2014-05-23T21:59:34Z

Events based implementation should do.

NiteshKant · 2014-05-23T22:21:00Z

This is a sub-task for issue #98 where we decided about event based implementation.

I was actually thinking about the events publishing per se and I am in two minds whether we should have the model like:

One callback like Rx onNext() and pass the event data.
OR
Have a callback interface with specific methods like requestStart(), requestEnd() etc.

There are of course pros & cons of both.
The biggest con of the first approach is that there will be an event object created per callback.
Based on what netty's evolution has seen (3.x to 4.x), the first approach creates a lot of garbage and hence GC pressure in high throughput applications.

The pro of first approach is simplicity and the ability to be able to add new callbacks without breaking the contract.

NiteshKant · 2014-05-23T23:49:44Z

After looking at netty's evolution and also the model that hystrix follows I am leaning towards Option 2 above.
I think we should also add an extension point onCustomEvent() in these metrics receivers to keep it open for any changes to metrics without breaking backward compatibility.

Thoughts?

headinthebox · 2014-05-26T00:34:27Z

Just to understand correctly, when you talk about exposing events for each metric, you expose them as Observables, as in https://github.com/Netflix/RxNetty/blob/master/rx-netty/src/main/java/io/reactivex/netty/client/PoolInsightProvider.java#L19?

NiteshKant · 2014-05-26T03:04:59Z

The issue with exposing them as an observable is if we have a state (eg: duration for request processing) per event. This means that for every event (these events will be large ~ 5-6 events per request) we have to create a new object. This in high-throughput systems can create a lot of short lived objects and hence GC pressure. This behavior is also referred to in the netty case study I pointed out in the last comment.
This is the reason I am leaning towards having a EventListener interface with explicit methods like requestStart() & requestComplete(duration) which can be registered with the client/server instance.
I will publish the initial design in this issue & we can iterate.

headinthebox · 2014-05-26T03:13:07Z

Cool.

NiteshKant · 2014-05-26T21:11:47Z

I have put the design proposal in the parent issue #98 . Any feedback is much appreciated!

benjchristensen · 2014-05-28T16:19:26Z

Agreed on the need to avoid object allocations for this.

Metrics (Fixes Issues #96 #97 #98)

NiteshKant mentioned this issue Apr 21, 2014

Metrics #98

Closed

NiteshKant self-assigned this Apr 21, 2014

NiteshKant added the enhancement label Apr 21, 2014

NiteshKant added this to the 0.3.4 milestone Apr 28, 2014

NiteshKant modified the milestones: 0.3.5, 0.3.4 May 19, 2014

NiteshKant modified the milestones: 0.3.7, 0.3.6 Jun 24, 2014

NiteshKant pushed a commit to NiteshKant/RxNetty that referenced this issue Jun 24, 2014

Fixes issue ReactiveX#96 ReactiveX#97 ReactiveX#98

5733172

NiteshKant added a commit that referenced this issue Jun 25, 2014

Merge pull request #149 from NiteshKant/metrics

4e26dd7

Metrics (Fixes Issues #96 #97 #98)

NiteshKant closed this as completed Jun 25, 2014

NiteshKant removed their assignment Aug 19, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Required Client side metrics #96

Required Client side metrics #96

NiteshKant commented Apr 21, 2014

NiteshKant commented May 19, 2014

allenxwang commented May 23, 2014

benjchristensen commented May 23, 2014

allenxwang commented May 23, 2014

NiteshKant commented May 23, 2014

NiteshKant commented May 23, 2014

headinthebox commented May 26, 2014

NiteshKant commented May 26, 2014

headinthebox commented May 26, 2014

NiteshKant commented May 26, 2014

benjchristensen commented May 28, 2014

Required Client side metrics #96

Required Client side metrics #96

Comments

NiteshKant commented Apr 21, 2014

TCP

HTTP

UDP

NiteshKant commented May 19, 2014

allenxwang commented May 23, 2014

benjchristensen commented May 23, 2014

allenxwang commented May 23, 2014

NiteshKant commented May 23, 2014

NiteshKant commented May 23, 2014

headinthebox commented May 26, 2014

NiteshKant commented May 26, 2014

headinthebox commented May 26, 2014

NiteshKant commented May 26, 2014

benjchristensen commented May 28, 2014