Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Required Client side metrics #96

Closed
NiteshKant opened this issue Apr 21, 2014 · 11 comments
Closed

Required Client side metrics #96

NiteshKant opened this issue Apr 21, 2014 · 11 comments
Milestone

Comments

@NiteshKant
Copy link
Member

As there was no insight initially on which metrics should be provided. This comment is updated post-implementation to provide information about the available metrics.
Following metrics will be available for out of the box servo metrics plugin.

TCP
  • Live Connections: The number of open connections from this clients to a server. This is a gauge.
  • Connection count: The total number of connections ever created by this client. This is a monotonically increasing counter.
  • Pending Connections: The number of connections that are pending. This is a gauge.
  • Failed connects: Total number of connect failures.
  • Connection times: Time taken to establish a connection.
  • Pending connection close: Number of connections which are requested to be closed but are not yet closed. This is a gauge.
  • Failed connection close: Number of times when the connection close failed. This is a monotonically increasing counter.
  • Pending pool acquires: For clients with a connection pool, the number of acquires that are pending. This is a gauge.
  • Failed pool acquires: For clients with a connection pool, the number of acquires that failed. This is a monotonically increasing counter.
  • Pool acquire times: For clients with a connection pool, time taken to acquire a connection from the pool.
  • Pending pool releases: For clients with a connection pool, the number of releases that are pending. This is a gauge.
  • Failed pool releases: For clients with a connection pool, the number of releases that failed. This is a monotonically increasing counter.
  • Pool releases times: For clients with a connection pool, time taken to release a connection to the pool.
  • Pool acquires: For clients with a connection pool, the total number of acquires from the pool.
  • Pool evictions: For clients with a connection pool, the total number of evictions from the pool.
  • Pool reuse: For clients with a connection pool, the total number of times a connection from the pool was reused.
  • Pool releases: For clients with a connection pool, the total number of releases to the pool.
  • Pending Writes: Writes that are pending to be written over the socket. This includes writes which are not flushed.
    This is a gauge.
  • Pending Flushes: Flushes that are issued but are not over yet. This is a gauge.
  • Bytes Written: Total number of bytes written. This is a monotonically increasing counter.
  • Write Times: The time taken to finish a write.
  • Bytes Read: The total number of bytes read. This is a monotonically increasing counter.
  • Failed Writes: The total number of writes that failed. This is a monotonically increasing counter.
  • Failed Flushes: The total number of flushes that failed. This is a monotonically increasing counter.
  • Flush times: The time taken to finish a flush.
HTTP

HTTP contains all the metrics that are available from TCP. The following metrics are specific to HTTP:

  • Request backlog: The number of requests that have been submitted but not started processing. This is a gauge.
  • Inflight requests: The number of requests that have been started processing but not yet finished processing. This is a gauge.
  • Processed Requests: Total number of requests processed. This is a monotonically increasing counter.
  • Request Write Times: Time taken to write requests, including headers and content.
  • Response Read Times: Time taken to read a response.
  • Failed Responses: Total number of responses that failed i.e. for which the requests were sent but response was an error.
  • Failed request writes: Total number of requests for which the writes failed.
UDP

UDP contains all the metrics that are available from TCP.

@NiteshKant NiteshKant mentioned this issue Apr 21, 2014
@NiteshKant NiteshKant self-assigned this Apr 21, 2014
@NiteshKant NiteshKant added this to the 0.3.4 milestone Apr 28, 2014
@NiteshKant NiteshKant modified the milestones: 0.3.5, 0.3.4 May 19, 2014
@NiteshKant
Copy link
Member Author

Moving this to milestone 0.3.5.
Had to invest time on issue #117 which requires a release now.

@allenxwang
Copy link

Here are the metrics I can think of:

  • Connection time (for connection pool, this should be the time to make a new connection)
  • For HttpClient, time for the first read after a request is submitted
  • Counters for different exceptions (ConnectException, ReadTimeoutException, etc)
  • For HttpClient, counters for different status codes

@benjchristensen
Copy link
Member

How do you envision these working? Maintaining them all internally and an API for retrieving them? or firing the events to a plugin so the implementation decides what to do with them?

@allenxwang
Copy link

Events based implementation should do.

@NiteshKant
Copy link
Member Author

This is a sub-task for issue #98 where we decided about event based implementation.

I was actually thinking about the events publishing per se and I am in two minds whether we should have the model like:

  • One callback like Rx onNext() and pass the event data.
    OR
  • Have a callback interface with specific methods like requestStart(), requestEnd() etc.

There are of course pros & cons of both.
The biggest con of the first approach is that there will be an event object created per callback.
Based on what netty's evolution has seen (3.x to 4.x), the first approach creates a lot of garbage and hence GC pressure in high throughput applications.

The pro of first approach is simplicity and the ability to be able to add new callbacks without breaking the contract.

@NiteshKant
Copy link
Member Author

After looking at netty's evolution and also the model that hystrix follows I am leaning towards Option 2 above.
I think we should also add an extension point onCustomEvent() in these metrics receivers to keep it open for any changes to metrics without breaking backward compatibility.

Thoughts?

@headinthebox
Copy link

Just to understand correctly, when you talk about exposing events for each metric, you expose them as Observables, as in https://github.com/Netflix/RxNetty/blob/master/rx-netty/src/main/java/io/reactivex/netty/client/PoolInsightProvider.java#L19?

@NiteshKant
Copy link
Member Author

The issue with exposing them as an observable is if we have a state (eg: duration for request processing) per event. This means that for every event (these events will be large ~ 5-6 events per request) we have to create a new object. This in high-throughput systems can create a lot of short lived objects and hence GC pressure. This behavior is also referred to in the netty case study I pointed out in the last comment.
This is the reason I am leaning towards having a EventListener interface with explicit methods like requestStart() & requestComplete(duration) which can be registered with the client/server instance.
I will publish the initial design in this issue & we can iterate.

@headinthebox
Copy link

Cool.

@NiteshKant
Copy link
Member Author

I have put the design proposal in the parent issue #98 . Any feedback is much appreciated!

@benjchristensen
Copy link
Member

Agreed on the need to avoid object allocations for this.

@NiteshKant NiteshKant modified the milestones: 0.3.7, 0.3.6 Jun 24, 2014
NiteshKant pushed a commit to NiteshKant/RxNetty that referenced this issue Jun 24, 2014
NiteshKant added a commit that referenced this issue Jun 25, 2014
@NiteshKant NiteshKant removed their assignment Aug 19, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants