Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added endpoint-level metrics keeping track of the request rate for each path. #104

Merged
merged 13 commits into from
Feb 7, 2018

Conversation

lducharme
Copy link
Contributor

I added metrics for request rate at the endpoint level. In addition to having this:

com.nordstrom.xrpc.server.Router.requests.Rate
             count = 0
         mean rate = 0.00 events/second
     1-minute rate = 0.00 events/second
     5-minute rate = 0.00 events/second
    15-minute rate = 0.00 events/second

we now also have metrics that keep track of the rate of requests for each endpoint (that is, combination of HTTP method and path):

com.nordstrom.xrpc.server.Router.handler.rate.get.people
             count = 0
         mean rate = 0.00 events/second
     1-minute rate = 0.00 events/second
     5-minute rate = 0.00 events/second
    15-minute rate = 0.00 events/second
com.nordstrom.xrpc.server.Router.handler.rate.post.people.{person}
             count = 0
         mean rate = 0.00 events/second
     1-minute rate = 0.00 events/second
     5-minute rate = 0.00 events/second
    15-minute rate = 0.00 events/second

A few things I'm unsure about and would especially appreciate feedback on:
Is ServiceRateLimiter the right place to configure these metrics? Since the overall server level rate is configured there, that's what I went with, but it seems a little off.

Re: issue 103, these metric names might be contributing to further inconsistency- any thoughts on naming these better?

Note we also may consider adding metrics per client. If we have metrics per client and per endpoint, this could end up being a large/unreadable number of metrics... how might we deal with this?

Copy link
Contributor

@andyday andyday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me know if you have questions


protected static String getMeterNameForRoute(Route route, String httpMethod) {
String result = "";
if (httpMethod != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should default this to 'ANY'

result += httpMethod.toLowerCase();
}

if (route != null && route.toString() != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no route should cause us to throw a new ArgumentNullException

return getMeterNameForRoute(route, httpMethod.name());
}

protected static String getMeterNameForRoute(Route route, String httpMethod) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since httpMethod can be null we should use Optional<String> httpMethod here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional is much better for readability, but it adds extra garbage for each request. I'm planning on running a benchmarking series at some point; it would be good to see how this impacts things.

@@ -46,4 +46,8 @@
routes = new AtomicReference<>();

@Getter private final ObjectMapper mapper;

@Getter
private final ConcurrentHashMap<String, Meter> rateMetersByRouteAndMethod =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be named something like metersByRoute . rate doesn't apply and Method is unneeded. ideally, method is already part of the route (conceptually)

configEndpointLevelRateMeters(metrics, ctx);
}

private void configEndpointLevelRateMeters(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't belong in this class. it should be done just below the configResponseCodeMeters() in Router

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had it where you suggest, in the constructor for Router, but at that point the routes aren't configured yet, so I can't set up meters for each route.

Looking at it again, perhaps a more appropriate place would be to put it would be early in Router.listenAndServe?

ImmutableSortedMap<Route, List<ImmutableMap<XHttpMethod, Handler>>> routes =
ctx.getRoutes().get();

final String NAME_PREFIX = "handler.rate.";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be changed to something like "routes."; also the name should be changed to namePrefix


final String NAME_PREFIX = "handler.rate.";

Set<String> routeAndMethodNames = new HashSet<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to have this temp holding set. the routes can be added directly to the metric registry


for (String routeName : routeAndMethodNames) {
ctx.getRateMetersByRouteAndMethod()
.put(routeName, metricRegistry.meter(name(Router.class, NAME_PREFIX + routeName)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to .meter(namePrefix + routeName)... . this includes my other comment about changing NAME_PREFIX to namePrefix

@xjdr
Copy link
Contributor

xjdr commented Feb 5, 2018

I'd like to see the regressions numbers between this and Master to make sure we aren't adding additional latency with this.

Lauren DuCharme and others added 2 commits February 5, 2018 09:54
@lducharme
Copy link
Contributor Author

Thanks for the comments, Andy. I moved the metrics config out of ServiceRateLimiter and into Router.listenAndServe, which is hopefully a better place for it. I also updated the variable names and metric names, and changed MetricsUtil to use Optional to handle possible null parameters.

Let me know if any of this still has issues. I am working on doing benchmarking to check on the latency.

}

return result;
return method + routeIdentifier;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this return can happen inside the if, eliminating the need for the routeIdentifier variable. in addition it is a good practice to use inverted if conditions to allow the happy path logic to happen outside of nested conditionals. for instance...

if (route == null || route.toString() == null) {
  throw new IllegalArgumentException("Route cannot be null.");
}

return String.format("{}{}", method, route.toString().replace('/', '.');

is better than the alternative you've presented. it allows the happy path to be as unencumbered by levels of nesting while reducing the overall Cyclomatic Complexity of the method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I updated this.

However, I also added a route identifier map to cache the route identifiers per one of Jesse's comments, so hopefully the way I refactored it still makes sense and reduces complexity.

@@ -365,6 +369,26 @@ public void listenAndServe(boolean serveAdmin, boolean scheduleHealthChecks) thr
channel = future.channel();
}

private void configEndpointLevelRateMeters(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that these are accurately Rate meters in the same sense as we use the word Rate in Rate Limiting. Or at minimum they are using different Rates. That being said we should use a different word here to avoid confusion. Some suggestions ThroughPut, RequestCount, Requests....

for (XHttpMethod httpMethod : map.keySet()) {
String routeName = MetricsUtil.getMeterNameForRoute(route, httpMethod);
ctx.getMetersByRoute()
.put(routeName, metricRegistry.meter(name(Router.class, namePrefix + routeName)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should use the .meter method that uses only namePrefix + routeName as the name without Router.class

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using the class makes the name overly verbose and noisy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I'll remove it.

I originally included the class name because ServiceRateLimiter did it, here:
https://github.com/Nordstrom/xrpc/blob/master/src/main/java/com/nordstrom/xrpc/server/ServiceRateLimiter.java#L56
Does it also make sense to remove the class name there, too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes i'd say making the names as short and sweet but unique is the way to go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, removed them.

Lauren DuCharme added 2 commits February 5, 2018 11:44
…d with Rate meters. Removed class name from metrics registry.
@lducharme
Copy link
Contributor Author

I used the run regression script to check out latency in master vs my branch. The max is definitely higher but the average is very close. (Note I had to increase the server and global rate limits for this to work):

master:

  4 threads and 15 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.44ms    2.89ms  62.62ms   92.39%
    Req/Sec   312.18    208.79     1.32k    80.91%
  Latency Distribution
     50%    1.70ms
     75%    3.37ms
     90%    4.97ms
     99%    9.21ms
  54451 requests in 1.00m, 3.74MB read
  Socket errors: connect 12, read 54451, write 0, timeout 0
Requests/sec:    905.97
Transfer/sec:     63.70KB

my branch:

  4 threads and 15 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.46ms    3.18ms  87.48ms   93.80%
    Req/Sec   312.49    200.63     1.18k    79.12%
  Latency Distribution
     50%    1.68ms
     75%    3.30ms
     90%    4.93ms
     99%    9.37ms
  56996 requests in 1.00m, 3.91MB read
  Socket errors: connect 12, read 56996, write 0, timeout 0
Requests/sec:    948.68
Transfer/sec:     66.70KB

@@ -88,6 +89,7 @@ private void executeHandler(ChannelHandlerContext ctx, int streamId, Route route
.findFirst();

if (handlerMapOptional.isPresent()) {
httpRequestMethodName = handlerMapOptional.get().keySet().asList().get(0).name();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On line 81 above, the method name is calculated.

You should be able to speed things up by calculating this once before line 69, and saving the method name in the local.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(you'll speed things up by avoiding the calls here, but also by avoiding re-calculating the method in the inner loop)

// in
// Router.serveAdmin()); we do not track metrics for admin endpoints.
String meterName = MetricsUtil.getMeterNameForRoute(route, httpRequestMethodName);
if (xctx.getMetersByRoute().get(meterName) != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache result of this expression:

// too lazy to look up actual type
MeterType meter = xctx.getMetersByRoute().get(meterName);
if (meter != null) {
  meter.mark();
}


String routeIdentifier;
if (route != null && route.toString() != null) {
routeIdentifier = route.toString().replace('/', '.');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be cached as a property of the route, or in a map somewhere.

// configured in
// Router.serveAdmin()); we do not track metrics for admin endpoints.
String meterName = MetricsUtil.getMeterNameForRoute(route, request.method().name());
if (xctx.getMetersByRoute().get(meterName) != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache expression result as above.

Lauren DuCharme added 2 commits February 5, 2018 16:15
…and meters when applicable, reduced redundant calls to calculate method name
@@ -108,6 +102,15 @@ private void executeHandler(ChannelHandlerContext ctx, int streamId, Route route
.handle(ctx.channel().attr(XrpcConstants.XRPC_REQUEST).get());
}

// Check here for the case of an admin endpoint (eg /metrics, /health, and all others configured
// in Router.serveAdmin()); we do not track metrics for admin endpoints.
Meter meter =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see the functional Optional pattern here like on line 113

Optional<Meter> routeMeter =
    Optional.ofNullable(xctx.getMetersByRoute()
          .get(MetricsUtil.getMeterNameForRoute(route, httpRequestMethod.name()));

routeMeter.ifPresent(Meter::mark);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, changed it in Http2Handler and UrlRouter.

@xjdr
Copy link
Contributor

xjdr commented Feb 7, 2018

Once you fix the merge conflicts, I am good to approve.

@lducharme
Copy link
Contributor Author

Merge conflicts now fixed (and addressed one more comment that I missed before).

@andyday
Copy link
Contributor

andyday commented Feb 7, 2018

👍 All of my remaining comments are unblocking... The one about cyclomatic complexity is worth noting...

@andyday
Copy link
Contributor

andyday commented Feb 7, 2018

you still need to fix the build issues. CorsTest is failing...

@lducharme
Copy link
Contributor Author

Build issues fixed! Thanks all for the help

@lducharme lducharme merged commit e4fca46 into master Feb 7, 2018
@lducharme lducharme deleted the feat-endpoint-level-metrics branch February 7, 2018 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants