-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Prometheus metrics for the internal API #236
Conversation
end | ||
|
||
def internal_api_req_type(path) | ||
(_regex, type) = INTERNAL_API_PATHS.find { |(regex, _)| regex.match path } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very unfortunate. With this we need to match against regexes a path which Sinatra already matched for us, so double work, but this time much more expensive than matchings strings.
We could use a Custom Route Matcher although that seems a bit too much for this case (and it's not like it's greatly documented), or in a more practical way we could use request.env['prometheus.iapi.req_type'] = 'alerts'
in the relevant Internal API endpoints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your solution would be more efficient, that's true. However, it implies adding a similar line to the one you pasted in all the files that define the endpoints. Also, this is part of an optional feature, and it only affects the internal API, not the main one. For those reasons, I'd rather merge it as it is, and only change this if it proves to be a problem when we run our benchmarks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
# Most requests will be under 100ms, so use a higher granularity from there | ||
TIME_BUCKETS = [0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.25, 0.5, 0.75, 1] | ||
private_constant :TIME_BUCKETS | ||
|
||
class << self | ||
ERRORS_4XX_TO_TRACK = Set[403, 404, 409].freeze |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I know this is not part of this PR (it's from #174 and I overlooked this), but this is a case for an ordered (by frequency) Array
rather than a Set
- rationale is it performs better for sets in which the 1st and 2nd elements are queried with high frequency compared to the rest - I just unearthed this gist and enabled perf
and disabled accumulated
for numbers.
Maybe a case for a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was an interesting investigation 😸
We can take a look in a different PR. Honestly, I don't think it's too important and I'm not sure we can apply your results to this specific case without doing some measurements first. First of all, I'm not sure about the distribution of hits in the set. I think it depends a lot on the use case. Also, this is the kind of thing that can change between Ruby versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[...] apply your results to this specific case without doing some measurements first.
That's easy to check with the gist and the frequencies.
Also, this is the kind of thing that can change between Ruby versions.
It hasn't changed from 2.1 up to 2.7, and it fundamentally should not change because Array
and Set
should have basic guarantees about their structure. You can read the details [private link] for some fun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep this for another PR if you are willing to take it :)
This PR adds prometheus metrics for the internal API. The metrics exposed are the request type (services, applications, metrics, etc.) and the response times. It's similar to what we did for the auth and report endpoints in #174