-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Periodic background reporter for "Server" metrics. #12
Conversation
In most cases the metrics being collected are not actually timings, but instead are counters. There are, however, some things that *are* actually "timers" which we need to address in a future commit. For now, we still submit the counters, as timers, to statsd to maintain the statsd aggregation facilities that are useful for per request stat tracking. (Tests are probably failing after this commit)
RackReporter reports only counters (as timers), which are useful in a per request context (e.g. how many minor collections are happening per request), using sampling strategies. However, data for the entire process, which previously was reported as Rack::Server should not be sampled. Instead, the PeriodicReporter reports to statsd as counts and gauges, providing a more accurate look at the server's characteristics over time. The intention of PeriodicReporter is to be used by a clock that measures, and then reports on an interval corresponding to the statsd flushInterval (by default something like 10 seconds). A clock is not included in this commit. Stay tuned.
Periodic collects metrics and reports them to reporter on an interval, rather than on a per request basis (such as the Rack class). To avoid large rearchitecting, it mimics that of the Rack Middleware by conforming to the environment passing model of the reporter and metrics storage, using *it's* thread local storage for that of persisted last values of counters to track changes. In addition, we add a RailTie, specifically for the enabling and starting the periodic reporter.
- ReporterTest ensures that report implementations do nothing - Move ReporterTest -> RackReporterTest - Add PeriodicReporterTest which tests sample rates implementation.
Provide a test helper for Statsd which yields after a call to `:block`. This provides a basis for mocking out the `:count` and `gauge` methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Thanks for the PR, @apg!
lib/trashed/reporter.rb
Outdated
# We emit the counters as timings to take advantage of statsd | ||
# aggregations, min, max, etc. | ||
send_to_statsd statsd, :timing, @counter_sample_rate, env[Trashed::COUNTERS], :'Rack.Request', @counter_dimensions.call(env) | ||
send_to_statsd statsd, :timing, @gauge_sample_rate, env[Trashed::GAUGES], :'Rack.Server', @gauge_dimensions.call(env) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of "timer" naming is to be clear that we're using StatsD's metric types and aggregations. It's understood that timings aren't literally elapsed time measurements but are any independent variable for which we'd like to gather summary statistics.
StatsD counters, on the other hand, are aggregated by summing all counts reported in an interval, so it'd confuse things further to mismatch our metric types and StatsD's.
Perhaps better docs are the way to go here? Link to https://github.com/etsy/statsd/blob/master/docs/metric_types.md for clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping @apg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bah! Github notifications.
Linking out to that probably makes sense. I'll update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's revert the timings
→ counters
mass name change as well. The Ruby naming directly reflects the StatsD metric types we're using.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only the per request metrics use :timing
statsd types though. The PeriodicReporter used :count
and :gauge
. I'd argue that the use of counters
and gauges
provides semantic meaning to the values that are being collected, even if they are reported to statsd in a different manner (that seems irrelevant to me, really).
lib/trashed/railclock.rb
Outdated
Periodic.new config.trashed_periodic | ||
end | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Railties are meant to handle multiple initializers and setup. We're good to merge this into the main Railtie.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only issue there is that I want some ability to only run the Periodic reporter, or the PerRequest reporter (or possibly both). Do you have a suggestion for how to manage that?
lib/trashed/reporter.rb
Outdated
end | ||
|
||
def report_statsd(env) | ||
raise "must implement interface" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise NotImplementedError
test/reporter_test.rb
Outdated
@@ -4,15 +4,32 @@ | |||
require 'stringio' | |||
|
|||
class ReporterTest < Minitest::Test | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✂️ prefer no extra space after class declaration
test/lib/trashed/test_helper.rb
Outdated
def batch | ||
yield @batcher | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's nest this within PeriodicReporterTest
where it's used.
Also periodic reporter can track number of threads. And it will be useful for Sidekiq server metrics |
@jeremy anything else required to finish this PR? |
@jeremy I'm not likely to pursue this in trashed, no. It seems like the goals of trashed are fairly different than what we're trying to do in barnes so our new path makes more sense at this stage in the game. That being said, @stereobooster, if @jeremy is still willing to accept, please feel free to push on with it. |
Cool. Best of luck, @apg! 🚢 |
First of all, I'm not typically a Rubyist, so all comments are welcome in order to get this into tip top shape. Secondly, here is my attempt at getting something landed that addresses my question in #11, in which @jeremy suggested he'd entertain a PR.
I've tried to keep the spirit of the original functionality, but have pulled out counters and gauges that were previously reported as
Rack.Server
and made them be reported periodically. This involves more setup, but is also, arguably, a better, more accurate approach. The per-request data is still being generated as timings emitted per request, like before, provided you set that up as well (the original directions suffice). This doesn't do aggregation of the per request metrics, which may, or may not, be the right thing to do here...