Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Periodic background reporter for "Server" metrics. #12

Closed
wants to merge 13 commits into from

Conversation

apg
Copy link

@apg apg commented Aug 14, 2017

First of all, I'm not typically a Rubyist, so all comments are welcome in order to get this into tip top shape. Secondly, here is my attempt at getting something landed that addresses my question in #11, in which @jeremy suggested he'd entertain a PR.

I've tried to keep the spirit of the original functionality, but have pulled out counters and gauges that were previously reported as Rack.Server and made them be reported periodically. This involves more setup, but is also, arguably, a better, more accurate approach. The per-request data is still being generated as timings emitted per request, like before, provided you set that up as well (the original directions suffice). This doesn't do aggregation of the per request metrics, which may, or may not, be the right thing to do here...

In most cases the metrics being collected are not actually timings,
but instead are counters. There are, however, some things that *are*
actually "timers" which we need to address in a future commit.

For now, we still submit the counters, as timers, to statsd to
maintain the statsd aggregation facilities that are useful for per
request stat tracking.

(Tests are probably failing after this commit)
RackReporter reports only counters (as timers), which are useful in a
per request context (e.g. how many minor collections are happening per
request), using sampling strategies.

However, data for the entire process, which previously was reported as
Rack::Server should not be sampled. Instead, the PeriodicReporter
reports to statsd as counts and gauges, providing a more accurate look
at the server's characteristics over time.

The intention of PeriodicReporter is to be used by a clock that
measures, and then reports on an interval corresponding to the statsd
flushInterval (by default something like 10 seconds). A clock is
not included in this commit. Stay tuned.
Periodic collects metrics and reports them to reporter on an interval,
rather than on a per request basis (such as the Rack class). To avoid
large rearchitecting, it mimics that of the Rack Middleware by
conforming to the environment passing model of the reporter and
metrics storage, using *it's* thread local storage for that of
persisted last values of counters to track changes.

In addition, we add a RailTie, specifically for the enabling and
starting the periodic reporter.
- ReporterTest ensures that report implementations do nothing
- Move ReporterTest -> RackReporterTest
- Add PeriodicReporterTest which tests sample rates implementation.
Provide a test helper for Statsd which yields after a call to
`:block`. This provides a basis for mocking out the `:count` and
`gauge` methods.
Copy link
Member

@jeremy jeremy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Thanks for the PR, @apg!

# We emit the counters as timings to take advantage of statsd
# aggregations, min, max, etc.
send_to_statsd statsd, :timing, @counter_sample_rate, env[Trashed::COUNTERS], :'Rack.Request', @counter_dimensions.call(env)
send_to_statsd statsd, :timing, @gauge_sample_rate, env[Trashed::GAUGES], :'Rack.Server', @gauge_dimensions.call(env)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of "timer" naming is to be clear that we're using StatsD's metric types and aggregations. It's understood that timings aren't literally elapsed time measurements but are any independent variable for which we'd like to gather summary statistics.

StatsD counters, on the other hand, are aggregated by summing all counts reported in an interval, so it'd confuse things further to mismatch our metric types and StatsD's.

Perhaps better docs are the way to go here? Link to https://github.com/etsy/statsd/blob/master/docs/metric_types.md for clarity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping @apg

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bah! Github notifications.

Linking out to that probably makes sense. I'll update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's revert the timingscounters mass name change as well. The Ruby naming directly reflects the StatsD metric types we're using.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the per request metrics use :timing statsd types though. The PeriodicReporter used :count and :gauge. I'd argue that the use of counters and gauges provides semantic meaning to the values that are being collected, even if they are reported to statsd in a different manner (that seems irrelevant to me, really).

Periodic.new config.trashed_periodic
end
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Railties are meant to handle multiple initializers and setup. We're good to merge this into the main Railtie.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only issue there is that I want some ability to only run the Periodic reporter, or the PerRequest reporter (or possibly both). Do you have a suggestion for how to manage that?

end

def report_statsd(env)
raise "must implement interface"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise NotImplementedError

@@ -4,15 +4,32 @@
require 'stringio'

class ReporterTest < Minitest::Test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✂️ prefer no extra space after class declaration

def batch
yield @batcher
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's nest this within PeriodicReporterTest where it's used.

@stereobooster
Copy link

stereobooster commented Sep 8, 2017

Also periodic reporter can track number of threads. And it will be useful for Sidekiq server metrics

@stereobooster
Copy link

@jeremy anything else required to finish this PR?

@jeremy
Copy link
Member

jeremy commented Oct 20, 2017

@jeremy
Copy link
Member

jeremy commented Oct 24, 2017

@apg Are you interested in pursuing this, given that you've forked (#18)? (I'm assuming that you parked it and decided to fork at the naming change.)

@apg
Copy link
Author

apg commented Oct 24, 2017

@jeremy I'm not likely to pursue this in trashed, no. It seems like the goals of trashed are fairly different than what we're trying to do in barnes so our new path makes more sense at this stage in the game.

That being said, @stereobooster, if @jeremy is still willing to accept, please feel free to push on with it.

@jeremy
Copy link
Member

jeremy commented Oct 24, 2017

Cool. Best of luck, @apg! 🚢

@jeremy jeremy closed this Oct 24, 2017
@edmorley edmorley deleted the background-reporter branch September 2, 2021 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants