Refactor report #21

Open
wants to merge 31 commits into
from

Conversation

Projects
None yet
4 participants
Contributor

lmarburger commented Aug 7, 2012

Feeling things out for now. @eric, do you have any thoughts on adding #report to meters? It's spiked on Counter and Meter so far.

Contributor

lmarburger commented Aug 8, 2012

The regression test added to librato_metrics_reporter_test.rb will only pass on Ruby 1.9. It's only a temporary measure which will be replaced by a better test once complete.

Contributor

lmarburger commented Aug 8, 2012

@eric Have any opinions on the isolated LibratoMetrics tests? They're only separate for the time being. Some are really messy but I think that's a code smell coming from the reporter doing too much.

@eric eric and 1 other commented on an outdated diff Aug 8, 2012

lib/metriks/histogram.rb
@@ -108,5 +108,17 @@ def update_variance(value)
new_values
end
end
+
+ def each(&block)
+ report_snapshot = snapshot
+ { 'count' => count,
+ 'min' => min,
+ 'max' => max,
+ 'mean' => mean,
+ 'stddev' => stddev,
+ 'median' => report_snapshot.median,
+ '95th_percentile' => report_snapshot.get_95th_percentile
+ }.each(&block)
@eric

eric Aug 8, 2012

Owner

I realized this code doesn't even need a hash. Would either of these be better than using a hash here?

  report_snapshot = snapshot
  block.call 'count', count
  block.call 'min', min
  block.call 'max', max

or

  report_snapshot = snapshot
  block['count', count]
  block['min', min]
  block['max', max]
@lmarburger

lmarburger Aug 8, 2012

Contributor

Great idea. I like this a lot better than the unnecessary hash.

@lmarburger

lmarburger Aug 8, 2012

Contributor

This would also make it easier for Metrics to be composed of other Metrics. The parent would call #each on its children passing block instead of merging all responses into a single hash.

On that note, is there a reason why things like Histogram and Meter keep their own counts instead of delegating to an internal Counter?

@lmarburger

lmarburger Aug 8, 2012

Contributor

We're not doing anything special with block. Would this be cleaner?

yield 'count', count
yield 'min', min
yield 'max', max
@lmarburger

lmarburger Aug 8, 2012

Contributor

Changed in b349e50

@eric

eric Aug 8, 2012

Owner

Works for me.

@eric eric and 1 other commented on an outdated diff Aug 8, 2012

lib/metriks/reporter/librato_metrics.rb
@@ -6,15 +7,25 @@ class LibratoMetrics
attr_accessor :prefix, :source
def initialize(email, token, options = {})
- @email = email
- @token = token
-
- @prefix = options[:prefix]
- @source = options[:source]
+ options = default_options.merge options
@eric

eric Aug 8, 2012

Owner

I think I prefer the way the defaulting was happening before.

It allows for things like:

config = YAML.load_file('config.yml')
reporter = Reporter.new(whatever, :interval => config['interval'])

to work.

@lmarburger

lmarburger Aug 8, 2012

Contributor

Is it common for config files to set a value to nil instead of deleting or commenting the unused key?

@eric

eric Aug 8, 2012

Owner

well, in this case, the config file wouldn't have the nil, it is just the caller is passing the option in a way that would result in a nil.

I don't know how often such code is written, but I think I would prefer the way it was working, that just lets that work.

@lmarburger

lmarburger Aug 9, 2012

Contributor

That makes total sense. Fixed in 7b32008

@eric eric and 1 other commented on an outdated diff Aug 8, 2012

lib/metriks/reporter/librato_metrics.rb
- unless gauges.empty?
- submit(form_data(gauges.flatten))
+ metric.each do |name, value|
+ time = @time_tracker.now_floored
@eric

eric Aug 8, 2012

Owner

This assignment could be outside of the loop.

@lmarburger

lmarburger Aug 8, 2012

Contributor

Fixed in 616efdf

@eric eric and 1 other commented on an outdated diff Aug 8, 2012

lib/metriks/reporter/librato_metrics.rb
- unless gauges.empty?
- submit(form_data(gauges.flatten))
+ metric.each do |name, value|
+ time = @time_tracker.now_floored
+ metrics << {
+ :type => name.to_s == "count" ? "counter" : "gauge",
@eric

eric Aug 8, 2012

Owner

Having this check (for "count") bothers me on some level, but I can't think of a big reason why it's actually a problem.

@lmarburger

lmarburger Aug 8, 2012

Contributor

Me too. There's still more refactoring to be done in #prepare_all_metrics.

@eric eric and 1 other commented on an outdated diff Aug 8, 2012

lib/metriks/reporter/librato_metrics.rb
- prepare_metric name, metric, [
- :count, :one_minute_rate, :five_minute_rate,
- :fifteen_minute_rate, :mean_rate,
- :min, :max, :mean, :stddev
- ], [
- :median, :get_95th_percentile
- ]
- when Metriks::Histogram
- prepare_metric name, metric, [
- :count, :min, :max, :mean, :stddev
- ], [
- :median, :get_95th_percentile
- ]
- end
- end
+ metrics = filter(prepare_all_metrics)
@eric

eric Aug 8, 2012

Owner

It's too bad we have to do this filtering after we generate the hashes of all of the values, but I can't think of a reasonable way to do it otherwise.

@lmarburger

lmarburger Aug 8, 2012

Contributor

Is this a performance issue? I'm sure we could cook something up.

@eric

eric Aug 8, 2012

Owner

I don't think it matters enough. We call it once per minute, usually.

Owner

eric commented Aug 8, 2012

I think this is a great improvement.

Should we make sure we include more of the snapshot in the list, too?

Contributor

lmarburger commented Aug 8, 2012

Thanks for the feedback! Glad you approve of the direction. I wasn't sure how you'd take to all the mocking but I love testing against an abstraction instead of a concrete object. For example, I like that the :only/:except tests pass in a stub that responds to #=== and not separate tests for a string, regex, etc. I'm still not entirely comfortable with mocha but it's growing on me.

What do you mean by including more of the snapshot in the list?

@lmarburger lmarburger and 1 other commented on an outdated diff Aug 9, 2012

test/librato_metrics_reporter_isolated_test.rb
+ data.has_key?('gauges[1][name]') &&
+ !data.has_key?('gauges[2][name]') &&
+ data['gauges[0][name]'] == 'registry.one' &&
+ data['gauges[1][name]'] == 'registry.three'
+ end
+ reporter.write
+ end
+
+ def test_only_matches_using_threequals_operator
+ registry = stub_iterator([ 'registry', stub_iterator([ 'one', 1.1 ])],
+ [ 'registry', stub_iterator([ 'two', 2.2 ],
+ [ 'three', 3.3 ]) ])
+ matcher = stub
+ matcher.expects(:===).with('registry.one')
+ matcher.expects(:===).with('registry.two')
+ matcher.expects(:===).with('registry.three')
@lmarburger

lmarburger Aug 9, 2012

Contributor

What do you think of this compromise, @eric? :only and :except have two higher level tests using string matchers for clarity and a third expectation test. The later defines the protocol for the matchers and asserts the report is sending the right messages. If the implementation changes from sending #=== to sending another message, the expectation test will fail. That's by design because it means that change may break clients coded against the original protocol.

@eric

eric Aug 9, 2012

Owner

Is there anything that is better by doing this vs just passing in strings directly?

@lmarburger

lmarburger Aug 10, 2012

Contributor

The difference is writing an expectation to use #=== means it can't be changed to something like this without failing the mock test:

if matcher.is_a? Regexp
  matcher =~ gauge[:name]
else
  matcher == gauge[:name]
end

@eric eric and 1 other commented on an outdated diff Aug 9, 2012

test/librato_metrics_reporter_isolated_test.rb
+ new('user', 'password', :registry => registry)
+ reporter.write
+ end
+
+ def test_uses_default_registry
+ registry = stub
+ registry.expects(:each)
+ Metriks::Registry.expects(:default).returns(registry)
+ Metriks::Reporter::LibratoMetrics.new('user', 'password').write
+ end
+
+ # TODO: These tests don't really test the interval. Tie them in with #start
+ # tests when written.
+ def test_specify_interval
+ Metriks::TimeTracker.expects(:new).with(42)
+ Metriks::Reporter::LibratoMetrics.new('user', 'password', :interval => 42)
@eric

eric Aug 9, 2012

Owner

I don't think it's useful to test the internal mechanics of how the reporter decides when to send stuff.

@lmarburger

lmarburger Aug 9, 2012

Contributor

I agree. That TODO is a placeholder to remind me that I hate that test.

This pull request fails (merged d8078cd into eb336ed).

Contributor

lmarburger commented Aug 13, 2012

Here's a new reporter for you to scrutinize.

This pull request fails (merged c1aa55a into eb336ed).

This pull request fails (merged 069c4e6 into eb336ed).

This pull request fails (merged 461ca3d into eb336ed).

This pull request fails (merged 98bd87c into eb336ed).

This pull request fails (merged 9b2f96f into eb336ed).

Contributor

lmarburger commented Aug 15, 2012

@eric OK most of the big stuff is done. I'd like to get better full-blown integration tests and then combine them all. There's still duplication and interface inconsistencies that could be addressed at some point.

Owner

eric commented Dec 7, 2012

The more I've thought about this, the more I've realized that it could be helpful to use some sort of "adapter" classes that could take a reporter and yield, for instance, a flat namespace of metric names to values.

Another example adapter could report rate by storing and recording the difference in Meter#count instead of reporting the rate_per_second.

Contributor

lukaszx0 commented Jul 29, 2014

Omg, started doing exactly the same refactoring as you are doing here without checking open PRs first and notice yours now, when I wanted to open mine. Bummer.

Let me know if I can help in any way with this 👍

@lukaszx0 lukaszx0 commented on the diff Jul 29, 2014

lib/metriks/reporter.rb
@@ -0,0 +1,44 @@
+require 'metriks/registry'
+require 'metriks/time_tracker'
+
+module Metriks
+ class Reporter
+ attr_reader :registry
+
+ def initialize(options = {})
+ @time_tracker = Metriks::TimeTracker.new(options[:interval] || 60)
+ @registry = options[:registry] || Metriks::Registry.default
+ @on_error = options[:on_error] || proc { |ex| }
+ end
+
+ def write
+ # Override in subclass
@lukaszx0

lukaszx0 Jul 29, 2014

Contributor

raise NotImplementedError maybe?

@lukaszx0 lukaszx0 commented on the diff Jul 29, 2014

lib/metriks/reporter/graphite.rb
-module Metriks::Reporter
- class Graphite
- attr_reader :host, :port
+class Metriks::Reporter
+ class Graphite < Metriks::Reporter
+ attr_accessor :host, :port, :prefix
def initialize(host, port, options = {})
@lukaszx0

lukaszx0 Jul 29, 2014

Contributor

While we're doing major refactoring here and metriks is still 0.9.x which according to semver means that Anything may change at any time., how about unifying all reporter's constructors and accepting only one, options hash, argument?

@lukaszx0 lukaszx0 commented on the diff Jul 29, 2014

lib/metriks/reporter.rb
+ def write
+ # Override in subclass
+ end
+
+ def start
+ @thread ||= Thread.new do
+ loop do
+ @time_tracker.sleep
+
+ Thread.new do
+ begin
+ write
+ rescue Exception => ex
+ @on_error[ex] rescue nil
+ end
+ end
@lukaszx0

lukaszx0 Jul 29, 2014

Contributor

When I was looking into this I was thinking about extracting this subthread to method called flush so that any Reporter would be able to send metrics at any time if there's such need.

Contributor

lukaszx0 commented Jul 29, 2014

@lmarburger This looks pretty great! Left few suggestions in comments.

Owner

eric commented Jul 29, 2014

I've been re-thinking the idea of having the reporters in this gem at all.

Here's an example of the Librato Metrics reporter that has been extracted:

Owner

eric commented Jul 29, 2014

I'm not convinced there's a lot of common logic that is worth encapsulating into a reporter base class or helper class so far.

It's hard to standardize a lot of this, because the correct way to translate the metrics to be sent seems to be fairly different from one destination to the next.

Contributor

lukaszx0 commented Jul 29, 2014

Actual sate yells for refactoring and having base reporter class. I've just implemented custom reporter and 90% of it is copypasted from librato metrics. The only difference are hash keys in prepare_metrics and custom send using different client. Whole write remained the same, not to mention start, stop and restart...

It's true and I agree that many different reporters/destinations will format data differently but I don't think it'll hurt them if the base class will provide some convenience methods which they can use if they want, if not - the base won't add any burden. In most cases though (graphite, librato, riemann) it'll be quite the same and the only difference will be at the end of the process which is actually sending metrics.

Regarding extracting reporters from this gem. Seems like a premature optimization and over kill for me 🙊 Those are only one-class implementations and we have only couple of them right now, the most demanded I'd say. Additonaly, I almost always use 3 of reporters at the time. The main one (graphite, librato, custom one) and proctitle on production and logger for development.

I'd personally love to see this refactor finished and merged to master. I also offer my help in finishing it up 😸

Owner

eric commented Jul 29, 2014

The problem with having all of the reporters in this gem is that if a reporter has a dependency (like the library generally used to submit metrics to that service), it means that everyone who uses metriks will be pulling in that dependency, even if they don't use that reporter.

Over time, that becomes a huge pain (especially if a given reporter needs a gem that has a compiled dependency).

Owner

eric commented Jul 29, 2014

So far, I've found with the reporters that I have written, that many of them actually need the metric type to be able to know how to act on it.

Having a simple Metriks::Service class that just deals with starting and stopping of threads seems fine to me if people think that's worth extracting.

Contributor

lukaszx0 commented Jul 29, 2014

The problem with having all of the reporters in this gem is that if a reporter has a dependency (like the library generally used to submit metrics to that service), it means that everyone who uses metriks will be pulling in that dependency, even if they don't use that reporter.

There's a simple pattern used in many gems to overcome this:

begin
  require 'librato'
resuce LoadError
  raise "You need to include `librato` gem in your gemfile if you want to use `Metriks::Reporter::Librato`"
end
Owner

eric commented Jul 29, 2014

I agree that that can solve the problem, but it feels like a pretty dirty solution to me. You now have to manually start explaining versioning dependencies and checking for them, etc. Yuck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment