Skip to content
This repository has been archived by the owner on Mar 28, 2019. It is now read-only.

Sample successful delivery and connections stats at 10% #217

Open
wants to merge 4 commits into
base: dev
Choose a base branch
from

Conversation

ghost
Copy link

@ghost ghost commented Feb 2, 2015

  • Add Metrics.IncrementByRate() and TimerByRate() for specifying the sampling rate.
  • Sample success stats at 10%. We can look at reducing the sampling rate for errors, too.
  • Fix a bug where IncrementByRate() would call StatsClient.Dec() with a negative value.
  • Replace TestMetrics with a StatsClient interface.
  • Short-circuit recording zero values for counters, timers, and gauge deltas.
  • Add tests.

Closes #177.

@ghost ghost added this to the v1.6 milestone Feb 2, 2015
@ghost ghost force-pushed the feature/sample-rate branch from 93ae105 to 04ea63a Compare February 3, 2015 03:06
@ghost ghost force-pushed the feature/sample-rate branch from 04ea63a to a3cf377 Compare February 20, 2015 21:30
@ghost ghost force-pushed the feature/sample-rate branch 2 times, most recently from 8c85f6a to 603c84d Compare March 3, 2015 17:02
@@ -426,7 +426,7 @@ func (b *EtcdBalancer) Fetch() (peers *EtcdPeers, err error) {
b.metrics.Increment("balancer.fetch.error")
return nil, err
}
b.metrics.Increment("balancer.fetch.success")
b.metrics.IncrementByRate("balancer.fetch.success", 1, 0.1)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this needs a low sample rate...we're only polling etcd every 10 seconds by default, and updating every minute. Even with both rates reduced to a second, that seems like a trickle compared to the storm of delivery metrics.

Kit Cambridge added 4 commits March 3, 2015 09:19
* Ensure `IncrementByRate()` passes the absolute value of the delta to
  `StatsClient.Dec()`.
* Replace `TestMetrics` with a `StatsClient` interface for `Metrics`.
* Lazily initialize `Metrics` snapshots.
* Remove remaining logged metrics.
* Add tests.

Closes #177.
`{IncrementBy, Timer}Rate()` and `GaugeDelta()` no longer call the
underlying `StatsClient` methods for zero values.
@ghost ghost force-pushed the feature/sample-rate branch from 603c84d to 74fdb69 Compare March 3, 2015 17:41
@bbangert
Copy link
Member

bbangert commented Mar 3, 2015

Looks fine, except the sampling rate shouldn't be hard-coded, it should be a config value so we can determine it when we deploy.

@ghost
Copy link
Author

ghost commented Mar 3, 2015

@bbangert Sounds good. Do you think a single config value for all metrics should be sufficient? Does the sampling logic look sound?

@bbangert
Copy link
Member

bbangert commented Mar 4, 2015

@kitcambridge I think a single value for the heaviest usage ones is fine. Sampling logic looks sound. I am curious about how much faster the stats might be if they stopped doing all that inefficient string concat.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add configurable metrics levels
1 participant