Add support for sampling metrics #182

ianks · 2023-05-04T03:25:18Z

This PR adds some initial support for sampling individual metrics. I made it an opt-in feature in Cargo so folks won't pull in the rand dependency by default.

Since I'm new to the codebase, I probably did some things sub-optimally. Let me know I can improve it!

resolves #179

56quarters · 2023-05-04T13:54:54Z

Thanks! I'll try to take a first pass over this in a bit.

ianks · 2023-05-04T14:58:15Z

I messed around with a slight refactor of this PR which tries a slightly different approach by moving the sampling behavior to a new type. LMK which you prefer:

diff --git i/cadence/src/types.rs w/cadence/src/types.rs
index 3ececec..ee4c227 100644
--- i/cadence/src/types.rs
+++ w/cadence/src/types.rs
@@ -8,6 +8,7 @@
 // option. This file may not be copied, modified, or distributed
 // except according to those terms.
 
+use crate::builder::SampleRate;
 use crate::builder::{MetricFormatter, MetricValue};
 use std::error;
 use std::fmt;
@@ -19,6 +20,76 @@ use std::io;
 /// types of metrics as defined in the [Statsd spec](https://github.com/b/statsd_spec).
 pub trait Metric {
     fn as_metric_str(&self) -> &str;
+
+    fn is_sampled(&self) -> bool {
+        true
+    }
+}
+
+/// Marker trait for metrics that can be sampled.
+///
+/// > A float between 0 and 1, inclusive. Only works with COUNT, HISTOGRAM,
+/// > DISTRIBUTION, and TIMER metrics. The default is 1, which samples 100% of the
+/// > time.
+/// > - via [DataDog](https://docs.datadoghq.com/developers/dogstatsd/datagram_shell)
+pub trait Sampleable: Metric {
+    /// Returns a new metric that indicates this metric was sampled.
+    fn sampled(self, sample_rate: f32) -> Result<Sampled<Self>, MetricError>
+    where
+        Self: Sized,
+    {
+        Sampled::new(self, sample_rate)
+    }
+}
+
+impl Sampleable for Counter {}
+impl Sampleable for Timer {}
+impl Sampleable for Gauge {}
+impl Sampleable for Histogram {}
+
+/// Wraps a that indicates a metric was sampled.
+pub struct Sampled<T: Sampleable> {
+    is_sampled: bool,
+    repr: String,
+    _marker: std::marker::PhantomData<T>,
+}
+
+impl<T: Sampleable> Sampled<T> {
+    pub fn new<S: TryInto<SampleRate>>(
+        metric: T,
+        sample_rate: S,
+    ) -> Result<Self, <S as std::convert::TryInto<SampleRate>>::Error> {
+        let sample_rate = sample_rate.try_into()?;
+        let mut repr = String::with_capacity(metric.as_metric_str().len() + sample_rate.kv_size());
+        repr.push_str(metric.as_metric_str());
+        repr.push('|');
+        repr.push_str(sample_rate.as_str());
+
+        #[cfg(feature = "sample-rate")]
+        use rand::Rng;
+
+        #[cfg(feature = "sample-rate")]
+        let is_sampled = rand::thread_rng().gen_bool(sample_rate.value() as f64);
+
+        #[cfg(not(feature = "sample-rate"))]
+        let is_sampled = true;
+
+        Ok(Sampled {
+            repr,
+            is_sampled,
+            _marker: std::marker::PhantomData,
+        })
+    }
+}
+
+impl<T: Sampleable> Metric for Sampled<T> {
+    fn as_metric_str(&self) -> &str {
+        &self.repr
+    }
+
+    fn is_sampled(&self) -> bool {
+        self.is_sampled
+    }
 }
 
 /// Counters are simple values incremented or decremented by a client.
@@ -307,7 +378,7 @@ pub type MetricResult<T> = Result<T, MetricError>;
 mod tests {
     #![allow(deprecated, deprecated_in_future)]
 
-    use super::{Counter, ErrorKind, Gauge, Histogram, Meter, Metric, MetricError, Set, Timer};
+    use super::{Counter, ErrorKind, Gauge, Histogram, Meter, Metric, MetricError, Sampleable, Set, Timer};
     use std::error::Error;
     use std::io;
 
@@ -421,4 +492,17 @@ mod tests {
         let our_err = MetricError::from((ErrorKind::InvalidInput, "Nope!"));
         assert!(our_err.source().is_none());
     }
+
+    #[test]
+    fn test_metrics_can_be_wrapped_as_sampled() {
+        let counter = Counter::new("my.app.", "test.counter", 4).sampled(1.0 / 3.0).unwrap();
+        let gauge = Gauge::new("my.app.", "test.gauge", 2).sampled(0.5).unwrap();
+        let histogram = Histogram::new("my.app.", "test.histogram", 45).sampled(0.5).unwrap();
+        let timer = Timer::new("my.app.", "test.timer", 34).sampled(0.5).unwrap();
+
+        assert_eq!("my.app.test.counter:4|c|@0.33333", counter.as_metric_str());
+        assert_eq!("my.app.test.timer:34|ms|@0.5", timer.as_metric_str());
+        assert_eq!("my.app.test.gauge:2|g|@0.5", gauge.as_metric_str());
+        assert_eq!("my.app.test.histogram:45|h|@0.5", histogram.as_metric_str());
+    }
 }

56quarters · 2023-05-04T15:12:16Z

Before I take a more thorough look, I've come around to unconditionally including the dependency on rand since it would allow us to get rid of all the conditional compilation. WDYT?

ianks · 2023-05-04T16:17:34Z

100% agree. Most folks will have it in their dep tree already, anyway I imagine.

56quarters

The public API for this roughly looks how I want, thanks! There are a few changes I'd like to see with how this is implemented that fall into a few categories:

I'd like to propagate errors related to bad sample rates instead of panicking or ignoring them.
ByteStr is a nice touch but I'd rather do something simpler/dumber and use float formatting for write!() with a fixed number of digits.
Removing the feature flag as discussed and combining the no-op and rand based Samplers.

I think once the above changes are made, there might be further changes we could make to simplify this a bit. Thanks again for this feature!

56quarters · 2023-05-05T14:42:35Z

cadence/Cargo.toml

@@ -14,5 +14,13 @@ autobenches = false

 [dependencies]
 crossbeam-channel = "0.5.1"
+rand = { version = "0.8.5", optional = true }


As discussed, fine to make this required and remove the feature flag.

56quarters · 2023-05-05T14:43:12Z

cadence/src/builder.rs

I didn't realize it was possible to have a builder.rs and builder/. Can you you move this to builder/mod.rs?

56quarters · 2023-05-05T14:43:37Z

cadence/src/builder.rs

 use crate::client::{MetricBackend, StatsdClient};
 use crate::types::{Metric, MetricError, MetricResult};
+#[cfg(feature = "sample-rate")]
+use sample_rate::SampleRate;


Please use full import paths everywhere: crate::builder::sample_rate

56quarters · 2023-05-05T14:45:42Z

cadence/src/builder.rs

+                self.sample_rate = Some(sr);
+                self.kv_size += sr.kv_size();
+            }
+            Err(e) => panic!("invalid sample rate for metric {}: {}", self.key, e),


I don't want Cadence to crash people's application since statsd metrics are by definition lossy (and so dropping something invalid or returning an error would be preferable). Instead, I think we should take a different approach to setting the sample rate. Described below on the MetricBuilder changes.

56quarters · 2023-05-05T14:47:27Z

cadence/src/builder.rs

+    #[cfg(feature = "sample-rate")]
+    fn write_sample_rate(&self, out: &mut String) {
+        if let Some(sample_rate) = self.sample_rate {
+            if sample_rate.is_applicable_to_metric(self.type_) {


I don't like the idea of silently ignoring sample rates for metrics where it doesn't make sense. This seems like the type of API guarantee that people would come to rely on even if they shouldn't. Instead, I'd rather return an error as described below.

56quarters · 2023-05-05T15:09:29Z

cadence/src/builder.rs

+    #[cfg(feature = "sample-rate")]
+    pub fn with_sample_rate(mut self, sample_rate: f32) -> Self {
+        if let BuilderRepr::Success(ref mut formatter, _) = self.repr {
+            formatter.with_sample_rate(sample_rate);


We should eagerly parse the sample rate here, validating it's between 0 and 1 and return an error if it's not by changing BuildRepr. Example below, sketching the idea (I haven't tested this and it doesn't need to look like this verbatim):

pub fn with_sample_rate(mut self, sample_rate: f32) -> Self { if let BuilderRepr::Success(ref mut formatter, type_) = self.repr { // note that we're passing type_ here and so can enforce that sampling is only applied when it makes sense for the metric type. match SampleRate::parse(sample_rate, type_) { Ok(rate) => { formatter.with_sample_rate(rate); }, Err(e) => { self.repr = BuilderRepr::Error(e); } } } self }

56quarters · 2023-05-05T15:19:50Z

cadence/src/builder.rs

+
+    /// Returns the sampler to use for this metric, based on features.
+    #[cfg(feature = "sample-rate")]
+    fn sampler(&self) -> Option<Sampler> {


Instead of returning an Option<Sampler> here could we use a no-op version instead at runtime?

56quarters · 2023-05-05T15:46:52Z

cadence/src/builder/byte_str.rs

+    len: usize,
+}
+
+impl<const N: usize> ByteStr<N> {


I appreciate the care taken to not cause any allocations but I'd be fine with doing something less involved to format the sample rate like write!(some_buf, "|@{:.6}", rate); instead of using this type.

56quarters · 2023-05-05T15:59:20Z

cadence/src/builder.rs

+
+                match formatter.sampler() {
+                    Some(sampler) => {
+                        if let Some(sampled_metric) = sampler.sample(&metric) {


It doesn't seem like sampler.sample(m) actually needs the metric, it's effectively just returning a bool to allow the send or not.

if formatter.sampler().allow() { client.send_metric(&metric)? } Ok(())

56quarters · 2023-05-05T15:59:38Z

cadence/src/builder.rs

@@ -710,17 +798,6 @@ mod tests {
        assert_eq!(1, errors.load(Ordering::Acquire));
    }

-    #[test]


Was this removed on purpose?

56quarters · 2023-05-05T16:39:34Z

Unrelated to your changes, but make sure you're running tests and linting locally since it seems like I haven't correctly set CI to run for forks.

56quarters · 2023-05-05T21:03:33Z

I messed around with a slight refactor of this PR which tries a slightly different approach by moving the sampling behavior to a new type. LMK which you prefer:

My preference would be for the original implementation, adding a .with_sample_rate() method to MetricBuilder. It feels more natural to me that you can only add sampling when you're "building" a metric to send, rather than applying a sample after a Metric (Counter, Gauge, etc) has already been constructed.

ianks added 2 commits May 3, 2023 16:53

Add sample rate to builder

a5975e7

Initial implementation of sample rates

90d65c9

56quarters requested changes May 5, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for sampling metrics #182

Add support for sampling metrics #182

ianks commented May 4, 2023

56quarters commented May 4, 2023

ianks commented May 4, 2023

56quarters commented May 4, 2023

ianks commented May 4, 2023

56quarters left a comment

56quarters May 5, 2023

56quarters May 5, 2023

56quarters May 5, 2023

56quarters May 5, 2023

56quarters May 5, 2023

56quarters May 5, 2023

56quarters May 5, 2023

56quarters May 5, 2023

56quarters May 5, 2023

56quarters May 5, 2023

56quarters commented May 5, 2023

56quarters commented May 5, 2023

Add support for sampling metrics #182

Are you sure you want to change the base?

Add support for sampling metrics #182

Conversation

ianks commented May 4, 2023

56quarters commented May 4, 2023

ianks commented May 4, 2023

56quarters commented May 4, 2023

ianks commented May 4, 2023

56quarters left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

56quarters commented May 5, 2023

56quarters commented May 5, 2023