Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API changes between 0.6 and 0.7, proposal for more improvements #36

Closed
mixalturek opened this issue Jan 16, 2019 · 6 comments
Closed

API changes between 0.6 and 0.7, proposal for more improvements #36

mixalturek opened this issue Jan 16, 2019 · 6 comments

Comments

@mixalturek
Copy link
Contributor

I'm trying to update Dipstick dependency to 0.7.1 in our project. There are a lot of great API improvements, but I like some parts of 0.6 more. Migration guide with few items would help a lot during the switch.

Nice .with_name() has been renamed to strange .add_prefix(). .with_name() was clear, it extended an existing name by a new subname. Current .add_prefix() is quite hard to use. Does it prefix the existing name or adds a suffix? I don't understand this change, it looks like the implementation detail leaked to the API.

AppMetrics was the "core" object of the library in 0.6. It's now called AtomicBucket in our use case. What about to add a type alias for it? We are using Monitor, Metrics or something other would be also good.

pub(crate) type Monitor = AppMetrics<Arc<Scoreboard>>; // 0.6
pub(crate) type Monitor = AtomicBucket; // 0.7

There is no clean "entry point" for the library that is otherwise very good. There are many of them in the examples (listed below), which one should be preferred? It's extremely hard for newcomers. Finding AtomicBucket was difficult even for me who was already using 0.6 for some time. I know it's in the example at crates.io but who would expect AtomicBucket?! What about to introduce a builder or helper struct called Dipstick and start all library uses with it?

  • Stream::write_to()
  • AtomicBucket::new()
  • MultiInputScope::new()
  • Graphite::send_to()
  • MultiInput::input()
  • MultiOutput::output()
  • Prometheus::send_json_to()
  • dipstick::Log::log_to().input()
  • Statsd::send_to()
  • Stream::to_stderr()
@fralalonde
Copy link
Owner

Thanks for your input. I am glad you address these points, as I've struggled with naming for quite a while and never seemed to be able to come up with a truly consistent way of doing things. My goal was to come up with fluent, meaningful method names for common use cases, hence the send_to and such.

The with_name / add_prefix in particular was a hard call. I wanted to convey the idea that the namespace gets appended to rather than entirely replaced. Also, the "prefix" is a prefix to the eventual metrics, but the added prefix is itself appended to the existing prefixes actuallyt making it a suffix of the existing namespace. Finally, to be consistent, the with_* convention would have to be used with other properties, such as with_cache, with_buffer, etc.

Some structs were not meant to be instantiated directly:

  • Multi* should be created using some form of and_then on a first output
  • Proxy would usually be instantiated from within the metrics!() macro
  • Cache* is just a decorator

Following the previous points, AtomicBucket is a special case and might be represented better with AtomicBucket::aggregate() instead of new(). It is now called AtomicBucket because there could be a non-Atomic version of it in the future, and I thought Bucket was better than Aggregator which is long-ish and sounds like the name of a basement-dwelling death metal band.

@mixalturek
Copy link
Contributor Author

Regarding the with_name and add_prefix... What about just named()? We are using it in our Java/Scala library and it's nicely accepted.

https://github.com/avast/metrics#naming-of-monitors

@mixalturek
Copy link
Contributor Author

I have just finished the update to 0.7, our setup method is below. The rest of the code operates with add_prefix/counters/timers/... - nothing special. Our application writes to StatsD but it aggregates all metrics internally and pushes them every minute. StatsD is close to be only 1:1 transport channel.

The issues with API that I see from our use case point of view:

  • I didn't expect that AtomicBucket is the "core" object of the library.
  • It's impossible to use ? operator for errors propagation, Broken Result type, missing Send/Sync in v0.7.1 #35.
  • I didn't expect the cached() method to be on the StatsD object, I was searching it in AtomicBucket` for a long time and I was thinking this feature was fully removed for a while. I'm still quite uncertain what exactly it does. It's connection with StatsD object is just against my mental model how it works (no deep looking to the code).
  • add_prefix() uses fluent API, but set_stats() and set_drain() do not, I'm unsure why such difference.
pub use dipstick::{AtomicBucket as Monitor, Counter, Gauge, InputScope, Prefixed, Timer};
use dipstick::{
    CachedOutput, CancelHandle, Flush, InputKind, MetricName, MetricValue, ScheduleFlush,
    ScoreType, Statsd,
};

    fn statsd_metrics(config: &MetricsConfig) -> Result<Self, Error> {
        let statsd = Statsd::send_to((config.statsd_host.as_ref(), config.statsd_port))
            // TODO: https://github.com/fralalonde/dipstick/issues/35
            .map_err(|e| err_msg(format!("Creation of StatsD connector failed: {}", e)))?
            .cached(100);

        let monitor = Monitor::new().add_prefix(config.prefix.clone());
        monitor.set_stats(Self::selected_stats);
        monitor.set_drain(statsd);
        let cancel_handle = Some(monitor.flush_every(config.flush_period));

        Ok(Metrics {
            monitor,
            cancel_handle,
        })
    }

@fralalonde
Copy link
Owner

cached() does not exist on AtomicBucket because it is not required. To track scores, the Bucket internally maintains a list of name-deduplicated metrics, which is what the Cache decorator does for non-persistent outputs. For example, a Statsd output does not remember that you asked it for a Counter metrics named "bananas" and would rebuild a new Counter everytime you asked for it.

Also, the cache is only useful if the metrics names are dynamically generated at runtime to prevent any cost of re-creating the metrics of the same name. If using static metrics, either through the metrics!() macro or programatically, caching will only add a slight overhead but provide no benefit.

@mixalturek
Copy link
Contributor Author

We use both static-programmatic metrics wherever possible - counters of requests, errors, latency, etc. - and the dynamic ones for e.g. HTTP status codes in responses.

We don't use fully static metrics!(), they work only for the simplest cases. Separately monitored multiple instances of the same struct are simply impossible with it. Imagine for example multiple HTTP handlers (/endpoint/a, /envpoint/b, ...) and generic MonitoredHandler wrapper that tracks number of requests, in/out bytes, latency, status codes, etc.

@fralalonde
Copy link
Owner

Closing this as the new names have been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants