Skip to content
This repository has been archived by the owner on Dec 11, 2023. It is now read-only.

Monitoring Background Workers

Michael Beale edited this page Apr 30, 2016 · 19 revisions

Both librato-rails and librato-rack are designed to operate inside a long-running process where they can queue up measurements and then periodically submit them in the background.

This model doesn't work well with background workers that run intermittently, exist for variable lengths of time, or fork. If you are using delayed_job, resque, croned or scheduled rake tasks, or queue_classic (forking) you should disable librato-rails or librato-rack if they are loading and use an alternate strategy for adding metrics.

Disabling librato-rails or librato-rack

With librato-rails v0.10+ or librato-rack v0.4+ the reporter will no longer start until your first web request. Since worker processes won't service any web requests you can safely include these libraries. There is no need to manually disable either. If you are using an earlier version please upgrade.

You may still have access to the collector methods from these libraries (Librato.increment, etc) but you shouldn't use them as metrics gathered that way will be lost.

Strategies for instrumenting your workers

There are a few options to consider when instrumenting your workers:

Use librato-metrics

librato-metrics is a lower-level gem that gives you easy access to the Librato API. You can easily queue up simple metrics, aggregate info for frequently occurring ops locally before sending and more.

If your workers take a while to run you can submit periodically while doing work or if they are short-lived you can use an at_exit handler to ensure metrics are submitted right before the process exits.

Pros: Works anywhere, just ruby, no extra process to run, lots of control
Cons: May add a small amount of time to worker exit as metrics are submitted

Use statsd

statsd is a tiny node-based daemon that you can run as a separate process and send UDP packets to as your workers run. We have a Librato plugin that makes it easy to hook up.

NOTE: Be sure the set the countersAsGauges configuration option to true to ensure compatibility with metrics reported by librato-rails and librato-rack.

This offloads all the work of submission to the statsd daemon so when your workers start or exit doesn't matter anymore. Then you can use the statsd-ruby library to report stats inline.

Pros: Extremely fast, no delay in workers exiting
Cons: You need to run another process

On heroku?

If you are on heroku you can't run your own other independent processes so statsd is out. Start with librato-metrics and if you find the performance profile isn't a good fit, send us an email to find out more about some alternative options we're working on.

Monitoring long-running threaded workers (Sidekiq, etc)

If you are using sidekiq or queue_classic (threaded), librato-rails or librato-rack may actually be a good fit for your workers. Be sure to configure using the ENV-variable strategy as your workers may not be able to find your yaml-based config files.

Newer versions of librato-rails and librato-rack don't start the background reporter until the process receives its first request. Since your workers will most likely not ever see any requests you need to let the library know to start up the reporter.

If you are using librato-rails v0.10.1 or greater you can set LIBRATO_AUTORUN=1 in your environment before starting your sidekiq or queue_classic process. This will force the reporter to start during process startup.

Note that you should not set LIBRATO_AUTORUN=1 for your web processes as it will likely interfere with reporting from them. The best ways to isolate this setting for LIBRATO_AUTORUN are to include this part of the config in a wrapper script for starting your workers or put it on the workers' line in your Procfile if you are using one.

If you are using librato-rack v0.4.5 or greater you can force the reporter to start by calling Librato.tracker.start! after initialization. If you are sharing code between your web and sidekiq processes, make sure this call only happens in the sidekiq environment or it may interfere with metric collection from your web processes.