Skip to content

Using Dragonfly

Mike Perham edited this page Jan 31, 2024 · 2 revisions

Sidekiq can use Dragonfly, an open-source in-memory data store compatible with Redis APIs, to store all the job and operational data. By default, Sidekiq connects to localhost:6379. Since Dragonfly is 100% compatible with Redis clients, your connection configuration will look identical to what Redis requires. In terms of other server configurations, Dragonfly exposes different flags and options to tune the server for your specific use case.

Note: The Sidekiq-Dragonfly integration requires Dragonfly v1.13+.

Using an ENV variable

You can configure Dragonfly's location using an environment variable. The easiest option is to set REDIS_URL; Sidekiq will pick it up and use it to connect to the backend store. It is just that in this case, the backend store is Dragonfly, which is wire-compatible with Redis. A Redis URL looks like redis://[hostname]:[port]/[dbnumber], e.g. redis://my.dragonfly.instance.com:7777/0.

Using an initializer

Prefer to use the ENV variables above. If you have to use logic in Ruby, you can configure the connection in the initializer. Note that to configure the location of Dragonfly, you must define both the Sidekiq.configure_server and Sidekiq.configure_client blocks. To do this, put the following code into config/initializers/sidekiq.rb.

Sidekiq.configure_server do |config|
  config.redis = { url: 'redis://my.dragonfly.instance.com:7777/0' }
end

Sidekiq.configure_client do |config|
  config.redis = { url: 'redis://my.dragonfly.instance.com:7777/0' }
end

Unknown parameters are passed to the underlying Redis client, so any parameters supported by the driver can go in the Hash. Keys can be strings or symbols and will be unified before being passed on to the Redis client.

Life in the Cloud

One problem with cloud-based systems like EC2 and Heroku can be multi-tenancy and unpredictable network performance. You can tune your network and pool timeouts to be a little more lenient if you are seeing occasional timeout errors, it defaults to 1 second.

config.redis = { url: 'redis://...', network_timeout: 5, pool_timeout: 5 }

REMEMBER: THIS IS A BANDAID You are not solving the actual cause of the slow performance.

If you are seeing Dragonfly timeout errors, you should check your Dragonfly instance latency by using the redis-cli --latency and --latency-history flags:

$ redis-cli --latency-history localhost
min: 0, max: 1, avg: 0.20 (1359 samples) -- 15.01 seconds range
min: 0, max: 1, avg: 0.18 (1356 samples) -- 15.01 seconds range
min: 0, max: 1, avg: 0.18 (1355 samples) -- 15.01 seconds range
min: 0, max: 1, avg: 0.17 (1359 samples) -- 15.01 seconds range
min: 0, max: 1, avg: 0.19 (1358 samples) -- 15.00 seconds range
min: 0, max: 1, avg: 0.18 (1353 samples) -- 15.01 seconds range
min: 0, max: 1, avg: 0.19 (1357 samples) -- 15.01 seconds range

This says my average latency to localhost is 0.2ms or 200 microseconds: excellent. Note that a 5-second timeout configuration is terrible. You can move to a different Cloud service provider or run your own Dragonfly server on a dedicated machine, but there's nothing Sidekiq can do if the network performance is terrible. Contact your Cloud provider and ask about your available options.

Architecture

Dragonfly is developed from the ground up with a multi-threaded shared-nothing architecture, which can utilize all the CPU cores and memory of modern servers.

Dragonfly offers many different topologies:

  • Standalone -- offers no fault tolerance
  • Primary/Replica -- manually fails over to a replica in case of primary failure
  • Primary/Replica managed by Sentinel -- automatically fails over to a replica in case of primary failure
  • Primary/Replica managed by Kubernetes -- automatically fails over to a replica in case of primary failure

Depending on your requirements, you can choose the best topology for your use case. At the moment of writing (January 2024, Dragonfly v1.14.1), Dragonfly does not support multi-instance cluster mode like Redis Cluster. However, it is notable that Redis Cluster is not appropriate for Sidekiq as Sidekiq has a few very hot keys which are constantly changing (aka queues) and Cluster cannot guarantee high-performance transactions, necessary to keep your job system fast and consistent.

On the other hand, Dragonfly is able to handle large datasets (up to 1TB) and fully utilize the hardware resources of a single server. This makes Dragonfly a viable option for Sidekiq users who need to process large amounts of jobs.

Tuning

You can see Dragonfly's server flags for performance tuning, specifically the --shard_round_robin_prefix server flag, with more explanation here.

Scale

The Sidekiq benchmark was slightly revamped in bin/multi_queue_bench to scale better with multi-threaded systems like Dragonfly. We used an AWS c5.4xlarge to run Dragonfly and a monstrous 96-core c5a.24xlarge to run the benchmark. All job types are no-op Sidekiq::Job. We run Dragonfly with dragonfly --proactor_threads=n --shard_round_robin_prefix="queue", where n is the number of threads that Dragonfly uses. This is different from Redis, which is single-threaded and scales horizontally via Redis Cluster. To fully demonstrate performance and scalability, we ran the benchmark with 3 different setups: RUBY_YJIT_ENABLE=1 PROCESSES=96 QUEUES=k THREADS=10 ./multi_queue_bench, with k=1, 2, 8. We used Ruby with version 3.3.0 and Sidekiq trunk.

The results are as follows:

Dragonfly Threads (n) Number of Queues (k) Jobs per Queue Total Jobs Throughput (Jobs/Sec)
1 1 1M 1M 115,528
2 2 1M 2M 241,259
8 8 1M 8M 487,781

Memory

Dragonfly runs best when all data fits in memory (i.e., no swapping). You should set the --cache_mode server flag to false so Dragonfly doesn't drop Sidekiq's data silently.

Multiple Dragonfly instances

It is recommended to run Sidekiq with a dedicated Dragonfly instance. If you are also using Dragonfly for other purposes, such as caching, you should run a separate Dragonfly instance for Sidekiq. This would avoid potential issues such as cache eviction, conflicting configurations, and performance degradation.

Notes

  • redis-cli --help shows several useful options, including --latency-history and --bigkeys, which can be used with Dragonfly.
  • Monitoring In-memory Data Stores covers the most essential things you should monitor when using Dragonfly.
  • BRPOPLPUSH and other B* commands are blocking; large latency spikes with them are normal and not a cause for concern.

Previous: Best Practices Next: Error Handling