You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are sending out roughly 300k ActionMailer emails through via Sidekiq::Batches. These batches take 2-3 hours to complete. When they are finished, the Sidekiq::Batch::Status still reports that is_complete is false and there are negative pending jobs. No errors are thrown but this whole process takes way too long. How can we speed this up?
We spawn 350k Sidekiq batches up at one time. It takes 30 threads a couple of hours to get through the whole lot.
We spawn the batches in BroadcastMessageSendWorker. message.contacts can contain up to 350k contacts.
classBroadcastMessageSendWorkerincludeSidekiq::Workerdefperform(message_guid)ActiveRecord::Base.connection_pool.with_connectiondomessage=BroadcastMessage.find(message_guid)message.with_lockdoreturnunlessmessage.pending?message.pickup!ifmessage.contacts.count == 0message.finish!returnendbatch=Sidekiq::Batch.newbatch.on(:complete,self.class,'guid'=>message_guid)batch.jobsdo# We can't use `uniq` or `DISTINCT` with find_in_batches because after 1000 records it# will start blowing up. Instead, use an in-memory `seen` indexseen=Set.new({})message.contacts.select(:id).find_in_batchesdo |contact_batch|
args=contact_batch.pluck(:id).mapdo |contact_id|
nextunlessseen.add?(contact_id)# add? returns nil if the object is already in the set[message_guid,contact_id]endSidekiq::Client.push_bulk('class'=>BroadcastMessageDeliverWorker,'args'=>args.compact)endendmessage.update(batch_id: batch.bid)endendenddefon_complete(_,options)message=BroadcastMessage.find(options['guid'])message.finish!ifmessage.sending?endend
Here is the BroadcastMessageDeliverWorker that we use to send the Action Mailer messages.
classBroadcastMessageDeliverWorkerincludeSidekiq::Workersidekiq_optionsqueue: 'broadcast_mailers',retry: 2,dead: falsedefperform(message_guid,contact_id)ActiveRecord::Base.connection_pool.with_connectiondoifRails.env.staging?Rails.logger.error"Not running in staging"returnendreturnunlessvalid_within_batch?BroadcastMessageMailer.send_message(message_guid,contact_id).deliver_nowendendend
Here is the status info of the last batch that went out. It looks like things aren’t completing correctly:
First, a back of the envelope calculation: 360k emails. IME it takes ~1 second to send an email on average, one hour is 3600 seconds, 3600 emails/hour/thread or 100 threads to send those emails in one hour. If you say it's taking 30 threads about 3 hours to send the emails, that's about what I would expect in the real world. You can always profile the Worker code to look for hotspots but your code looks at first glance like it's doing all the right things: find_in_batches, pluck(:id), push_bulk, etc.
Things I would try:
double your dynos to 60 threads and seeing if that halves the time
double the concurrency within each dyno and see if that halves the time
As for the negative pending, that's a worry. Make sure you are shutting down your Sidekiqs cleanly when deploying, sending TERM with plenty of time to shut down and avoid Heroku's kill -9 on shutdown timeout:
Another customer has found that they get negative pending errors when they push too much data into Redis so that it starts evicting Batch data. Make sure you've enabled maxmemory-policy noeviction in your Redis instance so Redis does not silently break Sidekiq.
Ruby version: 2.4.1
Sidekiq: 5.2.2
Pro: 4.0.4
We are sending out roughly 300k ActionMailer emails through via
Sidekiq::Batches
. These batches take 2-3 hours to complete. When they are finished, theSidekiq::Batch::Status
still reports thatis_complete
isfalse
and there are negative pending jobs. No errors are thrown but this whole process takes way too long. How can we speed this up?We spawn 350k Sidekiq batches up at one time. It takes 30 threads a couple of hours to get through the whole lot.
We spawn the batches in
BroadcastMessageSendWorker
.message.contacts
can contain up to 350k contacts.Here is the
BroadcastMessageDeliverWorker
that we use to send the Action Mailer messages.Here is the status info of the last batch that went out. It looks like things aren’t completing correctly:
The text was updated successfully, but these errors were encountered: