Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pro Batch question #4142

Closed
bragboy opened this Issue Apr 9, 2019 · 5 comments

Comments

Projects
None yet
2 participants
@bragboy
Copy link

bragboy commented Apr 9, 2019

This is more of an implementation question on using the Batch feature of Sidekiq Pro.

Currently, I have a bunch of synchronous code execution done inside a background job which needs parallel execution now and then (as exactly shown in the Really complex workflow of batches - https://github.com/mperham/sidekiq/wiki/Really-Complex-Workflows-with-Batches )

I have solved it like this (note: this is a trimmed down version of my complex workflow)

class SampleWorker
  include Sidekiq::Worker

  def perform
    puts 'SOME LONG RUNNING TASK BEFORE'
    DummyWorker.dummy_task
    puts 'SOME LONG RUNNING TASK AFTER'
  end
end
class DummyWorker
  include Sidekiq::Worker

  def perform(page_id)
    sleep(1)
    puts "DONE FOR #{page_id}"
  end

  def on_success(status, options)
    puts 'Batch processing ended'
    Rails.cache.write(options['bid'], 'complete')
  end

  def self.dummy_task
    puts 'Batch processing started'
    batch = Sidekiq::Batch.new
    batch.on(:success, self, 'bid' => batch.bid)
    batch.jobs do
      (1...10).to_a.each do |dummy_id|
        DummyWorker.perform_async(dummy_id)
      end
    end
    sleep(1) until Rails.cache.read(batch.bid) == 'complete'
  end
end

In my opinion (although I use sleep(1)), this looks cleaner and helps me put more complex batch workflows as a single line instead of chaining as shown in the documentation example.

Question: Is this a okay design ( need @mperham's expert opinion )

And thanks for the Batch feature, it kicks a**!

@mperham

This comment has been minimized.

Copy link
Owner

mperham commented Apr 9, 2019

It's a hack but it'll work for the normal case.

The issue is that your SampleWorker#perform method will block, possibly for a long time, and do nothing. You are allocating a Sidekiq thread to sit around, doing nothing but check status every second, which is costly at scale. If there are retries, the success callback can take hours or days to fire.

@mperham

This comment has been minimized.

Copy link
Owner

mperham commented Apr 9, 2019

Another bad case: If you restart Sidekiq while the job is polling, you will lose your place and restart the workflow from scratch. It's best to follow the Batch workflow documentation, it's designed to handle all of these edge cases cleanly.

@bragboy

This comment has been minimized.

Copy link
Author

bragboy commented Apr 10, 2019

@mperham - Thanks for the advice. I knew this was a code smell for sure. Will try to refactor it to the model shown in the documentation. The reason I raised this question was to use something similar to the await/async in Javascript workflow wherein if I use the line await, it will essentially make the code synchronous (under the hood it does not block anything like I've done).

async function sample() {
  let response = await fetch('/article/promise-chaining/user.json');
}

If my understanding is correct, Ruby does not have such capability. Right?

@mperham

This comment has been minimized.

Copy link
Owner

mperham commented Apr 10, 2019

JavaScript uses events under the covers so it can wait efficiently but even then, if you restart the Node process, you'll lose where you were. With Batches, they are designed to run across machines/processes and fire callbacks on complete/success. It's a true persistent workflow system.

Ruby can use something like EventMachine but that's rare, it's fallen out of fashion.

@bragboy

This comment has been minimized.

Copy link
Author

bragboy commented Apr 11, 2019

Thanks @mperham !

@bragboy bragboy closed this Apr 11, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.