-
-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
features to pause batch execution #1323
base: main
Are you sure you want to change the base?
features to pause batch execution #1323
Conversation
I think this looks good so far. I think some Readme-driven-development might be in order here. What do you think about an interface like this: my_job = MyJob.set(good_job_paused: true).perform_later(1234)
my_job.good_job_unpause
# or
job_id = my_job.job_id
MyJob.good_job_unpause(job_id) # or an array of IDs, or an array of ActiveJob::Base instances A Another thing we'll want to think about is under what job-states a job could be paused (after job creation). Those get a little messy because we'll probably want to take an advisory lock (to prevent pausing a job that is actively executing). But we can defer that too if we just want to focus on creating jobs at create too (we don't have to land everything all at once) |
The snippet in my original post was just a proof of concept, I started with pausing/unpausing single jobs just to get a hand on the mechanics. I'm of the same mind about using the batch for the pause/unpause methods. I'm keeping the 'pausing an arbitrary batch/jobs' feature in the back of my mind while I'm working, but for the first pass I'm just targeting pausing a batch at creation and adding jobs to it:
batch = GoodJob::Batch.new(paused: true)
batch.save
batch.add do
10.times { MyJob.perform_later(1234) } # All jobs added to the batch at this point are paused
end
batch.enqueue # can be called before or after #unpause
batch.unpause # Jobs will now start running |
These don't get distributed as options to a job, they are "properties" of the batch. So this won't work:
|
OK! I still need to document everything and add/update tests (I think the failures are caused by seed data containing NULL values), but I'd appreciate a once-over before that in case there's anything you'd like changed. |
def paused? | ||
# TODO: consider querying to see if any jobs within the batch are paused, and if/how that should be represented if that result does not match properties[:paused] | ||
# I think there are probably going to need to be separate methods for "is the batch set to pause all jobs within it" and "does the batch contain any paused jobs" as those cases aren't always lined up | ||
properties[:paused] || false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really want to support (at least right now, initially) having a special property of a batch ("paused") have a side effect or meaning to the library; I consider them arbitrary application data.
I realize it makes it a little less optimal to say: "To create a paused batch, enqueue any job that is paused" but I think that's a fine workaround for now. I sort of see the order here being:
- Design and implement the functionality to allow jobs to be paused/unpaused
- I think you have all the changes for allowing
scheduled_at
to be nil - I think we need to handle the unpausing interface, especially in the dashboard
- I think you have all the changes for allowing
- Handle the situation in which a batch has paused jobs in it
And honestly, other than that, this looks great. I can help polish stuff up if whenever you're ok with where things are at.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be open to storing a batch's paused state in a separate column/hash/etc to avoid the interference, or would you prefer to remove that entirely?
re: unpausing interface
# public API
j = OtherJob.set(good_job_paused: true).perform_later
j.good_job_unpause # calls unpause(self) on the current adapter
# public API (maybe?) I put this on the adapter by default, not sure where else to put it
GoodJob::Adapter#unpause(jobs_or_ids) # The 'real' method for unpausing arbitrary jobs, loads GoodJob::Job records, calls unpause on individual records
# internal API, basically just what Batch#unpause is doing
r = GoodJob::Job.find(my_job_id)
r.paused?
r.unpause # probably need to provide some option to disable the notifier within the method so the caller can batch notifications
What functionality did you have in mind for the dashboard? I could see a use for unpausing jobs by hand if we also had the capability to pause jobs by hand, but if they're only paused programmatically I don't see as much benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be open to storing a batch's paused state in a separate column/hash/etc to avoid the interference, or would you prefer to remove that entirely?
I'd like to defer that. So not one way or another in perpetuity, but for right now it looks like we have a path to allow for a pausing feature without changing any existing interfaces or database columns. This is mostly about my personal ability to handle complexity, not the technical stuff.
What functionality did you have in mind for the dashboard? I could see a use for unpausing jobs by hand if we also had the capability to pause jobs by hand, but if they're only paused programmatically I don't see as much benefit.
We'd need to show that the jobs are in a paused state on the dashboard, to answer the question of "why isn't this job running?" And it would be nice to also add options (simple ones) to pause and unpause through the UI.
On the programmitic side, I think the pause
/unpause
methods should be class methods on the job. e.g. OtherJob.unpause(jobs_or_ids)
I noticed there are some situations where jobs will be enqueued with |
I don't consider those public methods and I would expect that job enqueues would go through the Adapter. Though you bring up a good point that I should doublecheck that Though reading the code now, maybe I'm wrong and it's changed. Though I do think it's intended given breadcrumbs like this: good_job/app/models/good_job/execution.rb Lines 514 to 523 in 4fd5ac4
We should do a separate PR if there are places that we discover are not reliably setting |
It looks like |
There are a handful of tests that assert that As far as my understanding goes I don't think this should cause any issues, but a spot-check to make sure my understanding is right would be appreciated. |
fyi, I did some exploratory testing here: #1332 The takeaway is that this feature might have to hold until I get a major release for GoodJob done, because if someone hasn't migrated to the new form of jobs (there is a temporary column called "is_discrete" to denote that migration), the
Yes, if the job has been enqueued without a scheduled time, the scheduled_at and created_at should have the exact same value. From my exploratory testing, it seems like that maybe only happens exactly the same when the jobs have been migrated to Discrete, so other tests may have assertions from older behavior. I would expect a test I wrote today to either be like |
OK, should I skip the "always set |
dbd770e
to
4922fac
Compare
… select all paused jobs if the batch had not been persisted
… the intended behavior is that it is always set
4922fac
to
7c599a7
Compare
Implements #1319
Pausing jobs works as far as enqueueing/runners, unpausing does work but is done by hand. Opening this now so I have the diff to look at and to make it easy to discuss implementation.