Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable/Enable #7

Closed
tooolbox opened this issue Aug 5, 2020 · 6 comments · Fixed by #20
Closed

Disable/Enable #7

tooolbox opened this issue Aug 5, 2020 · 6 comments · Fixed by #20

Comments

@tooolbox
Copy link
Contributor

tooolbox commented Aug 5, 2020

I realized that Dalga does not have a pause/resume functionality. Would you be open to adding this feature?

There are some specific semantics that I'm interested in for this. For example, if a job would run weekly and the next run is on the 2nd of the month, pausing on the 1st and resuming on the 3rd should skip that iteration, and the job would run again on the 9th.

This is different than a delay caused by downtime. For instance, if the next run is on the 2nd and the system went down on the 1st and came up on the 3rd, it would immediately process the job and reschedule for the 9th.

(From what I understand this is already how Dalga would handle downtime, I am just providing a counter-example to illustrate what I am looking for in a Pause/Resume functionality.)

@cenkalti
Copy link
Owner

cenkalti commented Aug 6, 2020

I agree that downtime and pause/resume are different cases.

Adding this feature requires a change in data model that I am not comfortable with it. I think this problem can be solved outside Dalga.

What about this?

  • For pausing the job, DELETE the job and possibly save it somewhere else.
  • For resuming the job, POST the job again.

@tooolbox
Copy link
Contributor Author

tooolbox commented Aug 6, 2020

Hm, I see. It does complicate things for me, what with storing data in two places and therefore building a distributed transaction with Dalga. Let me review this and get back to you.

@tooolbox
Copy link
Contributor Author

Hey @cenkalti, so now that I have added ISO8601 support and some key retry features (more to come, though) I took another look at the pause/resume functionality.

Assuming that #13 lands, and with what I have in #14 so far, (i.e. the next_sched column) it looks like this feature could be implemented by making the next_run column nullable. To pause the job, make next_run null and none of the instances will pick it up.

For resumption, it depends on the FixedIntervals setting:

  • If FixedIntervals is false, next_run is set to the value of next_sched. (This means that if next_sched is in the past, the job would run immediately.)
  • If FixedIntervals is true, both next_run and next_sched are moved shifted forward by interval until they're in the future.

If we don't want to make next_run nullable then a flag column of some kind could be used to exclude the row from the Front() query. Anyway, this is roughly what I'm thinking and I wanted to get your input.


(Note that I wrote most of this last night, before I saw your comments on #13, I'm going to go read them now.)

@cenkalti
Copy link
Owner

Sounds good. Making the next_run column nullable can work. I prefer this method instead of adding a separate boolean column.

We should also handle the case where job is paused while it is running. execute method may update the next_run field. We may need to add a WHERE clause to the SQL that updates the next_run field.

A suggestion about the terminology, maybe we can call it disable/enable instead of pause/resume. You decide.

@tooolbox
Copy link
Contributor Author

Sounds good. Making the next_run column nullable can work. I prefer this method instead of adding a separate boolean column.

Okay, fantastic.

We should also handle the case where job is paused while it is running. execute method may update the next_run field. We may need to add a WHERE clause to the SQL that updates the next_run field.

...Huh. That is a very good point. I guess when it finishes it should update next_sched but if next_run has been NULL'd it doesn't update. Possibly to avoid 2 queries we could use CASE, so in UpdateNextRun() you have something like UPDATE jobs SET next_sched=?, next_run=CASE next_run IS NULL THEN NULL ELSE ? END, instance_id=NULL, location=? WHERE path = ? AND body = ?. I'm not 100% sure that will work, would need to try it out.

But yeah, we should account for that for sure.

A suggestion about the terminology, maybe we can call it disable/enable instead of pause/resume. You decide.

I agree. Now that I think of it, pause/resume sounds like it relates to an instance of the job running, which could be misleading. So we'll do enable/disable.

@tooolbox tooolbox changed the title Pause/Resume Disable/Enable Aug 24, 2020
@tooolbox
Copy link
Contributor Author

This is my next item, to follow #18

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants