Allow for custom job ID #264

rzajac · 2015-05-20T18:11:57Z

put <pri> <delay> <ttr> <bytes> <id>\r\n
<data>\r\n

This would allow to implement failovers.

The text was updated successfully, but these errors were encountered:

emanuelecasadio · 2015-05-21T14:47:36Z

I can't see the reason for specifying a custom job ID within a queue.

aight8 · 2015-05-21T15:36:55Z

And I really see a reason and really hope this will implemented!
Here is an example:

You have number of repeatable job that you want to execute every day once but you want it handle in the queue. Today you have delete first the whole tube, that's ugly (by iterate and delete the queue items). With this solution it's lot easier and flexible!
You can check a specific jobs state
and so on...

Please implement this!! 👍

ifduyue · 2015-05-21T15:48:38Z

You have number of repeatable job that you want to execute every day once but you want it handle in the queue. Today you have delete first the whole tube, that's ugly (by iterate and delete the queue items). With this solution it's lot easier and flexible!

This can be done easily by checking if a daily job is exucted before putting it into or after reserving it from beanstalkd.

And, why not /etc/cron.daily/?

rzajac · 2015-05-22T07:15:25Z

Allowing to specify custom job IDs would allow me to implement sort of HA and Failover on the library level:

adding/deleting... jobs to two or more beanstalkd servers
in case of one failing workers / producers could connect to servers from a pool

Unless I'm missing something and current protocol allows for better solutions.

PS. Adding job with ID that already exist should trigger an error

jabdoa2 · 2015-05-27T11:40:04Z

How do you maintain your job ids without central point of failure? In general, in a distributed system you cannot have a job which runs exactly once (http://bravenewgeek.com/you-cannot-have-exactly-once-delivery/). Beanstalkd provides at most once and we do not want to change that. You can have at least once with more than one beanstalkd and some distributed locking or similar (some people use memcached or a db for that purpose).

rzajac · 2015-05-29T12:03:53Z

Maintaining job ids is out of the scope of this ticket. What I need is to be able to specify my own job ID.

JensRantil · 2016-04-03T16:00:28Z

This would allow to implement failovers.

@rzajac Could you elaborate a little on this? I'm not exactly sure about your use-case.

rzajac · 2016-04-03T19:37:54Z

@JensRantil I explained it a little bit here: #264 (comment)

sergeyklay · 2016-04-03T19:50:26Z

This feature request breaks BC for protocol v1.x

JensRantil · 2016-04-03T21:20:51Z

@rzajac Ah, sorry. Missed that. Thanks!

I'm going to be the devil's advocate here and shoot down some of the use cases :-)

@aight8 wrote:

You have number of repeatable job that you want to execute every day once but you want it handle in the queue. Today you have delete first the whole tube, that's ugly (by iterate and delete the queue items). With this solution it's lot easier and flexible!

There are various approaches to regular cronjobs:

Simply having a an /etc/cron.daily putting your daily job on the queue.
Having a permanent job, which every day is being delayed until next midnight.
For multiple jobs being executed at the same time every day, you could use any of the two above cases and simply have your daily job put smaller tasks on the queue. That is, task would split into smaller tasks.

You can check a specific jobs state

Valid point. Workaround is to store the job id in another datastore.

and so on...

Not an argument. Carry on. ;)

@rzajac wrote:

Allowing to specify custom job IDs would allow me to implement sort of HA and Failover on the library level:

I really don't think this is a good idea. Doing double writes independently to two queues is bound to eventually make them diverge and have different state. There are all sorts of race conditions. Examples; One TTL times out on one queue and not on the other. Another problem is that you currently can't reserve a specific job. You can delete a specific job, but then you can't be sure that no other consumer has reserved it etc.

The real solution here would be to use something like Zookeeper's ZAB or probably even better RAFT algorithm. All writes would go through master and a majority would need to acknowledge each state change. This would obviously introduce complexity, new failure modes and additional latency to every operation.

urjitbhatia · 2016-10-18T18:42:18Z

@rzajac @JensRantil I've also run into this.
The way I do it right now is put the external-id in Redis like @JensRantil suggested and save a mapping to the Beanstalkd generated id. Then, I use it later to cancel, query the job etc. In a way, having Beanstalkd take a custom Id would eliminate the need for an extra piece of infra.

yellow1912 · 2016-12-28T01:09:00Z

This will also help with self-throttling the job on the client side as well :) Simply checking if the job is already there allow us to avoid sending another one or just increase delay time.

JensRantil · 2018-08-19T19:12:02Z

Okay, I'm going to close this issue as a no-go. Reasons are as follows:

Allowing a custom ID also has the implications that we break the uniqueness of job id. There are also potential concurrency confusions since a very recently deleted job might seem to "pop up" again if a job with a specific job id is added adding lots of confusion to both clients and developers.
There are lots of ways this can be solved in a different way than expanding the scope of beanstalkd:
- submit multiple identical cronjob tasks and make the task processing idempotent (such as gracefully ignoring recently processed message or similar).
- only running a single process pushing a cronjob task to beanstalkd.
- taking a distributed lock to make sure only a single process pushes a cronjob task to beanstalkd. See for example https://dkron.io.

Please open a new issue describing your use-case if you believe if your use-case can't be worked around using the above approaches.

JensRantil closed this as completed Aug 19, 2018

onyxbb mentioned this issue Oct 29, 2019

add commands "reserve-id" and "put-with-id" to put the job with this id #379

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow for custom job ID #264

Allow for custom job ID #264

rzajac commented May 20, 2015

emanuelecasadio commented May 21, 2015

aight8 commented May 21, 2015

ifduyue commented May 21, 2015

rzajac commented May 22, 2015

jabdoa2 commented May 27, 2015

rzajac commented May 29, 2015

JensRantil commented Apr 3, 2016

rzajac commented Apr 3, 2016

sergeyklay commented Apr 3, 2016

JensRantil commented Apr 3, 2016

urjitbhatia commented Oct 18, 2016

yellow1912 commented Dec 28, 2016

JensRantil commented Aug 19, 2018

Allow for custom job ID #264

Allow for custom job ID #264

Comments

rzajac commented May 20, 2015

emanuelecasadio commented May 21, 2015

aight8 commented May 21, 2015

ifduyue commented May 21, 2015

rzajac commented May 22, 2015

jabdoa2 commented May 27, 2015

rzajac commented May 29, 2015

JensRantil commented Apr 3, 2016

rzajac commented Apr 3, 2016

sergeyklay commented Apr 3, 2016

JensRantil commented Apr 3, 2016

urjitbhatia commented Oct 18, 2016

yellow1912 commented Dec 28, 2016

JensRantil commented Aug 19, 2018