Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] - Run a scheduled a job once in a distributed system #181

Closed
marcsantiago opened this issue May 27, 2021 · 13 comments
Closed

[FEATURE] - Run a scheduled a job once in a distributed system #181

marcsantiago opened this issue May 27, 2021 · 13 comments
Labels
enhancement New feature or request hacktoberfest

Comments

@marcsantiago
Copy link

marcsantiago commented May 27, 2021

Is your feature request related to a problem? Please describe

Currently If I use this package on any cluster of AWS ec2s, all my machines running production code would also run code in the scheduler. For instance if I need a background go routine to scan data every minute and I have 5 machines deployed. I don't need or want all my machines to scan the same data. It's also not cost effective to spin a single isolated ec2 to run a single job.

Describe the solution you'd like

I would like if gocron had to ability to add connections to persistent storage via some sort of interface to allow users to plugin any persistent storage (redis, memcache, sql, etc) such that it kept track of the jobs running. If a job is running by 1 machine and you've set the settings to allow only 1 job globally to run, then all other machines do not run the same job. In essence it would be safe to deploy code on a fleet of web servers and have only 1 web server at any given time run the same job.

some considerations:

  • When a cluster of machines are being deployed, restarted, or added to a load balancer, there is a chance for a race condition to occur given the latency it may take the fastest instance to write to persistent storage. Example If machine A writes to storage saying "hey I'm responsible for the job" and machine B checks storage a few minutes later and doesn't find anything in persistent storage it will also say "hey I'm responsible for the job" . This is because machines get updates in batches. In order to prevent this behavior there needs to be some sort of jitter or perhaps a queue is used. For instance of all the servers start up and schedule a job. The name space for that job should be the same on each machine. If i'm using redis, I would upsert a job by some shared id. When the scheduler ticks on the application it pops from the queue, if there is an item then it's granted the ability to run the job else the job does not run. When the job completes on the machine with the successful pop, it adds the job id back to the queue for the next tick.... something like that

Describe alternatives you've considered

  • Spinning up isolated ec2s that do nothing other then run a single job, at the cost or operational cognitive load and monetary cost
  • Use other distributed systems like machinery, however projects like machinery aren't necessarily meant for cron like tasks.
  • Compile a Go binary and use the native linux cron system, but again this means moving away from using a very clean package and system that gocron presents
@marcsantiago marcsantiago added the enhancement New feature or request label May 27, 2021
@JohnRoesler
Copy link
Contributor

Definitely something we’re interested in supporting

@JohnRoesler
Copy link
Contributor

I like the way that this platform (https://benthos.dev/) abstracts different data sources and I think it would be ideal to then add a reasonable interface to gocron that we could then support multiple specific implementations like redis, ec2, etc.

@ianaz
Copy link

ianaz commented Jun 28, 2021

The original version had SetLocker where you were able to provide an implementation to lock and unlock. Is there still something similar?

@JohnRoesler
Copy link
Contributor

@ianaz it never worked properly so we removed it. That’d be a good place to start, adding a locker interface.

@derkan
Copy link

derkan commented Jul 2, 2021

When in a cluster environment, nodes in cluster will be known. Maybe consistenthash may be usable here to detect which node will run which jobs.

@JohnRoesler
Copy link
Contributor

@derkan that is an interesting idea. Perhaps we could introduce some sort of distributed provider that could have different implementations. I was thinking something along the lines of a redis cache as a locker for jobs - example

@avimess23
Copy link

Hi guys,
@marcsantiago @JohnRoesler
I wanted to know what were your final thoughts on the subject?

I have a similar situation where I am using gocron to send out a status email once a week.
The program is being executed on sevral servers for efficiancy.
I am looking into implementing the suggested "Distributed Locks with Redis" and would like to hear if this was succsessful for you. maybe ever get some advice and things to watch out for while I am pursuing this solution.
Thanks

@dnitsch
Copy link

dnitsch commented Nov 7, 2022

I wonder if the approach here could maybe be similar to APScheduler begin with OOTB persistent stores supported like Redis/Postgres - once an interface can be reasonably standardized on, allow people to pass in their own implementations that satisfy it and the core would have to handle queues/locks, moving onto next available etc...

I realize the current gocron kind of gives everyone the ability to roll their own version of the above but would be nice to provide that as a built-in option.

@vuhoanghiep1993
Copy link

Hello, does this feature supported ? I'm looking for job with shedlock for distritbuted system. I think if this included in this lib is better than Implement in my own project

@manuelarte
Copy link

Hi, I am also interested in this

@JohnRoesler
Copy link
Contributor

If anyone is able to test out the feature I have in the works that would be great! v1.24.0-rc2

@JohnRoesler
Copy link
Contributor

I've released distributed locker support with redis and marked it as in beta. Please test and provide feedback when you have a chance!

@manuelarte
Copy link

Maybe this is not the place, but this is a locker I created and could be used as an example/inspiration for someone else with the same needs:

#529

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request hacktoberfest
Projects
None yet
Development

No branches or pull requests

8 participants