Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advice: Scheduling with intervals #387

Closed
softwaregravy opened this issue Jun 16, 2022 · 6 comments
Closed

Advice: Scheduling with intervals #387

softwaregravy opened this issue Jun 16, 2022 · 6 comments

Comments

@softwaregravy
Copy link
Contributor

Hi all.

Thank you for the work you put in to this gem. I appreciate it.

I wanted to make sure I had grokked all the docs correctly. I'm was hoping someone very familiar could tell me if I'm approaching this the best way.

  1. I want to schedule jobs that run on regular cadence. Every X seconds or Y minutes. These are quick job which I want to run often.
  2. The jobs are idempotent and fault tolerant -- if we fail to schedule 1, it's okay. If we fail to schedule 2-3 in a row, we're effectively down. Even though the jobs individually are cheap, I don't want 2x or 10x of them running.
  3. I'm on Rails 7 running on Heroku

Here's an example in my scheduler.yml

# config/scheduler.yml
CalendarActionOrganizer:
  description: "This job queues the CalendarActionOrganizer to drive actions from the calendar"
  interval: ["1m"]
  queue: actions

And then I load this from my config/initializers/sidekiq.rb

# config/initializer/sidekiq.rb
if ENV.fetch("IS_SCHEDULER", false)
  Sidekiq.configure_server do |config|
    config.on(:startup) do
      Sidekiq.schedule = YAML.load_file(File.expand_path("../scheduler.yml", File.dirname(__FILE__)))
      Sidekiq::Scheduler.reload_schedule!
    end
  end
end

And then I set the flag as a scheduler to true in my Procfile, and also reduce the threads to 1.

# Procfile snippet
release: ./release-tasks.sh

web: bin/rails server
worker: RAILS_MAX_THREADS=10 bundle exec sidekiq -q user_facing -q actions -q calendar -q background -q default
scheduler: RAILS_MAX_THREADS=1 IS_SCHEDULER=true bundle exec sidekiq -q default

So then in Heroku, I have the scheduler process, and I just set that to be 1, and I rely on Heroku to keep me with 1 of this process type. The idea being we will have 1 thread running the scheduler across all my dynos.

Does this seem like the right approach? Is there an easier way I should be doing this?

Thank you !

@marcelolx
Copy link
Member

hey @softwaregravy, yes, this seems the right approach, the scheduler will push the jobs to sidekiq and sidekiq will process them.

If you're running a single instance of worker you could run the scheduler in there and you should be ok as well, sidekiq-scheduler wouldn't enqueue the same job multiple times. The only scenario where it is possible to have the same job being enqueued multiple times is when the scheduler is running on multiple hosts as mentioned here.

But running a single instance of the scheduler as you're doing should be 💯

@softwaregravy
Copy link
Contributor Author

Thank yo very much for getting back to me so quickly.

I will have lots of workers. I also trimmed my Procfile for this, I have a couple of other types of workers as well. We're architected to scale mostly linearly with workers. (meaning, as we scale, we can add workers and we will maintain overall latency.)

I'm assuming that a heroku dyno counts as a host in this context. So having 10 worker dynos would count as 10 hosts. This is exactly the scenario I'm trying to correct for.

2 quick followups:

  1. is limiting the scheduler process to 1 thread needed? Or can it be a "regular" worker that happens to have the scheduler flag to true and we only ever have 1 of it?
  2. If this is the correct approach, I'd be happy to add this to the readme or a wiki page. Do you think this would be helpful?

@marcelolx
Copy link
Member

So having 10 worker dynos would count as 10 hosts.

Exactly

is limiting the scheduler process to 1 thread needed? Or can it be a "regular" worker that happens to have the scheduler flag to true and we only ever have 1 of it?

I was going to mention this before, but yes, it can be a regular worker; I don't see any need to limit it to a single thread

If this is the correct approach, I'd be happy to add this to the readme or a wiki page. Do you think this would be helpful?

I would suggest testing this approach, if you have success and don't see any problem, a PR improving the readme is more than welcome! We can keep this issue open until then

@softwaregravy
Copy link
Contributor Author

softwaregravy commented Jun 16, 2022

Great. I'll plan to check back in early next week (~June 22nd). If everything is smooth, I will see about converting this into the readme.

@marcelolx
Copy link
Member

marcelolx commented Jul 1, 2022

Just a note @softwaregravy, keep an eye on #332, I'm working on introducing a different Locking strategy so we can run sidekiq-scheduler in multiple hosts without having to worry about the same job be scheduled by multiple hosts

@softwaregravy
Copy link
Contributor Author

softwaregravy commented Jul 1, 2022

@marcelolx Thanks. Your ping reminded me to come back to this.

As for the fix in 332, distributed locking is always a hard problem. The setup I used here sort of cheats by relying on Heroku to only give us 1 host, so we delegate the need for single-instance to their infrastructure. Good luck with the more general solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants