Skip to content

Perf suggestion #29

Merged
merged 2 commits into from Feb 10, 2013

3 participants

@scosman
scosman commented Jan 25, 2013

Suggestion: Instead of fetching 5 candidates at attempting to lock each one by one, just lock the next job in a single query. Let me know if I'm missing something.

Example of old system and 100 workers (worst but not uncommon case):
1) 100 workers wake up and fetch 5 candidate jobs (they all get the same, or very similar set of 5 jobs), making a total of 100 SELECT calls
2) they all try to lock the first job (only 1 of 99 succeeds). The failing workers try the next, making a total of roughly 500 UPDATE calls
3) a total of 5 jobs get processed from about 600 database calls, and we restart the whole process

New system:
1) 100 workers wake up and fetch the next available job, each getting a single unique job (1 SELECT and 1 UPDATE call each).
2) a total of 100 jobs are processed with exactly 200 SQL calls

I've also included an example to make it all work in 1 call, avoiding an extra round-trip. This requires custom SQL only tested with PostgreSQL

scosman added some commits Jan 25, 2013
@scosman scosman Fix the way we fetch jobs to work better for a large number of workers.
Instead of fetching 5 candidates at attempting to lock each one by one, just lock the next job in 1 query.

Example of old system and 100 workers (worst but not uncommon case):
1) 100 workers wake up and fetch 5 candidate jobs (they all get the same, or very similar set of 5 jobs), making a total of 100 SELECT calls
2) they all try to lock the first job (only 1 of 99 succeeds). The failing workers try the next, making a total of roughly 500 UPDATE calls
3) a total of 5 jobs get processed from about 600 database calls, and we restart the whole process

New system:
1) 100 workers wake up and fetch the next available job, each getting a single unique job (1 SELECT and 1 UPDATE call each).
2) a total of 100 jobs are processed with exactly 200 SQL calls

I've also included an example to make it all work in 1 call, avoiding an extra round-trip. This requires custom SQL only tested with PostgreSQL
940bf92
@scosman scosman Even faster for postgresql 5de3205
@sferik sferik merged commit 666cbc1 into collectiveidea:master Feb 10, 2013

1 check failed

Details default The Travis build could not complete due to an error
@carlosantoniodasilva

Is there any reason why this patch dropped the silencing of Active Record? With 0.4.0 forward all pooling queries are being logged, making the log file grow a lot when log_level is set to debug (ie development or other envs used for validating stuff). /cc @sferik

Hey @albus522 @sferik, do you have any input on this? :)

Have you asked @scosman why he made this change? I wouldn’t be opposed to adding it back if it wasn’t removed for a good reason.

@sferik I didn't ask @scosman directly besides that first comment, but I imagined he'd get a notification and could answer back with some reasoning. In any case, hearing from him would definitely be good, thanks.

I can't recall at this point. Generally I prefer to not be hide db requests from the logs. It's annoying since it's polling, but these calls were causing major perf issues for us and were harder to track down because they were not in the logs.

@sferik your call.

@scosman That logic seems reasonable to me.

@carlosantoniodasilva Any reason why you can’t just silence the logs in your app? This ought to do the trick:

module Delayed
  module Backend
    module ActiveRecord
      class Job < ::ActiveRecord::Base
        ::ActiveRecord::Base.silence do
          scope.by_priority.all(:limit => limit)
        end
      end
    end
  end
end

I see the reasoning (in fact that was my thought while reading the commit), however I'm unsure that the application logs are the best place to track such perf issues down. Anyway the problem I faced is that the AR logs are growing a lot due to the frequency DJ queries the database, and it was a bit hard to figure out why at first.

@sferik I could probably hack something out definitely, although I'm not sure the code you pasted is going to work :). Are you saying to override inside the Job class I presume?

The code has changed considerably since this commit, so I believe it'd probably not be so future proof, but I can give it a shot.

Thanks anyway.

@sferik I could probably hack something out definitely, although I'm not sure the code you pasted is going to work :). Are you saying to override inside the Job class I presume?

Yeah, that’s what I meant. Updated the snipped above so it should work.

Alright, thanks.

For the record, this seems to do the trick (although it is not the most beautiful thing in the world):

# Silence DJ querying AR frequently.
Delayed::Backend::ActiveRecord::Job.instance_eval do
  def reserve_with_silence(*args)
    silence { reserve_without_silence(*args) }
  end

  class << self
    alias_method_chain :reserve, :silence
  end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.