Perf suggestion #29

Merged
merged 2 commits into from Feb 10, 2013

Conversation

Projects
None yet
3 participants
@scosman
Contributor

scosman commented Jan 25, 2013

Suggestion: Instead of fetching 5 candidates at attempting to lock each one by one, just lock the next job in a single query. Let me know if I'm missing something.

Example of old system and 100 workers (worst but not uncommon case):

  1. 100 workers wake up and fetch 5 candidate jobs (they all get the same, or very similar set of 5 jobs), making a total of 100 SELECT calls
  2. they all try to lock the first job (only 1 of 99 succeeds). The failing workers try the next, making a total of roughly 500 UPDATE calls
  3. a total of 5 jobs get processed from about 600 database calls, and we restart the whole process

New system:

  1. 100 workers wake up and fetch the next available job, each getting a single unique job (1 SELECT and 1 UPDATE call each).
  2. a total of 100 jobs are processed with exactly 200 SQL calls

I've also included an example to make it all work in 1 call, avoiding an extra round-trip. This requires custom SQL only tested with PostgreSQL

scosman added some commits Jan 25, 2013

Fix the way we fetch jobs to work better for a large number of workers.
Instead of fetching 5 candidates at attempting to lock each one by one, just lock the next job in 1 query.

Example of old system and 100 workers (worst but not uncommon case):
1) 100 workers wake up and fetch 5 candidate jobs (they all get the same, or very similar set of 5 jobs), making a total of 100 SELECT calls
2) they all try to lock the first job (only 1 of 99 succeeds). The failing workers try the next, making a total of roughly 500 UPDATE calls
3) a total of 5 jobs get processed from about 600 database calls, and we restart the whole process

New system:
1) 100 workers wake up and fetch the next available job, each getting a single unique job (1 SELECT and 1 UPDATE call each).
2) a total of 100 jobs are processed with exactly 200 SQL calls

I've also included an example to make it all work in 1 call, avoiding an extra round-trip. This requires custom SQL only tested with PostgreSQL

sferik added a commit that referenced this pull request Feb 10, 2013

@sferik sferik merged commit 666cbc1 into collectiveidea:master Feb 10, 2013

1 check failed

default The Travis build could not complete due to an error
Details

@albus522 albus522 referenced this pull request Mar 28, 2013

Merged

Fix job locking #46

@carlosantoniodasilva

This comment has been minimized.

Show comment Hide comment
@carlosantoniodasilva

carlosantoniodasilva Aug 12, 2013

Is there any reason why this patch dropped the silencing of Active Record? With 0.4.0 forward all pooling queries are being logged, making the log file grow a lot when log_level is set to debug (ie development or other envs used for validating stuff). /cc @sferik

Is there any reason why this patch dropped the silencing of Active Record? With 0.4.0 forward all pooling queries are being logged, making the log file grow a lot when log_level is set to debug (ie development or other envs used for validating stuff). /cc @sferik

This comment has been minimized.

Show comment Hide comment
@carlosantoniodasilva

carlosantoniodasilva Dec 17, 2013

Hey @albus522 @sferik, do you have any input on this? :)

Hey @albus522 @sferik, do you have any input on this? :)

This comment has been minimized.

Show comment Hide comment
@sferik

sferik Dec 17, 2013

Collaborator

Have you asked @scosman why he made this change? I wouldn’t be opposed to adding it back if it wasn’t removed for a good reason.

Collaborator

sferik replied Dec 17, 2013

Have you asked @scosman why he made this change? I wouldn’t be opposed to adding it back if it wasn’t removed for a good reason.

This comment has been minimized.

Show comment Hide comment
@carlosantoniodasilva

carlosantoniodasilva Dec 17, 2013

@sferik I didn't ask @scosman directly besides that first comment, but I imagined he'd get a notification and could answer back with some reasoning. In any case, hearing from him would definitely be good, thanks.

@sferik I didn't ask @scosman directly besides that first comment, but I imagined he'd get a notification and could answer back with some reasoning. In any case, hearing from him would definitely be good, thanks.

This comment has been minimized.

Show comment Hide comment
@scosman

scosman Dec 17, 2013

Contributor

I can't recall at this point. Generally I prefer to not be hide db requests from the logs. It's annoying since it's polling, but these calls were causing major perf issues for us and were harder to track down because they were not in the logs.

@sferik your call.

Contributor

scosman replied Dec 17, 2013

I can't recall at this point. Generally I prefer to not be hide db requests from the logs. It's annoying since it's polling, but these calls were causing major perf issues for us and were harder to track down because they were not in the logs.

@sferik your call.

This comment has been minimized.

Show comment Hide comment
@sferik

sferik Dec 17, 2013

Collaborator

@scosman That logic seems reasonable to me.

@carlosantoniodasilva Any reason why you can’t just silence the logs in your app? This ought to do the trick:

module Delayed
  module Backend
    module ActiveRecord
      class Job < ::ActiveRecord::Base
        ::ActiveRecord::Base.silence do
          scope.by_priority.all(:limit => limit)
        end
      end
    end
  end
end
Collaborator

sferik replied Dec 17, 2013

@scosman That logic seems reasonable to me.

@carlosantoniodasilva Any reason why you can’t just silence the logs in your app? This ought to do the trick:

module Delayed
  module Backend
    module ActiveRecord
      class Job < ::ActiveRecord::Base
        ::ActiveRecord::Base.silence do
          scope.by_priority.all(:limit => limit)
        end
      end
    end
  end
end

This comment has been minimized.

Show comment Hide comment
@carlosantoniodasilva

carlosantoniodasilva Dec 17, 2013

I see the reasoning (in fact that was my thought while reading the commit), however I'm unsure that the application logs are the best place to track such perf issues down. Anyway the problem I faced is that the AR logs are growing a lot due to the frequency DJ queries the database, and it was a bit hard to figure out why at first.

@sferik I could probably hack something out definitely, although I'm not sure the code you pasted is going to work :). Are you saying to override inside the Job class I presume?

The code has changed considerably since this commit, so I believe it'd probably not be so future proof, but I can give it a shot.

Thanks anyway.

I see the reasoning (in fact that was my thought while reading the commit), however I'm unsure that the application logs are the best place to track such perf issues down. Anyway the problem I faced is that the AR logs are growing a lot due to the frequency DJ queries the database, and it was a bit hard to figure out why at first.

@sferik I could probably hack something out definitely, although I'm not sure the code you pasted is going to work :). Are you saying to override inside the Job class I presume?

The code has changed considerably since this commit, so I believe it'd probably not be so future proof, but I can give it a shot.

Thanks anyway.

This comment has been minimized.

Show comment Hide comment
@sferik

sferik Dec 17, 2013

Collaborator

@sferik I could probably hack something out definitely, although I'm not sure the code you pasted is going to work :). Are you saying to override inside the Job class I presume?

Yeah, that’s what I meant. Updated the snipped above so it should work.

Collaborator

sferik replied Dec 17, 2013

@sferik I could probably hack something out definitely, although I'm not sure the code you pasted is going to work :). Are you saying to override inside the Job class I presume?

Yeah, that’s what I meant. Updated the snipped above so it should work.

This comment has been minimized.

Show comment Hide comment
@carlosantoniodasilva

carlosantoniodasilva Dec 17, 2013

Alright, thanks.

Alright, thanks.

This comment has been minimized.

Show comment Hide comment
@carlosantoniodasilva

carlosantoniodasilva Dec 18, 2013

For the record, this seems to do the trick (although it is not the most beautiful thing in the world):

# Silence DJ querying AR frequently.
Delayed::Backend::ActiveRecord::Job.instance_eval do
  def reserve_with_silence(*args)
    silence { reserve_without_silence(*args) }
  end

  class << self
    alias_method_chain :reserve, :silence
  end
end

For the record, this seems to do the trick (although it is not the most beautiful thing in the world):

# Silence DJ querying AR frequently.
Delayed::Backend::ActiveRecord::Job.instance_eval do
  def reserve_with_silence(*args)
    silence { reserve_without_silence(*args) }
  end

  class << self
    alias_method_chain :reserve, :silence
  end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment