Skip to content

gvarela/job_queue_benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Queue benchmark
This app was designed to benchmark Resque against Delayed::Job for the purpose of proving that we need to change out the queue system on an inventory project we have been working on.

I created two types of jobs. One as a simple baseline and the other to represent a facet of the tasks we queue up currently.
1) a simple AR find
2) fetch an image from an akamai server and then save and create a thumb with attachment_fu

Each of these tasks were created as jobs in Dj and Resque a varying number of times. To have a baseline we ran these tasks without a queue. We then ran several tests adding each task as a job 5,000 and 500,000 times into Dj. We did this with one worker and three workers to see the overhead of the queue as well as see how the performance improves with each worker. We then queued the same tasks in Resque 500,000 and ran with one worker and three workers respectively.

Results Summary:
We essentially confirmed our hypothesis that Dj doesn't scale when the queue becomes large. Synchronously fetching and saving images took took 91.7 seconds inline. Processing 1000 jobs of the that same task from a 500,000 job queue with one worker took 691.1 seconds in Dj, and 93.8 seconds in Resque. This means that Resque has very little overhead when compared to Dj which has huge database overhead. When we ran the same 1000 jobs with the same total number in the queue with 3 workers Dj took roughly 411 seconds and Resque took roughly 36 seconds. Thus Resque scales much better than Dj with more workers.

Method:
There are several rake tasks for seeding the database and the queues. After seeding the data I ran some benchmark tasks to time the execution. Take a look at benchmark.rake and seed.rake for examples.

Raw Results:

* 5,000 row image table straight synchronous processing:

  basic AR find with no queue ( 1,000 queries )
    user       system     total    real
    0.510000   0.060000   0.570000 (  2.094698)  [secs]

  fetching an image off of akamai and doing an insert ( 1,000 inserts and image fetches )
   user     system      total        real
   15.080000   2.900000  17.980000 ( 91.660208)


* 5,000 row image table 5,000 row jobs table dj queue single worker process:

  Dj with just a basic AR find ( 1000 jobs )
   user       system     total    real
   3.140000   0.290000   3.430000 ( 10.598753)

  Dj with fetching an image off of akamai and doing an insert ( 1000 jobs )
   user       system     total     real
   18.810000   3.090000  21.900000 ( 95.330724)
   

* 5,000 row image table 500,000 row jobs table dj queue single worker process:

  Dj with just a basic AR find ( 1000 jobs )
    user       system     total    real
    3.710000   0.720000   4.430000 (473.825784)  [secs]

  Dj with fetching an image off of akamai and doing an insert ( 1000 jobs )
    user        system    total     real
    19.610000   3.730000  23.340000 (691.071522)  [secs]

* 5,000 row image table 5,000 row jobs table dj queue 3 worker processes:

  Dj with just a basic AR find ( 1000 jobs / 333 per worker ) [This has a rather large standard deviation which is due to failed lock attempts]
    user        system    total     real
84180 =>   0.290000   0.170000   0.460000 (  1.257616)
84178 =>   1.100000   0.260000   1.360000 (  4.334029)
84179 =>   1.390000   0.280000   1.670000 (  5.335976)

  Dj with fetching an image off of akamai and doing an insert ( 1000 jobs / 333 per worker )
      user        system    total     real
84307 =>   7.440000   1.370000   8.810000 ( 41.680948)
84309 =>   7.500000   1.360000   8.860000 ( 42.371396)
84308 =>   7.480000   1.360000   8.840000 ( 42.777726)


* 5,000 row image table 500,000 row jobs table dj queue 3 worker processes:

  Dj with just a basic AR find ( 1000 jobs )
    user       system     total    real
84970 =>   1.940000   0.590000   2.530000 (333.338117)
84971 =>   1.940000   0.590000   2.530000 (333.344340)
84969 =>   1.990000   0.580000   2.570000 (334.971635)

  Dj with fetching an image off of akamai and doing an insert ( 1000 jobs / 333 per worker )
      user        system    total     real
85809 =>   7.780000   1.660000   9.440000 (409.771486)
85811 =>   7.830000   1.630000   9.460000 (410.615919)
85810 =>   7.840000   1.670000   9.510000 (411.344920)


* 5,000 row image table 500,000 row jobs table Resque queue single worker process:

  Resque with just a basic AR find ( 1000 jobs )
    user       system     total    real
    1.080000   0.200000   1.280000 (  3.261001)

  Resque with fetching an image off of akamai and doing an insert ( 1000 jobs )
    user        system    total     real
    15.610000   3.150000  18.760000 ( 93.730797)

* 5,000 row image table 500,000 row jobs table Resque queue 3 worker processes:

  Resque with just a basic AR find ( 1000 jobs / 333 per worker )
    user       system     total    real
    0.480000   0.180000   0.660000 (  2.446347)
    0.480000   0.170000   0.650000 (  2.542048)
    0.490000   0.180000   0.670000 (  2.551230)

  Resque with fetching an image off of akamai and doing an insert ( 1000 jobs / 333 per worker )
    user        system    total     real
    5.930000   1.230000   7.160000 ( 35.850032)
    6.010000   1.230000   7.240000 ( 36.007538)
    5.950000   1.240000   7.190000 ( 36.010807)


About

[demo] A simple rails app to benchmark differences between Delayed::Job and Resque with large queues

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published