Lots of documentation updates.

que-rb · Sep 8, 2017 · fb1ddd9 · fb1ddd9
1 parent 353341d
commit fb1ddd9
Show file tree

Hide file tree

Showing 15 changed files with 167 additions and 472 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,30 +4,38 @@
 
 *   Que's implementation has been changed from one in which worker threads hold their own PG connections and lock their own jobs to one in which a single thread (and PG connection) locks jobs through LISTEN/NOTIFY and batch polling, and passes jobs along to worker threads. This has many benefits, including:
 
-    *   Individual workers no longer need to monopolize their own (possibly idle) connections while working jobs, so Ruby processes may require many fewer Postgres connections.
-
-    *   PgBouncer can be used for workers' connections (though not for the connection used to lock and listen for jobs).
-
     *   Jobs queued for immediate processing can be actively distributed to workers with LISTEN/NOTIFY, which is more efficient than having workers repeatedly poll for new jobs.
 
     *   When polling is necessary (to pick up jobs that are scheduled for the future or that need to be retried due to errors), jobs can be locked in batches, rather than one at a time.
 
+    *   Individual workers no longer need to monopolize their own (possibly idle) connections while working jobs, so Ruby processes may require many fewer Postgres connections.
+
+    *   PgBouncer or another external connection pool can be used for workers' connections (though not for the connection used to lock and listen for jobs).
+
 *   Other features introduced in this version include:
 
+    *   All versions of ActiveJob are much better supported. In particular, you can include `Que::ActiveJob::JobExtensions` into your `ApplicationJob` subclass to get support for all of Que's job methods.
+
     *   `Que.connection_proc=` has been added, to allow for the easy integration of custom connection pools.
 
     *   `Que.job_states` returns a list of locked jobs and the hostname/pid of the Ruby processes that have locked them.
 
+    *   Job configuration options are now inherited by subclasses.
+
+    *   It is now possible to define middleware that wrap running jobs.
+
+    *   Worked jobs can now be retained in the database, and marked with a timestamp in the finished_at column. To hold onto instances of a job indefinitely, replace the `destroy` calls in your jobs with `finish`. `destroy` will still perform its old behavior, for job classes that you don't want to keep. If you don't resolve the job by calling any of these methods, Que will `finish` the job for you by default (previously it would `destroy` it).
+
 *   In keeping with semantic versioning, the major version is being bumped since the new implementation requires some backwards-incompatible changes. These changes include:
 
     *   Support for MRI Rubies below 2.2 has been dropped.
 
     *   Support for Postgres versions below 9.4 has been dropped.
 
-    *   The Railtie and generators providing simple integration with Rails have been removed, due to the difficulty of supporting Rails reliably. Que still supports hooking into the ActiveRecord connection pool, if it is present, but doesn't make any special effort to integrate with Rails any more than it would with any other Ruby framework.
-
     *   JRuby support has been dropped. It will be reintroduced whenever the jruby-pg gem is production-ready.
 
+    *   The Que.mode= setter has been removed. To run jobs synchronously when they are enqueued (the old :sync behavior) you can set `Que.run_synchronously = true`. To start up the worker pool (the old :async behavior) you should use the `que` executable to start up a worker process. Running a worker pool outside of the `que` executable is still possible, but there's no supported API for it.
+
     *   Que no longer uses prepared statements for its built-in queries. This should have no outward-facing change, except that the `Que.disable_prepared_statements` configuration accessor no longer exists.
 
     *   In addition to `Que.disable_prepared_statements=`, the following methods are not meaningful under the new implementation and have been removed: `Que.wake_interval`, `Que.wake_interval=`, `Que.wake!`, `Que.wake_all!`, `Que.worker_count`, `Que.worker_count=`.
@@ -38,13 +46,19 @@
 
     *   For simplicity, the new default for job attributes and keys in argument hashes are now converted to symbols when retrieved from the database, rather than made indifferently-accessible.
 
-    *   Features marked as deprecated in 0.x releases have been removed.
+    *   Calling Que.log() directly is no longer supported/recommended.
 
-    *   Que.connection= has been removed, use Que.connection_proc= instead.
+    *   Configuring workers using QUE_* environment variables is no longer supported, please pass options to the `que` executable instead.
+
+    *   Arguments passed to jobs are now deep-frozen, to prevent unexpected behavior when the args are mutated and the job is reenqueued. In Rails, argument hash keys are also now symbolized, rather than converted to HashWithIndifferentAccesses.
+
+    *   It's now possible to set priority thresholds for individual workers, to ensure that there will always be workers available for high-priority jobs.
+
+    *   Features marked as deprecated in 0.x releases have been removed.
 
-*   Other new features:
+### 0.13.1 (2017-07-05)
 
-    *   There is now a `Que.constantizer=` option, which you can set to a proc to customize how job classes are converted from strings to classes. If you're on Rails and experiencing problems with autoloading, you may want to set `Que.constantizer = proc(&:constantize)`.
+*   Fix issue that caused error stacktraces to not be persisted in most cases.
 
 ### 0.13.0 (2017-06-08)
 

diff --git a/LICENSE.txt b/LICENSE.txt
@@ -1,4 +1,4 @@
-Copyright (c) 2013 Chris Hanks
+Copyright (c) 2013-2017 Chris Hanks
 
 MIT License
 

diff --git a/README.md b/README.md
@@ -1,11 +1,11 @@
 # Que
 
-**TL;DR: Que is a high-performance alternative to DelayedJob or QueueClassic that improves the reliability of your application by protecting your jobs with the same [ACID guarantees](https://en.wikipedia.org/wiki/ACID) as the rest of your data.**
+**TL;DR: Que is a high-performance job queue that improves the reliability of your application by protecting your jobs with the same [ACID guarantees](https://en.wikipedia.org/wiki/ACID) as the rest of your data.**
 
 Que ("keɪ", or "kay") is a queue for Ruby and PostgreSQL that manages jobs using [advisory locks](http://www.postgresql.org/docs/current/static/explicit-locking.html#ADVISORY-LOCKS), which gives it several advantages over other RDBMS-backed queues:
 
   * **Concurrency** - Workers don't block each other when trying to lock jobs, as often occurs with "SELECT FOR UPDATE"-style locking. This allows for very high throughput with a large number of workers.
-  * **Efficiency** - Locks are held in memory, so locking a job doesn't incur a disk write. These first two points are what limit performance with other queues - all workers trying to lock jobs have to wait behind one that's persisting its UPDATE on a locked_at column to disk (and the disks of however many other servers your database is synchronously replicating to). Under heavy load, Que's bottleneck is CPU, not I/O.
+  * **Efficiency** - Locks are held in memory, so locking a job doesn't incur a disk write. These first two points are what limit performance with other queues. Under heavy load, Que's bottleneck is CPU, not I/O.
   * **Safety** - If a Ruby process dies, the jobs it's working won't be lost, or left in a locked or ambiguous state - they immediately become available for any other worker to pick up.
 
 Additionally, there are the general benefits of storing jobs in Postgres, alongside the rest of your data, rather than in Redis or a dedicated queue:
@@ -17,11 +17,12 @@ Additionally, there are the general benefits of storing jobs in Postgres, alongs
 
 Que's primary goal is reliability. You should be able to leave your application running indefinitely without worrying about jobs being lost due to a lack of transactional support, or left in limbo due to a crashing process. Que does everything it can to ensure that jobs you queue are performed exactly once (though the occasional repetition of a job can be impossible to avoid - see the docs on [how to write a reliable job](https://github.com/chanks/que/blob/master/docs/writing_reliable_jobs.md)).
 
-Que's secondary goal is performance. It won't be able to match the speed or throughput of a dedicated queue, or maybe even a Redis-backed queue, but it should be fast enough for most use cases. In [benchmarks of RDBMS queues](https://github.com/chanks/queue-shootout) using PostgreSQL 9.3 on a AWS c3.8xlarge instance, Que approaches 10,000 jobs per second, or about twenty times the throughput of DelayedJob or QueueClassic. You are encouraged to try things out on your own production hardware, though.
+Que's secondary goal is performance. The worker process is multithreaded, so that a single process can run many jobs simultaneously. In [benchmarks of RDBMS queues](https://github.com/chanks/queue-shootout) using PostgreSQL 9.3 on a AWS c3.8xlarge instance, Que approaches 10,000 jobs per second, or about twenty times the throughput of DelayedJob or QueueClassic. You are encouraged to try things out on your own production hardware, though. (TODO: Run new benchmarks)
 
-Que also includes a worker pool, so that multiple threads can process jobs in the same process. It can even do this in the background of your web process - if you're running on Heroku, for example, you don't need to run a separate worker dyno.
-
-Que is tested on Ruby 2.0, Rubinius and JRuby (with the `jruby-pg` gem, which is [not yet functional with ActiveRecord](https://github.com/chanks/que/issues/4#issuecomment-29561356)). It requires Postgres 9.2+ for the JSON datatype.
+Compatibility:
+- Ruby 2.2+
+- PostgreSQL 9.4+ (JSONB required)
+- Rails 4.1+ (optional)
 
 **Please note** - Que's job table undergoes a lot of churn when it is under high load, and like any heavily-written table, is susceptible to bloat and slowness if Postgres isn't able to clean it up. The most common cause of this is long-running transactions, so it's recommended to try to keep all transactions against the database housing Que's job table as short as possible. This is good advice to remember for any high-activity database, but bears emphasizing when using tables that undergo a lot of writes.
 
@@ -43,7 +44,8 @@ Or install it yourself as:
 
 ## Usage
 
-The following assumes you're using Rails 4.0 and ActiveRecord. *Que hasn't been tested with versions of Rails before 4.0, and may or may not work with them.* See the [/docs directory](https://github.com/chanks/que/blob/master/docs) for instructions on using Que [outside of Rails](https://github.com/chanks/que/blob/master/docs/advanced_setup.md), and with [Sequel](https://github.com/chanks/que/blob/master/docs/using_sequel.md) or [no ORM](https://github.com/chanks/que/blob/master/docs/using_plain_connections.md), among other things.
+# TODO: Update Rails version
+The following assumes you're using Rails 4.0 and ActiveRecord. See the [/docs directory](https://github.com/chanks/que/blob/master/docs) for instructions on using Que [outside of Rails](https://github.com/chanks/que/blob/master/docs/advanced_setup.md), with [Sequel](https://github.com/chanks/que/blob/master/docs/using_sequel.md) or with [no ORM](https://github.com/chanks/que/blob/master/docs/using_plain_connections.md).
 
 First, generate and run a migration for the job table.
 
@@ -58,24 +60,30 @@ Create a class for each type of job you want to run:
 class ChargeCreditCard < Que::Job
   # Default settings for this job. These are optional - without them, jobs
   # will default to priority 100 and run immediately.
-  @priority = 10
   @run_at = proc { 1.minute.from_now }
 
-  def run(user_id, options)
+  # We use the Linux priority scale - a lower number is more important.
+  @priority = 10
+
+  def run(credit_card_id, user_id:)
     # Do stuff.
-    user = User[user_id]
-    card = CreditCard[options[:credit_card_id]]
+    user = User.find(user_id)
+    card = CreditCard.find(credit_card_id)
 
-    ActiveRecord::Base.transaction do
+    User.transaction do
       # Write any changes you'd like to the database.
-      user.update_attributes charged_at: Time.now
-
-      # It's best to destroy the job in the same transaction as any other
-      # changes you make. Que will destroy the job for you after the run
-      # method if you don't do it yourself, but if your job writes to the
-      # DB but doesn't destroy the job in the same transaction, it's
-      # possible that the job could be repeated in the event of a crash.
-      destroy
+      user.update charged_at: Time.now
+
+      # It's best to finish the job in the same transaction as any other changes
+      # you make. Que will mark the job as finished for you after the run method
+      # if you don't do it yourself, but if your job writes to the DB but
+      # doesn't finish the job in the same transaction, it's possible that the
+      # job could be repeated in the event of a crash.
+      finish
+
+      # This will leave the job record in the database, so it can be available
+      # for historical inspection. To delete the job entirely, simply replace
+      # the `finish` call with a `destroy` call.
     end
   end
 end
@@ -84,25 +92,22 @@ end
 Queue your job. Again, it's best to do this in a transaction with other changes you're making. Also note that any arguments you pass will be serialized to JSON and back again, so stick to simple types (strings, integers, floats, hashes, and arrays).
 
 ``` ruby
-ActiveRecord::Base.transaction do
+CreditCard.transaction do
   # Persist credit card information
   card = CreditCard.create(params[:credit_card])
-  ChargeCreditCard.enqueue(current_user.id, credit_card_id: card.id)
+  ChargeCreditCard.enqueue(card.id, user_id: current_user.id)
 end
 ```
 
 You can also add options to run the job after a specific time, or with a specific priority:
 
 ``` ruby
-# The default priority is 100, and a lower number means a higher priority. 5 would be very important.
-ChargeCreditCard.enqueue current_user.id, credit_card_id: card.id, run_at: 1.day.from_now, priority: 5
+ChargeCreditCard.enqueue card.id, user_id: current_user.id, run_at: 1.day.from_now, priority: 5
 ```
 
-To determine what happens when a job is queued, you can set Que's mode. There are a few options for the mode:
+## Testing
 
-  * `Que.mode = :off` - In this mode, queueing a job will simply insert it into the database - the current process will make no effort to run it. You should use this if you want to use a dedicated process to work tasks (there's an executable included that will do this, `que`). This is the default when running `bin/rails console`.
-  * `Que.mode = :async` - In this mode, a pool of background workers is spun up, each running in their own thread. See the docs for [more information on managing workers](https://github.com/chanks/que/blob/master/docs/managing_workers.md).
-  * `Que.mode = :sync` - In this mode, any jobs you queue will be run in the same thread, synchronously (that is, `MyJob.enqueue` runs the job and won't return until it's completed). This makes your application's behavior easier to test, so it's the default in the test environment.
+There are a couple ways to do testing. You may want to set `Que::Job.run_synchronously = true`, which will cause JobClass.enqueue to simply execute the job's logic synchronously, as if you'd run JobClass.run(*your_args). Or, you may want to leave it disabled so you can assert on the job state once they are stored in the database.
 
 **If you're using ActiveRecord to dump your database's schema, [set your schema_format to :sql](http://guides.rubyonrails.org/migrations.html#types-of-schema-dumps) so that Que's table structure is managed correctly.** (You can use schema_format as :ruby if you want but keep in mind this is highly advised against, as some parts of Que will not work.)
 
@@ -128,12 +133,14 @@ Regarding contributions, one of the project's priorities is to keep Que as simpl
 
 ### Specs
 
-A note on running specs - Que's worker system is multithreaded and therefore prone to race conditions (especially on interpreters without a global lock, like Rubinius or JRuby). As such, if you've touched that code, a single spec run passing isn't a guarantee that any changes you've made haven't introduced bugs. One thing I like to do before pushing changes is rerun the specs many times and watching for hangs. You can do this from the command line with something like:
+A note on running specs - Que's worker system is multithreaded and therefore prone to race conditions. As such, if you've touched that code, a single spec run passing isn't a guarantee that any changes you've made haven't introduced bugs. One thing I like to do before pushing changes is rerun the specs many times and watching for hangs. You can do this from the command line with something like:
 
-    for i in {1..1000}; do bundle exec rspec -b --seed $i; done
+    for i in {1..1000}; do SEED=$i bundle exec rake; done
 
 This will iterate the specs one thousand times, each with a different ordering. If the specs hang, note what the seed number was on that iteration. For example, if the previous specs finished with a "Randomized with seed 328", you know that there's a hang with seed 329, and you can narrow it down to a specific spec with:
 
-    for i in {1..1000}; do LOG_SPEC=true bundle exec rspec -b --seed 329; done
+    for i in {1..1000}; do LOG_SPEC=true SEED=328 bundle exec rake; done
 
 Note that we iterate because there's no guarantee that the hang would reappear with a single additional run, so we need to rerun the specs until it reappears. The LOG_SPEC parameter will output the name and file location of each spec before it is run, so you can easily tell which spec is hanging, and you can continue narrowing things down from there.
+
+Another helpful technique is to replace an `it` spec declaration with `hit` - this will run that particular spec 100 times during the run.
diff --git a/docs/README.md b/docs/README.md
@@ -1,6 +1,8 @@
 Docs Index
 ===============
 
+TODO: Fix doc links.
+
 - [Advanced Setup](advanced_setup.md#advanced-setup)
   - [Using ActiveRecord Without Rails](advanced_setup.md#using-activerecord-without-rails)
   - [Forking Servers](advanced_setup.md#forking-servers)