Merge pull request #172 from timgit/cron

scheduled jobs
timgit · Jul 15, 2020 · 1aecca6 · 1aecca6
2 parents 52f03cd + 9432990
commit 1aecca6
Show file tree

Hide file tree

Showing 35 changed files with 1,338 additions and 763 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -6,9 +6,9 @@ services:
     - postgresql
 language: node_js
 node_js:
+  - "14"
   - "12"
   - "10"
-  - "8"
 before_script:
   - psql -c 'create database pgboss' -U postgres
   - psql -c 'create extension pgcrypto' -d pgboss -U postgres

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,45 @@
 # Changes
 
+## 5.0.0 :tada:
+
+The pg-boss team hired a timekeeper and now has distributed cron-based scheduling! This works across all instances based on the database server's time as a central clock.
+
+  New functions:
+
+  - `schedule(name, cron, data, options)`
+  - `unschedule(name)`
+  - `getSchedules()`
+
+  New constructor configuration properties:
+
+  - `clockMonitorIntervalSeconds`
+  - `clockMonitorIntervalMinutes`
+  - `noScheduling`
+
+### Changes
+
+- MAJOR: Removed `connect()` and `disconnect()` to simplify usage since these functions became obsolete in v4.  If you had relied on secondary instances running with `connect()`, you should switch to `start()`. Since `start()` is multi-master, it's safe to let it monitor and submit maintenance work, but if you need to opt out of this for whatever reason on a particular instance, set the `noSupervisor` and `noScheduling` constructor options to `true`.
+- MAJOR: Dropped `poolSize` in constructor database config to standardize on `max` property used in the pg package.
+- MAJOR: Dropped Node 8 support and from Travis CI builds.
+- MAJOR: Adjusted maintenance configuration settings for clarity. For example, some operations run on an interval and contain the word "interval". However, other settings are time-based policies evaluated only after maintenance is run. These also contained "interval" which made it challenging to explain the differences between them.
+  - Removed properties related to moving completed jobs to the archive table. Completed jobs will be moved to the archive table based on the maintenance interval going forward.
+
+    | Old | New |
+    | - | - |
+    | `archiveIntervalSeconds` | ** REMOVED ** |
+    | `archiveIntervalMinutes` | ** REMOVED ** |
+    | `archiveIntervalHours` | ** REMOVED ** |
+    | `archiveIntervalDays` | ** REMOVED ** |
+
+  - Renamed properties for controlling when to delete jobs from the archive table
+
+    | Old | New |
+    | - | - |
+    | `deleteIntervalSeconds` | `deleteAfterSeconds` |
+    | `deleteIntervalMinutes` | `deleteAfterMinutes` |
+    | `deleteIntervalHours` | `deleteAfterHours` |
+    | `deleteIntervalDays` | `deleteAfterDays` |
+
 ## 4.3.4
 
 - Typescript types fix for db connections.  Includes PR from @mlegenhausen

diff --git a/README.md b/README.md
@@ -34,31 +34,24 @@ async function someAsyncJobHandler(job) {
 
 pg-boss is a job queue built in Node.js on top of PostgreSQL in order to provide background processing and reliable asynchronous execution to Node.js applications.
 
-Why would you consider using this queue over others? pg-boss is actually a light abstraction over features added in PostgreSQL 9.5
-(specifically [SKIP LOCKED](http://blog.2ndquadrant.com/what-is-select-skip-locked-for-in-postgresql-9-5) and upserts)
-which significantly enhanced its ability to act as a reliable, distributed message queue. I wrote this to remove a dependency on Redis (via the kue package), consolidating systems I have to support in production as well as upgrading to guaranteed message processing (hint: [Redis persistence docs](https://redis.io/topics/persistence#ok-so-what-should-i-use)).
+pg-boss relies on [SKIP LOCKED](http://blog.2ndquadrant.com/what-is-select-skip-locked-for-in-postgresql-9-5), a feature introduced in PostgreSQL 9.5 written specifically for message queues, in order to resolve record locking challenges inherent with relational databases. This brings the safety of guaranteed atomic commits of a relational database to your asynchronous job processing.
 
-This will likely cater the most to teams already familiar with the simplicity of relational database semantics and operations (querying and backups, for example).
+This will likely cater the most to teams already familiar with the simplicity of relational database semantics and operations (SQL, querying, and backups). It will be especially useful to those already relying on PostgreSQL that want to limit how many systems are required to monitor and support in their architecture.
 
 ## Features
-* Guaranteed delivery and finalizing of jobs using a promise API
-* Delayed jobs
-* Job retries (opt-in exponential backoff)
-* Job throttling (unique jobs, rate limiting and/or debouncing)
-* Job batching for high volume use cases
-* Backpressure-compatible subscriptions
-* Configurable job concurrency
-* Distributed and/or clustered workers
-* Completion subscriptions to support orchestrations/sagas
-* On-demand job fetching and completion for external integrations (such as web APIs)
+* Backpressure-compatible subscriptions for monitoring queues on an interval (with configurable concurrency)
+* Distributed cron-based job scheduling with database clock synchronization
+* Job deferral, retries (with exponential backoff), throttling, rate limiting, debouncing
+* Job Completion subscriptions for orchestrations/sagas
+* Direct publish, fetch and completion APIs for custom integrations
+* Batching API for chunked job fetching
+* Direct table access for bulk loads via COPY or INSERT
 * Multi-master capable using tools such as Kubernetes ReplicaSets
-* Direct table access for bulk loading via COPY or other advanced usage
 * Automatic provisioning of required storage into a dedicated schema
-* Automatic monitoring for expired jobs
-* Automatic archiving for completed jobs
+* Automatic maintenance operations to manage table growth
 
 ## Requirements
-* Node 8 or higher
+* Node 10 or higher
 * PostgreSQL 9.5 or higher
 
 ## Documentation

diff --git a/docs/configuration.md b/docs/configuration.md
@@ -8,7 +8,6 @@ pg-boss can be customized using configuration options when an instance is create
   - [Database options](#database-options)
   - [Queue options](#queue-options)
   - [Maintenance options](#maintenance-options)
-    - [Archive completed jobs](#archive-completed-jobs)
     - [Delete archived jobs](#delete-archived-jobs)
     - [Maintenance interval](#maintenance-interval)
 - [Publish options](#publish-options)
@@ -41,7 +40,7 @@ Alternatively, the following options can be set as properties in an object.
 
 * **port** - int,  defaults to 5432
 
-* **ssl** - bool, defaults to false
+* **ssl** - boolean or object
 
 * **database** - string, *required*
 
@@ -53,9 +52,9 @@ Alternatively, the following options can be set as properties in an object.
 
   PostgreSQL connection string will be parsed and used instead of `host`, `port`, `ssl`, `database`, `user`, `password`.
 
-* **poolSize** or **max** - int, defaults to 10
+* **max** - int, defaults to 10
 
-    Maximum number of connections that will be shared by all subscriptions in this instance
+  Maximum number of connections that will be shared by all subscriptions in this instance
 
 * **application_name** - string, defaults to "pgboss"
 
@@ -110,47 +109,27 @@ Maintenance operations include checking active jobs for expiration, archiving co
 
   If this is set to true, maintenance and monitoring operations will not be started during a `start()` after the schema is created.  This is an advanced use case, as bypassing maintenance operations is not something you would want to do under normal circumstances.
 
-#### Archive completed jobs
+* **noScheduling**, bool, default false
 
-When jobs become eligible for archive after completion.
-
-* **archiveIntervalSeconds**, int
-
-    archive interval in seconds, must be >=1
-
-* **archiveIntervalMinutes**, int
-
-    archive interval in minutes, must be >=1
-
-* **archiveIntervalHours**, int
-
-    archive interval in hours, must be >=1
-
-* **archiveIntervalDays**, int
-
-    archive interval in days, must be >=1
-
-Default: 1 hour.
-
-> When a higher unit is is specified, lower unit configuration settings are ignored.
+  If this is set to true, this instance will not monitor scheduled jobs during `start()`. However, this instance can still use the scheduling api. This is an advanced use case you may want to do for testing or if the clock of the server is skewed and you would like to disable the skew warnings.
 
 #### Delete archived jobs
 
 When jobs in the archive table become eligible for deletion.
 
-* **deleteIntervalSeconds**, int
+* **deleteAfterSeconds**, int
 
     delete interval in seconds, must be >=1
 
-* **deleteIntervalMinutes**, int
+* **deleteAfterMinutes**, int
 
     delete interval in minutes, must be >=1
 
-* **deleteIntervalHours**, int
+* **deleteAfterHours**, int
 
     delete interval in hours, must be >=1
 
-* **deleteIntervalDays**, int
+* **deleteAfterDays**, int
 
     delete interval in days, must be >=1
 
@@ -219,19 +198,19 @@ Default: 15 minutes
 
 * **retentionSeconds**, number
 
-    How many seconds a job may be in created state before it becomes eligible to be archived. Must be >=1
+    How many seconds a job may be in created state before it's archived. Must be >=1
 
 * **retentionMinutes**, number
 
-    How many minutes a job may be in created state before it becomes eligible to be archived. Must be >=1
+    How many minutes a job may be in created state before it's archived. Must be >=1
 
 * **retentionHours**, number
 
-    How many hours a job may be in created state before it becomes eligible to be archived. Must be >=1
+    How many hours a job may be in created state before it's archived. Must be >=1
 
 * **retentionDays**, number
 
-    How many days a job may be in created state before it becomes eligible to be archived. Must be >=1
+    How many days a job may be in created state before it's archived. Must be >=1
 
 Default: 30 days