Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduled jobs #172

Merged
merged 30 commits into from
Jul 15, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
a378bb8
hired a timekeeper
timgit Jun 10, 2020
e19d137
skew monitoring and timezone support
timgit Jun 13, 2020
46bf078
clean up process event listener
timgit Jun 14, 2020
9d8b9f2
scheduling tests
timgit Jun 14, 2020
5524079
removed unused api
timgit Jun 14, 2020
7467114
scheduling tests and some docs
timgit Jun 15, 2020
a614ea8
linting
timgit Jun 15, 2020
3491c52
custom clock monitoring
timgit Jun 15, 2020
0b04136
switching to database time for cron
timgit Jun 16, 2020
b3ca060
docs wip [skip ci]
timgit Jun 17, 2020
da8695c
Merge branch 'master' into cron
timgit Jun 17, 2020
abae02c
scheduling validation
timgit Jun 17, 2020
c569fef
update travis config to drop node 8
timgit Jun 21, 2020
214fb2b
update deps
timgit Jun 21, 2020
c48cfac
renamed clock monitoring configuration props
timgit Jun 21, 2020
02fd46b
removed connect() and disconnect()
timgit Jun 21, 2020
89d8e1e
updated configuration and tests
timgit Jun 25, 2020
a438ed9
clock sync ms rounding
timgit Jun 25, 2020
681150c
fixed friendly skew messaging
timgit Jun 27, 2020
3af5b73
updated manifest and deps for v5
timgit Jun 27, 2020
165b5cc
reordered changelog
timgit Jun 27, 2020
2fb77dd
added maintenance job monitor
timgit Jul 1, 2020
10993ae
Merge branch 'master' into cron
timgit Jul 1, 2020
b08c8ef
typescript types
timgit Jul 2, 2020
7121668
fix throttline in cron handler
timgit Jul 9, 2020
ca50058
pg dep update
timgit Jul 9, 2020
cd5f94b
debouncing cron jobs
timgit Jul 9, 2020
ba16eb5
linting
timgit Jul 9, 2020
278095d
schedule() validation
timgit Jul 9, 2020
9432990
debouncing defers based on time slot
timgit Jul 11, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ services:
- postgresql
language: node_js
node_js:
- "14"
- "12"
- "10"
- "8"
before_script:
- psql -c 'create database pgboss' -U postgres
- psql -c 'create extension pgcrypto' -d pgboss -U postgres
Expand Down
40 changes: 40 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,45 @@
# Changes

## 5.0.0 :tada:

The pg-boss team hired a timekeeper and now has distributed cron-based scheduling! This works across all instances based on the database server's time as a central clock.

New functions:

- `schedule(name, cron, data, options)`
- `unschedule(name)`
- `getSchedules()`

New constructor configuration properties:

- `clockMonitorIntervalSeconds`
- `clockMonitorIntervalMinutes`
- `noScheduling`

### Changes

- MAJOR: Removed `connect()` and `disconnect()` to simplify usage since these functions became obsolete in v4. If you had relied on secondary instances running with `connect()`, you should switch to `start()`. Since `start()` is multi-master, it's safe to let it monitor and submit maintenance work, but if you need to opt out of this for whatever reason on a particular instance, set the `noSupervisor` and `noScheduling` constructor options to `true`.
- MAJOR: Dropped `poolSize` in constructor database config to standardize on `max` property used in the pg package.
- MAJOR: Dropped Node 8 support and from Travis CI builds.
- MAJOR: Adjusted maintenance configuration settings for clarity. For example, some operations run on an interval and contain the word "interval". However, other settings are time-based policies evaluated only after maintenance is run. These also contained "interval" which made it challenging to explain the differences between them.
- Removed properties related to moving completed jobs to the archive table. Completed jobs will be moved to the archive table based on the maintenance interval going forward.

| Old | New |
| - | - |
| `archiveIntervalSeconds` | ** REMOVED ** |
| `archiveIntervalMinutes` | ** REMOVED ** |
| `archiveIntervalHours` | ** REMOVED ** |
| `archiveIntervalDays` | ** REMOVED ** |

- Renamed properties for controlling when to delete jobs from the archive table

| Old | New |
| - | - |
| `deleteIntervalSeconds` | `deleteAfterSeconds` |
| `deleteIntervalMinutes` | `deleteAfterMinutes` |
| `deleteIntervalHours` | `deleteAfterHours` |
| `deleteIntervalDays` | `deleteAfterDays` |

## 4.3.4

- Typescript types fix for db connections. Includes PR from @mlegenhausen
Expand Down
29 changes: 11 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,31 +34,24 @@ async function someAsyncJobHandler(job) {

pg-boss is a job queue built in Node.js on top of PostgreSQL in order to provide background processing and reliable asynchronous execution to Node.js applications.

Why would you consider using this queue over others? pg-boss is actually a light abstraction over features added in PostgreSQL 9.5
(specifically [SKIP LOCKED](http://blog.2ndquadrant.com/what-is-select-skip-locked-for-in-postgresql-9-5) and upserts)
which significantly enhanced its ability to act as a reliable, distributed message queue. I wrote this to remove a dependency on Redis (via the kue package), consolidating systems I have to support in production as well as upgrading to guaranteed message processing (hint: [Redis persistence docs](https://redis.io/topics/persistence#ok-so-what-should-i-use)).
pg-boss relies on [SKIP LOCKED](http://blog.2ndquadrant.com/what-is-select-skip-locked-for-in-postgresql-9-5), a feature introduced in PostgreSQL 9.5 written specifically for message queues, in order to resolve record locking challenges inherent with relational databases. This brings the safety of guaranteed atomic commits of a relational database to your asynchronous job processing.

This will likely cater the most to teams already familiar with the simplicity of relational database semantics and operations (querying and backups, for example).
This will likely cater the most to teams already familiar with the simplicity of relational database semantics and operations (SQL, querying, and backups). It will be especially useful to those already relying on PostgreSQL that want to limit how many systems are required to monitor and support in their architecture.

## Features
* Guaranteed delivery and finalizing of jobs using a promise API
* Delayed jobs
* Job retries (opt-in exponential backoff)
* Job throttling (unique jobs, rate limiting and/or debouncing)
* Job batching for high volume use cases
* Backpressure-compatible subscriptions
* Configurable job concurrency
* Distributed and/or clustered workers
* Completion subscriptions to support orchestrations/sagas
* On-demand job fetching and completion for external integrations (such as web APIs)
* Backpressure-compatible subscriptions for monitoring queues on an interval (with configurable concurrency)
* Distributed cron-based job scheduling with database clock synchronization
* Job deferral, retries (with exponential backoff), throttling, rate limiting, debouncing
* Job Completion subscriptions for orchestrations/sagas
* Direct publish, fetch and completion APIs for custom integrations
* Batching API for chunked job fetching
* Direct table access for bulk loads via COPY or INSERT
* Multi-master capable using tools such as Kubernetes ReplicaSets
* Direct table access for bulk loading via COPY or other advanced usage
* Automatic provisioning of required storage into a dedicated schema
* Automatic monitoring for expired jobs
* Automatic archiving for completed jobs
* Automatic maintenance operations to manage table growth

## Requirements
* Node 8 or higher
* Node 10 or higher
* PostgreSQL 9.5 or higher

## Documentation
Expand Down
47 changes: 13 additions & 34 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ pg-boss can be customized using configuration options when an instance is create
- [Database options](#database-options)
- [Queue options](#queue-options)
- [Maintenance options](#maintenance-options)
- [Archive completed jobs](#archive-completed-jobs)
- [Delete archived jobs](#delete-archived-jobs)
- [Maintenance interval](#maintenance-interval)
- [Publish options](#publish-options)
Expand Down Expand Up @@ -41,7 +40,7 @@ Alternatively, the following options can be set as properties in an object.

* **port** - int, defaults to 5432

* **ssl** - bool, defaults to false
* **ssl** - boolean or object

* **database** - string, *required*

Expand All @@ -53,9 +52,9 @@ Alternatively, the following options can be set as properties in an object.

PostgreSQL connection string will be parsed and used instead of `host`, `port`, `ssl`, `database`, `user`, `password`.

* **poolSize** or **max** - int, defaults to 10
* **max** - int, defaults to 10

Maximum number of connections that will be shared by all subscriptions in this instance
Maximum number of connections that will be shared by all subscriptions in this instance

* **application_name** - string, defaults to "pgboss"

Expand Down Expand Up @@ -110,47 +109,27 @@ Maintenance operations include checking active jobs for expiration, archiving co

If this is set to true, maintenance and monitoring operations will not be started during a `start()` after the schema is created. This is an advanced use case, as bypassing maintenance operations is not something you would want to do under normal circumstances.

#### Archive completed jobs
* **noScheduling**, bool, default false

When jobs become eligible for archive after completion.

* **archiveIntervalSeconds**, int

archive interval in seconds, must be >=1

* **archiveIntervalMinutes**, int

archive interval in minutes, must be >=1

* **archiveIntervalHours**, int

archive interval in hours, must be >=1

* **archiveIntervalDays**, int

archive interval in days, must be >=1

Default: 1 hour.

> When a higher unit is is specified, lower unit configuration settings are ignored.
If this is set to true, this instance will not monitor scheduled jobs during `start()`. However, this instance can still use the scheduling api. This is an advanced use case you may want to do for testing or if the clock of the server is skewed and you would like to disable the skew warnings.

#### Delete archived jobs

When jobs in the archive table become eligible for deletion.

* **deleteIntervalSeconds**, int
* **deleteAfterSeconds**, int

delete interval in seconds, must be >=1

* **deleteIntervalMinutes**, int
* **deleteAfterMinutes**, int

delete interval in minutes, must be >=1

* **deleteIntervalHours**, int
* **deleteAfterHours**, int

delete interval in hours, must be >=1

* **deleteIntervalDays**, int
* **deleteAfterDays**, int

delete interval in days, must be >=1

Expand Down Expand Up @@ -219,19 +198,19 @@ Default: 15 minutes

* **retentionSeconds**, number

How many seconds a job may be in created state before it becomes eligible to be archived. Must be >=1
How many seconds a job may be in created state before it's archived. Must be >=1

* **retentionMinutes**, number

How many minutes a job may be in created state before it becomes eligible to be archived. Must be >=1
How many minutes a job may be in created state before it's archived. Must be >=1

* **retentionHours**, number

How many hours a job may be in created state before it becomes eligible to be archived. Must be >=1
How many hours a job may be in created state before it's archived. Must be >=1

* **retentionDays**, number

How many days a job may be in created state before it becomes eligible to be archived. Must be >=1
How many days a job may be in created state before it's archived. Must be >=1

Default: 30 days

Expand Down
Loading