Skip to content

v2.0.0 - The "Database-Driven Development" Release!

Choose a tag to compare

@smudge smudge released this 19 Dec 22:16
· 2 commits to main since this release
Immutable release. Only release title and notes can be modified.
62df905

What's Changed

Numerous changes focused on index coverage and query performance went into this release.

At a high level, this release consists of:

  • New optional-but-encouraged DB indexes (available via rake delayed:install:migrations) that improve index coverage of all queries.
  • Adjustments to queries to improve selectivity and reduce cost (# of scanned/filtered rows).
  • First class support for HOT updates in PostgreSQL (during job pickup query).
  • Daylight savings time fixes in non-UTC time (:local) contexts. (#81)
  • Job timeouts can no longer be rescued from within the perform method as a StandardError (#66)

YMMV (and will depend heavily on the contents of your queue and the number of workers you run). In real-world at-scale testing against a PostgreSQL (RDS Aurora) backed queue, in an environment where the table had many millions of future-scheduled and/or failed rows (i.e. "non-claimable" rows), improvements were observed across the board:

  • The worker pickup query saw a 100-1000x query time improvement.
  • The "monitor" queries saw on net a 10-100x improvement.
  • Overall CPU usage and disk I/O of an active (but not back-logged) PostgreSQL queue dropped significantly.

The full list of changes is as follows:

  • test: Add "golden" tests for worker/monitor SQL by @smudge in #61
  • refactor: Add 'lock_timeout' & clarify that 'max_run_time' is a process-wide config by @smudge in #62
  • refactor: Remove 'ready_scope' and improve remaining scopes. by @smudge in #63
  • fix(tests): timing issue on test flake by @smudge in #65
  • fix: max_run_time timeout error should not be rescuable as a StandardError by @smudge in #66
  • test: Use snapshots for easier SQL testing/iteration. by @smudge in #67
  • test: Normalize DB versions used in CI by @smudge in #68
  • test: Snapshot test all EXPLAIN query plans by @smudge in #69
  • feat(perf): New indexes for job pickup & monitoring by @smudge in #70
  • fix: Ensure that upsert_index is reversible by @smudge in #72
  • fix(monitor): exclude failed jobs from metrics that shouldn't count them by @smudge in #73
  • fix(monitor): Obey lock timeout when reporting claimed/unclaimed rows by @smudge in #74
  • fix: Include 'claimed' as part of 'claimed_by' scope by @smudge in #75
  • perf(monitor): use 'attempts > 0' for better indexability by @smudge in #76
  • perf: HOT updates for PostgreSQL, index locked_at for everyone else. by @smudge in #77
  • perf(worker): avoid sequential scan during worker shutdown by @smudge in #79
  • perf: Improve selectivity of pickup query / "claimed" states by @smudge in #80
  • perf: remove legacy/unused "delayed_jobs_priority" index by @smudge in #78
  • fix: prevent DST + :local from breaking job backoff by @smudge in #81
  • fix: Improve upsert_index to only drop+rebuild if there is no matching & valid index by @smudge in #83
  • test(migration path): ensure that new queries are compatible with old index by @smudge in #84
  • perf(monitor): avoid querying the same thing twice, avoid seq scans by @smudge in #82
  • build: Ship 2.0.0! by @smudge in #71

Full Changelog: v1.2.1...v2.0.0