Please sign in to comment.
Fix the way we fetch jobs to work better for a large number of workers.
Instead of fetching 5 candidates at attempting to lock each one by one, just lock the next job in 1 query. Example of old system and 100 workers (worst but not uncommon case): 1) 100 workers wake up and fetch 5 candidate jobs (they all get the same, or very similar set of 5 jobs), making a total of 100 SELECT calls 2) they all try to lock the first job (only 1 of 99 succeeds). The failing workers try the next, making a total of roughly 500 UPDATE calls 3) a total of 5 jobs get processed from about 600 database calls, and we restart the whole process New system: 1) 100 workers wake up and fetch the next available job, each getting a single unique job (1 SELECT and 1 UPDATE call each). 2) a total of 100 jobs are processed with exactly 200 SQL calls I've also included an example to make it all work in 1 call, avoiding an extra round-trip. This requires custom SQL only tested with PostgreSQL
- Loading branch information...
Showing with 25 additions and 10 deletions.