Per-job AllowPromotion opts out of ZADD GT guard#4
Merged
Conversation
Enqueue and PromoteJob have both used ZADD XX GT since #3 to preserve the deferral guarantee for jobs enqueued with deterministic IDs: once a schedule sits at time T in the future, a duplicate enqueue at T' < T must be a no-op, and PromoteJob must not demote a job whose score has been bumped to now + InvisibleSec by Dequeue. That is the right semantic for dedup-style jobs that may race their own re-enqueue. It is the wrong semantic for two cases that have grown up around the queue since: 1. Worker retry rescheduling. When a handler returns an error the retry middleware computes a backoff delay and calls Enqueue with score = now + delay. With GT, that score is rejected because the Dequeue invisibility mark (now + InvisibleSec, typically 60s) is greater. The configured Backoff is effectively dead for any value less than InvisibleSec; gated handlers re-run only on the InvisibleSec cadence regardless of how short the backoff is. 2. Subqueue PromoteOnAck. The subqueue middleware advances the next gated job after the prior handler Acks by calling PromoteJob on that job's ID. With GT, the score (still sitting at the InvisibleSec mark from its last dequeue/gated cycle) is also rejected and the next job continues to wait out its full invisibility window. Add a per-job AllowPromotion flag. Default false preserves today's GT semantics so dedup-deferral jobs are unaffected; setting true causes Enqueue to use plain ZADD XX so backoff can lower the score, and causes PromoteJob to use ZADD XX (without GT) so the next gated subqueue entry can be advanced. The flag rides in the job's msgpack storage so it survives the worker retry round-trip without callers having to track it across Enqueue/PromoteJob boundaries. The Enqueue Lua script splits the per-job arg list into two ZADD calls (one with gt, one without) so a mixed BulkEnqueue stays atomic. PromoteJob does an HGET to read the flag before issuing the ZADD; this adds one round-trip per promotion but keeps the API stable.
AllowPromotion is consulted only server-side (by Enqueue and PromoteJob), so rehydrating it onto jobs returned by Dequeue/BulkFind added a brittle fields[0].(string) assertion and a wider Lua-vs-Go contract for no benefit. Drop the rehydration; the Lua scripts return jobm strings as before, and the Go side reverts to its prior simple form. PromoteJob now runs as a single Lua script (HGET allow_promotion + ZADD XX [GT]) instead of two round trips. This closes a narrow race where an Ack + re-enqueue between the HGET and ZADD could let a stale "AllowPromotion=true" read demote a freshly-enqueued job, and matches the atomic style of every other queue op.
Reset previously called FlushAll, which wiped the entire Redis DB. Because go test ./... runs packages in parallel against a shared Redis instance, one package's Reset could erase another package's in-flight data mid-test, surfacing as flakes (most visibly the 100k-job bulk-enqueue tests). Reset now takes namespaces and scans+deletes only matching keys, and each test package uses a distinct namespace so parallel packages no longer collide.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Enqueue and PromoteJob have both used ZADD XX GT since #3 to preserve the deferral guarantee for jobs enqueued with deterministic IDs: once a schedule sits at time T in the future, a duplicate enqueue at T' < T must be a no-op, and PromoteJob must not demote a job whose score has been bumped to now + InvisibleSec by Dequeue. That is the right semantic for dedup-style jobs that may race their own re-enqueue.
It is the wrong semantic for two cases that have grown up around the queue since:
Worker retry rescheduling. When a handler returns an error the retry middleware computes a backoff delay and calls Enqueue with score = now + delay. With GT, that score is rejected because the Dequeue invisibility mark (now + InvisibleSec, typically 60s) is greater. The configured Backoff is effectively dead for any value less than InvisibleSec; gated handlers re-run only on the InvisibleSec cadence regardless of how short the backoff is.
Subqueue PromoteOnAck. The subqueue middleware advances the next gated job after the prior handler Acks by calling PromoteJob on that job's ID. With GT, the score (still sitting at the InvisibleSec mark from its last dequeue/gated cycle) is also rejected and the next job continues to wait out its full invisibility window.
Add a per-job AllowPromotion flag. Default false preserves today's GT semantics so dedup-deferral jobs are unaffected; setting true causes Enqueue to use plain ZADD XX so backoff can lower the score, and causes PromoteJob to use ZADD XX (without GT) so the next gated subqueue entry can be advanced. The flag rides in the job's msgpack storage so it survives the worker retry round-trip without callers having to track it across Enqueue/PromoteJob boundaries.
The Enqueue Lua script splits the per-job arg list into two ZADD calls (one with gt, one without) so a mixed BulkEnqueue stays atomic. PromoteJob does an HGET to read the flag before issuing the ZADD; this adds one round-trip per promotion but keeps the API stable.