-
Notifications
You must be signed in to change notification settings - Fork 222
fix(job.id): Use unique hex string for job id to avoid a race condition #193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I don't understand the test failure involving the stack trace contents. |
|
So the core change of this PR breaks the default id ordering which supports the estimation of job-creation-throughput that the documentation for The separation of user-provided ids from default ids is exercised by the remaining id-based test failure: AFAICT, the idea of keeping user-provided ids separate from the default (automatic) ids seems like an artifact of the old way of generating ids in Redis, rather than a bona-fide feature. The documented behavior has changed, and the non-user-provided and ordering semantics of checkHealth().newestJob are no longer supported. I would like to change the documentation, and change the test to show that user-provided ids do show up in newestJob... Actually, I would prefer to get rid of the One possible adjustment: we could make This adjustment would also make explicit that maintenance of the Redis key for |
|
Can you share more about how generating the job id from calling Redis' |
|
Please see #189 which reproduces the problem. Specifically to your question: Redis does the increment and pushes the job onto the waiting queue atomically. The worker can pop the job off the waiting queue and get it finished before the job's ID has been set in the queue's container (which happens asynchronously with the worker running the job). So, when the job finishes, the queue does not emit a succeeded event because the queue doesn't yet know about the job. |
|
I see now - thanks. Just to confirm - the only symptom of this is that events won't be emitted for those jobs, right? I'll let @skeggse or @LewisJEllis review. |
|
Correct. And I believe only the Queue events won't be admitted. That is, the BTW, I am considering an alternate approach to solving the problem. That is, make a separate Redis call to get the new incr ID. This was suggested in the analysis in #78 . This entails the extra overhead of two Redis calls, but has some simplicity, and is much less disruptive vis-a-vis the I'm still struggling with intermittent test failures in a Docker environment that suggest there are lurking race conditions in some of the test logic (or conceivably the library). |
|
There may be some Queue assumptions that are causing test failures for either approach (this PR or a separate incr call). The existing behavior is that the Redis state has the saved job in play before the Queue has the job.id in its activeJobs set. The new behavior is that the Queue has the job.id in its activeJobs set before Redis has the job entered into its state. Thus, there are different constraints on what can be done... There's more for me to learn before I can untangle the implications and formulate a solution... |
|
See also #194 which implements the Docker test harness I've been using, and which documents two intermittent test failures I've seen. |
|
This proposed fix is obsoleted by #197 , so I'm closing it. |
These changes fix the problems reported in #189 and originally in #78 . They do so by managing the job id in the Job object itself rather than in Redis. Various simplifications emerge. Note that with these changes, the only purpose of the
bq:name:idkey in Redis is to support thenewestJobfield returned by checkHealth().A couple of tests fail because they rely on the incrementing numerical-properties of the old job-id creation or on the idea that user-provided job ids are special. Whether user-provided ids are special isn't clear to me. So, I haven't tried to fix the tests yet.