-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: handle script reinstantiation during Redis failover #316
Conversation
I believe this may address #147. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want/need to revisit lib/lua/index.js
, seeing as it was written with the assumption that none of the script loading was performance-critical. I suspect it's more important now.
Also worth noting:
An alternative and less obtuse implementation could be to utilize
SCRIPT EXISTS
to check before everyEVALSHA
^ this is actually how the proposed changes operate, because buildCache
invokes loadScriptIfMissing
, which runs SCRIPT EXISTS
.
lib/queue.js
Outdated
err.name === 'ReplyError' && | ||
err.message === 'NOSCRIPT No matching script. Please use EVAL.' | ||
) { | ||
return lua.buildCache(this.client).then((client) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm somewhat concerned that this will hammer the redis server if the client tries to execute a bunch of scripts all at once - they'll all come back with NOSCRIPT
errors and will all separately trigger this code path. IMO we should handle that by introducing new code (either in buildCache
or between here and buildCache
) to memoize the promise. It'd want to be per-client, I suspect - maybe using a WeakMap
(just storing it on the Queue
would work too)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe WeakMap
requires node versions >= 12 (10 behind an interpreter flag); does this change this recommendation at all? I see that you are planning deprecation of node 4 & 6 - wondering about 8/10 in this context & given that your friends at Mixmax (probably) aren't the only users of this package. I know I personally use 10 in some places still :L (tho this could be the kick in the butt to prioritize some other things.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm seeing Node 0.12
, not 12
- am I looking at something different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nah, you're correct - I mean 0.12 and not 12. d'oh, I'm just mixed up.
lib/queue.js
Outdated
if (err.code !== 'NOSCRIPT') { | ||
throw err; | ||
} | ||
return lua.buildCache(this.client).then((client) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The promise produced by buildCache
will not resolve to the client. We'll want to use _commandable
again anyway in case the Queue
is in the process of exiting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d'oh!
lib/lua/index.js
Outdated
if (scriptsLoaded) return scriptsLoadedPromise; | ||
scriptsLoaded = true; | ||
// We could theoretically pipeline this, but it's pretty insignificant. | ||
return readScripts().then(() => | ||
return (scriptsLoadedPromise = readScripts().then(() => | ||
Promise.all( | ||
Object.keys(shas).map((key) => loadScriptIfMissing(client, key)) | ||
) | ||
); | ||
)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skeggse is this a reasonable way to accomplish this, in lieu of a WeakMap? I see this being done above when reading the scripts in. perhaps this isn't reasonable because it's not done at the Client
level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we'll need to do it per-Queue or per-Client - otherwise we won't correctly load scripts into auxiliary clients.
lib/queue.js
Outdated
@@ -32,6 +32,8 @@ class Queue extends Emitter { | |||
this.bclient = null; | |||
this.eclient = null; | |||
|
|||
this._scriptCachePromise = new WeakMap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect you're already aware but I don't see where this is used.
In light of a potential rewrite, given that Mixmax has successfully accomplished the overarching task that prompted this, and a lack of time on my part to fixup this PR, I'm closing this. Thanks for the time spent reviewing! It has been good for me 💟 |
This PR helps handle script re-instantiation during Redis failover events (in AWS at least.) When swapping instances, the script cache disappears, even while keeping the contents of the Redis cache intact. At present,
bee-queue
only loads scripts during startup, which means that after failover events thisevalsha
call fails for already running bee-queue instances. This PR attempts to catch this error, re-instantiate the scripts on the new Redis server, and try again.An alternative and less obtuse implementation could be to utilize
SCRIPT EXISTS
to check before everyEVALSHA
, but that doubles the round trips to Redis on every run; instead we catch this odd edgecase and only try to re-instantiate them when we encounter it.My javascript isn't great; I'd appreciate any and all checks for correctness, code quality & style, and things I just hadn't considered. As a result, I'm filing this as a draft to receive early feedback on implementation. I'll be testing this out more fully on localhost later; at present,
npm test
passes all 140 tests.