-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use the async api when a connection error triggers a slot update #144
Conversation
1ddb360
to
96b9389
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good! A few comments and questions.
hircluster.c
Outdated
acc->update_route_time = hi_usec_now() + SLOTMAP_UPDATE_THROTTLE_USEC; | ||
|
||
if (r == NULL) { | ||
/* Ignore failures, next random node is hopefully working */ | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we set update_route_time even if r == NULL?
Shouldn't we trigger another slot update with a different node here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bit simplified as a start.
Hiredis calls the callback-function with a NULL reply when either the command has timedout, or when there is connection errors.
The scenario I thought of:
If a slot map update don't find any connected node it will take a random node and start an async connect, it will also send/give the CLUSTER SLOTS command to hiredis for it to send it when the connection is up.
If the connect fails this callback is called with a NULL reply, due to the outstanding CLUSTER SLOTS command.
If all redis nodes are gone (like in the issue we have) I was afraid slotmap updates would go into a busy-loop.
One option could be to only trigger a new slotupdate to one of connected nodes, if that exists.
This would speedup updates for timeout cases, or temporary node issues.
..or some bookkeeping to avoid busy-looping cases..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want a loop until we have a slot map, but wait one second between each attempt. (That's what we do in ered.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Waiting a second is the problem here, hiredis owns the eventloop and its timers.
We need to add a interface and functionality to each event-adapter to be able to kick-off timers in each eventsystem (or have hiredis expose something).
The modularity and being able to choose an own eventsystem its a bit annoying :-)
We have users that has implemented an own eventsystem as well, so updating existing adapters might not be enough for everybody.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is now updated to retry until all nodes has been attempted (by using a lastConnectionAttempt and throttling).
Since there are no timerhandling available yet to hiredis-cluster the first slotmap update attempt is triggered by the user via a command, the retries are triggered by the NULL-reply callback.
Breakout of a new function that also will be reused for the async api.
The attempt to update the slot map will be immediate.
Make sure we throttle the slot map update to 1/sec after communications errors. Update testcase timeouts to include slotmap updates.
Make sure that a connection to the first node exists so that CLUSTER SLOTS is sent to that node instead of a random node.
Let hiredis cleanup after a disconnect like it does after a failure, i.e call unlinkAsyncContextAndNode(). This fixes a thread-sanitizer issue.
Add a timestamp to each redisClusterNode to indicate when the last connection attempt was performed to this node.
When a command response indicates a communication error the slot map is updated. This is now updated using the async api, which avoids blocking calls to connect when querying the cluster node. Primarily select a connected node by picking one of four first found connected nodes. If there are no connected nodes then pick a node where a connect has not been attempted within throttle time (1 sec). cc->update_route_time is used to throttle the slotmap update and to make sure we only have one ongoing query at a time (SLOTMAP_UPDATE_ONGOING).
daa9bde
to
c83f290
Compare
Add missing srandom() and random()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much nicer now, to use an existing connection.
Is there any way to see how many requests are pending on a connection? Just thinking about the case the CLUSTER SLOTS will be added to a very long pipeline and therefore be delayed... Maybe not a valid concern?
Ouch, the randomness makes the simulated-redis tests a bit harder to handle.. |
Right :-) We can inject some fake-randomness just for testing, e.g. using an ifdef or a known fixed random seed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty clean algorithm now. I like it. However, I found a corner case that might need to be addressed.
When a command response indicates a communication error the slot map is updated.
This is now updated using the async api to avoid blocking calls to connect when querying a cluster node.
This fixes the problem with hanging connects that blocks the event handling system.
Changed behaviors
Previously communication errors were counted. When the client had received more errors that the configured max_retry_count value the Redis configuration "cluster-node-time" was fetched from a cluster node. This configured value was then used to determine when to perform a slotmap update. When an additional error was received after the time-to-wait the slotmap update procedure started. This procedure used blocking calls on a new TCP connection.
After this PR a communication error triggers the slotmap update procedure directly.
Primarily a connected node is selected that is found close to a randomly picked index of all known nodes.
The random index should give a more even distribution of selected nodes.
If no connected node is found while iterating to this index the remaining nodes are also checked until a connected node is found. If no connected node is found; a node close o the picked index, for which a connection establishment has not been attempted within throttle-time, is selected.
The commands are sent using the async api to avoid blocking sends (or connects). During the time the slotmap update procedure runs and until a second after it is finish other sent commands that triggers communication errors/timeouts will not start additional slotmap updates, ie the slotmap update is throttled.
Other:
Using async-api during MOVED should be implemented as well, but done in other PR.
Fixes #142