Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Replication timeout is responsibility of the slave alone, should we try this? #918
Since the master -> slave channel is one-way in Redis, currently replication timeout is only up to the slave. Basically this is what happens:
Because of this, there is always data going from master to slave, so the slave will be able to detect when the connection is down and reconnect if possible.
On the master side, if we don't get an error from the socket, we keep sending data. The output buffer can get too large, but the limits in Redis 2.6 will detect that and close the connection with the slave.
However there is no way currently for the master, in absence of socket errors, to detect if the slave is down and close the connection earlier. This is usually not a big issue but I'm opening this issue to get some feedback about that and to reconsider the issue before 2.8 final.
Hello @charsyam, sorry I'm not sure I understand your message, slave of course automatically reconnect in every case as they handle timeout explicitly. Here the problem is only that a dead slave can still be seen by a master as active.
About the patch, thanks, that's appreciated, but I assigned the issue to me as when things are very critical usually I try to address them myself, since it is very little code and mostly design, so to review a patch tends to be as time consuming or more as writing the code ;-)
Yes there are a number of solutions, we already send PINGs to slaves suppressing the reply. It is possible to enable replies just for slaves and implement master-driven timeout as well. But I'm not convinced so I opened this issue just as a reminder to evaluate this matter in the future. Cheers.