Unexpected resync with master #841

Closed
wants to merge 1 commit into
from

Projects

None yet

4 participants

@charsyam
Contributor

This issue is based on "redis doesn't close replication fd", so it can call syncWithMaster with previous replication fd.

There are 2 approaches.

1] when client requests "slaveof" command
-> closeing replication fd.

2] when syncWithMaster call
-> checking fd is repl_transfer_s

and if repl_state == REDIS_REPL_CONNECTED , then redis doesn't close replication fd.

@charsyam
Contributor
@jokea
Contributor
jokea commented Jan 11, 2013

The fix is buggy, because server.repl_transfer_s has never been initilized.
The first time a SLAVEOF command is received, server.repl_transfer_s is some
random value, 0 in most cases, so in closeConnectionWithMaster, close(0) will be
called. Any incoming client which takes 0 as it's fd will not get any reply from redis(as
_installWriteEvent() refused to register write event for fd less or equal than 0).

697     +void closeConnectWithMaster() {
698     +    int fd = server.repl_transfer_s;
699     +    if (fd != -1) {
700     +        aeDeleteFileEvent(server.el,fd,AE_READABLE|AE_WRITABLE);
701     +        close(fd);   // Here's the problem
702     +        server.repl_transfer_s = -1;
703     +    }
704     +
@charsyam
Contributor

@jokea. Thank you for you advice. I think you're right. I will fix it soon.

@openbaas openbaas Unexpected resync with master
initializing repl_transfer_s to 0
19d8783
@charsyam
Contributor

@jokea I add to initialize repl_transfer_s to -1.

@antirez
Owner
antirez commented Jan 14, 2013

Why we don't just use undoConnectWithMaster() when SLAVEOF is called but the slave is in one the states CONNECTING or RECEIVE_PONG?

@antirez
Owner
antirez commented Jan 14, 2013

What about my proposed fix f8e40cd ?

@charsyam charsyam closed this Jan 19, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment