Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

min-slaves-to-write configuration on a slave breaks replication #1434

Closed
shanemadden opened this issue Dec 5, 2013 · 12 comments
Closed

min-slaves-to-write configuration on a slave breaks replication #1434

shanemadden opened this issue Dec 5, 2013 · 12 comments

Comments

@shanemadden
Copy link

@shanemadden shanemadden commented Dec 5, 2013

In Redis 2.8.2 slaves which have min-slaves-to-write set in their config file (I was putting it in there for when that slave gets promoted to master - I would guess that it'd be inert on a slave, but hopefully active once that slave became a master?) seem to have problems with accepting replication of writes from the master.

Master and two slaves set up and running..

root@redis3:~# redis-cli info replication
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.99.82,port=6379,state=online,offset=2973,lag=1
slave1:ip=192.168.99.81,port=6379,state=online,offset=2973,lag=1
master_repl_offset:2973
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:2972

root@redis3:~# redis-cli info keyspace
# Keyspace
db0:keys=1,expires=0,avg_ttl=0

One slave, redis1, has min-slaves-to-write 1 in its config file. redis2 does not. Otherwise their configurations are identical.

On startup, both slaves do their initial sync great.. but redis1 has a slightly different output for info replication (with min_slaves_good_slaves:0):

root@redis2:~# redis-cli info replication
# Replication
role:slave
master_host:192.168.99.83
master_port:6379
master_link_status:up
master_last_io_seconds_ago:2
master_sync_in_progress:0
slave_repl_offset:2973
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

root@redis2:~# redis-cli info keyspace
# Keyspace
db0:keys=1,expires=0,avg_ttl=0


root@redis1:~# redis-cli info replication
# Replication
role:slave
master_host:192.168.99.83
master_port:6379
master_link_status:up
master_last_io_seconds_ago:2
master_sync_in_progress:0
slave_repl_offset:2973
slave_priority:100
slave_read_only:1
connected_slaves:0
min_slaves_good_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

root@redis1:~# redis-cli info keyspace
# Keyspace
db0:keys=1,expires=0,avg_ttl=0

A key change on the master is replicated to redis2 just fine, but not to redis1, despite it still being apparently synced:

root@redis3:~# redis-cli set foo 1
OK

root@redis2:~# redis-cli get foo
"1"

root@redis1:~# redis-cli get foo
(nil)

root@redis3:~# redis-cli info replication
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.99.82,port=6379,state=online,offset=66350,lag=0
slave1:ip=192.168.99.81,port=6379,state=online,offset=66350,lag=0
master_repl_offset:66350
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:66349
root@redis3:~# redis-cli info keyspace
# Keyspace
db0:keys=2,expires=0,avg_ttl=0

root@redis2:~# redis-cli info replication
# Replication
role:slave
master_host:192.168.99.83
master_port:6379
master_link_status:up
master_last_io_seconds_ago:2
master_sync_in_progress:0
slave_repl_offset:66350
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
root@redis2:~# redis-cli info keyspace
# Keyspace
db0:keys=2,expires=0,avg_ttl=0

root@redis1:~# redis-cli info replication
# Replication
role:slave
master_host:192.168.99.83
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:66488
slave_priority:100
slave_read_only:1
connected_slaves:0
min_slaves_good_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
root@redis1:~# redis-cli info keyspace
# Keyspace
db0:keys=1,expires=0,avg_ttl=0

root@redis3:~# redis-cli flushall

root@redis2:~# redis-cli info keyspace
# Keyspace

root@redis1:~# redis-cli info keyspace
# Keyspace
db0:keys=1,expires=0,avg_ttl=0

A config rewrite on the malfunctioning slave removes the min-slaves-to-write 1 line, then a restart of the process gets it back into a normal replication state.

@rhoml
Copy link
Contributor

@rhoml rhoml commented Jun 5, 2014

I have the same issue on redis 2.8.8, the weird escenario is that data replicates when you attach the nodes to the master node, and then sync stops.

A work around to fix this is to use the reconfigure script from sentinel so when promoting a new master it sets the 'min-slaves-to-write 1'.

@charsyam
Copy link
Contributor

@charsyam charsyam commented Jun 5, 2014

@rhoml @shanemadden
I just search code. and if repl_good_slaves_count < repl_min_slaves_to_write then
it rejects all write_command.

so. I might think. one of them is not good slave. so if 1 < 1 will not accept write_commands
it checks good slave

void refreshGoodSlavesCount(void) {
    listIter li;
    listNode *ln;
    int good = 0;

    if (!server.repl_min_slaves_to_write ||
        !server.repl_min_slaves_max_lag) return;

    listRewind(server.slaves,&li);
    while((ln = listNext(&li))) {
        redisClient *slave = ln->value;
        time_t lag = server.unixtime - slave->repl_ack_time;

        if (slave->replstate == REDIS_REPL_ONLINE &&
            lag <= server.repl_min_slaves_max_lag) good++;
    }
    server.repl_good_slaves_count = good;
}

@rhoml
Copy link
Contributor

@rhoml rhoml commented Jun 5, 2014

@charsyam that is exactly what is happening, if you have a cluster of 1 redis master and 2 redis slaves with a min-slaves-to-write 1 on each node, replication won't work because slaves doesn't comply with the min-slaves-to-write 1 so replication will fail.

Either this option is not meant to be written on the slaves so that why I'll do the workaround with sentinel or it shouldn't apply for slaves so the code must be aware of the redis node role when deciding of can replicate.

@antirez
Copy link
Contributor

@antirez antirez commented Jun 5, 2014

Horrible bug, fixing.

@rhoml
Copy link
Contributor

@rhoml rhoml commented Jun 5, 2014

@antirez thank you.

antirez added a commit that referenced this issue Jun 5, 2014
Replication is totally broken when a slave has this option, since it
stops accepting updates from masters.

This fixes issue #1434.
antirez added a commit that referenced this issue Jun 5, 2014
Replication is totally broken when a slave has this option, since it
stops accepting updates from masters.

This fixes issue #1434.
antirez added a commit that referenced this issue Jun 5, 2014
Replication is totally broken when a slave has this option, since it
stops accepting updates from masters.

This fixes issue #1434.
@antirez
Copy link
Contributor

@antirez antirez commented Jun 5, 2014

Fixed and added tests for min-slaves-to-write. It is very bad that I didn't noticed this issue when @shanemadden originally reported it 4 months ago. Redis 2.8.10 will go live today.

@antirez antirez closed this Jun 5, 2014
@antirez
Copy link
Contributor

@antirez antirez commented Jun 5, 2014

You are welcome, 2.8.10 is live if you want to upgrade.

@rhoml
Copy link
Contributor

@rhoml rhoml commented Jun 5, 2014

@antirez awesome, thanks this was fast.

@rhoml
Copy link
Contributor

@rhoml rhoml commented Jun 11, 2014

@antirez How will this behave when you have a cluster like M - S - S? Will the first slave will still be affected by the min-slaves-to-write?

@antirez
Copy link
Contributor

@antirez antirez commented Jun 11, 2014

No, it will not because, to start, slaves are by default read-only, so it is basically meaningless most of the times, and also, slaves totally ignore the option now, regardless of the fact they have additional slaves or not.

@rhoml
Copy link
Contributor

@rhoml rhoml commented Jun 11, 2014

@antirez awesome, that's what I expected. Thanks.

@Natarajan77
Copy link

@Natarajan77 Natarajan77 commented Apr 10, 2018

minslavestowrite
maxlagnumberofsecs

we set on master and slaves in the conf file. Do we need to set in sentinel nodes also?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants