Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Slave severly backed up #1021

Open
swarupe04 opened this Issue · 4 comments

2 participants

@swarupe04

Hi,
We're hitting issues where for some reason Slave is severly backedup and causing the master host to go out of memory trying to sync the slave.
Question 1) Why would master take lot of memory to sync the slave.
2) Workaround to quickly get slave to sync to master. Is copying rdb file from master and restarting slave works?
3) What causes a slave to severly lag behind? other than network delay? Is there a way we can alert when a slave is behind?

Here's a fully synced slave which takes only <2G, compared to master which is going beyond 64G to sync all slaves (6) and crashing..

There was a previous bug related to this
#619 (comment)

Slave INFO fully synced:

redis_version:2.4.14
redis_git_sha1:00000000
redis_git_dirty:0
arch_bits:64
multiplexing_api:epoll
gcc_version:4.4.5
process_id:27568
uptime_in_seconds:239347
uptime_in_days:2
lru_clock:107562
used_cpu_sys:1197.35
used_cpu_user:1216.59
used_cpu_sys_children:3569.42
used_cpu_user_children:13466.99
connected_clients:316
connected_slaves:0
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
used_memory:2061723840
used_memory_human:1.92G
used_memory_rss:2110783488
used_memory_peak:3349234528
used_memory_peak_human:3.12G
mem_fragmentation_ratio:1.02
mem_allocator:jemalloc-2.2.5
loading:0
aof_enabled:0
changes_since_last_save:36362
bgsave_in_progress:0
last_save_time:1364224410
bgrewriteaof_in_progress:0
total_connections_received:579554
total_commands_processed:197659530
expired_keys:0
evicted_keys:0
keyspace_hits:466202737
keyspace_misses:23082066
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:227644
vm_enabled:0
role:slave
master_host:xxxx.web.aol.com
master_port:8080
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
db0:keys=213296,expires=66
db1:keys=1433434,expires=781064
db2:keys=770,expires=769
db3:keys=29798,expires=29636
db4:keys=8603,expires=8076
db5:keys=497488,expires=483108
db6:keys=141984,expires=63844
db8:keys=17643,expires=17627

Master:
redis_version:2.4.14
redis_git_sha1:00000000
redis_git_dirty:0
arch_bits:64
multiplexing_api:epoll
gcc_version:4.4.5
process_id:30151
uptime_in_seconds:5806
uptime_in_days:0
lru_clock:107058
used_cpu_sys:1116.05
used_cpu_user:439.48
used_cpu_sys_children:248.77
used_cpu_user_children:763.87
connected_clients:1389
connected_slaves:6
client_longest_output_list:983803
client_biggest_input_buf:13224
blocked_clients:0
used_memory:41542102896
used_memory_human:38.69G
used_memory_rss:41549139968
used_memory_peak:41541833200
used_memory_peak_human:38.69G
mem_fragmentation_ratio:1.00
mem_allocator:jemalloc-2.2.5
loading:0
aof_enabled:1
changes_since_last_save:27795
bgsave_in_progress:0
last_save_time:1364219378
bgrewriteaof_in_progress:0
total_connections_received:93830
total_commands_processed:32045872
expired_keys:1578896
evicted_keys:0
keyspace_hits:14679
keyspace_misses:501
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:595337
vm_enabled:0
role:master
aof_current_size:3410100559
aof_base_size:1817969465
aof_pending_rewrite:0
aof_buffer_length:0
aof_pending_bio_fsync:0
slave0:172.29.24.161,18153,online
slave1:172.29.24.161,38085,online
slave2:172.29.108.145,55114,online
slave3:172.29.108.145,34798,online
slave4:172.29.108.145,59060,online
slave5:172.29.24.161,20446,online
db0:keys=213168,expires=66
db1:keys=1309439,expires=654815
db2:keys=1412,expires=1411
db3:keys=26689,expires=26527
db4:keys=7645,expires=7118
db5:keys=529818,expires=515117
db6:keys=140898,expires=62884
db8:keys=8902,expires=8886

@antirez
Owner

Hello, upgrading to latest 2.6 will help resolving many related issues that could cause this problem.
Ultimately for some reason your slave does no longer accept updates because probably of some networking issue you have. With 2.4 this silently creates all the sort of issues. With 2.6 the problem is detected setting a suitable max output buffer limit for slaves, and the offending slave disconnected.

Moreover, with 2.6 you can enable keep-alive with the slave socket connection, that will detect other disconnected peer issues.

So my suggestion is:

  • Upgrade to 2.6 ASAP.
  • Enable keep alive in the slave connection.
  • Set a max output buffer for slaves.

Please report here what happens :-)
Salvatore

@swarupe04
@swarupe04
@swarupe04

Hi Salvatore,
If we've to sync a severly backed up slave, as a temporary workaround prior to upgrading to 2.6.11 or even after we upgrade, how can that be achieved?

If we get a dump of rdb from master and restart slave with it, does that quicken the sync from master?

Thanks,

Swarup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.