New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redis-trib.rb may allocate the slave and master in the same host #2204

Closed
rj03hou opened this Issue Dec 10, 2014 · 21 comments

Comments

Projects
None yet
10 participants
@rj03hou

rj03hou commented Dec 10, 2014

I found in the following deploy situation:
192.168.1.100 two instance 6379 6380
192.168.1.101 two instance 6379 6380
192.168.1.102 two instance 6379 6380

if I use the following command to create the cluster:

redis-trib.rb create --replicas 1 192.168.1.100:6379 192.168.1.100:6380 192.168.1.101:6379 ... 192.168.1.102:6380

the master will be 6379 port on the three host, but the slave of 192.168.1.102:6379 is 192.168.1.102:6380, on the same host.

I read the code of redis-trib.rb, I think when we find a slave to the master, we can use each one time and reverse_each one time to resolve this problem. I have test my idea with more situation and think the distribution is better, eg three port in each host.

    def alloc_slots:
        masters.each{|m|
            reverse = true
            if reverse 
                ips.reverse_each{|ip,nodes_list|
                        next if nodes_list.length == 0
                        # Skip instances with the same IP as the master if we
                        # have some more IPs available.
                        next if ip == m.info[:host] && nodes_count > nodes_list.length
                        slave = nodes_list.shift
            else
                ips.each {|ip,nodes_list|
                .......
             reverse = reverse?false:true

@antirez

This comment has been minimized.

Show comment
Hide comment
@antirez

antirez Dec 16, 2014

Owner

Here I invoke @mattsta that's the last one that touched this code 💃 and may have it more fresh than me.

Thanks for reporting.

Owner

antirez commented Dec 16, 2014

Here I invoke @mattsta that's the last one that touched this code 💃 and may have it more fresh than me.

Thanks for reporting.

@badboy

This comment has been minimized.

Show comment
Hide comment
@badboy

badboy Dec 17, 2014

Contributor

The proposed solution doesn't work. ips is a hash and thus has no order, so we can't reverse it (well, actually it does have an order, but we cannot rely on it).
We could force the hash to a array, then sort it and reverse it, that seems to work for above setup. I don't really understood the whole code yet so I re-read that and try to come up with a good solution.

Contributor

badboy commented Dec 17, 2014

The proposed solution doesn't work. ips is a hash and thus has no order, so we can't reverse it (well, actually it does have an order, but we cannot rely on it).
We could force the hash to a array, then sort it and reverse it, that seems to work for above setup. I don't really understood the whole code yet so I re-read that and try to come up with a good solution.

@mattsta

This comment has been minimized.

Show comment
Hide comment
@mattsta

mattsta Dec 17, 2014

Contributor

Yeah, the current redis-trib code does work for cluster master->replica assignment.

The original message didn't specify if redis-trib configured the cluster poorly or if the cluster didn't follow the redis-trib plan. (we'd need a copy of the full redis-trib output during creation)

There was a bug a while ago where: redis-trib would make a cluster map, but the cluster would begin migrating slots before the setup was finished. Even though redis-trib tried to set up a perfect balance, the cluster re-arranged things incorrectly. (That happened a while ago and I think this commit fixed it, but I can't find the GitHub Issue that originally reported the problem (to verify this was the fix): 0d9bcb1)

Contributor

mattsta commented Dec 17, 2014

Yeah, the current redis-trib code does work for cluster master->replica assignment.

The original message didn't specify if redis-trib configured the cluster poorly or if the cluster didn't follow the redis-trib plan. (we'd need a copy of the full redis-trib output during creation)

There was a bug a while ago where: redis-trib would make a cluster map, but the cluster would begin migrating slots before the setup was finished. Even though redis-trib tried to set up a perfect balance, the cluster re-arranged things incorrectly. (That happened a while ago and I think this commit fixed it, but I can't find the GitHub Issue that originally reported the problem (to verify this was the fix): 0d9bcb1)

@antirez

This comment has been minimized.

Show comment
Hide comment
@antirez

antirez Dec 17, 2014

Owner

Thanks, let's see if we get more info from @rj03hou before closing.

Owner

antirez commented Dec 17, 2014

Thanks, let's see if we get more info from @rj03hou before closing.

@badboy

This comment has been minimized.

Show comment
Hide comment
@badboy

badboy Dec 17, 2014

Contributor

Current script produces this mapping:

Adding replica 192.168.1.101:6380 to 192.168.1.100:6379
Adding replica 192.168.1.100:6380 to 192.168.1.101:6379
Adding replica 192.168.1.102:6380 to 192.168.1.102:6379

As you can see: 102:6380 is assigned to 102:6379.
Better setup would be:

100:6379 ← 102:6380
101:6379 ← 100:6380
102:6379 ← 101:6380

This diff produces exactly that:

diff --git i/src/redis-trib.rb w/src/redis-trib.rb
index 4002f63..f99044f 100755
--- i/src/redis-trib.rb
+++ w/src/redis-trib.rb
@@ -596,6 +596,7 @@ class RedisTrib

         [:requested,:unused].each{|assign|
             masters.each{|m|
+                reverse = true
                 assigned_replicas = 0
                 while assigned_replicas < @replicas
                     break if nodes_count == 0
@@ -609,7 +610,9 @@ class RedisTrib
                                  "role too (#{nodes_count} remaining)."
                         end
                     end
-                    ips.each{|ip,nodes_list|
+
+                    cur_ips = reverse ? ips.to_a.reverse : ips.to_a
+                    cur_ips.each{|ip,nodes_list|
                         next if nodes_list.length == 0
                         # Skip instances with the same IP as the master if we
                         # have some more IPs available.
@@ -621,6 +624,8 @@ class RedisTrib
                         puts "Adding replica #{slave} to #{m}"
                         break
                     }
+
+                    reverse = !reverse
                 end
             }
         }

I did not test with another (larger) setup yet and I'm not sure we can handle each possible case without writing confusing hard-to-maintain code.

Extracted example code: https://gist.github.com/badboy/b59fb31d94909b6bf968

Contributor

badboy commented Dec 17, 2014

Current script produces this mapping:

Adding replica 192.168.1.101:6380 to 192.168.1.100:6379
Adding replica 192.168.1.100:6380 to 192.168.1.101:6379
Adding replica 192.168.1.102:6380 to 192.168.1.102:6379

As you can see: 102:6380 is assigned to 102:6379.
Better setup would be:

100:6379 ← 102:6380
101:6379 ← 100:6380
102:6379 ← 101:6380

This diff produces exactly that:

diff --git i/src/redis-trib.rb w/src/redis-trib.rb
index 4002f63..f99044f 100755
--- i/src/redis-trib.rb
+++ w/src/redis-trib.rb
@@ -596,6 +596,7 @@ class RedisTrib

         [:requested,:unused].each{|assign|
             masters.each{|m|
+                reverse = true
                 assigned_replicas = 0
                 while assigned_replicas < @replicas
                     break if nodes_count == 0
@@ -609,7 +610,9 @@ class RedisTrib
                                  "role too (#{nodes_count} remaining)."
                         end
                     end
-                    ips.each{|ip,nodes_list|
+
+                    cur_ips = reverse ? ips.to_a.reverse : ips.to_a
+                    cur_ips.each{|ip,nodes_list|
                         next if nodes_list.length == 0
                         # Skip instances with the same IP as the master if we
                         # have some more IPs available.
@@ -621,6 +624,8 @@ class RedisTrib
                         puts "Adding replica #{slave} to #{m}"
                         break
                     }
+
+                    reverse = !reverse
                 end
             }
         }

I did not test with another (larger) setup yet and I'm not sure we can handle each possible case without writing confusing hard-to-maintain code.

Extracted example code: https://gist.github.com/badboy/b59fb31d94909b6bf968

@mattsta

This comment has been minimized.

Show comment
Hide comment
@mattsta

mattsta Dec 17, 2014

Contributor

So, this actually feels like an NP-complete matching problem. We can fake a good solution since we only care about not matching IPs together.

The reverse case fails for a slight longer test since it just moves the failure case of "same nodes at end" to "same nodes in middle." The following configuration causes duplicate node assignment even though we can still match all nodes to non-same IPs.

  new_node("192.168.1.100", 1),
  new_node("192.168.1.100", 2),
  new_node("192.168.1.100", 3),
  new_node("192.168.1.100", 4),
  new_node("192.168.1.100", 5),

  new_node("192.168.1.101", 1),
  new_node("192.168.1.101", 2),
  new_node("192.168.1.101", 3),
  new_node("192.168.1.101", 4),

  new_node("192.168.1.102", 1),
  new_node("192.168.1.102", 2),
  new_node("192.168.1.102", 3),
  new_node("192.168.1.102", 4),
  new_node("192.168.1.102", 5),

Let's step back and see what we have.

What are we given? We have a hash of {IP, List of Nodes With That IP}. For a proper (simple) matching, what we really want is: a list of alternating nodes across all IPs.

The current processing method forms an in-order traversal:

IP1:0 -> IP1:1 -> IP2:0 -> IP2:1 -> IP3:0 -> IP3:1

The current method says: "if next IP is current IP, skip it and try again." That works (IP1:0 == IP1:1, so use IP1:0 => IP2:0. Now, IP1:1 => IP2:1), but by the end, all we have is IP3:0 and IP3:1 remaining.

What we need: traverse one node from each IP before using the same IP again:

IP1:0 -> IP2:0 -> IP3:0 -> IP1:1 -> IP2:1 -> IP3:1

now all we have to do is say "use the next available IP in the list" since it won't match the current IP in the list.

Now, we only get master->slave on the same IP if one IP has an even number of nodes provided while all the other IPs have an odd number of nodes provided. We can't stop people from not having enough node diversity, but we can stop the average cases from over-assigning poorly.

Double check it makes sense (diff based off of the great test utility at https://gist.github.com/badboy/b59fb31d94909b6bf968):

diff --git a/rclus.rb b/rclus.rb
index 258cd60..f13accd 100644
--- a/rclus.rb
+++ b/rclus.rb
@@ -40,16 +40,22 @@ def alloc_slots(nodes, replicas)

   # Select master instances
   puts "Using #{masters_count} masters:"
-  while masters.length < masters_count
-    ips.each{|ip,nodes_list|
-      next if nodes_list.length == 0
-      masters << nodes_list.shift
-      puts masters[-1]
-      nodes_count -= 1
-      break if masters.length == masters_count
-    }
+
+  interleaved = []
+  stop = false
+  while not stop do
+      # Take one node from each IP until we run out of nodes
+      # across every IP.
+      ips.each do |ip,nodes|
+          stop = nodes.empty? and next
+          interleaved.push nodes.shift
+      end
   end

+  masters = interleaved.slice!(0, masters_count)
+  nodes_count -= masters.length
+  puts masters
+
   # Alloc slots on masters
   slots_per_node = ClusterHashSlots.to_f / masters_count
   first = 0
@@ -78,6 +85,8 @@ def alloc_slots(nodes, replicas)
   assignment_verbose = false

   p ips
+  p nodes
+  p interleaved
   [:requested,:unused].each{|assign|
     masters.each{|m|
       reverse = true
@@ -95,21 +104,10 @@ def alloc_slots(nodes, replicas)
           end
         end

-        cur_ips = reverse ? ips.to_a.reverse : ips.to_a
-        cur_ips.each{|ip,nodes_list|
-          next if nodes_list.length == 0
-          # Skip instances with the same IP as the master if we
-          # have some more IPs available.
-          next if ip == m.info[:host] && nodes_count > nodes_list.length
-          slave = nodes_list.shift
-          #slave.set_as_replica(m.info[:name])
-          nodes_count -= 1
-          assigned_replicas += 1
-          puts "Adding replica #{slave} to #{m}"
-          break
-        }
-
-        reverse = !reverse
+        slave = interleaved.shift
+        nodes_count -= 1
+        assigned_replicas += 1
+        puts "Adding replica #{slave} to #{m}"
       end
     }
   }
@@ -117,14 +115,24 @@ end


 nodes = [
-  new_node("192.168.1.100", 6379),
-  new_node("192.168.1.100", 6380),
+  new_node("192.168.1.100", 1),
+  new_node("192.168.1.100", 2),
+  new_node("192.168.1.100", 3),
+  new_node("192.168.1.100", 4),
+  new_node("192.168.1.100", 5),

-  new_node("192.168.1.101", 6379),
-  new_node("192.168.1.101", 6380),
+  new_node("192.168.1.101", 1),
+  new_node("192.168.1.101", 2),
+  new_node("192.168.1.101", 3),
+  new_node("192.168.1.101", 4),
+  new_node("192.168.1.101", 5),
+#  new_node("192.168.1.101", 6),  # <-- that will cause overassignment of 101

-  new_node("192.168.1.102", 6379),
-  new_node("192.168.1.102", 6380),
+  new_node("192.168.1.102", 1),
+  new_node("192.168.1.102", 2),
+  new_node("192.168.1.102", 3),
+  new_node("192.168.1.102", 4),
+  new_node("192.168.1.102", 5),
 ]
 replicas = 1
-alloc_slots(nodes, replicas)
\ No newline at end of file
+alloc_slots(nodes, replicas)
Contributor

mattsta commented Dec 17, 2014

So, this actually feels like an NP-complete matching problem. We can fake a good solution since we only care about not matching IPs together.

The reverse case fails for a slight longer test since it just moves the failure case of "same nodes at end" to "same nodes in middle." The following configuration causes duplicate node assignment even though we can still match all nodes to non-same IPs.

  new_node("192.168.1.100", 1),
  new_node("192.168.1.100", 2),
  new_node("192.168.1.100", 3),
  new_node("192.168.1.100", 4),
  new_node("192.168.1.100", 5),

  new_node("192.168.1.101", 1),
  new_node("192.168.1.101", 2),
  new_node("192.168.1.101", 3),
  new_node("192.168.1.101", 4),

  new_node("192.168.1.102", 1),
  new_node("192.168.1.102", 2),
  new_node("192.168.1.102", 3),
  new_node("192.168.1.102", 4),
  new_node("192.168.1.102", 5),

Let's step back and see what we have.

What are we given? We have a hash of {IP, List of Nodes With That IP}. For a proper (simple) matching, what we really want is: a list of alternating nodes across all IPs.

The current processing method forms an in-order traversal:

IP1:0 -> IP1:1 -> IP2:0 -> IP2:1 -> IP3:0 -> IP3:1

The current method says: "if next IP is current IP, skip it and try again." That works (IP1:0 == IP1:1, so use IP1:0 => IP2:0. Now, IP1:1 => IP2:1), but by the end, all we have is IP3:0 and IP3:1 remaining.

What we need: traverse one node from each IP before using the same IP again:

IP1:0 -> IP2:0 -> IP3:0 -> IP1:1 -> IP2:1 -> IP3:1

now all we have to do is say "use the next available IP in the list" since it won't match the current IP in the list.

Now, we only get master->slave on the same IP if one IP has an even number of nodes provided while all the other IPs have an odd number of nodes provided. We can't stop people from not having enough node diversity, but we can stop the average cases from over-assigning poorly.

Double check it makes sense (diff based off of the great test utility at https://gist.github.com/badboy/b59fb31d94909b6bf968):

diff --git a/rclus.rb b/rclus.rb
index 258cd60..f13accd 100644
--- a/rclus.rb
+++ b/rclus.rb
@@ -40,16 +40,22 @@ def alloc_slots(nodes, replicas)

   # Select master instances
   puts "Using #{masters_count} masters:"
-  while masters.length < masters_count
-    ips.each{|ip,nodes_list|
-      next if nodes_list.length == 0
-      masters << nodes_list.shift
-      puts masters[-1]
-      nodes_count -= 1
-      break if masters.length == masters_count
-    }
+
+  interleaved = []
+  stop = false
+  while not stop do
+      # Take one node from each IP until we run out of nodes
+      # across every IP.
+      ips.each do |ip,nodes|
+          stop = nodes.empty? and next
+          interleaved.push nodes.shift
+      end
   end

+  masters = interleaved.slice!(0, masters_count)
+  nodes_count -= masters.length
+  puts masters
+
   # Alloc slots on masters
   slots_per_node = ClusterHashSlots.to_f / masters_count
   first = 0
@@ -78,6 +85,8 @@ def alloc_slots(nodes, replicas)
   assignment_verbose = false

   p ips
+  p nodes
+  p interleaved
   [:requested,:unused].each{|assign|
     masters.each{|m|
       reverse = true
@@ -95,21 +104,10 @@ def alloc_slots(nodes, replicas)
           end
         end

-        cur_ips = reverse ? ips.to_a.reverse : ips.to_a
-        cur_ips.each{|ip,nodes_list|
-          next if nodes_list.length == 0
-          # Skip instances with the same IP as the master if we
-          # have some more IPs available.
-          next if ip == m.info[:host] && nodes_count > nodes_list.length
-          slave = nodes_list.shift
-          #slave.set_as_replica(m.info[:name])
-          nodes_count -= 1
-          assigned_replicas += 1
-          puts "Adding replica #{slave} to #{m}"
-          break
-        }
-
-        reverse = !reverse
+        slave = interleaved.shift
+        nodes_count -= 1
+        assigned_replicas += 1
+        puts "Adding replica #{slave} to #{m}"
       end
     }
   }
@@ -117,14 +115,24 @@ end


 nodes = [
-  new_node("192.168.1.100", 6379),
-  new_node("192.168.1.100", 6380),
+  new_node("192.168.1.100", 1),
+  new_node("192.168.1.100", 2),
+  new_node("192.168.1.100", 3),
+  new_node("192.168.1.100", 4),
+  new_node("192.168.1.100", 5),

-  new_node("192.168.1.101", 6379),
-  new_node("192.168.1.101", 6380),
+  new_node("192.168.1.101", 1),
+  new_node("192.168.1.101", 2),
+  new_node("192.168.1.101", 3),
+  new_node("192.168.1.101", 4),
+  new_node("192.168.1.101", 5),
+#  new_node("192.168.1.101", 6),  # <-- that will cause overassignment of 101

-  new_node("192.168.1.102", 6379),
-  new_node("192.168.1.102", 6380),
+  new_node("192.168.1.102", 1),
+  new_node("192.168.1.102", 2),
+  new_node("192.168.1.102", 3),
+  new_node("192.168.1.102", 4),
+  new_node("192.168.1.102", 5),
 ]
 replicas = 1
-alloc_slots(nodes, replicas)
\ No newline at end of file
+alloc_slots(nodes, replicas)
@mattsta

This comment has been minimized.

Show comment
Hide comment
@mattsta

mattsta Dec 17, 2014

Contributor

(though, that only works for a replica factor of 1. For replica factor > 1, we do need to reject nodes at assignment time. But, it still helps to have one list to search instead of a list of each IP. Generalized solution for replica assignment becomes:

-        cur_ips = reverse ? ips.to_a.reverse : ips.to_a
-        cur_ips.each{|ip,nodes_list|
-          next if nodes_list.length == 0
-          # Skip instances with the same IP as the master if we
-          # have some more IPs available.
-          next if ip == m.info[:host] && nodes_count > nodes_list.length
-          slave = nodes_list.shift
-          #slave.set_as_replica(m.info[:name])
-          nodes_count -= 1
-          assigned_replicas += 1
-          puts "Adding replica #{slave} to #{m}"
-          break
-        }
-
-        reverse = !reverse
+        # Return the first node not matching our current master
+        node = interleaved.find{|n| n.info[:host] != m.info[:host]}
+
+        # If we found a node, use it as a best-first match.
+        # else, we didn't find a not-this-master-IP node, so just use
+        # the next node even though it's already on this master's IP.
+        if node
+            slave = node
+            interleaved.delete node
+        else
+            slave = interleaved.shift
+        end
+        nodes_count -= 1
+        assigned_replicas += 1
+        puts "Adding replica #{slave} to #{m}"

)

Contributor

mattsta commented Dec 17, 2014

(though, that only works for a replica factor of 1. For replica factor > 1, we do need to reject nodes at assignment time. But, it still helps to have one list to search instead of a list of each IP. Generalized solution for replica assignment becomes:

-        cur_ips = reverse ? ips.to_a.reverse : ips.to_a
-        cur_ips.each{|ip,nodes_list|
-          next if nodes_list.length == 0
-          # Skip instances with the same IP as the master if we
-          # have some more IPs available.
-          next if ip == m.info[:host] && nodes_count > nodes_list.length
-          slave = nodes_list.shift
-          #slave.set_as_replica(m.info[:name])
-          nodes_count -= 1
-          assigned_replicas += 1
-          puts "Adding replica #{slave} to #{m}"
-          break
-        }
-
-        reverse = !reverse
+        # Return the first node not matching our current master
+        node = interleaved.find{|n| n.info[:host] != m.info[:host]}
+
+        # If we found a node, use it as a best-first match.
+        # else, we didn't find a not-this-master-IP node, so just use
+        # the next node even though it's already on this master's IP.
+        if node
+            slave = node
+            interleaved.delete node
+        else
+            slave = interleaved.shift
+        end
+        nodes_count -= 1
+        assigned_replicas += 1
+        puts "Adding replica #{slave} to #{m}"

)

@rj03hou

This comment has been minimized.

Show comment
Hide comment
@rj03hou

rj03hou Dec 18, 2014

Thanks for your reply and good working, we have deploy three redis cluster in production.
If redis cluster release, we will deploy more redis cluster.

I have no info, we may add a test case to this case. The reverse method is we think the method which is the small change to the code, mattsta's method is complicate, I don't understand, but I guess it's better than reverse, if we test it the common situation, we can close it.

rj03hou commented Dec 18, 2014

Thanks for your reply and good working, we have deploy three redis cluster in production.
If redis cluster release, we will deploy more redis cluster.

I have no info, we may add a test case to this case. The reverse method is we think the method which is the small change to the code, mattsta's method is complicate, I don't understand, but I guess it's better than reverse, if we test it the common situation, we can close it.

@mariano-perez-rodriguez

This comment has been minimized.

Show comment
Hide comment
@mariano-perez-rodriguez

mariano-perez-rodriguez Dec 18, 2014

Contributor

Just a quick note. If I understand correctly, the problem is to assign n
slaves to each master (if this is not the case, please ignore what follows
:-P).

This could be modelled by a min-cut/max-flow problem as follows:

- you're given m masters M1, M2, ..., Mm, and an integer number n < m.

- build a (weighted and directed) graph G having 2m + 2 nodes: call one of

them "source", another one "drain", m of them are labeled M1, M2, ..., Mm,
and call the last m of them M1', M2', ..., Mm'; connect the nodes thus:
a. Place edges weighting n from "source" to each of M1, M2, ..., Mm.
b. Place edges with infinite weight from each of M1', M2', ..., Mm' to
"drain" (but see note below!!!)
c. Place edges weighting 1 from each Mi to each Mj' whenever i != j.

- now run a (polynomial) min-cut/max-flow algorithm like

Ford-Fulkerson-Edmonds-Karp and look at the total flow (call it f) and the
edges selected between the nodes M1, M2, ..., Mm, M1', M2', ..., Mm': if f
equals m_n (meaning that the algorithm could assign m_n master/slave
relations) then for each selected edge between Mi and Mj', have j be a
slave of i (actually, replica or something, I didn't yet have the time to
get Redis Cluster terminology right).

NOTE: if the edges from each of M1', M2', ..., Mm' to "drain" would weight
n, then each node would be the slave/replica of at most n masters.

This looks involved, but it's not terribly horrible to implement and I'm
sure there are Ruby implementations of FFEK around :-)

Regards!

Contributor

mariano-perez-rodriguez commented Dec 18, 2014

Just a quick note. If I understand correctly, the problem is to assign n
slaves to each master (if this is not the case, please ignore what follows
:-P).

This could be modelled by a min-cut/max-flow problem as follows:

- you're given m masters M1, M2, ..., Mm, and an integer number n < m.

- build a (weighted and directed) graph G having 2m + 2 nodes: call one of

them "source", another one "drain", m of them are labeled M1, M2, ..., Mm,
and call the last m of them M1', M2', ..., Mm'; connect the nodes thus:
a. Place edges weighting n from "source" to each of M1, M2, ..., Mm.
b. Place edges with infinite weight from each of M1', M2', ..., Mm' to
"drain" (but see note below!!!)
c. Place edges weighting 1 from each Mi to each Mj' whenever i != j.

- now run a (polynomial) min-cut/max-flow algorithm like

Ford-Fulkerson-Edmonds-Karp and look at the total flow (call it f) and the
edges selected between the nodes M1, M2, ..., Mm, M1', M2', ..., Mm': if f
equals m_n (meaning that the algorithm could assign m_n master/slave
relations) then for each selected edge between Mi and Mj', have j be a
slave of i (actually, replica or something, I didn't yet have the time to
get Redis Cluster terminology right).

NOTE: if the edges from each of M1', M2', ..., Mm' to "drain" would weight
n, then each node would be the slave/replica of at most n masters.

This looks involved, but it's not terribly horrible to implement and I'm
sure there are Ruby implementations of FFEK around :-)

Regards!

mattsta added a commit to mattsta/redis that referenced this issue Dec 20, 2014

Improve redis-trib replica assignment
This tiny bit of code has gone through so many revisions.  Hopefully
it's more correct now.

Fixes antirez#2204

mattsta added a commit to mattsta/redis that referenced this issue Dec 20, 2014

Improve redis-trib replica assignment
This tiny bit of code has gone through so many revisions.  Hopefully
it's more correct now.

Fixes antirez#2204
@mattsta

This comment has been minimized.

Show comment
Hide comment
@mattsta

mattsta Dec 20, 2014

Contributor

If I understand correctly, the problem is to assign n slaves to each master

The problem isn't counting the assignments, it's just not assigning replicas on the same master host (as best we can). There's no weight or score between the nodes here, so ideally we need a "global planner" to figure out the best options for all master-replica combinations. The approach I outlined above is a quick way to make sure adjacent assignments are as non-duplicate as possible.

I integrated my solution into redis-trib at #2227

Contributor

mattsta commented Dec 20, 2014

If I understand correctly, the problem is to assign n slaves to each master

The problem isn't counting the assignments, it's just not assigning replicas on the same master host (as best we can). There's no weight or score between the nodes here, so ideally we need a "global planner" to figure out the best options for all master-replica combinations. The approach I outlined above is a quick way to make sure adjacent assignments are as non-duplicate as possible.

I integrated my solution into redis-trib at #2227

@mariano-perez-rodriguez

This comment has been minimized.

Show comment
Hide comment
@mariano-perez-rodriguez

mariano-perez-rodriguez Dec 20, 2014

Contributor

I wasn't talking about counting the assignments, the procedure I gave
above will give you one such assignment: by looking at the edges used you
can determine the assignment itself :)

Contributor

mariano-perez-rodriguez commented Dec 20, 2014

I wasn't talking about counting the assignments, the procedure I gave
above will give you one such assignment: by looking at the edges used you
can determine the assignment itself :)

@mattsta

This comment has been minimized.

Show comment
Hide comment
@mattsta

mattsta Dec 20, 2014

Contributor

Yup, but typical node assignment is between 3 and 100 nodes, so pulling in a larger method is a bit overkill. :)

Though—redis-trib.rb is a reference implementation of cluster management functions. If there are better ways, write them and people will use them. :shipit:

Contributor

mattsta commented Dec 20, 2014

Yup, but typical node assignment is between 3 and 100 nodes, so pulling in a larger method is a bit overkill. :)

Though—redis-trib.rb is a reference implementation of cluster management functions. If there are better ways, write them and people will use them. :shipit:

@mariano-perez-rodriguez

This comment has been minimized.

Show comment
Hide comment
@mariano-perez-rodriguez

mariano-perez-rodriguez Dec 20, 2014

Contributor

Agreed! thanks for considering it though :)

Contributor

mariano-perez-rodriguez commented Dec 20, 2014

Agreed! thanks for considering it though :)

mattsta added a commit that referenced this issue Jan 8, 2015

Improve redis-trib replica assignment
This tiny bit of code has gone through so many revisions.  Hopefully
it's more correct now.

Fixes #2204
@Annapoornar

This comment has been minimized.

Show comment
Hide comment
@Annapoornar

Annapoornar Nov 22, 2016

Hi, I have similar problem. I have the redis cluster setup be like this.

10.2.2.161 two instance 6379 6380
10.2.2.11 two instance 6379 6380
10.2.2.149 two instance 6379 6380

and I want to set master slave combination to be
161:6379 ← 11:6380
11:6379 ← 149:6380
149:6379 ← 161:6380

I used the following command to create the cluster:
src/redis-trib.rb create --replicas 1 10.2.2.161:6379 10.2.2.11:6379 10.2.2.149:6379 10.2.2.11:6380 10.2.2.149:6380 10.2.2.161:6380

then after when I check the cluster nodes command output, i see the cluster has set up like below which is not what I am expecting. Please correct me if the order of the nodes mentioned in create is wrong?

16c0a014f91b72bcb4426285aea77b0c0e1aaeac 10.2.2.161:6379 myself,master - 0 0 1 connected 0-5460
42202165db4cc660496afb6d8c548782076bc60a 10.2.2.149:6379 master - 0 1479798899058 3 connected 10923-16383
a4776993f98ef476e8374a83ea1a0ab53b0593ad 10.2.2.11:6379 master - 0 1479798900059 2 connected 5461-10922
592cd006659dd6b549f55c8affcf65205739dccb 10.2.2.149:6380 slave 42202165db4cc660496afb6d8c548782076bc60a 0 1479798901061 5 connected
6b03fc8305ab15d2dc07b3fc51409cadb2e9fcb3 10.2.2.11:6380 slave 16c0a014f91b72bcb4426285aea77b0c0e1aaeac 0 1479798900361 4 connected
23c4b2486b8ef910bf4975582992e5647b9a53a5 10.2.2.161:6380 slave a4776993f98ef476e8374a83ea1a0ab53b0593ad 0 1479798902063 6 connected

Annapoornar commented Nov 22, 2016

Hi, I have similar problem. I have the redis cluster setup be like this.

10.2.2.161 two instance 6379 6380
10.2.2.11 two instance 6379 6380
10.2.2.149 two instance 6379 6380

and I want to set master slave combination to be
161:6379 ← 11:6380
11:6379 ← 149:6380
149:6379 ← 161:6380

I used the following command to create the cluster:
src/redis-trib.rb create --replicas 1 10.2.2.161:6379 10.2.2.11:6379 10.2.2.149:6379 10.2.2.11:6380 10.2.2.149:6380 10.2.2.161:6380

then after when I check the cluster nodes command output, i see the cluster has set up like below which is not what I am expecting. Please correct me if the order of the nodes mentioned in create is wrong?

16c0a014f91b72bcb4426285aea77b0c0e1aaeac 10.2.2.161:6379 myself,master - 0 0 1 connected 0-5460
42202165db4cc660496afb6d8c548782076bc60a 10.2.2.149:6379 master - 0 1479798899058 3 connected 10923-16383
a4776993f98ef476e8374a83ea1a0ab53b0593ad 10.2.2.11:6379 master - 0 1479798900059 2 connected 5461-10922
592cd006659dd6b549f55c8affcf65205739dccb 10.2.2.149:6380 slave 42202165db4cc660496afb6d8c548782076bc60a 0 1479798901061 5 connected
6b03fc8305ab15d2dc07b3fc51409cadb2e9fcb3 10.2.2.11:6380 slave 16c0a014f91b72bcb4426285aea77b0c0e1aaeac 0 1479798900361 4 connected
23c4b2486b8ef910bf4975582992e5647b9a53a5 10.2.2.161:6380 slave a4776993f98ef476e8374a83ea1a0ab53b0593ad 0 1479798902063 6 connected

@vladcenan

This comment has been minimized.

Show comment
Hide comment
@vladcenan

vladcenan Dec 4, 2016

I have the same issue, this was not solved for replica factor 1, it create the last two master-slave on the same node/ip.

vladcenan commented Dec 4, 2016

I have the same issue, this was not solved for replica factor 1, it create the last two master-slave on the same node/ip.

@lahngp

This comment has been minimized.

Show comment
Hide comment
@lahngp

lahngp Jan 19, 2017

I came across this problem too. I was going to setup a cluster with replicas set to 1 on 3 machine, and each machine hosted two redis instances, namely 6 instances in total, hoping that any one of the 3 machine's failure would not affect the cluster. But , after the cluster was up, I found a master and slave on a same machine. If this machine got down, then the whole cluster would be down. Then I choose to manually add slaves, as is shown below:
[cluster] step by step # you may create the cluster with create-cluster script, and it's useful for debugging.
# apt-get install ruby ruby-dev
# gem install redis
REDIS_HOME=/opt/redis-3.2.6
trib=$REDIS_HOME/src/redis-trib.rb
host1=192.168.1.61
host2=192.168.1.67
host3=192.168.1.108
comm='--daemonize yes --bind 0.0.0.0 --appendonly yes --cluster-enabled yes --cluster-node-timeout 5000'
. start up server respectively execute on all machine a physical server act as a master and a slave of other master
redis-server $comm --port 7001 --cluster-config-file nodes-7001.conf
redis-server $comm --port 7002 --cluster-config-file nodes-7002.conf
.create cluster execute on one of server
$trib create --replicas 0 $host1:7001 $host2:7001 $host3:7001
## $trib create --replicas 1 $host1:7001 $host2:7001 $host3:7001 $host1:7002 $host2:7002 $host3:7002 # this command may lead to a master and a slave on a same machine
. check
redis-cli -c -p 7001 CLUSTER NODES
redis-cli -c -p 7001 SET foo bar set aa 11 set bb 22 # The 3 keys foo, aa and bb will span over 3 different master, and help debug.
. replica add slave manually execute on one of master
MEET=$host1:7001
master1=03521c832d7023f5d6461e1809f271b01b02ed52 # the master on host1
master2=80327e5dabeb8f6a1be7b91093a29f40ba2c11be # the master on host2
master3=79428c877c1a835997fbe628fe2068197504c743 # the master on host3
$trib add-node --slave --master-id $master2 $host1:7002 $MEET
$trib add-node --slave --master-id $master3 $host2:7002 $MEET
$trib add-node --slave --master-id $master1 $host3:7002 $MEET
. check # shutdown host3 first, then check whether all of the following commands will have result or not
redis-cli -c -p 7001 get foo
redis-cli -c -p 7001 get aa
redis-cli -c -p 7001 get bb

lahngp commented Jan 19, 2017

I came across this problem too. I was going to setup a cluster with replicas set to 1 on 3 machine, and each machine hosted two redis instances, namely 6 instances in total, hoping that any one of the 3 machine's failure would not affect the cluster. But , after the cluster was up, I found a master and slave on a same machine. If this machine got down, then the whole cluster would be down. Then I choose to manually add slaves, as is shown below:
[cluster] step by step # you may create the cluster with create-cluster script, and it's useful for debugging.
# apt-get install ruby ruby-dev
# gem install redis
REDIS_HOME=/opt/redis-3.2.6
trib=$REDIS_HOME/src/redis-trib.rb
host1=192.168.1.61
host2=192.168.1.67
host3=192.168.1.108
comm='--daemonize yes --bind 0.0.0.0 --appendonly yes --cluster-enabled yes --cluster-node-timeout 5000'
. start up server respectively execute on all machine a physical server act as a master and a slave of other master
redis-server $comm --port 7001 --cluster-config-file nodes-7001.conf
redis-server $comm --port 7002 --cluster-config-file nodes-7002.conf
.create cluster execute on one of server
$trib create --replicas 0 $host1:7001 $host2:7001 $host3:7001
## $trib create --replicas 1 $host1:7001 $host2:7001 $host3:7001 $host1:7002 $host2:7002 $host3:7002 # this command may lead to a master and a slave on a same machine
. check
redis-cli -c -p 7001 CLUSTER NODES
redis-cli -c -p 7001 SET foo bar set aa 11 set bb 22 # The 3 keys foo, aa and bb will span over 3 different master, and help debug.
. replica add slave manually execute on one of master
MEET=$host1:7001
master1=03521c832d7023f5d6461e1809f271b01b02ed52 # the master on host1
master2=80327e5dabeb8f6a1be7b91093a29f40ba2c11be # the master on host2
master3=79428c877c1a835997fbe628fe2068197504c743 # the master on host3
$trib add-node --slave --master-id $master2 $host1:7002 $MEET
$trib add-node --slave --master-id $master3 $host2:7002 $MEET
$trib add-node --slave --master-id $master1 $host3:7002 $MEET
. check # shutdown host3 first, then check whether all of the following commands will have result or not
redis-cli -c -p 7001 get foo
redis-cli -c -p 7001 get aa
redis-cli -c -p 7001 get bb

@sidd081

This comment has been minimized.

Show comment
Hide comment
@sidd081

sidd081 Jan 30, 2017

@vladcenan
I made a small change in redis-trib.rb and it worked fine for me in most of the cases. In def alloc_slots
in script, after masters.each{|m| puts m} add one more new line interleaved.push interleaved.shift

....
579 masters = interleaved.slice!(0, masters_count)
580 nodes_count -= masters.length
581
582 masters.each{|m| puts m}
583 interleaved.push interleaved.shift
....

sidd081 commented Jan 30, 2017

@vladcenan
I made a small change in redis-trib.rb and it worked fine for me in most of the cases. In def alloc_slots
in script, after masters.each{|m| puts m} add one more new line interleaved.push interleaved.shift

....
579 masters = interleaved.slice!(0, masters_count)
580 nodes_count -= masters.length
581
582 masters.each{|m| puts m}
583 interleaved.push interleaved.shift
....

@vladcenan

This comment has been minimized.

Show comment
Hide comment
@vladcenan

vladcenan Jan 31, 2017

@sidd081 thanks for the solution, for me it worked for all the environments. Please create a pull request with this solution it will help others also .

vladcenan commented Jan 31, 2017

@sidd081 thanks for the solution, for me it worked for all the environments. Please create a pull request with this solution it will help others also .

@sidd081

This comment has been minimized.

Show comment
Hide comment
@sidd081

sidd081 Feb 1, 2017

@vladcenan Sure will do that. Please do checkout https://github.com/sidd081/Analyse-Redis-Cluster-nodes as well for analysing redis cluster.

sidd081 commented Feb 1, 2017

@vladcenan Sure will do that. Please do checkout https://github.com/sidd081/Analyse-Redis-Cluster-nodes as well for analysing redis cluster.

@sidd081

This comment has been minimized.

Show comment
Hide comment
@sidd081

sidd081 commented Feb 3, 2017

@spccold

This comment has been minimized.

Show comment
Hide comment
@spccold

spccold Mar 13, 2017

@antirez can redis-trib.rb provide some ways to let us define exact layout of the fresh redis cluster?
if redis-trib.rb can use a configuration file to initialize a fresh redis cluster, that will be fine!

configuration file example:

master1:slave1,slave2
master2:slave1
master3:slave1

spccold commented Mar 13, 2017

@antirez can redis-trib.rb provide some ways to let us define exact layout of the fresh redis cluster?
if redis-trib.rb can use a configuration file to initialize a fresh redis cluster, that will be fine!

configuration file example:

master1:slave1,slave2
master2:slave1
master3:slave1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment