Skip to content

Commit

Permalink
added cluster docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Michał Siatkowski committed Sep 2, 2019
1 parent 6a91073 commit 1c54555
Show file tree
Hide file tree
Showing 2 changed files with 39 additions and 21 deletions.
21 changes: 21 additions & 0 deletions README.md
Expand Up @@ -234,6 +234,27 @@ def scatterGatherFirstWithList(opsPerClient: Int)(implicit clients: RedisClientP

See an example implementation using Akka at https://github.com/debasishg/akka-redis-pubsub.

## RedisCluster

`RedisCluster` uses data sharding (partitioning) which splits all data across available Redis instances,
so that every instance contains only a subset of the keys. Such process allows mitigating data grown
by adding more and more instances and dividing the data to smaller parts (shards or partitions).

`RedisCluster` allows user to pass a special `KeyTag`, that helps to distribute keys according to special
requirements. Otherwise node is selected by hashing the whole key with `CRC-32` function.

`RedisCluster` also allows for dynamic nodes modification with `addServer`, `replaceServer` and `removeServer`
methods. Note that data on the disconnected node will be lost immediately.
What is more, since modification of the cluster impacts key distribution, some of the data scattered
across the cluster could be lost as well.

For automatic node downtime handling, by disconnecting the offline node and reconnecting it as it comes back up,
there is a `Reconnectable` trait. To allow such behaviour mix it into `RedisCluster` instance:
```
new RedisCluster(nodes, Some(NoOpKeyTag)) with Reconnectable
```
you can observe it's behaviour in `ReconnectableSpec` test.

## License

This software is licensed under the Apache 2 license, quoted below.
Expand Down
39 changes: 18 additions & 21 deletions src/main/scala/com/redis/cluster/KeyTag.scala
@@ -1,31 +1,28 @@
package com.redis.cluster

/**
* <p>
* Consistent hashing distributes keys across multiple servers. But there are situations
* like <i>sorting</i> or computing <i>set intersections</i> or operations like <tt>rpoplpush</tt>
* in redis that require all keys to be collocated on the same server.
* <p/>
* One of the techniques that redis encourages for such forced key locality is called
* <i>key tagging</i>. See <http://code.google.com/p/redis/wiki/FAQ> for reference.
* <p/>
* The trait <tt>KeyTag</tt> defines a method <tt>tag</tt> that takes a key and returns
* the part of the key on which we hash to determine the server on which it will be located.
* If it returns <tt>None</tt> then we hash on the whole key, otherwise we hash only on the
* returned part.
* <p/>
* redis-rb implements a regex based trick to achieve key-tagging. Here is the technique
* explained in redis FAQ:
* <i>
* A key tag is a special pattern inside a key that, if preset, is the only part of the key
* hashed in order to select the server for this key. For example in order to hash the key
* "foo" I simply perform the CRC32 checksum of the whole string, but if this key has a
* pattern in the form of the characters {...} I only hash this substring. So for example
* for the key "foo{bared}" the key hashing code will simply perform the CRC32 of "bared".
* This way using key tags you can ensure that related keys will be stored on the same Redis
* instance just using the same key tag for all this keys. Redis-rb already implements key tags.
* </i>
* </p>
* <p>
* One of the techniques that redis encourages for such forced key locality is called <i>key tagging</i>.
* See <a href="https://redis.io/topics/cluster-tutorial#redis-cluster-data-sharding">Redis Cluster data sharding</a>
* for reference.
* </p>
* <p><i>
* (...) but the gist is that if there is a substring between {} brackets in a key, only what is inside the string
* is hashed, so for example this{foo}key and another{foo}key are guaranteed to be in the same hash slot,
* and can be used together in a command with multiple keys as arguments.
* </i></p>
*/
trait KeyTag {

/**
* Takes a key and returns the part of the key on which we hash to determine the server on which it will be located.
* If it returns <tt>None</tt> then we hash on the whole key, otherwise we hash only on the returned part.
*/
def tag(key: Seq[Byte]): Option[Seq[Byte]]
}

Expand All @@ -46,7 +43,7 @@ object KeyTag {
}

object NoOpKeyTag extends KeyTag {
def tag(key: Seq[Byte]) = Some(key)
def tag(key: Seq[Byte]): Option[Seq[Byte]] = None
}

}

0 comments on commit 1c54555

Please sign in to comment.