Skip to content

Support Sharding : Add a command to split dump.rdb #14

Open
sripathikrishnan opened this Issue Sep 20, 2012 · 1 comment

2 participants

@sripathikrishnan
Owner

Sharding an existing redis database is a big pain. There isn't a way to do it without data loss. For example, exporting to json using any of the tools loses the TTL. There is also no good way to maintain set v/s list semantics. Writing a script to iterate over all keys is doable, but not appealing.

Using the rdbparser, we can split the dump file directly into several shards. This method will retain the data type, ttl, as well as the internal representation. It will likely be faster than existing methods, and should also be safer.

For flexibility, we should allow sharding by database, by key, by datatype, or any combination thereof.

Proposed API -

redis-shard dump.rdb config.json
  1. dump.rdb is the input dump file
  2. config.json is a configuration file that declares how the shard the data
  3. Any keys not matching the configuration file will be moved to default-shard.rdb

Sample config file :

[
    {
        "shard-name" : "shard1",
        "db" : [0],
        "keys" : ["user:1.*", "user-friends:1.*"],
        "data-type" : ["dict", "list", "set"],
    },
    {
        "shard-name" : "shard2",
        "keys" : ["user:2.*", "user-friends:2.*"],
    },
]
@yihuang
yihuang commented Aug 21, 2013

A more common situation, re-shard a shared cluster, I don't even known how to do it right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.