Sharding an existing redis database is a big pain. There isn't a way to do it without data loss. For example, exporting to json using any of the tools loses the TTL. There is also no good way to maintain set v/s list semantics. Writing a script to iterate over all keys is doable, but not appealing.
Using the rdbparser, we can split the dump file directly into several shards. This method will retain the data type, ttl, as well as the internal representation. It will likely be faster than existing methods, and should also be safer.
For flexibility, we should allow sharding by database, by key, by datatype, or any combination thereof.
Proposed API -
redis-shard dump.rdb config.json
Sample config file :
"shard-name" : "shard1",
"db" : ,
"keys" : ["user:1.*", "user-friends:1.*"],
"data-type" : ["dict", "list", "set"],
"shard-name" : "shard2",
"keys" : ["user:2.*", "user-friends:2.*"],
A more common situation, re-shard a shared cluster, I don't even known how to do it right.