Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Good Migration Plan? #151

Closed
dmcfaul opened this issue Oct 23, 2013 · 15 comments
Closed

Good Migration Plan? #151

dmcfaul opened this issue Oct 23, 2013 · 15 comments

Comments

@dmcfaul
Copy link

dmcfaul commented Oct 23, 2013

Hello, I've been trying to find a solid migration plan from my current single server setup to twemproxy. Wondering if anyone can point me in the direction of something solid? Currently have a master setup that is replicating to slaves and I've been mucking about with rdbtools to try and convert my rdb file to pipelined commands and then sending them to my twemproxy setup through redis-cli. Pretty hit or miss results there, however.
After searching for several hours, decided I'd ask. Hopefully this is in the correct place.

Thanks!

@manjuraj
Copy link
Collaborator

What's the problem with bootstrapping a twemproxy + redis setup with a rdb file through redis-cli?

@dmcfaul
Copy link
Author

dmcfaul commented Oct 24, 2013

I assume you're referencing this post? #33
I was under the impression that wouldn't shard keys throughout the twemproxy setup?

@manjuraj
Copy link
Collaborator

twemproxy is just a proxy for redis. I believe redis-cli uses redis protocol to bootstrap a redis instance from a rdb file. The same behavior should work if redis (or many redises) are frontende'd by twemproxy

@manjuraj
Copy link
Collaborator

@manjuraj
Copy link
Collaborator

I haven't researched this fully yet. But the high level idea is to convert rdb file to redis commands and use redis-cli to populate the redis cluster through twemproxy.

This might require supporting --pipe behavior in twemproxy to figure out when data upload ended

@dmcfaul
Copy link
Author

dmcfaul commented Oct 24, 2013

Ah right, that's what I was referring to with the rdbtools. The problem with an rdb file is you need to convert it to the redis protocol spec before you can import it using the pipeline command. Converting large rdb files is problematic when you encounter invalid chars, etc. Just was hoping for a better method.

@charsyam
Copy link
Contributor

@dmcfaul how about using this step.
load rdb. and save aof file(it is written by redis protocol) and send it with pipeline

@dmcfaul
Copy link
Author

dmcfaul commented Oct 25, 2013

Hey guys, sorry for delayed response, got caught up in other things. Tried loading rdb and writing to aof today, however this isn't working as I'm runinng out of memory getting the following error:
Can't rewrite append only file in background: fork: Cannot allocate memory

I'd prefer not to have to bump this up as I'm already on an aws m2.4xl (64GB).

Thanks

@manjuraj
Copy link
Collaborator

can't we fix rdbtools. I'm sure the fix would be helpful to a lot of folks in redis community

@dmcfaul
Copy link
Author

dmcfaul commented Oct 25, 2013

That is my next step, will be looking into it over the next few days, I'll keep this thread up to date if I find a fix.

@dmcfaul
Copy link
Author

dmcfaul commented Nov 4, 2013

Just to update this thread: in the end I've decided to migrate my .rdb file from my current single server setup and startup each of the redis nodes connected to twemproxy with the entire dataset. Although this is obviously overkill it appears to be the simplest way to migrate a large dataset currently. I fixed several issues with rdbtools that were due to character encoding in my dataset (not actual issues with rdbtools) and eventually was able to output my entire dataset to the redis protocol. However, the final dataset size was in excess of 60GB which was cumbersome to try and import through a redis pipeline. I turned off all timeouts on twemproxy/nutcracker but loading anything larger than a 1GB chunk at a time through nutcracker would result in the process being killed due to excess memory consumption on a machine with 64GB of memory.
So, theoretically, if you have more patience and time than me you could take steps to convert an .rdb file to redis protocol and do a mass insert through the redis pipeline. Ostensibly, loading each node with your entire dataset and triming data as you go has turned out to be the best solution for me. Also, one could code up a script to read all datatypes from the original single redis server and re-set them though twemproxy, but even that will have it's limitations, and be painfully slow on a large dataset.

Thanks for the help manjuraj and charsyam.

@dmcfaul dmcfaul closed this as completed Nov 4, 2013
@idning
Copy link
Contributor

idning commented Mar 13, 2014

you can try this: https://github.com/idning/redis/tree/replay
replay the old aof file:

  1. remove all select / multi
  2. change MSET/MSETNX/DEL to many SET/SETNX/DEL cmd;
  3. --filter : filter key by prefix
  4. --orig, --rewrite, rewrite key.
  5. follow aof modification like tail -f

@yihuang
Copy link

yihuang commented Jun 23, 2014

@idning thanks very much, you are my life safer.

@bitthegeek
Copy link

We did a migration from a single redis server to a twemproxy managed pair of redis server nodes this way:

  1. Create another instance of redis server.
  2. Make it a slave of the original redis server. Wait until they are fully synced.
  3. Create a twemproxy instance with the two servers as the nodes.
  4. Change the config of the apps connecting to the redis server to the newly created twemproxy instance. Kill the apps for now.
  5. Make the slave a master as well, creating 2 copies of the original redis server.
  6. Start the apps for the changes to take effect.

(If concerned about memory usage, can be done on a later time:)
7. Create another twemproxy with the nodes' order reversed as the original one. Let's call it t'
8. Create a list of all the keys on the original server via "KEYS *" then create a script prepending all the keys with DEL.
9. Execute the script on t'. Wait until memory use recedes again.

Any ideas on a better plan if someone would like to use 64 nodes instead of just 2 nodes?

@idning
Copy link
Contributor

idning commented Sep 22, 2014

@bitthegeek

  1. create a new cluster (you can do this with redis-mgr)
  2. enable aof on old nodes(no matter how many nodes)
  3. use aof-replay tool to replay data from old nodes to new cluster
  4. when aof-replay catch up, switch the apps to the nutcracker of new cluster.
    you can do this online at any time, because the replay is still running.

ref:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants