Permalink
Find file
60b63ef Feb 2, 2015
171 lines (119 sloc) 5.3 KB

Doom migration at ODS 2014

At ODS 2014, I gave a demo of live migration of doom running in a container. Several people have asked me how exactly this worked, so here are the steps to get it going.

Create some vms.

I used two vms on my laptop for the hosts. You can create these however you like, but I find uvtool to be quite nice:

sudo apt-get install uvtool
sudo uvt-simplestreams-libvirt sync release=utopic arch=amd64
uvt-kvm create --cpu 4 --memory 2000 --disk 10 --bridge virbr0 --password ubuntu host1 release=utopic arch=amd64
uvt-kvm create --cpu 4 --memory 2000 --disk 10 --bridge virbr0 --password ubuntu host2 release=utopic arch=amd64

Set up the vms

All of these instructions need to be performed on both vms. Here, you need to install the daily build of lxc, and a trunk build of criu. The criu trunk does break from time to time; I used a6e746ba17b for the demo.

sudo apt-add-repository ppa:ubuntu-lxc/daily
sudo apt-get update
sudo apt-get install lxc build-essential protobuf-c-compiler asciidoc linux-image-extra-`uname -r`
git clone https://github.com/xemul/criu && cd criu && sudo make install
sudo apt-get install avahi-daemon  # make hostname.local work

Alternatively, since asciidoc is huge, another option is to just not install the man pages, which means you don't need asciidoc. You can accomplish that via: http://paste.ubuntu.com/9823158/

Next, add a bridge to put the containers on. I just had the containers get an IP from libvirt's dhcp server, which means modifying each vm's files to look something like what's below. How you want to do this depends on exactly how you want to connect to your container. One other note is that the setup below causes the machines to hang (cloud-init waits for a network device) for about two minutes on boot. It is annoying, but once everything has started, it all seems to work just fine. If anyone knows what's going on here, please drop me a line.

criu:~ cat /etc/network/interfaces.d/eth0.cfg
# The primary network interface
auto eth0
iface eth0 inet manual

criu:~ cat /etc/network/interfaces.d/doombr.cfg
auto doombr
iface doombr inet dhcp
  bridge_ports eth0

Set up the container

Set up the container on host1:

sudo lxc-create -t download -n u1 -- -d ubuntu -r utopic -a amd64

criu/lxc-checkpoint still has several bugs relating to ttys. You need to:

cat | sudo tee -a /var/lib/lxc/u1/config << EOF
# hax for criu
lxc.console = none
lxc.tty = 0
lxc.cgroup.devices.deny = c 5:1 rwm
EOF

In addition, you also need to ensure the tty devices are regular files (another bug :):

sudo rm /var/lib/lxc/u1/rootfs/dev/tty*
for i in `seq 0 4`; do sudo touch /var/lib/lxc/u1/rootfs/dev/tty$i; done

Then, start the container and install doom:

sudo lxc-start -n u1
ssh ubuntu@$(sudo lxc-info -n u1 -H -i)
sudo apt-get install vnc4server prboom-plus doom-wad-shareware
vnc4server  # start the vnc server once to set the password

If you want to start a vnc server on the container when it boots (as I did for the demo, so I didn't have to start it manually during my talk), you can use the following upstart job inside the container:

description "vnc"
author "Tycho Andersen <tycho.andersen@example.com>"

start on runlevel [2345]
stop on starting rc RUNLEVEL=[016]

respawn

background
expect fork

env HOME="/home/ubuntu"

setuid ubuntu

exec vnc4server

Set up the client

Now, on the bare metal which is hosting the vms, you can set up a client:

sudo apt-get install xvnc4viewer vnc4server
vncpasswd  # type in the password you used above

Start your engines!

Start the container inside host1, find out the machine's ip (sudo lxc-info -n u1 should tell you), and then run:

 xvnc4viewer 192.168.122.186 -passwd ~/.vnc/passwd

You can start doom by running prboom-plus in the xterm that shows up in the vnc window. Now, you're ready to migrate.

Migrate the container

I used the lxc-checkpoint tool in the lxc suite to do the actual migration. Full support in lxd for all this is of course forthcoming. I shared ssh keys as root=>ubuntu on the two machines, and used the following script to migrate the machine:

#!/bin/sh
set -e

usage() {
  echo $0 container user@host.to.migrate.to
  exit 1
}

if [ "$(id -u)" != "0" ]; then
  echo "ERROR: Must run as root."
  usage
fi

if [ "$#" != "2" ]; then
  echo "Bad number of args."
  usage
fi

name=$1
host=$2

checkpoint_dir=/tmp/checkpoint

do_rsync() {
  sudo rsync -rltha --devices --rsync-path="sudo rsync" $1 $host:$1
}

# we assume the same lxcpath on both hosts, that is bad.
LXCPATH=$(sudo lxc-config lxc.lxcpath)

sudo lxc-checkpoint -n $name -D $checkpoint_dir -s -v

do_rsync $LXCPATH/$name/
do_rsync $checkpoint_dir/

ssh $host "sudo lxc-checkpoint -r -n $name -D $checkpoint_dir -v"

Of course, this is very dumb (i.e. it stops the world, then does an rsync of the entire thing, then starts the world). lxd will have much smarter migration (using the p.haul tool). But we're not quite there yet :). Anyway, once you have the above script, you can migrate the container:

sudo ./migrate u1 ubuntu@host2.local

Happy hacking!