Browse files

article: new article about network lab with KVM

Also, my first video.
  • Loading branch information...
1 parent f8bf00b commit dec96c87f1c526dc1756e6172afdc2323cad8c1c @vincentbernat committed Oct 20, 2012
@@ -0,0 +1,389 @@
+title: "Network lab with KVM"
+keywords: "network lab, network, kvm, 9p, gdb, kgdb, virtualization"
+uuid: aac3ecda-175b-11e2-99e8-0018f3034e06
+ "": "GitHub repository"
+ - network
+To experiment with network stuff, I was using
+[UML-based network labs][uml-lab]. Many alternatives exist, like
+[GNS3][], [Netkit][], [Marionnet][] or [Cloonix][]. All of them are
+great viable solutions but I still prefer to stick to my minimal
+home-made solution with UML virtual machines. Here is why:
+ - I didn't want to use **disk images**. They take a lot of space and
+ they have to be maintained. They also become cluttered, especially
+ if you try to reuse them across several labs. They are also
+ difficult to share.
+ - I want to be able to access my **home directory**. It contains the
+ important configuration files related to the lab and I can put them
+ in the right place thanks to symbolic links when the lab starts. It
+ also makes exchanging files with the lab quite easy.
+ - I don't want to **boot a complete system**. This allows me to be
+ cheap on memory and each virtual system should boot in a few
+ seconds.
+The use of UML had some drawbacks:
+ - It may be buggy. For example, it is currently not possible to use
+ [gdbserver inside UML][] without a patch. Sometimes, the kernel
+ won't even compile.
+ - It is slow.
+However, UML features [HostFS][], a filesystem providing access to any
+part of the host filesystem. This is the killer feature which allows
+me to not use any virtual disk image and to get access to my home
+directory right from the guest.
+I discovered recently that KVM provided [9P][], a similar filesystem
+on top of VirtIO, the paravirtualized IO framework.
+# Setting up the lab
+The setup of the lab is done with a
+[single self-contained shell file][setup]. The layout is similar to
+[what I have done with UML][uml-lab]. I will only highlight here the
+most interesting steps.
+## Booting KVM with a minimal kernel
+My initial goal was to experiment with
+[Nicolas Dichtel's IPv6 ECMP patch][ecmp-ipv6]. Therefore, I needed to
+configure a custom kernel. I have started from `make defconfig`,
+removed everything that was not necessary, added what I needed for my
+lab (mostly network stuff) and added the appropriate options for VirtIO
+ ::
+No modules. Grab the [complete configuration][.config] if you want to
+have a look.
+From here, you can start your kernel with the following command
+(`$LINUX` is the appropriate `bzImage`):
+ ::sh
+ kvm \
+ -m 256m \
+ -display none \
+ -nodefconfig -no-user-config -nodefaults \
+ \
+ -chardev stdio,id=charserial0,signal=off \
+ -device isa-serial,chardev=charserial0,id=serial0 \
+ \
+ -chardev socket,id=con0,path=$TMP/vm-$name-console.pipe,server,nowait \
+ -mon chardev=con0,mode=readline,default \
+ \
+ -kernel $LINUX \
+ -append "init=/bin/sh console=ttyS0"
+Of course, since there is no disk to boot from, the kernel will panic
+when trying to mount the root filesystem. KVM is configured to not
+display video output (`-display none`). A serial port is defined and
+uses `stdio` as a backend[^signal]. The kernel is configured to use
+this serial port as a console (`console=ttyS0`). A VirtIO console
+could have been used instead but it seems this is not possible to make
+it work early in the boot process.
+[^signal]: `stdio` is configured such that signals are not
+ enabled. KVM won't stop when receiving `SIGINT`. This is
+ important for the usage we want to have.
+The KVM monitor is setup to listen on an Unix socket. It is possible
+to connect to it with `socat UNIX:$TMP/vm-$name-console.pipe -`.
+## Initial ramdisk
+Theoretically, the root filesystem could be mounted from the command
+line but I was unable to do so[^mount]. A small initial ramdisk is
+used to bypass this difficulty. It is built by the following snippet:
+ ::sh
+ # Setup initrd
+ setup_initrd() {
+ info "Build initrd"
+ DESTDIR=$TMP/initrd
+ mkdir -p $DESTDIR
+ # Setup busybox
+ copy_exec $($WHICH busybox) /bin/busybox
+ for applet in $(${DESTDIR}/bin/busybox --list); do
+ ln -s busybox ${DESTDIR}/bin/${applet}
+ done
+ # Setup init
+ cp $PROGNAME ${DESTDIR}/init
+ cd "${DESTDIR}" && find . | \
+ cpio --quiet -R 0:0 -o -H newc | \
+ gzip > $TMP/initrd.gz
+ }
+[^mount]: I am using the following kernel command line:
+ `root=rootshare rootflags=trans=virtio,version=9p2000.u ro
+ rootfstype=9p rootdelay=2`. I get the following error on
+ boot: `9pnet_virtio: no channels available`.
+The `copy_exec` function is stolen from the `initramfs-tools` package
+in Debian. It will ensure that the appropriate libraries are also
+copied. Another solution would have been to use a static `busybox`.
+The setup script is copied as `/init` in the initial ramdisk. It will
+detect it has been invoked as such. If it was omitted, a shell would
+be spawned instead. Remove the `cp` call if you want to experiment
+The flag `-initrd` allows KVM to use this initial ramdisk.
+## Root filesystem
+Let's mount our root filesystem using _9P_. This is quite easy. First
+KVM needs to be configured to export the host filesystem to
+the guest:
+ ::sh
+ kvm \
+ -fsdev local,security_model=passthrough,id=fsdev-root,path=${ROOT},readonly \
+ -device virtio-9p-pci,id=fs-root,fsdev=fsdev-root,mount_tag=rootshare
+`${ROOT}` can either be `/` or any directory containing a complete
+filesystem. Mounting it from the guest is quite easy:
+ ::sh
+ mkdir -p /target/ro
+ mount -t 9p rootshare /target/ro -o trans=virtio,version=9p2000.u
+You should find a complete root filesystem inside `/target/ro`. I have
+used `version=9p2000.u` instead of `version=9p2000.L` because the
+later does not allow a program to `stat()` a host mount
+[^9pversion]: And `busybox` mount tries to call `stat()` before
+ mounting something on a directory. Therefore, it is not
+ possible to mound a fresh `/proc` on top of the existing
+ one. I have searched a bit but didn't find why. Any
+ comments on this is welcome.
+Now, you have a read-only root filesystem (because you don't want to
+mess with your existing root filesystem and moreover, you did not run
+this lab as root, did you?). Let's use an union filesystem. Debian
+comes with [AUFS][] while Ubuntu and OpenWRT have migrated to
+[overlayfs][]. I was previously using AUFS but got errors on some
+specific cases. It is still not clear
+[which one will end up in the kernel][lwn-overlayfs]. So, let's try
+I didn't find any patchset ready to be applied on top of my kernel
+tree. I was working with [David Miller's net-next tree][davem]. Here
+is how I have applied the _overlayfs_ patch on top of it:
+ ::console
+ $ git remote add torvalds git://
+ $ git fetch torvalds
+ $ git remote add overlayfs git://
+ $ git fetch overlayfs
+ $ git merge-base overlayfs.v15 v3.6
+ 4cbe5a555fa58a79b6ecbb6c531b8bab0650778d
+ $ git checkout -b net-next+overlayfs
+ $ git cherry-pick 4cbe5a555fa58a79b6ecbb6c531b8bab0650778d..overlayfs.v15
+Don't forget to enable `CONFIG_OVERLAYFS_FS` in `.config`. Here is how
+I configured the whole root filesystem:
+ ::sh
+ info "Setup overlayfs"
+ mkdir /target
+ mkdir /target/ro
+ mkdir /target/rw
+ mkdir /target/overlay
+ # Version 9p2000.u allows to access /dev, /sys and mount new
+ # partitions over them. This is not the case for 9p2000.L.
+ mount -t 9p rootshare /target/ro -o trans=virtio,version=9p2000.u
+ mount -t tmpfs tmpfs /target/rw -o rw
+ mount -t overlayfs overlayfs /target/overlay -o lowerdir=/target/ro,upperdir=/target/rw
+ mount -n -t proc proc /target/overlay/proc
+ mount -n -t sysfs sys /target/overlay/sys
+ info "Mount home directory on /root"
+ mount -t 9p homeshare /target/overlay/root -o trans=virtio,version=9p2000.L,access=0,rw
+ info "Mount lab directory on /lab"
+ mkdir /target/overlay/lab
+ mount -t 9p labshare /target/overlay/lab -o trans=virtio,version=9p2000.L,access=0,rw
+ info "Chroot"
+ export STATE=1
+ cp "$PROGNAME" /target/overlay
+ exec chroot /target/overlay "$PROGNAME"
+You have to export your `${HOME}` and the lab directory from host:
+ ::sh
+ kvm \
+ -fsdev local,security_model=passthrough,id=fsdev-root,path=${ROOT},readonly \
+ -device virtio-9p-pci,id=fs-root,fsdev=fsdev-root,mount_tag=rootshare \
+ -fsdev local,security_model=none,id=fsdev-home,path=${HOME} \
+ -device virtio-9p-pci,id=fs-home,fsdev=fsdev-home,mount_tag=homeshare \
+ -fsdev local,security_model=none,id=fsdev-lab,path=$(dirname "$PROGNAME") \
+ -device virtio-9p-pci,id=fs-lab,fsdev=fsdev-lab,mount_tag=labshare
+## Network
+You know what is missing from our network lab? Network setup. For each
+LAN that I will need, I spawn a VDE switch:
+ ::sh
+ # Setup a VDE switch
+ setup_switch() {
+ info "Setup switch $1"
+ screen -t "sw-$1" \
+ start-stop-daemon --make-pidfile --pidfile "$TMP/switch-$" \
+ --start --startas $($WHICH vde_switch) -- \
+ --sock "$TMP/switch-$1.sock"
+ screen -X select 0
+ }
+To attach an interface to the newly created LAN, I use:
+ ::sh
+ mac=$(echo $name-$net | sha1sum | \
+ awk '{print "52:54:" substr($1,0,2) ":" substr($1, 2, 2) ":" substr($1, 4, 2) ":" substr($1, 6, 2)}')
+ kvm \
+ -net nic,model=virtio,macaddr=$mac,vlan=$net \
+ -net vde,sock=$TMP/switch-$net.sock,vlan=$net
+The use of a VDE switch allows me to run the lab as a non-root
+user. It is possible to give Internet access to each VM, either by
+using `-net user` flag or using `slirpvde` on a special switch. I
+prefer the latest solution since it will allow the VM to speak to each
+# Debugging
+This lab was mostly done to debug both the kernel and Quagga. Each of
+them can be debugged remotely.
+## Kernel debugging
+While the kernel features [KGDB][], its own debugger, compatible with
+[GDB][], it is easier to use the _remote GDB server_ built inside
+ ::sh
+ kvm \
+ -gdb unix:$TMP/vm-$name-gdb.pipe,server,nowait
+To connect to the remote GDB server from the host, first locate the
+`vmlinux` file at the root of the source tree and run GDB on it. The
+kernel has to be compiled with `CONFIG_DEBUG_INFO=y` to get the
+appropriate debugging symbols. Then, use `socat` with the Unix socket
+to attach to the remote debugger:
+ ::console
+ $ gdb vmlinux
+ GNU gdb (GDB) 7.4.1-debian
+ Reading symbols from /home/bernat/src/linux/vmlinux...done.
+ (gdb) target remote | socat UNIX:$TMP/vm-$name-gdb.pipe -
+ Remote debugging using | socat UNIX:/tmp/tmp.W36qWnrCEj/vm-r1-gdb.pipe -
+ native_safe_halt () at /home/bernat/src/linux/arch/x86/include/asm/irqflags.h:50
+ 50 }
+ (gdb)
+You can now set breakpoints and resume the execution of the kernel.
+It is easier to debug the kernel if optimizations are not
+enabled. However, it is not possible to
+[disable them globally][gcc20217]. You can however disable them
+for some files. For example, to debug `net/ipv6/route.c`, just add
+`CFLAGS_route.o = -O0` to `net/ipv6/Makefile`, remove
+`net/ipv6/route.o` and type `make`.
+## Userland debugging
+To debug a program inside KVM, you can just use `gdb` as usual. Your
+`$HOME` directory is available and it should be therefore
+straightforward. However, if you want to perform some remote
+debugging, that's quite easy. Add a new serial port to KVM:
+ ::sh
+ kvm \
+ -chardev socket,id=charserial1,path=$TMP/vm-$name-serial.pipe,server,nowait \
+ -device isa-serial,chardev=charserial1,id=serial1
+Starts `gdbserver` in the guest:
+ ::console
+ $ libtool execute gdbserver /dev/ttyS1 zebra/zebra
+ Process /root/code/orange/quagga/build/zebra/.libs/lt-zebra created; pid = 800
+ Remote debugging using /dev/ttyS1
+And from the host, you can attach to the remote process:
+ ::console
+ $ libtool execute gdb zebra/zebra
+ GNU gdb (GDB) 7.4.1-debian
+ Reading symbols from /home/bernat/code/orange/quagga/build/zebra/.libs/lt-zebra...done.
+ (gdb) target remote | socat UNIX:/tmp/tmp.W36qWnrCEj/vm-r1-serial.pipe
+ Remote debugging using | socat UNIX:/tmp/tmp.W36qWnrCEj/vm-r1-serial.pipe
+ Reading symbols from /lib64/ debugging symbols found)...done.
+ Loaded symbols for /lib64/
+ 0x00007ffff7dddaf0 in ?? () from /lib64/
+ (gdb)
+# Demo
+For a demo, have a look at the following video (it is also available
+as an [Ogg Theora video][ogv]).
+[ogv]: [[!!videos/2012-network-lab-kvm.ogv]]
+*[UML]: User Mode Linux
+*[KVM]: Kernel-based Virtual Machine
+*[VM]: Virtual Machine
+[uml-lab]: [[en/blog/2011-uml-network-lab.html]] "Network lab with UML"
+[Cloonix]: "Cloonix: dynamical topology virtual networks"
+[Marionnet]: "Marionnet: a virtual network laboratory"
+[Netkit]: "Netkit: the poor man's system to experiment computer networking"
+[GNS3]: "GNS3: Graphical Network Simulator"
+[HostFS]: "UML: Host File Access"
+[gdbserver inside UML]: "Bug #676184: gdbserver not usable inside UML"
+[9P]: "Setting up 9P in KVM"
+[ecmp-ipv6]: "ipv6: add support of ECMP"
+[AUFS]: "AUFS: advanced multi-layered unification filesystem"
+[overlayfs]:;a=summary "overlayfs git repository"
+[lwn-overlayfs]: "LWN: Debating overlayfs"
+[davem]:;a=summary "net-next git repository"
+[.config]: "Minimal .config"
+[KGDB]: "KGDB: Linux Kernel Source Level Debugger"
+[GDB]: "GDB: The GNU Project Debugger"
+[gcc20217]: "Bug 20217 - Switching off the optimization triggers undefined reference at link time when building Linux kernel"
+{# Local Variables: #}
+{# mode: markdown #}
+{# indent-tabs-mode: nil #}
+{# End: #}
Oops, something went wrong.

0 comments on commit dec96c8

Please sign in to comment.