Permalink
Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
397 lines (249 sloc) 21.8 KB

Migrating Bootable Gentoo on ZFS Root

This tutorial describes the process for migrating an existing Gentoo system to run on a root ZFS filesystem using the native ZFS module available from https://github.com/zfsonlinux/zfs.

This tutorial will give you a bootable ZFS root filesystem with GRUB booting directly from the ZFS pool. If you are installing a system from scratch, you may wish to follow Installing a fresh Gentoo on ZFS System instead.

Status of this Document

As of 24-Jul-2011, this document should be considered "beta" status. While best efforts have been made to ensure accuracy of the content, it has not yet been tested on a wide variety of system configurations. Any feedback, error corrections, typos, etc. would be greatly appreciated. Bug reports can be sent to pendorbound /a.t/ g mail.

Assumptions

Before starting, you should verify the following are true:

  • You have an existing bootable Gentoo system with a sane set of kernel, compiler tools, etc. available.
  • You know your way around Portage, are comfortable using overlays, repartitioning disks, and the like.
  • You don't mind losing everything on the system in question (in case something goes wrong).
  • You're comfortable running bleeding edge versions of software, in some cases where no "official" release is yet available.
  • You have good backups of anything of value on your system (and you have tested them).

Recommendations

The steps in this tutorial are as accurate as possible, but this being bleeding edge software, things will inevitably go wrong. Before trying this on a "real" system, it's recommended that you:

  • Try at least one "dry-run" of these instructions. Using VMWare, VirtualBox, or the like makes for an easy test bed and much easier recovery when things go wrong. (Though see the note about memory for VM systems below.)
  • Don't do this to production hardware! The ZFSonLinux kernel modules are still evolving software. While they are reasonably stable and in use by many people for real-world work loads, using this for a mission critical system isn't the sanest of things to do.
  • Ensure that you have stable, reliable, 64-bit capable hardware. Running ZFS on 32-bit hardware isn't recommended. Running it on a system that's unstable due to driver issues or similar problems is a recipe for data loss.
  • Have a livecd standing by. You'll probably need it...
  • And did I mention having some fresh backups? It's not just a GoodIdea(tm)...

Required Packages

Before getting started, you'll need the following packages emerged on your system. These are all available from standard Portage. You'll be adding additional packages which aren't in the standard tree a bit later.

  • app-portage/layman
  • dev-vcs/git
  • sys-kernel/gentoo-sources-2.6.38-r6 -- ZFS may work on later kernels by the time you read this, but please don't assume that it will without checking the commit history from the git repository. Kernel upgrades can break ZFS' interface into the kernel.

Installing Packages

With the preliminaries out of the way, we'll start installing the ZFS-specific packages.

You'll likely need to add several package keywords to get all of the packages building. The following should be a good starting point:

$ FILE=/etc/portage/package.keywords ; [ -d $FILE ] && FILE=$FILE/zfs ; cat >> $FILE <<EOF
sys-kernel/spl **
sys-fs/zfs **
=sys-boot/grub-1.99_rc99 **
=sys-kernel/genkernel-9999 **
=sys-kernel/dracut-015 ~amd64
EOF

Adding the Layman Overlay

A Portage overlay is available which provides ebuilds for patched & customized versions of the source code. The need for this should diminish as patches are integrated in upstream repositories, but if you're living on the bleeding edge, you'll have to work a bit harder...

You should already have layman installed, so you can edit add the zfs overlay and sync it:

$ layman -o https://raw.github.com/pendor/gentoo-zfs-overlay/master/overlay.xml -f -a zfs

This overlay provides ebuilds for spl, zfs, genkernel, and grub.

Note that if you've not previously configured layman, you may also need to edit your /etc/make.conf to enable it. Make sure the following line is present near the bottom:

source /var/lib/layman/make.conf

Configuring the Kernel

You must ensure that your kernel configuration has the following options set:

CONFIG_KALLSYMS=y
CONFIG_PREEMPT_NONE=y

If your current kernel lacks either of these settings, you'll need to rebuild it and reboot before you'll be able to compile or load the ZFS module.

ZFS & SPL

SPL (Solaris Porting Layer) enables building the ZFS modules with minimal changes to the core source code. Instead of adapting the ZFS code directly to the Linux system (resulting in a maintainability nightmare when/if Oracle releases updated source), SPL provides a more Solaris-like environment for existing Solaris libraries to use.

After SPL, we can install the ZFS module & usermode tools.

For both SPL & ZFS, we'll be installing the latest rc build from the maintainer. Presently all necessary patches have been integrated into upstream, so we'll be building the source directly with minimal modification in the ebuilds (only to configure & make parameters).

# Note that you'll probably need to add ** keywords for both packages to /etc/portage/package.keywords.
$ emerge -v =sys-kernel/spl-9999 =sys-fs/zfs-9999

# You'll need the zfs initscript loaded at boot:
$ rc-update add zfs boot

# You may also need to add udev.  Classic genkernel typically does this automatically, but the Dracut version does not.
$ rc-update add udev boot

Genkernel

In order to support booting with a ZFS root filesystem, the Dracut branch of Genkernel is used. This branch uses the cross-distro initramfs builder Dracut rather than relying on Genkernel's traditional initramfs logic. This allows us to leverage the Dracut module already present in the ZFS source and avoid re-inventing the wheel for each distro. The layman zfs overlay provides a forked version of Genkernel from https://github.com/pendor/genkernel-zfs (branch: dracut). Patches for Genkernel have been submitted to upstream and are awaiting review.

Before you can use Dracut (which is called by Genkernel automatically), you'll need to add the following to your /etc/make.conf:

DRACUT_MODULES="zfs"

You may wish to add additional modules (like gensplash), but configuring these is beyond the scope of these instructions. Note that you won't ordinarily need mdadm, dmraid, or similar RAID solutions in your initramfs.

You can install Dracut and Genkernel with:

# Genkernel needs ** keyword, dracut should be okay with ~amd64.
$ emerge -v =sys-kernel/dracut-015 =sys-kernel/genkernel-9999

Configuring ZFS

Now that all of the required software is installed, it's time to start building a ZFS pool and moving files to it. The examples below assume a system with two SCSI drives, one of which holds the current live operating system and the other which will hold ZFS. When the migration is completed, the original system drive will be added to the zpool as a mirror device.

Partitioning

Assume your two hard drives are /dev/sda and /dev/sdb. sda is the live device, sdb will be reformatted for ZFS. You should create two partitions on the drive. The first should be a small (say 64MB) utility partition for Grub. The second should take up the rest of the disk and will be used for ZFS.

Using GPT-based partitions is recommended. MBR can also work, but requires a bit more work to make Grub happy with it.

Using gptfdisk, the partition setup might look something like this:

	# Install if necessary:
	$ emerge -v sys-apps/gptfdisk

# Start out clean: (This WILL destroy everything on the disk...)
$ sgdisk --zap-all /dev/sdb

# Create a bios_grub partition (type EF02) starting at 1MB with size 64MB.  The
# 1MB offset is to ensure proper sector alignment for SSD's and other 4k sector drives.
# After the utility partition, the rest of the drive is setup for ZFS.  Repeat for all
# drives to be used in the pool.
  $ sgdisk \
    --new=1:2048:133120 --typecode=1:EF02 --change-name=1:"grub" \
    --largest-new=2 --typecode=2:BF01 --change-name=2:"zfs" \
    /dev/sdb

A Note About Memory

Since we're about to load the ZFS kernel module for the first time, now would be a good time to talk about two critical settings you may need to make.

The first is a ZFS Evil Tuning parameter to control the amount of system memory used for the ZFS ARC cache. Default tuning for ZFS assumes that you have a Large System With Lots of RAM. If you're installing a test system in a VM, this may not be the case. If you attempt to use stock tuning parameters on a system with less than about 2GB of RAM, ZFS will probably lock up the system under even a moderate amount of I/O as the ARC will consume all available memory.

In order to limit the ARC size, you would load the ZFS module like:

$ modprobe zfs zfs_arc_max=0x20000000
# 128MB: 0x8000000
# 512MB: 0x20000000

The parameter is the number of bytes (in hex) which should be dedicated to the ARC, at most. You do want to make this value as large as your system RAM can afford since it serves as your disk cache. For a 512-768MB system, 128MB will leave the system useable, though quite slow. For a 1.5GB system, 512MB gives reasonable performance while still leaving RAM available.

Making Settings Permanent

If you do need to set ZFS tunables such as zfs_arc_max, you should add the settings in /etc/modprobe.d/zfs.conf like:

options zfs zfs_arc_max=0x20000000

The complete list of tunable parameters can be found by running modinfo zfs. Any changes to these settings will require rebuilding your initramfs and rebooting.

A Note About HostID's

The second parameter you should take care of is the system hostid which is stored in /etc/hostid. On many systems, simply running the command hostid may be enough to create this file if it doesn't already exist. Some versions of hostid won't persist the value into this file, so you may need to run hostid > /etc/hostid.

ZFS uses the hostid value to determine if a zpool is in use on a different system than the current one (IE if a shared iSCSI LUN allows the pool members to be visible from multiple hosts simultaneously). Absence of this file can cause system startup to fail as ZFS will refuse to automatically import pools that it cannot verify are owned by the current system.

Creating the zpool

With parameters out of the way, now we can load the ZFS modules and create a pool with one drive in it. The default mountpoint for the pool would be /rpool which won't work for booting, so we unmount it, change some settings, and import it to a temporary mountpoint so we can work on it.

# Load the module (see above for memory tuning parameters & check `dmesg` if anything goes wrong)
$ modprobe zfs

# Create a new pool with a single device.  Device node here assumes you created a swap partition.  Adjust as needed.  Also feel free to use a different pool name. `tank` is a popular choice...
$ zpool create rpool sdb2

Creating Datasets (Filesystems)

Now it's time to create some datasets (AKA filesystems) to work with. First we unmount the rpool we just mounted and reset it's mount parameters. We make the root dataset unmountable and set its mountpoint to / so the child datasets inherit their mount points directly.

ZFS uses a hierarchal approach for many attributes on datasets, and each sub-dataset inherits the beginning of its mountpoint from its ancestors. We set the root dataset unmountable since we're actually going to use a sub-dataset as the real root of the system.

$ zfs umount rpool
$ zfs set mountpoint=/ rpool
$ zfs set canmount=off rpool

Next we create the dataset that will actually be the root filesystem and give it a legacy mountpoint (since the initramfs will mount it manually). Why not use the rpool/ root dataset as the root filesystem? ZFS' attribute inheritance means that all datasets in the pool will inherit many of their attributes from rpool/. Often attributes like quota may need to be set explicitly for the root but best not inherited by the sub-datasets. Additionally, when snapshots are created, they're normally made recursively. If all of your data is in sub-datasets of root, then when you try to snap your root for system recovery, you'd get all of your user data as well. Ideally you should be able to roll back from a failed system upgrade but NOT rollback user data. Keeping the filesystem root separate from the physical root of the hierarchy enables this.

$ zfs create -o mountpoint=legacy rpool/ROOT
$ mount -t zfs rpool/ROOT /mnt/gentoo

Since we want to boot off this filesystem, we can set an attribute on the pool to indicate which dataset should be used as the root:

$ zpool set bootfs rpool/ROOT rpool

Note that this will only work for a RAID-1 or completely unmirrored pool. If you're using RAID-Z, you'll currently need to set the root dataset using the root=... kernel parameter (see Configuring GRUB below).

Now we'll create some additional datasets to break things up. The beauty of ZFS is that it allows creation of any number of datasets to split administrative and performance boundaries. For example, you might decide to turn on compression for log files and source code, set quotas on home, make sure a run away /tmp can't fill up the filesystem, etc.

Feel free to experiment with how partitions are laid out. The suggestions below are taken from common use on FreeBSD systems, but are by no means the only reasonable way to lay things out. The one restriction for the root system is that /bin, /boot, /sbin, /etc, /dev, /sys, and /proc must all be in the root dataset. These trees must all be available from the initrd early in the boot process, and the initrd can only mount the one single dataset that contains root (rpool/ROOT). Additional datasets for /home, /usr, etc. are all mounted by the system initscripts once the initrd has already switch_root'd to it.

	$ zfs create rpool/home
	$ zfs create rpool/usr
	$ zfs create rpool/usr/local
	$ zfs create rpool/usr/portage
	$ zfs create rpool/usr/src
	$ zfs create rpool/var
	$ zfs create rpool/var/log

Create Swap (Optional)

Important Note

Running swap in ZVOL will currently deadlock when the system grows into swap. If you NEED swap, you must create separate partitions for it and setup mdadm in necessary.

Unsafe swap-on-ZVOL Setup:

Swap can be run from with the zpool as a zvol:

# Set size as desired.  Checksum'ing isn't necessary for swap and turning it off improves
# performance.
$ zfs create -V 2G rpool/swap
$ zfs set checksum=off rpool/swap
$ echo "/dev/rpool/swap    none    swap    sw    0 0" >> /mnt/gentoo/etc/fstab

The swap space will automatically use the level of redundancy configured for the pool.

Building the initramfs

In order to boot from ZFS, an initramfs with ZFS support is required. Gentoo's Genkernel tool will build this initramfs using the Dracut ramdisk builder. Before you start, you should make a backup of your existing ramdisk in case you need to boot into your current root for recovery or troubleshooting.

$ mount /boot
$ cp /boot/initframfs-${KV} /boot/initramfs-safe

To make Genkernel build a ZFS ramdisk, you need to edit /etc/genkernel.conf and set ZFS="yes". Note that if your genkernel.conf doesn't have a ZFS=... setting, you may need to run etc-update or dispatch-conf to merge in a new version from the emerge steps run above. If you still don't have the setting, you should double check that you have the correct -9999 version of genkernel installed and that the portage overlay took effect to pull the modified dracut branch of genkernel.

In order to create the initramfs, run:

$ genkernel --loglevel=5 initramfs

Loglevel=5 ensures that all output from Dracut is displayed. You should check the output to make sure the zfs module ran and that zfs, zpool, /etc/hostid, /etc/zfs/zpool.cache are all included in the initramfs. Assuming everything worked, you should have a new initramfs in your /boot partition.

Copying Root

Presumably you have some data on an existing system that you'd like to transfer to your new ZFS root. We're doing this dead last so that you'll have the latest version of all of your configuration changes copied over. rsync is probably the best tool for the job. With all of your datasets created and mounted (they're mounted automatically when created), you can copy everything in one step and the content will get split up among the various datasets in the process. You do need to take some care to exclude certain pseudo filesystems (like /dev and /proc).

$ rsync -rax --exclude /proc --exclude /dev --exclude /sys --exclude /tmp --exclude /mnt / /mnt/gentoo

You'll need to create the excluded directories when you're done:

$ cd /mnt/gentoo
$ mkdir -p dev proc sys mnt mnt/gentoo mnt/cdrom tmp

# Set perms on tmp
$ chmod 1777 tmp

# Populate dev (probably overkill)
$ cp -a /dev/* dev/

If your system locks up during the above rsync, you probably have your ARC max size set too large for your system's memory. See "A Note About Memory" above.

Double-check zpool.cache

The above rsync should have done it, but make sure that /mnt/gentoo/etc/zfs/zpool.cache and /mnt/gentoo/etc/hostid exist after the copy.

Fix /etc/fstab

Edit your /etc/fstab file to set up your new boot, swap, and root partitions. You'll want at least the following, and you may of course leave any other entries you have:

/dev/rpool/swap         none            swap            sw              0 0
rpool/ROOT              /               zfs             noatime         0 0

Remove any existing entries you have for root, swap, and boot.

Configuring GRUB

Configuring GRUB must be done from within a chroot environment just like during a normal Gentoo install. We'll also emerge the new Grub only in the chroot so your existing drive's bootloader isn't modified.

	# Bind over pseudo-filesystems
	$ mount -t proc none /mnt/gentoo/proc
	$ mount --rbind /dev /mnt/gentoo/dev
	$ chroot /mnt/gentoo /bin/bash
	$ env-update && source /etc/profile
	$ emerge -v =sys-boot/grub-1.99_rc99

# Setup the config file
$ cat /proc/mounts | grep -v rootfs > /etc/mtab
$ grub-mkconfig -o /boot/grub/grub.cfg
	
$ grub-install /dev/sdb

Assuming no errors are reported, you can exit the chroot and prepare to reboot:

	$ exit
	$ zfs umount -a
	$ zfs set mountpoint=/ rpool
	$ umount -l /mnt/gentoo/dev{/shm,/pts,}
	$ umount -l /mnt/gentoo{/proc,}
	$ zpool export rpool

Reboot!

Now cross your fingers, reboot, and tell your BIOS to boot off the new drive. If all went well, you should be running off your new Gentoo ZFS root.

Troubleshooting

If you run into problems, it's most likely to be when Dracut tries to mount and switch_root. The following recipe may help from the Dracut emergency holographic shell:

$ zfs umount -a
$ umount /sysroot
$ zpool export rpool

# One or more of the above might fail, but they're just to get things back to a known state.

$ zpool import -N rpool
$ mount -t zfs rpool/ROOT /sysroot
$ cp /etc/zfs/zpool.cache /sysroot/etc/zfs/
$ exit

# Dracut should try to mount root again and hopefully succeed.

The most common reason for failing on startup is that ZFS doesn't have a correct cached view of its device layout. The above incantation exports any imported pool, imports it again (creating zpool.cache in the process), mounts the root dataset, copies zpool.cache into the root dataset, then lets Dracut continue booting.

After running the above, you should regenerate your initramfs to ensure the updated zpool.cache is included.

Setup Mirroring - No Turning Back!

Assuming everything worked, now you can mirror everything over to your original drive and run on a pure ZFS mirrored system. Note that if you're using RAID-Z, you may just need to remove your original drive and call it good at this point.

Partitioning

First, you need to wipe & repartition your existing drive to match the new drive. If you used MBR, you might do something like:

$ sfdisk -d /dev/sdb | sfdisk /dev/sda

If you used GPT, you'll need to use sgdisk.

Setting up ZFS Mirror

Now mirror the zpool:

# Make sure the **existing** device is listed first.
$ zpool attach rpool sdb2 sda2

You can keep an eye on the progress by running zpool status:

$ watch zpool status

The zpool should mirror relatively quickly since only used storage is copied. The resilver process doesn't bother to copy over unused portions of the drive.

Rebuilding initramfs

Reading through Solaris documentation, you may find the phrase "rebuild the boot archive". The Linux equivalent is the initramfs, and it too needs to be rebuilt after significant ZFS changes. Adding a new device to a pool causes changes to the /etc/zfs/zpool.cache file which should be available at boot time in the initramfs environment.

$ genkernel --loglevel=5 initramfs

Installing Grub (Again)

Repartitioning destroyed your GRUB boot sector, so:

$ grub-install /dev/sda

You're Done!

That's all there is too it (yeah. that's all...). Enjoy your new Gentoo ZFS system!

Need Help?

If you run into trouble, please stop by the zfs-discuss Google Group and ask a question.