Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wiki: Debian Stretch with ZFS boot, plus ZFS and LUKS root #6056

Closed
ghost opened this issue Apr 21, 2017 · 26 comments
Closed

Wiki: Debian Stretch with ZFS boot, plus ZFS and LUKS root #6056

ghost opened this issue Apr 21, 2017 · 26 comments

Comments

@ghost
Copy link

ghost commented Apr 21, 2017

This is a feature request for a HOWTO covering Debian Stretch with ZFS /boot and ZFS and LUKS root. Ideally, the HOWTO will include instructions for mirroring the /boot and root pools in case of drive failure, and perhaps also for mirroring the BIOS boot partition (even though the latter will not be using ZFS).

I have begun work on this here by forking Debian Jessie Root on ZFS per #5960. (You might find the "Drive layout" section especially interesting.)

It is a work in progress. Improvements would be welcome!

I would be grateful if this issue could be left open until the HOWTO is complete. Thanks :)

P.S. To answer some potential questions:

  • Why didn't I simply fork the whole ZoL wiki repo? Because it would have potentially breached contributors' copyright, as ZFS On Linux wiki content: no license specified #5958 is not yet fixed.

  • Why am I using GitLab instead of GitHub? I prefer GitLab for the reasons I gave in ZFS on Linux code hosting currently fails GNU Ethical Repository Criteria Evaluations #6046. You're welcome to fork my repo and submit bug reports or pull requests wherever you like, though, as long as you respect the licensing terms. (Obviously, I'll only receive those bug reports or PRs if I have an account on whichever site you use.)

  • Why am I using Org instead of Markdown? Because:

    • Exporting from Org to HTML with Emacs yields a nice table of contents at the top of the file.
    • Org markup is more powerful and expressive than most (all?) flavours of Markdown.
    • Both GitLab and GitHub render Org files decently, so there's no need to use Markdown.
@jwittlincohen
Copy link
Contributor

jwittlincohen commented Apr 22, 2017

SSDs may use 8K or 16K page sizes but they are optimized for 4K reads/writes. Manufacturer provided IOPS estimates and professional reviews all cite 4K reads and presumably use filesystems with a 4K minimum block size. As such, benchmarks are necessary prior to making a general recommendation in favor of ashift=13 for SSDs. Higher ashift values also lead to particularly bad space efficiency with raidz. With ashift=12, the overhead for 4K and 8K blocks is 200%, essentially equivalent to 3-way mirrors. Using ashift=13 will presumably result in the same 200% overhead for 4K, 8K and 16K blocks.

Note: I noted a several-minute outage on Gitlab while reviewing your howto (500 error on multiple Gitlab pages).

@ghost
Copy link
Author

ghost commented Apr 22, 2017

@jwittlincohen wrote:

Higher ashift values also lead to particularly bad space efficiency with raidz.

Agreed :) This is already mentioned in the HOWTO (as internal fragmentation efficiency).

Benchmarks are necessary prior to making a general recommendation in favor of ashift=13 for SSDs.

Fair point. I'll suggest that people perform benchmarking before choosing ashift=13 over ashift=12.

I noted a several-minute outage on Gitlab while reviewing your howto (500 error on multiple Gitlab pages).

Ah, that was probably around the moment that GitLab upgraded to GitLab EE 9.1.0 RC6. I guess they need to improve their deployment process! Thanks for persevering despite the inconvenience :)

@jwittlincohen
Copy link
Contributor

@sampablokuper The OpenZFS Hardware page says, "As of 2017, NAND-flash SSDs are tuned for 4096-byte IOs. Matching the flash page size is unnecessary and ashift=12 is usually the correct choice. Public documentation on flash page size is also nearly non-existent." As such, I would leave this out unless you have testing that suggests ashift=13 would yield a significant improvement.

@richardelling
Copy link
Contributor

To be precise, it is the ratio of the size of the volblock/recordsize for the block to the physical blocksize reported by the storage (or as overridden by physical blocksize = 1 << ashift). The raidz overhead will be high for volblocksize=512 and physical blocksize=512, too. There is nothing special about 4k or 8k or 1k or 2k... what is important is the size ratio.

@ryao
Copy link
Contributor

ryao commented Apr 23, 2017

@richardelling Good point.

@ghost
Copy link
Author

ghost commented Apr 24, 2017

@jwittlincohen wrote:

The OpenZFS Hardware page says, "As of 2017, NAND-flash SSDs are tuned for 4096-byte IOs. Matching the flash page size is unnecessary and ashift=12 is usually the correct choice. Public documentation on flash page size is also nearly non-existent." As such, I would leave this out unless you have testing that suggests ashift=13 would yield a significant improvement.

Thanks! How's this?

@jwittlincohen
Copy link
Contributor

jwittlincohen commented Apr 24, 2017

The discussion of SSD ashift values looks good. However, I don't agree with the following statement or your chart:

For HDDs, if the filesystem block size equals the drive's physical sector size,
performance will be optimal. If the filesystem block size is less than the
physical sector size, performance will typically be low. If the filesystem block
size exceeds the physical sector size, performance will typically be
intermediate.

     |====================================================================================================================|
     | Drive  | Logical | Physical |            ashift=9         |            ashift=12        |         ashift=13        |
     | format | sector  | sector   |-----------------------------|-----------------------------|--------------------------|
     |        | size    | size     | Performance    | Internal   | Performance    | Internal   | Performance | Internal   |
     |        | (bytes) | (bytes)  |                | frag.      |                | frag.      |             | frag.      |
     |        |         |          |                | efficiency |                | efficiency |             | efficiency |
     |====================================================================================================================|
     | 512n   |     512 |      512 | High           | High       | Medium         | Medium     | Medium      | Low        |
     |--------|---------|----------|----------------|------------|----------------|------------|-------------|------------|
     | 512e   |     512 |     4096 | Medium         | High       | High           | Medium     | Medium      | Low        |
     |--------|---------|----------|----------------|------------|----------------|------------|-------------|------------|
     | 4Kn    |    4096 |     4096 | Low            | High       | High           | Medium     | Medium      | Low        |
     |--------|---------|----------|----------------|------------|----------------|------------|-------------|------------|
     | SSD    |     N/A |      N/A | Medium?        | High       | High?          | Medium     | High?       | Low        |
    

Why would using a block size larger than the physical sector size reduce performance, so long as you're using a multiple of the physical block size? For example, using ashift=12 to force a minimum 4K block size on a disk with 512 byte sectors should yield identical performance to the use of ashift=9. The benefit is that should you upgrade any of the drives in the vdev in the future, you can use Advanced Format (4K) or disks with 512 byte sectors. You'll get worse space efficiency but this shouldn't impact performance, and given that virtually all new SATA disks are now Advanced Format, it's a good option to future proof. What do you mean by "performance"?

@ghost
Copy link
Author

ghost commented Apr 24, 2017

@jwittlincohen, thanks for the feedback :)

Why would using a block size larger than the physical sector size reduce performance[...]?

I believe I read that it did, somewhere in the documentation of one of the operating systems that uses ZFS. Tracking down that source is not a top priority for me, but if I do locate it, I will mention it here.

My guess is that this occurs because using blocks larger than the physical sectors would cause more data to be sent to or read from the drive than necessary. For instance, using a 4 KiB filesystem block size with 512-byte physical sectors would mean that up to 8x more data would be sent to or read from the drive than necessary. Even though it would just be padding, while it is being read or written, that extra overhead would keep the storage bus and the HDD heads busy. This would potentially reduce the rate at which genuine data (not padding) could be stored to or retrieved from the drive, which is what I mean by performance.

Perhaps I am mistaken. Maybe some kind of compression and decompression occurs at the drive firmware and at the southbridge respectively on the way from the drive to the CPU, and likewise in the other direction, such that the padding imposes no cost on the storage bus or on the components at either end of it. Or perhaps that cost is avoided some other way.

I would be glad to know if this is the case :)

@gmelikov
Copy link
Member

Just my first thoughts.

Looks like LUKS is mandatory, I think it's better to make it optional.

About any performance recommendations and other useful information - maybe better just give some links? My opinion - Instruction about anything doesn't need to duplicate any other wiki pages, it's better to have small but exact information anybody need to just install the system on ZFS.

And last thing for now - scripts are great, but for me the best way - manual commands (I usually make many changes in them). Everybody just have different workloads. Maybe describe the manual way and give scripts too? I already done something similar here.

@jwittlincohen
Copy link
Contributor

jwittlincohen commented Apr 24, 2017

@gmelikov

Wouldn't this Howto be duplicative if it didn't cover encryption? There is essentially no difference between installing a ZFS root on Debian Stretch versus doing so on Debian Jessie. The only change is that Stretch includes an appropriate version of Grub, which changes one command in the howto. This HOWTO already covers the general case of installing ZFS on Root in Debian Jessie (will be updated for Stretch when Stretch is released).

@sampablokuper

I agree with gmelikov regarding the use of scripts. I used the existing Debian on Root Howto for my own system and changed several commands to suit my needs. I think that presenting the commands to the user (at least the ZFS ones) also has the secondary benefit of demonstrating how easy it is to create and set properties for ZFS filesystems. In addition, the existing Howto describes this as an advanced topic. After all, an rpool can make a system unbootable if a user installs a new kernel but new ZFS/SPL modules are not generated. I'm not sure we want to make the process into a black box (just run X script) for simplicity because there's no automated script for fixing the system when it breaks. I'm somewhat ambivalent about this. What do you think?

@gmelikov
Copy link
Member

Wouldn't this Howto be duplicative if it didn't cover encryption?

Honestly, I thought that @sampablokuper want to update it, not write the additional one.

Closer to question - I think it would be better to have LUKS as an option, not the different instruction, because it just adds some steps and doesn't change the whole process from the ZFS side.

@jwittlincohen
Copy link
Contributor

I agree. That makes more sense, but I was under the impression he was creating a new Howto given that he said he forked the old one and is putting the new one on Gitlab.

@ghost
Copy link
Author

ghost commented Apr 24, 2017

@gmelikov wrote:

Just my first thoughts.

Thanks for feeding back :)

Looks like LUKS is mandatory, I think it's better to make it optional.

Are you saying you would prefer to see the HOWTO cover:

  • a different block device encryption mechanism (e.g. plain dm-crypt) instead of LUKS, or
  • a way to install Stretch on ZFS without block device encryption?

If the former, fair enough. LUKS seemed a good starting point because it is the system used by the Debian Installer, so should work with a kernel copied from an existing Stretch on LUKS installation. If you think there's a really good reason to perform the encryption and key setup with a different piece of free software, let me know and I'll look into it.

If instead you are asking for an encryption-free HOWTO, that's also fair enough, but I am aiming for this particular HOWTO to be secure by default. Once it is complete, it can always be forked to create a version for people who rely on physical security measures instead of device encryption.

About any performance recommendations and other useful information - maybe better just give some links?

When I attempted to follow the Debian Jessie Root on ZFS HOWTO, I found the contextual information in it very helpful, but not always sufficient. Sometimes I still found myself asking, "Why are they suggesting that setting?", etc.

My aim with the Stretch HOWTO was to forestall those questions by providing a comprehensive rationale starting with the end goals and linking these back to the installation procedure and settings. I want to minimise duplicating others' efforts though. If background information can be found elsewhere, my aim was to summarise it using the relevant keywords and links, such that the summary is probably sufficient but that the reader will otherwise be equipped to learn more elsewhere.

I agree that the "Considerations" section is long and might get in the way of a reader who wants to get straight to the installation procedure (hence the "TL;DR" near the top of the page). I may well split it out into a separate "Rationale" document, to make it less intrusive.

scripts are great, but for me the best way - manual commands (I usually make many changes in them). ... Maybe describe the manual way and give scripts too?

I am starting with scripts because that speeds developing the HOWTO. It lets me wipe the relevant parts of the drive and re-run the installation procedure with just a few commands, instead of having to do lots of slow and error-prone (for me, at least!) copy and paste. I hope that if other people test the HOWTO, they will also benefit from this convenience, enabling them to provide feedback to me more quickly.

Once the scripts are working, then maybe I (or someone else: the HOWTO is a free cultural work, after all) will derive a manual tutorial from them. I hope this would involve little more work than marking-up the comments as body text and the commands as code blocks. Thinking aloud: Org-mode might even be able to automate this, depending upon how I mark up the scripts...?

I already done something similar here.

Cool, thanks for drawing my attention to it :)

@ghost
Copy link
Author

ghost commented Apr 24, 2017

@jwittlincohen wrote:

I was under the impression he was creating a new Howto given that he said he forked the old one and is putting the new one on Gitlab.

Besides being for Stretch rather than Jessie, the HOWTO that I am working on has slightly different goals to the one it is forked from, e.g.:

  • Source drive: a Stretch install on a USB stick, instead of a Jessie LiveCD.
  • Block device encryption: root partition on LUKS, instead of none.
  • /boot on its own ZFS partition.
  • Installation of Debian onto target drive by copying from source drive instead of using debootstrap.
  • Explain rationale for installation decisions in slightly more depth.

Ideally, when the HOWTO I'm working on is more or less finished, it will be integrated into the ZoL wiki.

@gmelikov
Copy link
Member

a way to install Stretch on ZFS without block device encryption?

This is what I've meant, yes.

but I am aiming for this particular HOWTO to be secure by default

You create separate boot. IIRC, it's not 100% secure with unencrypted boot. In short - I don't see the reason to install ZFS-on-LUKS on every machine.

Explain rationale for installation decisions in slightly more depth.

The main point of all our install-root-on-ZFS instructions is that they are for experienced people only, because it's not 100% bulletproof now. I won't mind the easier instruction, though, but bear it in mind.

Ideally, when the HOWTO I'm working on is more or less finished, it will be integrated into the ZoL wiki.

What's your main goal? If integration isn't then we will just rework your variant for our needs later, and you can do it as you like. Encryption isn't mandatory for root now, especially because we will have #5769 soon. Honestly, I just want to understand if we will make instruction which wiil be ready for our wiki here, or just good basis for it?

@jwittlincohen
Copy link
Contributor

jwittlincohen commented Apr 24, 2017

@gmelikov

I believe native encryption is still a ways off (It was tagged for 0.7.0-rc4 but that tag was removed, presumably indicating it is being pushed back). There is utility to offering a solution for encryption today, although I agree with you that it certainly isn't necessary.

@ghost
Copy link
Author

ghost commented Apr 25, 2017

@gmelikov wrote:

it's not 100% secure with unencrypted boot.

Even with encrypted boot, it wouldn't be 100% secure.

I don't see the reason to install ZFS-on-LUKS on every machine.

Neither do I ;) There are some use-cases where LUKS makes sense: machines that have low physical security and might be stolen, like laptops, desktops, and maybe some home or office file servers or entertainment PCs. There are some where it makes less sense: racked servers in locked cabinets in locked rooms in locked data centres. The HOWTO I am working on is for the use-cases where it makes sense.

What's your main goal?

My primary goal is to (learn how to) achieve the goals described in the "Goal" section of the HOWTO, for the reasons given elsewhere in the HOWTO (which I will expand), and to document the necessary steps. So my primary goal is to become more knowledgeable and capable about using ZFS with free software: to go from being someone who is reliant upon official OS installers (e.g. from FreeBSD or Ubuntu) in order to install an OS onto ZFS, to someone who understands and can manually perform all the steps involved.

My secondary goal is to help others. I hope that the HOWTO will be helpful for other people who are interested in learning how to install Stretch onto a LUKS-encrypted ZFS root, with a ZFS /boot, and for people who might not want to actually do that but who might want to read about the rationale behind doing it.

My tertiary goal is to have the HOWTO incorporated into the ZoL wiki. This will help to reduce the fragmentation of ZoL-related tutorials across the Web.

@jwittlincohen wrote:

I agree with gmelikov regarding the use of scripts. [...] I'm not sure we want to make the process into a black box (just run X script) for simplicity [...] What do you think?

I agree about not wanting the process to be a black box. The scripts are already extensively commented, and I want to keep them that way. Ultimately, I think the best approach is probably to use Org-mode's literate programming features to provide a HOWTO that looks like the Debian Jessie Root on ZFS one, but that can have scripts "tangled" from it (see literate programming link) at the push of a button, so that the user can easily choose whether to proceed step-by-step or via a script.

@gmelikov
Copy link
Member

gmelikov commented Apr 26, 2017

@sampablokuper to make our discussion shorter - it would be great if your instruction will have:

  • the encryption as an option;
  • manual and automatic (scripts) methods;

, so it will be ready for ZoL wiki without any major modification.

If you are not ready to do it - the variant for ZoL wiki will be based on your work (I'll change it for our needs).

Thank you =)

@gmelikov
Copy link
Member

gmelikov commented Aug 17, 2017

Recently i've already adopted wiki to Stretch release https://github.com/zfsonlinux/zfs/wiki/Debian-Stretch-Root-on-ZFS , and encryption is already in master branch, so I prefer to close this issue as stale, @sampablokuper if you won't mind. Thanks for your work!

@sampablokuper if you are ready to contribute your work - feel free to reopen it with patches.

@sjau
Copy link

sjau commented Oct 23, 2017

@gmelikov I've seen your howto and I tried to follow it using ZFS native encryption (getting ZFS from git-master) but in the end I failed to do so.... it seems I could get encryption to run fine but on boot up there were multiple issues with initramfs - I used #6044 (comment) to compile zfs-dkms from git....

I'm currently using encrypted root ZFS on a notebook with NixOS and it's fairly easy there. However I require a debian server and I'd love to employ encrypted root ZFS there instead of mdadm raid1, luks/dm-crypt, zfs.

@gmelikov
Copy link
Member

@sjau see #5489 , boot from native encryption is not implemented yet (initramfs scripts must be modified).

@sjau
Copy link

sjau commented Oct 23, 2017

Thanks for the quick answer. I hope this will be realized soon. As said, on my notebook it's nice to just have "zfs" running for raid1, encryption and and filesystem (with a small ext4 partition for /boot). Would be nice to get that also on debian.

@gmelikov
Copy link
Member

@sjau interesting, do I understand you correctly that you boot from native encrypted zfs pool (use it as /)? It would be great if someone can backport any patches from NixOS for it. Let's use #5489 for discussion.

@sjau
Copy link

sjau commented Oct 23, 2017

Not really sure how NixOS manages it... the howto is here: https://nixos.wiki/wiki/NixOS_on_ZFS#Encrypted_ZFS it even contains a section on how to remotely unlock enrypted root zfs (which I have tested in a vm as well and on an old notebook and it works)

I could upload the generated initrd somewhere if you want...

@sjau
Copy link

sjau commented Oct 24, 2017

@sjau
Copy link

sjau commented Oct 25, 2017

So, I managed to build an encrypted root zfs on debian. I created according bash scripts for qemu if anyone wants to try.

What I basically did is:

  1. Boot Debian Live CD (I just used xfce because it's smallest)
  2. Add sources list and install dependencies for building zfs
  3. Building zfs from git-master using jwittlincohen instructions here: [Feature Request] Add support for building DKMS ZFS and SPL binaries for Debian/Ubuntu #6044 (comment)
  4. Then use sgdisk to auto partition /dev/sda
  5. Create pool, encrypted data set and then a subdata set for root, mostly following instructions here: https://github.com/zfsonlinux/zfs/wiki/Debian-Stretch-Root-on-ZFS
  6. Then chroot into the new system to run the chroot-install script which will bascially redo adding sources.list, get dependencies, build zfs from master again....

However there are a few issues:

The initrd contains a zfs script which works nice in general but it does not unlock encrypted datasets because zfs load-key -a is missing.

So I did unpack the created initrd:

mkdir /tmp/initrd
cd /tmp/initrd
gzip -cd /boot/initrd.img-4.9.0-4-amd64 | cpio -i

I went to the script folder and edited the zfs script.
Before the section

	# ----------------------------------------------------------------
	# P R E P A R E   R O O T   F I L E S Y S T E M

I just added the zfs load-key -a command:

        zfs load-key -a
	
        # ----------------------------------------------------------------
	# P R E P A R E   R O O T   F I L E S Y S T E M

Saved it and created initrd again (but named it different so that I still have the original one:
find . | cpio --dereference -o -H newc | gzip > /boot/initrd.new

However grub still has two issues with the boot entry; upon reboot you need to edit the entries. The linux line looks by default for me like this: linux /vmlinuz-4.9.0-amd64 root=ZFS=/encZFS/Root ro quiet

Two things are missing there:
you need to add boot=zfs so that the zfs script is actually executed and also for some reason it did not add the zpool name properly to the root= entry.

In the end I needed to change it manually to: linux /vmlinuz-4.9.0-amd64 boot=zfs root=ZFS=tankie/encZFS/Root ro quiet

Also, since I altered the zfs script in the original initrd and renamed the initrd, I had to adjust that entry as well to the new initrd: initrd /initrd.new

A video can be found here: https://youtu.be/JrT9dQFTtFA

Also another thing I noticed, the /boot partition /dev/sda1 didn't get automounted after mounting the encrypted root zfs child dataset.

The scripts used for creating the VM are here: https://github.com/sjau/bash-stuff/tree/master/ZFS-Debian

qemuSetup: This script just creates the qemu disk and also two bash launchers for the vm - one with live dvd, one without

install & chroot-install: Both scripts need to go into the booted live dvd session into the /root folder

After that run run first the install script, it will ask for a few things but should be easy to answer... at the end copies the chroot-install script to /mnt/root/ and it will chroot into /mnt.

Just change dir to /root then and run the chroot-install script.

Once done, you need to unpack the generated intitrd and make the above mentioned change to the zfs script inside - I haven't automated this yet. Repack it and reboot.

Upon reboot, edit the grub entries as explained above and shown in the video.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants