Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUKS + BTRFS + unencrypted /boot partition made by manual partitioning = unbootable system #2281

Closed
ArrayBolt3 opened this issue Feb 6, 2024 · 20 comments

Comments

@ArrayBolt3
Copy link
Contributor

ArrayBolt3 commented Feb 6, 2024

Please note that I am not using patches to enable automatic creation of an unencrypted /boot partition to trigger this bug. This should be reproducible using a vanilla Calamares and manual partitioning.

Describe the bug
When using manual partitioning on Lubuntu 24.04 Alpha, it is possible to set up a partition layout as follows:

  • Partition table: MBR
  • Partition 1: 4GB ext4 unencrypted, mounted at /boot
  • Partition 2: 100% BTRFS LUKS-encrypted, mounted at /

Installing with this partition layout succeeds, but the resulting system fails to boot. Instead, an error appears on the Plymouth screen from cryptsetup stating bad passphrase or options?, then soon another error appears stating that the maximum number of passphrase input attempts has been reached. Hitting Esc reveals that the system is attempting to read from a non-existent file /crypto_keysetup.bin in a loop. After probably thirty or forty seconds of this, the system drops to an initramfs prompt.

To Reproduce
Steps to reproduce the behavior:

  1. Download the latest Lubuntu 24.04 Alpha ISO, and boot it on a BIOS system (I used a virt-manager VM).
  2. Launch the installer.
  3. On the partitioning screen, choose manual partitioning.
  4. Click "Create Partition Table", and make an MBR partition table.
  5. Create a new partition, set the filesystem to ext4, set the size to 4 GiB, ensure that encryption is not enabled, set the mount point to /boot, and enable the boot flag.
  6. Create a new partition, set the filesystem to btrfs, leave the size at whatever default it's at, check the Encrypt box and type a password, and set the mount point to /.
  7. Proceed through the rest of the installer and allow the installation to complete.
  8. Attempt to boot into the freshly installed system. The behavior described above will occur.

Expected behavior
You should be asked on the Plymouth screen for the passphrase you typed earlier, and providing that passphrase should result in the system booting properly.

Additional context
This does not occur if you follow the reproduction steps but set the root partition's filesystem to ext4 rather than btrfs. Something about using btrfs triggers this.

It looks like the initramfs is looking for a crypto keyfile that was never generated and thus feeding a blank (and incorrect) password into cryptsetup. Indeed, the crypto keyfile should (and even must) be absent so as to avoid bypassing encryption entirely with an unencrypted /boot partition - the bug is that the system is looking for it and attempting to use it even though /boot was left unencrypted.

This was first encountered while attempting to install Kubuntu using a patched Calamares to allow unencrypted /boot to be used with automatic partitioning. However, the above steps do not require the extra patches to reproduce.

@dalto8
Copy link
Contributor

dalto8 commented Feb 6, 2024

Can we the full install log?

@ArrayBolt3
Copy link
Contributor Author

https://termbin.com/ls5m

@dalto8
Copy link
Contributor

dalto8 commented Feb 6, 2024

Hitting Esc reveals that the system is attempting to read from a non-existent file /crypto_keysetup.bin

Is this a typo because we create a file called crypto_keyfile.bin?

Assuming it is a typo, I am not sure I see anything obvious that would cause this.

What does the /etc/crypttab on the installed system look like?

@ArrayBolt3
Copy link
Contributor Author

Oops, yes it is a typo. It's looking for /crypto_keyfile.bin. Sorry about that.

In case it's helpful, I actually can get the system to boot once I'm at the initramfs prompt by typing exit so it tells me what device is missing, then doing cryptsetup luksOpen /dev/vda2 luke-<UUID> (replacing <UUID> as appropriate), providing my passphrase, then typing exit once more. The Plymouth screen reappears after this and then the system boots to a desktop.

Here are the contents of /etc/crypttab: https://termbin.com/8mdm

@ArrayBolt3
Copy link
Contributor Author

ArrayBolt3 commented Feb 6, 2024

Looks like the crypttab is looking for the keyfile. Looking at src/modules/fstab/main.py, it seems that the logic is there for disabling the keyfile field for / if the /boot partition is unencrypted though. I wonder if perhaps that logic is getting faked out by the default BTRFS subvolumes Calamares creates? The partition isn't being mounted at / (which is what the fstab module checks for), a subvolume of it is.

@ArrayBolt3
Copy link
Contributor Author

OK, so I did a bit more digging by patching the fstab module's main.py like so:

def generate_crypttab(self):
    """ Create crypttab. """
    mkdir_p(os.path.join(self.root_mount_point, "etc"))
    crypttab_path = os.path.join(self.root_mount_point, "etc", "crypttab")

    libcalamares.utils.warning("!!! Partition dict !!! :" + str(self.partitions)) # I added this line
    ...

This gave me a very messy line of debug output that I formatted nicely:

[
    {   
        'claimed': True,
        'device': '/dev/vda1',
        'features': {}, 
        'fs': 'ext4',
        'fsName': 'ext4',
        'mountPoint': '/boot',
        'partattrs': 0,
        'partlabel': '', 
        'parttype': '', 
        'partuuid': '', 
        'uuid': 'b038ad74-4da8-4902-9dd2-e04caeaff9a9'
    },  
    {   
        'claimed': True,
        'device': '/dev/vda2',
        'features': {}, 
        'fs': 'btrfs',
        'fsName': 'luks',
        'luksMapperName': 'luks-1cad7028-37a0-425b-868d-2c090056fe51',
        'luksPassphrase': 'qwe',
        'luksUuid': '1cad7028-37a0-425b-868d-2c090056fe51',
        'mountPoint': '/swap',
        'partattrs': 0,
        'partlabel': '', 
        'parttype': '', 
        'partuuid': '', 
        'uuid': '1cad7028-37a0-425b-868d-2c090056fe51',
        'subvol': '/@swap'
    },  
    {   
        'fs': 'swap',
        'mountPoint': None,
        'claimed': True,
        'device': '/swap/swapfile',
        'uuid': None
    }   
]

Looks like the BTRFS partition isn't recognized as being mounted on /, but rather on /swap.

@ArrayBolt3
Copy link
Contributor Author

Also, here's the VM's fstab: https://termbin.com/6dvt

@dalto8
Copy link
Contributor

dalto8 commented Feb 6, 2024

The btrfs partition actually should have a mountpoint of /.

From your debug output, it looks like something in the fstab main.py is modifying the partitions object. Then later, something else is expecting it to be unmodified.

@iam-TJ
Copy link

iam-TJ commented Feb 7, 2024

The issue here is that BTRFS subvolumes are akin to separate block devices for mount and fstab purposes but Calamares is not special-casing handling of multiple subvols in the same block device.

As @ArrayBolt3 shows in the additional debug output /dev/vda2 LUKS device is home to several pseudo-block devices (BTRFS subvolumes) and what happens is that the partition dictionary entry for vda2 ends up being associated with whatever the last BTRFS subvol in fstab happens to be (in this case /swap).

That then breaks the tests in src/modules/fstab/main.py::generate_crypttab_line_info() since the mountpoint is /swap not /

The attached patch should show what is happening
calamares.patch.txt

@dalto8
Copy link
Contributor

dalto8 commented Feb 7, 2024

The issue here is that BTRFS subvolumes are akin to separate block devices for mount and fstab purposes but Calamares is not special-casing handling of multiple subvols in the same block device.

Not really. The way Calamares handles btrfs is that it applies the subvolume layout to the partition assigned to /. So that should work. The btrfs partition that contains the root subvolume should always have a mountpoint of / even though technically it is a subvolume that is mounted to /.

However, from the looks of the debug output, partitions is actually being modified by something since it now shows that it is mounted to /swap. We will have to look at the code to see what is changing it.

@dalto8
Copy link
Contributor

dalto8 commented Feb 7, 2024

This is probably the issue:

https://github.com/calamares/calamares/blob/calamares/src/modules/fstab/main.py#L204

I can fix it this weekend unless someone wants to PR it before I get to it.

@iam-TJ
Copy link

iam-TJ commented Feb 7, 2024

Although there is a logic problem with the code of course, the mount_entry = partition reference should be mount_entry = copy.deepcopy(partition) I am unable to reproduce the symptoms originally described by @ArrayBolt3 .

Using the Lubuntu 24.04 amd64 installer ISO in a virtual machine I followed the reproducer instructions several times, with debug logic added to report the BTRFS items and the state of partition with the existing code, and again using deepcopy()

Although that shows the partition being modified by each of the BTRFS subvolumes and ending up with its mountPoint being the last subvolume mount point it is not sufficient to cause generation of /crypto_keyfile.bin .

Extracts from the "bad" (mount_entry = partition) and "good" (mount_entry = copy.deepcopy(partition)) session logs below. Note how in both cases /boot/ is correctly detected as unencrypted.

"Bad":

2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "Entering generate_fstab() BTRFS" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  PARTITION" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  claimed = False" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  device = /dev/vda1" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  features = {}" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  fs = ext4" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  fsName = ext4" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  mountPoint = /boot" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  partattrs = 0" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  partlabel = " 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  parttype = " 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  partuuid = " 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  uuid = f676c4e0-4255-422a-8f6a-f32355fd60d3" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  BTRFS mountPoint / /" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  BTRFS subvolume /@" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  BTRFS mountPoint / /home" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  BTRFS subvolume /@home" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  BTRFS mountPoint /home /swap" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  BTRFS subvolume /@swap" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  PARTITION" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  claimed = True" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  device = /dev/vda2" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  features = {}" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  fs = btrfs" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  fsName = luks" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  luksMapperName = luks-8a0b4856-a624-44c8-b035-135988c7b380" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  luksPassphrase = letmein" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  luksUuid = 8a0b4856-a624-44c8-b035-135988c7b380" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  mountPoint = /swap" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  partattrs = 0" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  partlabel = " 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  parttype = " 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  partuuid = " 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  uuid = 8a0b4856-a624-44c8-b035-135988c7b380" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  subvol = /@swap" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  PARTITION" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  fs = swap" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  mountPoint = None" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  claimed = True" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  device = /swap/swapfile" 
2024-02-07 - 14:24:06 [6]: [PYTHON JOB]: "  uuid = None" 
2024-02-07 - 14:24:39 [6]:     Starting job "Configuring LUKS key file." ( 16 / 40 ) 
2024-02-07 - 14:24:39 [6]: virtual Calamares::JobResult LuksBootKeyFileJob::exec()
2024-02-07 - 14:24:39 [6]:     There are 1 LUKS partitions 
2024-02-07 - 14:24:39 [6]:     .. root "/dev/vda2" passphrase? true 
2024-02-07 - 14:24:39 [6]:     .. /boot partition is not encrypted, skipping keyfile creation.

And the "Good":

2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "Entering generate_fstab() BTRFS" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  PARTITION" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  claimed = False" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  device = /dev/vda1" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  features = {}" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  fs = ext4" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  fsName = ext4" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  mountPoint = /boot" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  partattrs = 0" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  partlabel = " 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  parttype = " 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  partuuid = " 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  uuid = c4c568a3-8efc-4ead-aebe-148bf170dc0c" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  BTRFS mountPoint / /" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  BTRFS subvolume /@" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  BTRFS mountPoint / /home" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  BTRFS subvolume /@home" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  BTRFS mountPoint / /swap" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  BTRFS subvolume /@swap" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  PARTITION" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  claimed = True" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  device = /dev/vda2" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  features = {}" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  fs = btrfs" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  fsName = luks" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  luksMapperName = luks-ca23147a-0e1e-4232-b7c7-d763728b6637" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  luksPassphrase = letmein" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  luksUuid = ca23147a-0e1e-4232-b7c7-d763728b6637" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  mountPoint = /" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  partattrs = 0" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  partlabel = " 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  parttype = " 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  partuuid = " 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  uuid = ca23147a-0e1e-4232-b7c7-d763728b6637" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  PARTITION" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  fs = swap" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  mountPoint = None" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  claimed = True" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  device = /swap/swapfile" 
2024-02-07 - 14:43:17 [6]: [PYTHON JOB]: "  uuid = None" 
2024-02-07 - 14:43:55 [6]:     Starting job "Configuring LUKS key file." ( 16 / 40 ) 
2024-02-07 - 14:43:55 [6]: virtual Calamares::JobResult LuksBootKeyFileJob::exec()
2024-02-07 - 14:43:55 [6]:     There are 1 LUKS partitions 
2024-02-07 - 14:43:55 [6]:     .. root "/dev/vda2" passphrase? true 
2024-02-07 - 14:43:55 [6]:     .. /boot partition is not encrypted, skipping keyfile creation.

Now, the only way I managed to reproduce @ArrayBolt3 scenario was when I re-ran the installer and in the manual partitioning step kept the existing partition #1 and forgot to set its mount point to /boot/ . That did cause /crypto_keyfile.bin to be generated.
I didn't realise this initially and re-ran the installer several times making this mistake and then studying the source-code carefully to figure out how it could write that file if LuksBootKeyFileJob::hasUnencryptedSeparateBoot() was working correctly, seeing it was impossible, and reviewing what I was doing.

@ArrayBolt3
Copy link
Contributor Author

@iam-TJ The bug isn't that the keyfile is not generated - the keyfile should not be generated because /boot is unencrypted. The bug is that the system looks for one anyway. I was able to reproduce this reliably in virt-manager, so I'm not sure what I'm leaving out on accident. I can try to take a video of the process.

Just to be doubly sure, you do see a picture of a numbat (looks like a brown stripey squirrel with a long face) on the desktop when you first boot the ISO, right? Just making sure you're booting the right ISO.

@iam-TJ
Copy link

iam-TJ commented Feb 7, 2024

@iam-TJ The bug isn't that the keyfile is not generated - the keyfile should not be generated because /boot is unencrypted. The bug is that the system looks for one anyway.

Oh - good point! I've been so fixated on the installer-time logic I neglected the second issue - that of the crypttab content. Thanks for clearing my tunnel vision. I'll crack on with that now!

iam-TJ pushed a commit to iam-TJ/calamares that referenced this issue Feb 7, 2024
When using BTRFS multiple subvolumes exist and whilst iterating them the
partition["mountPoint"] is inadvertently changed due to taking a
reference rather than a copy.

Closes: issue calamares#2281
@iam-TJ
Copy link

iam-TJ commented Feb 7, 2024

I'm about to run a "bad" reproducer to confirm the crypttab does contain /crypto_keyfile.bin but - looking at the content in the "good" output - it is correct so the use of deepcopy() solves it. I'll sort out the PR now:

# /etc/crypttab: mappings for encrypted partitions.
#
# Each mapped device will be created in /dev/mapper, so your /etc/fstab
# should use the /dev/mapper/<name> paths for encrypted devices.
#
# See crypttab(5) for the supported syntax.
#
# NOTE: You need not list your root (/) partition here, but it must be set up
#       beforehand by the initramfs (/etc/mkinitcpio.conf). The same applies
#       to encrypted swap, which should be set up with mkinitcpio-openswap
#       for resume support.
#
# <name>               <device>                         <password> <options>
luks-ca23147a-0e1e-4232-b7c7-d763728b6637 UUID=ca23147a-0e1e-4232-b7c7-d763728b6637     none 

@iam-TJ
Copy link

iam-TJ commented Feb 7, 2024

Just noticed another issue, slightly cosmetic, that has caught me out several times now. When doing manual partitioning and there is an existing partition and I choose to Edit it, and I change the format option from "Keep" to "Format", if after saving the edited changes one returns to edit it again the formatting option again shows "Keep" - making one think the previous change didn't save.

I did this several times before continuing and then saw in the Install summary actions list there were four entries for formatting partition #1 !

It would be good if saved changes were reflected in the radio button selection rather than defaulting to "Keep".

@dalto8
Copy link
Contributor

dalto8 commented Feb 7, 2024

I did this several times before continuing and then saw in the Install summary actions list there were four entries for formatting partition #1 !

It would also be good if the action wasn't in the summary 4 times (and potentially executed 4 times).

We should probably track that in a different issue though.

@iam-TJ
Copy link

iam-TJ commented Feb 7, 2024

We should probably track that in a different issue though.

Indeed - on my ToDo list but wanted to note it before I forget!

As for the "bad" run and the generated crypttab I can confirm the entry is there even though the file is not:

$ sudo qemu-nbd --connect=/dev/nbd0  /var/lib/libvirt/images/lubuntu-2403.qcow2
$ sudo cryptsetup open /dev/nbd0p2 lubuntu
$ sudo mkdir /mnt/lubuntu
$ sudo mount /dev/mapper/lubuntu /mnt/lubuntu/
$ ls /mnt/lubuntu/@/ ; tail -n 2 /mnt/lubuntu/@/etc/crypttab
bin  boot  dev  etc  home  lib  lib64  media  mnt  opt  proc  root  run  sbin  session.log  srv  swap  sys  tmp  usr  var
# <name>               <device>                         <password> <options>
luks-2f3344c4-03c3-4668-a0b3-ba52a8fda59c UUID=2f3344c4-03c3-4668-a0b3-ba52a8fda59c     /crypto_keyfile.bin luks,keyscript=/bin/cat

@kkofler
Copy link
Contributor

kkofler commented Feb 8, 2024

I would even classify the formatting issue as a potential data loss bug! Just think of the use case where one accidentally selects "Format", then goes back to switch to "Keep", but the partition gets formatted anyway… bye bye data!

@ArrayBolt3
Copy link
Contributor Author

Just verified that the patch from @iam-TJ does indeed fix this in production. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants