Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootengine: some udev rules are in rootfs but missing in initramfs #2481

Closed
r7vme opened this Issue Jul 30, 2018 · 16 comments

Comments

Projects
None yet
5 participants
@r7vme
Copy link

r7vme commented Jul 30, 2018

Issue Report

Proper way to detect a disk in Azure was added here. Udev rules are creating symlink like /dev/disk/azure/scsi1/lun0. Unfortunately i can not use them from ignition.

Following fails with Timed out waiting for device dev-disk-azure-scsi1-lun0.device

storage:
  filesystems:
    - name: docker
      mount:
        device: /dev/disk/azure/scsi1/lun0
        format: xfs
        wipe_filesystem: true
        label: docker

Inside debug console, i see no rules for azure (which is probably expected as initramfs does not suppy them)

:/# ls /usr/lib/udev/rules.d/
10-dm.rules                  63-md-raid-arrays.rules    80-drivers.rules
13-dm-disk.rules             64-btrfs.rules             80-net-setup-link.rules
50-udev-default.rules        64-md-raid-assembly.rules  90-vconsole.rules
60-block.rules               71-seat.rules              95-dm-notify.rules
60-cdrom_id.rules            73-seat-late.rules         99-systemd.rules
60-persistent-storage.rules  75-net-description.rules

Same applies to ignition on AWS (nvme disks) i assume.

Would be really great to use ignition "filesystems" as it has many advantages over having systemd units to format and mount filesystems.

Bug

Container Linux Version

$ cat /etc/os-release
cat /etc/os-release 
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1745.7.0
VERSION_ID=1745.7.0
BUILD_ID=2018-06-14-0909
PRETTY_NAME="Container Linux by CoreOS 1745.7.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Environment

Azure

Expected Behavior

Filesystem created on /dev/disk/azure/lun0

Actual Behavior

Ignition fails to boot with timeout

Reproduction Steps

  1. Create Azure VM and put ignition with filesystems snippet above
  2. VM will fail with provisioning timeout

Other Information

N/A

@r7vme r7vme changed the title Unable to use Azure disk (and AWS nvme disks) symlinks with ignition Unable to format Azure disk (and AWS nvme disks) with ignition Jul 30, 2018

r7vme pushed a commit to giantswarm/giantnetes-terraform that referenced this issue Jul 30, 2018

Roman Sokolkov
Identify Azure disk by LUN
Switch to systemd units. See coreos/bugs#2481
@ajeddeloh

This comment has been minimized.

Copy link

ajeddeloh commented Jul 30, 2018

Whoops. Looks like we forgot to pull them into the initramfs as well. Should grab coreos/init#268 while we're at it.

@r7vme

This comment has been minimized.

Copy link
Author

r7vme commented Aug 7, 2018

Label "jira" probably means that this bug is already tracked internally. :)

Do you have some rough estimates?

@lucab

This comment has been minimized.

Copy link
Member

lucab commented Aug 7, 2018

@r7vme yes, but work has not started on it yet. You'll likely see a pingback as soon as it picked up.

@r7vme

This comment has been minimized.

Copy link
Author

r7vme commented Aug 28, 2018

Hi, anyhow i can help with this issue? Some high-level steps would help.

We (and i assume many others who use ignition with AWS or Azure) are waiting for this fix. Briinging back systemd units that format disks are the last thing i want to do :)

@lucab

This comment has been minimized.

Copy link
Member

lucab commented Aug 28, 2018

@r7vme at a very high level, the problem is that bootengine is not installing any of the udev-related cloud bits that are used in the real rootfs. At a low level, it means setting up proper dependencies between the ebuilds, installing those bits via a dracut module, and testing the result in azure.

I haven't personally looked into this so I don't know if this was simply completely overlooked or if it is done somewhere already and just some bits are missing.

r7vme pushed a commit to r7vme/bootengine that referenced this issue Aug 30, 2018

Roman Sokolkov
Add udev rules for Azure disks
Fixes: coreos/bugs#2481

This change will allows to use proper disk paths in
ignition in Azure. Udev rules are taken from coreos/init
repo.

coreos/init@3bd5cd1
@r7vme

This comment has been minimized.

Copy link
Author

r7vme commented Aug 30, 2018

@lucab thank for the instructions.

I've create PR, but not 100% sure about ebuild you mentioned. I'm not sure how i can use ebuilds in dracut. I've just added udev rule for Azure disks.

@lucab

This comment has been minimized.

Copy link
Member

lucab commented Aug 30, 2018

@r7 where did you source those udev rules from? Are they already in CL rootfs somewhere?

We should probably have them normally in CL root first, and then tell dracut to copy them to the initramfs. The ebuilds for the two components are in our overlay: bootengine and coreos-init. From those packages we assemble the content of the images.

@r7vme

This comment has been minimized.

Copy link
Author

r7vme commented Aug 30, 2018

where did you source those udev rules from? Are they already in CL rootfs somewhere?

https://github.com/coreos/init/blob/master/udev/rules.d/66-azure-storage.rules

We should probably have them normally in CL root first, and then tell dracut to copy them to the initramfs.

Aha, i can do it.

r7vme pushed a commit to r7vme/bootengine that referenced this issue Aug 30, 2018

Roman Sokolkov
Add udev rules for Azure disks
Fixes: coreos/bugs#2481

This change allows to use proper Azure disk paths in ignition.
@r7vme

This comment has been minimized.

Copy link
Author

r7vme commented Aug 30, 2018

@lucab updated my PR.

@r7vme

This comment has been minimized.

Copy link
Author

r7vme commented Aug 31, 2018

Thanks for the review. I agree that this issue should be closed after AWS fixed too. Should it be reopen then?

@lucab lucab reopened this Aug 31, 2018

@lucab lucab changed the title Unable to format Azure disk (and AWS nvme disks) with ignition bootengine: some udev rules are in rootfs but missing in initramfs Aug 31, 2018

@lucab lucab removed the platform/azure label Aug 31, 2018

@lucab

This comment has been minimized.

Copy link
Member

lucab commented Aug 31, 2018

Ack, re-opened. While the specific Azure+SCSI usecase should be unblocked, there may be a few more items missing (like AWS NVMe) and we should check that relevant scripts and helpers/utilities are present in the initramfs.

@r7vme

This comment has been minimized.

Copy link
Author

r7vme commented Aug 31, 2018

PR that bumps bootengine in coreos-overlay coreos/coreos-overlay#3396

r7vme pushed a commit to r7vme/bootengine that referenced this issue Sep 26, 2018

Roma Sokolkov
Add udev rules for cloud storage disks
Fixes: coreos/bugs#2481

This change allows to use proper AWS / GCE disk paths in ignition.
@r7vme

This comment has been minimized.

Copy link
Author

r7vme commented Sep 26, 2018

PR for AWS disks coreos/bootengine#149

r7vme pushed a commit to r7vme/bootengine that referenced this issue Sep 26, 2018

Roma Sokolkov
Add udev rules for cloud storage disks
Fixes: coreos/bugs#2481

This change allows to use proper AWS / GCE disk paths in ignition.

r7vme pushed a commit to r7vme/bootengine that referenced this issue Oct 24, 2018

Roma Sokolkov
Add udev rules for cloud disks and nvme binary
Fixes: coreos/bugs#2481

This change allows to use proper AWS / GCE disk paths in ignition.
@seh

This comment has been minimized.

Copy link

seh commented Dec 3, 2018

I find that even with the AWS EBS rules in place, I can use "storage.filesystems" with an unstable path like /dev/nvme1n1 for the "storage.filesystems.mount.device" field, but I can't use the more stable name I have to supply to AWS to attach the volume like /dev/sdf. Ignition times out consistently waiting for the corresponding systemd "device unit" to start, as noted over in coreos/coreos-overlay#3366 here.

In my EC2 instance, I can see the rules present that @r7vme had originally pointed out were missing:

% grep -C3 'Elastic Block' /usr/lib/udev/rules.d/90-cloud-storage.rules 
## AWS EBS NVMe names
## https://github.com/coreos/bugs/issues/2399
# NVMe devices
KERNEL=="nvme[0-9]*n[0-9]*", ENV{DEVTYPE}=="disk", ATTRS{model}=="Amazon Elastic Block Store", ATTRS{serial}=="?*", SYMLINK+="disk/by-id/nvme-$attr{model}_$attr{serial}-ns-%n", OPTIONS+="string_escape=replace"
KERNEL=="nvme[0-9]*n[0-9]*", ENV{DEVTYPE}=="disk", ATTRS{model}=="Amazon Elastic Block Store", ATTRS{serial}=="?*", PROGRAM="cloud_aws_ebs_nvme_id -d /dev/%k", SYMLINK+="%c"
# NVMe partitions
KERNEL=="nvme[0-9]*n[0-9]*p[0-9]*", ENV{DEVTYPE}=="partition", ATTRS{model}=="Amazon Elastic Block Store", ATTRS{serial}=="?*", IMPORT{program}="cloud_aws_ebs_nvme_id -n /dev/%k"
KERNEL=="nvme[0-9]*n[0-9]*p[0-9]*", ENV{DEVTYPE}=="partition", ATTRS{model}=="Amazon Elastic Block Store", ATTRS{serial}=="?*", ENV{_NS_ID}=="?*", SYMLINK+="disk/by-id/nvme-$attr{model}_$attr{serial}-ns-$env{_NS_ID}-part%n", OPTIONS+="string_escape=replace"
KERNEL=="nvme[0-9]*n[0-9]*p[0-9]*", ENV{DEVTYPE}=="partition", ATTRS{model}=="Amazon Elastic Block Store", ATTRS{serial}=="?*", ENV{_NS_ID}=="?*", PROGRAM="cloud_aws_ebs_nvme_id -d /dev/%k", SYMLINK+="%c%n"

# TODO: Anyone else support friendly names?

Those rules do indeed work, but is the problem that they wind up working too late for Ignition to use when creating the filesystems?

% ls -l /dev/sdf
lrwxrwxrwx. 1 root root 7 Dec  3 18:46 /dev/sdf -> nvme1n1
@seh

This comment has been minimized.

Copy link

seh commented Dec 4, 2018

Please see #2531 for related trouble.

r7vme pushed a commit to giantswarm/giantnetes-terraform that referenced this issue Dec 20, 2018

Roma Sokolkov
Use persistent paths for disks
Requires CoreOS 1995.0.0.

We can not use predictable disk names with ignition
as coreos/bugs#2481 was fixed.

TODO: Enable docker disk wipe

r7vme pushed a commit to giantswarm/giantnetes-terraform that referenced this issue Dec 24, 2018

Roma Sokolkov
Use persistent paths for disks (#187)
* Use persistent paths for disks

Requires CoreOS 1995.0.0.

We can not use predictable disk names with ignition
as coreos/bugs#2481 was fixed.

TODO: Enable docker disk wipe

* Wipe docker disks for master VMs

* Use 1995.0.0 in CI
@r7vme

This comment has been minimized.

Copy link
Author

r7vme commented Dec 28, 2018

For everyone who will come here. We finally switched to 1995.0.0 (currently in alpha channel), which has both fixes for AWS and Azure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.