Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify kernel as customization for Edge image types #1175

Merged
merged 6 commits into from Feb 16, 2021

Conversation

achilleas-k
Copy link
Member

@achilleas-k achilleas-k commented Jan 22, 2021

This pull request includes:

  • adequate testing for the new functionality or fixed issue
  • adequate documentation informing people about the change

Background

"When creating ostree commits, only one kernel package can be installed at a time, otherwise creating the commit will fail in rpm-ostree" - COMPOSER-691.


This PR removes the standard kernel from the list of packages for all image types, so the default package list does not specify a kernel. Instead the user (through the blueprint) can define a kernel by name and only the specified kernel will be included.

The name of the kernel to be included is specified in the Kernel customizations part of the Blueprint. If none is specified, it falls
back to the standard 'kernel' package.

Caveats

  1. Adding multiple kernels through package selection, or by specifying a kernel in the package list and a different one in the customization, will still break ostree.
  2. No restriction or validation of the kernel.name option coming from the blueprint (more on this below).

Tests

Includes simple unit tests for the blueprint GetPackages() method.

I also added a common distro test (distro_test_common.go) to ensure that ImageType implementations don't include kernels in their package list and especially that multiple kernels aren't added by default to OSTree type builds.

Future

We discussed some ideas that need more work in more places. I've saved these for followup PRs to keep changes smaller for now.

We could validate the blueprint kernel selection against a list of known kernel names. In the short term, this could be part of the image type specification (the same way they specify base and build packages). Longer term, it could be done automatically by using the dnf-json script to query for whatprovides a kernel, which would also support kernels from custom repositories.

The kernel name customization can be used in the pipeline to configure the final image. This should be useful for #1188.

ACKNOWLEDGEMENTS We would like, first of all, to thank the anonymous reviewers for their valuable input. Their feedback was fundamental in improving the quality of the manuscript.

@achilleas-k achilleas-k requested review from teg and gicmo February 1, 2021 16:47
@achilleas-k achilleas-k marked this pull request as ready for review February 1, 2021 16:47
@achilleas-k achilleas-k changed the title (POC) Specify kernel as customization for Edge image types Specify kernel as customization for Edge image types Feb 1, 2021
Copy link
Member

@teg teg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really, really nice! I like this approach a lot, much better than what I originally envisioned. I had a small nitpick and a suggestion on one of the patches, otherwise this looks ready to me.

internal/blueprint/blueprint.go Outdated Show resolved Hide resolved
internal/blueprint/blueprint.go Show resolved Hide resolved
@achilleas-k achilleas-k force-pushed the alt-kernel branch 2 times, most recently from f79a033 to ac5a129 Compare February 1, 2021 18:53
teg
teg previously approved these changes Feb 1, 2021
ondrejbudai
ondrejbudai previously approved these changes Feb 2, 2021
Copy link
Member

@ondrejbudai ondrejbudai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, thanks!

@teg
Copy link
Member

teg commented Feb 2, 2021

The failing test case is unrelated to this PR, and is caused by the known flakiness of PSI OpenShift.
I do have one last request for this: could we add another integration test to verify that this feature works end-to-end? I.e., similar extend the test we currently have to also build a blueprint with the rt kernel, and verify that we end up with the right kernel running?

@achilleas-k
Copy link
Member Author

That's a good idea. Will get on that ASAP.

@ondrejbudai ondrejbudai marked this pull request as draft February 3, 2021 11:36
@ondrejbudai
Copy link
Member

Marking this as a draft until an integration test is implemented.

@henrywang
Copy link
Member

henrywang commented Feb 3, 2021

@achilleas-k, I'm running test on this feature on RHEL 8.4 (latest nightly). I found an issue. sshd service can't get started after rt kernel gets installed. The reason is due to private keys permisson in /etc/ssh.

And chmod 600 /etc/ssh/ssh_host_rsa_key /etc/ssh/ssh_host_ecdsa_key /etc/ssh/ssh_host_ed25519_key fixed this issue. Do we need some update on private key permisson when rt-kernel installed?

Can't find the same issue when general kernel gets installed.

journalctl log:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@         WARNING: UNPROTECTED PRIVATE KEY FILE!          @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions 0640 for '/etc/ssh/ssh_host_ed25519_key' are too open.
It is required that your private key files are NOT accessible by others.
This private key will be ignored.
Unable to load host key "/etc/ssh/ssh_host_ed25519_key": bad permissions
Unable to load host key: /etc/ssh/ssh_host_ed25519_key
sshd: no hostkeys available -- exiting.

@henrywang
Copy link
Member

@teg and @achilleas-k, I'm updating check_ostree.yaml and ostree.sh to add test for this feature. But I encountered repo issue:

  1. For rhel, we have internal repo https://download.devel.redhat.com/rhel-8/nightly/RHEL-8/latest-RHEL-8.4.0/compose/RT/x86_64/os/, but no rt-kernel can be found from https://rpmrepo.osbuild.org/v1/psi. Do I add this internal repo into rhel-8*.json or rebuild rt-kernel into rpmrepo.osbuild.org?
  2. For fedora, the only way to install rt-kernel is from planetccrma repo http://ccrma.stanford.edu/planetccrma/mirror/fedora/linux/planetcore/33/x86_64/. Do you have any security concerns?

Thanks.

@gicmo
Copy link
Contributor

gicmo commented Feb 3, 2021

@henrywang you trying to log in as root by any chance? And who owns the files in /etc/ssh?

@henrywang
Copy link
Member

henrywang commented Feb 4, 2021

@gicmo and @achilleas-k. Thanks for following this issue. 👍
I changed private key permission already, so the permission is not 640 any more.

[admin@vm ~]$ ll /etc/ssh
total 600
-rw-r--r--. 1 root root    577388 Feb  3 15:18 moduli
-rw-r--r--. 1 root root      1770 Feb  3 15:18 ssh_config
drwxr-xr-x. 2 root root        28 Feb  3 15:18 ssh_config.d
-rw-------. 1 root polkitd    492 Feb  3 15:11 ssh_host_ecdsa_key
-rw-r--r--. 1 root root       162 Feb  3 15:11 ssh_host_ecdsa_key.pub
-rw-------. 1 root polkitd    387 Feb  3 15:11 ssh_host_ed25519_key
-rw-r--r--. 1 root root        82 Feb  3 15:11 ssh_host_ed25519_key.pub
-rw-------. 1 root polkitd   2578 Feb  3 15:11 ssh_host_rsa_key
-rw-r--r--. 1 root root       554 Feb  3 15:11 ssh_host_rsa_key.pub
-rw-------. 1 root root      4269 Feb  3 15:18 sshd_config

@achilleas-k
Copy link
Member Author

Thanks, that helps. Still looking into this.

@teg
Copy link
Member

teg commented Feb 4, 2021

@teg and @achilleas-k, I'm updating check_ostree.yaml and ostree.sh to add test for this feature. But I encountered repo issue:

1. For rhel, we have internal repo `https://download.devel.redhat.com/rhel-8/nightly/RHEL-8/latest-RHEL-8.4.0/compose/RT/x86_64/os/`, but no rt-kernel can be found from `https://rpmrepo.osbuild.org/v1/psi`. Do I add this internal repo into `rhel-8*.json` or rebuild rt-kernel into `rpmrepo.osbuild.org`?

Ah, fun. I did not realise this was in a different repo. @dvdhrm could you snapshot that too?

2. For fedora, the only way to install rt-kernel is from planetccrma repo `http://ccrma.stanford.edu/planetccrma/mirror/fedora/linux/planetcore/33/x86_64/`. Do you have any security concerns?

No security concerns. But we should probably also include this in the rpmsnapshots.

@achilleas-k
Copy link
Member Author

Hi @henrywang. I haven't managed to reproduce this. It's possible it's something very specific to the build you tested. Any information you can share would help understand what exactly happened.
Can you share the blueprint, manifest, or the commit tar?

@achilleas-k
Copy link
Member Author

...

Ah, fun. I did not realise this was in a different repo. @dvdhrm could you snapshot that too?

...

No security concerns. But we should probably also include this in the rpmsnapshots.

Should I add these repos to our test cases or should I wait until they're added to our snapshots before pushing test cases?

Alternatively, we can add test cases with kernel-debug as an alternative to the main kernel to test the feature. It would work as a test for knowing that ostree commits with alt kernels can be built, but it wont be the exact use case we made this change for (kernel-rt).

@henrywang
Copy link
Member

@achilleas-k, Edge team has requirement to include kernel-rt into Edge image.

@achilleas-k
Copy link
Member Author

I've narrowed down the issue to a change in group IDs after applying the kernel-rt commit to an existing OS. The realtime kernel creates a new system group called realtime, which creates a different set (and order) of groups.

If we generate two commits, one with mainline and one with kernel-rt, use the first one during OS install and then upgrade using the second one, the system GIDs change, likely due to the addition of the realitme group. In all the cases I tested this (both @henrywang's scripts/blueprints and now my own), the groups ssh_keys and polkitd swap GIDs, which changes the group ownership of the host keys and creates the situation seen in original issue.

I originally thought that the issue occurs when creating a commit through OSBuild with kernel-rt selected, and using it to install a new OS, which is why I was failing to reproduce it. This use case doesn't present the issue since all the groups are created on install or first boot and no GID changing occurs.

I wanted to add this update to the comments before stopping for the day and will continue tomorrow. I'm still not sure exactly what is wrong here or how to work around it.

@gicmo
Copy link
Contributor

gicmo commented Feb 10, 2021

I had a quick moment to think about that, and I think the missing piece is that osbuild does not do the equivalent of preserve-passwd and check-passwd from rpm-ostree (see its treefile doc). Need to have a look at the rpm-ostree code to see what needs to be done to replicate the behaviour.

@teg
Copy link
Member

teg commented Feb 10, 2021

Oh my, thanks for figuring this out! If we manage, I think it would be great to try to test and land this PR independently of this bug, and fix it separately (would still be great to have that fixed of course).

@gicmo
Copy link
Contributor

gicmo commented Feb 10, 2021

Oh my, thanks for figuring this out! If we manage, I think it would be great to try to test and land this PR independently of this bug, and fix it separately (would still be great to have that fixed of course).

I agree, an upgrading adding a package that adds new groups can happen independently of the kernel-rt support. It is a bug and needs to be addressed. For now we should test upgrades where the groups don't change.

@gicmo
Copy link
Contributor

gicmo commented Feb 11, 2021

I had a quick moment to think about that, and I think the missing piece is that osbuild does not do the equivalent of preserve-passwd and check-passwd from rpm-ostree (see its treefile doc). Need to have a look at the rpm-ostree code to see what needs to be done to replicate the behaviour.

Ok, I think I found the place in rpm-ostree: /src/app/rpmostree-compose-builtin-tree.cxx#L440

Group is created in rt-setup.spec

Polkit and ssh-keys groups are added without specifying any gids.

@achilleas-k
Copy link
Member Author

Added two test cases to the format-request-map for testing the feature. Still need to generate the test cases when the kernel-rt package is added to our test mirrors.

@henrywang
Copy link
Member

@achilleas-k, looks rt-kernel still not work, so I run test with debug kernel as alt kernel (install with default kernel and upgrade with debug kernel). Everything works as what I expected.

@achilleas-k
Copy link
Member Author

Until we make changes to the way ostree commits are built and how GIDs are selected (issue #1222), the rt kernel will have this issue when upgrading from a system without it. It seems it's a more general issue however, which could occur with any upgrade that changes the number and order of system groups.

There should be no issues building a commit with the kernel-rt directly and using it on a new install without upgrading.

Blueprints can now be used to specify a kernel as part of the kernel
customizations.  Specifying a kernel adds it to the package list.

If no known kernel is specified (neither in the customizations nor the
package list), the default "kernel" is included automatically.

If kernels are specified in both the package list and the
customizations, both are added (even if they're duplicates).
Edit GetPackages test to expect automatic kernel inclusion.

New test for all combinations of adding a kernel to customizations and
package list.
@achilleas-k
Copy link
Member Author

achilleas-k commented Feb 15, 2021

Test cases added and generated. Thanks @dvdhrm for mirroring the kernel repo.

Marking PR as ready now.

@achilleas-k achilleas-k marked this pull request as ready for review February 15, 2021 15:58
@henrywang
Copy link
Member

Run rt kernel image test, everything works. No rpm-ostree upgrade test here.

  1. Build rt kernel image
  2. Install image
  3. Some checkings, like mount point and permission
  4. greenboot checking
  5. greenboot rollback checking

The kernel now comes from the blueprint packages even when it's not
specified.  Removing from the base packages of the image types avoids
duplication and allows for alternative kernels to be specified without
also including the default.

The latter is necessary for RHEL for Edge and Fedora IoT images (ostree
commits) that fail to build when multiple kernels are installed.

ImageType tests modified to fix expected package order.
Test that all defined image types return at least one kernel when given
an empty blueprint and exactly one kernel for ostree-commit types.
Two new test cases added to format-request-map and test cases are
generated.

1. kernel-rt for RHEL images:
Requires new package repositories for RHEL 8.3 and 8.4.
Creates an OSTree commit with the `kernel-rt` as a customization.

2. kernel-debug for Fedora images:
kernel-rt isn't included in the official fedora repositories.  Using
kernel-debug at least tests the feature with the fedora-iot-commit type.
Copy link
Member

@teg teg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 🥳

@teg teg merged commit d686abf into osbuild:main Feb 16, 2021
@achilleas-k achilleas-k deleted the alt-kernel branch February 16, 2021 13:52
achilleas-k added a commit to achilleas-k/osbuild-composer that referenced this pull request Feb 23, 2021
A few recent changes in image type definitions haven't been reflected in
the test cases yet.  This also acts as a check to make sure that the
changes in composer don't affect the old behaviour.

Causes for (some) changes:
- Kernel modules added to package lists:
  Prior to PR osbuild#1175 image types defined the kernel package in their
  package list.  Some only included `kernel-core` and not the `kernel`
  metapackage.  Now images default to having the `kernel` metapackage
  included which also adds `kernel-modules` and `alsa-sof-firmware`.
- New package source for rt kernel.
achilleas-k added a commit to achilleas-k/osbuild-composer that referenced this pull request Feb 23, 2021
A few recent changes in image type definitions haven't been reflected in
the test cases yet.  This also acts as a check to make sure that the
changes in composer don't affect the old behaviour.

Causes for (some) changes:
- Kernel modules added to package lists:
  Prior to PR osbuild#1175 image types defined the kernel package in their
  package list.  Some only included `kernel-core` and not the `kernel`
  metapackage.  Now images default to having the `kernel` metapackage
  included which also adds `kernel-modules` and `alsa-sof-firmware`.
- New package source for rt kernel.
achilleas-k added a commit to achilleas-k/osbuild-composer that referenced this pull request Feb 23, 2021
A few recent changes in image type definitions haven't been reflected in
the test cases yet.  This also acts as a check to make sure that the
changes in composer don't affect the old behaviour.

Causes for (some) changes:
- Kernel modules added to package lists:
  Prior to PR osbuild#1175 image types defined the kernel package in their
  package list.  Some only included `kernel-core` and not the `kernel`
  metapackage.  Now images default to having the `kernel` metapackage
  included which also adds `kernel-modules` and `alsa-sof-firmware`.
- New package source for rt kernel.
achilleas-k added a commit to achilleas-k/osbuild-composer that referenced this pull request Feb 25, 2021
A few recent changes in image type definitions haven't been reflected in
the test cases yet.  This also acts as a check to make sure that the
changes in composer don't affect the old behaviour.

Causes for (some) changes:
- Kernel modules added to package lists:
  Prior to PR osbuild#1175 image types defined the kernel package in their
  package list.  Some only included `kernel-core` and not the `kernel`
  metapackage.  Now images default to having the `kernel` metapackage
  included which also adds `kernel-modules` and `alsa-sof-firmware`.
- New package source for rt kernel.
achilleas-k added a commit to achilleas-k/osbuild-composer that referenced this pull request Feb 25, 2021
A few recent changes in image type definitions haven't been reflected in
the test cases yet.  This also acts as a check to make sure that the
changes in composer don't affect the old behaviour.

Causes for (some) changes:
- Kernel modules added to package lists:
  Prior to PR osbuild#1175 image types defined the kernel package in their
  package list.  Some only included `kernel-core` and not the `kernel`
  metapackage.  Now images default to having the `kernel` metapackage
  included which also adds `kernel-modules` and `alsa-sof-firmware`.
- New package source for rt kernel.
achilleas-k added a commit to achilleas-k/osbuild-composer that referenced this pull request Feb 25, 2021
A few recent changes in image type definitions haven't been reflected in
the test cases yet.  This also acts as a check to make sure that the
changes in composer don't affect the old behaviour.

Causes for (some) changes:
- Kernel modules added to package lists:
  Prior to PR osbuild#1175 image types defined the kernel package in their
  package list.  Some only included `kernel-core` and not the `kernel`
  metapackage.  Now images default to having the `kernel` metapackage
  included which also adds `kernel-modules` and `alsa-sof-firmware`.
- New package source for rt kernel.
achilleas-k added a commit to achilleas-k/osbuild-composer that referenced this pull request Mar 1, 2021
A few recent changes in image type definitions haven't been reflected in
the test cases yet.  This also acts as a check to make sure that the
changes in composer don't affect the old behaviour.

Causes for (some) changes:
- Kernel modules added to package lists:
  Prior to PR osbuild#1175 image types defined the kernel package in their
  package list.  Some only included `kernel-core` and not the `kernel`
  metapackage.  Now images default to having the `kernel` metapackage
  included which also adds `kernel-modules` and `alsa-sof-firmware`.
- New package source for rt kernel.
achilleas-k added a commit to achilleas-k/osbuild-composer that referenced this pull request Mar 4, 2021
A few recent changes in image type definitions haven't been reflected in
the test cases yet.  This also acts as a check to make sure that the
changes in composer don't affect the old behaviour.

Causes for (some) changes:
- Kernel modules added to package lists:
  Prior to PR osbuild#1175 image types defined the kernel package in their
  package list.  Some only included `kernel-core` and not the `kernel`
  metapackage.  Now images default to having the `kernel` metapackage
  included which also adds `kernel-modules` and `alsa-sof-firmware`.
- New package source for rt kernel.
msehnout pushed a commit that referenced this pull request Mar 8, 2021
A few recent changes in image type definitions haven't been reflected in
the test cases yet.  This also acts as a check to make sure that the
changes in composer don't affect the old behaviour.

Causes for (some) changes:
- Kernel modules added to package lists:
  Prior to PR #1175 image types defined the kernel package in their
  package list.  Some only included `kernel-core` and not the `kernel`
  metapackage.  Now images default to having the `kernel` metapackage
  included which also adds `kernel-modules` and `alsa-sof-firmware`.
- New package source for rt kernel.
ondrejbudai pushed a commit to ondrejbudai/osbuild-composer that referenced this pull request Mar 16, 2021
A few recent changes in image type definitions haven't been reflected in
the test cases yet.  This also acts as a check to make sure that the
changes in composer don't affect the old behaviour.

Causes for (some) changes:
- Kernel modules added to package lists:
  Prior to PR osbuild#1175 image types defined the kernel package in their
  package list.  Some only included `kernel-core` and not the `kernel`
  metapackage.  Now images default to having the `kernel` metapackage
  included which also adds `kernel-modules` and `alsa-sof-firmware`.
- New package source for rt kernel.

(cherry picked from commit 83d87a9)
teg pushed a commit that referenced this pull request Mar 17, 2021
A few recent changes in image type definitions haven't been reflected in
the test cases yet.  This also acts as a check to make sure that the
changes in composer don't affect the old behaviour.

Causes for (some) changes:
- Kernel modules added to package lists:
  Prior to PR #1175 image types defined the kernel package in their
  package list.  Some only included `kernel-core` and not the `kernel`
  metapackage.  Now images default to having the `kernel` metapackage
  included which also adds `kernel-modules` and `alsa-sof-firmware`.
- New package source for rt kernel.

(cherry picked from commit 83d87a9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants