Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create users at install time for image and edge installers #2375

Merged
merged 17 commits into from Mar 28, 2022

Conversation

achilleas-k
Copy link
Member

Rewrite of #1700 for RHEL 8.6 and RHEL 9.0 (and CentOS 8 and CentOS 9).

Instead of "baking in" users to the payload, create them at install time. For the edge-installer, this has the benefit that we no longer need to work around the ssh key creation on first boot.

For the image-installer, this is functionally equivalent.

The edge-installer previously did not allow any customizations at build time. Now users and groups are allowed.

Added test cases to generate manifests for the image- and edge-installer with users.

@teg
Copy link
Member

teg commented Mar 1, 2022

This is really great! Thanks for picking it up :)

@achilleas-k achilleas-k force-pushed the installer-user-creation branch 3 times, most recently from 1e6b0a1 to c9b4392 Compare March 2, 2022 12:17
Copy link
Member

@ondrejbudai ondrejbudai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs an exception I think.

@ondrejbudai ondrejbudai added the RHEL exception Change proposed for RHEL release as an exception label Mar 2, 2022
@henrywang
Copy link
Member

@achilleas-k Is this PR ready for testing? May I add edge-installer test in this PR? Thanks!

@achilleas-k
Copy link
Member Author

achilleas-k commented Mar 2, 2022

@achilleas-k Is this PR ready for testing? May I add edge-installer test in this PR? Thanks!

Yes, please do. I was trying to figure out where the failure is coming from, but maybe we should test the feature first and see how that goes.

Thanks

@achilleas-k
Copy link
Member Author

Pushed a rebase on main

@achilleas-k achilleas-k added the manifest changes Pull requests that affect the test manifests and image info label Mar 2, 2022
@henrywang
Copy link
Member

henrywang commented Mar 3, 2022

@achilleas-k @gicmo I added edge-installer test, it passed. But I found all installer tests failed. After installation, vm entered into emergency shell. Could you please take a look if you have time? Thanks.

[    1.774983] localhost dracut-mount[490]: Warning: Can't mount root filesystem
[    1.793486] localhost systemd[1]: Starting Dracut Emergency Shell...

Full boot log can be found from https://paste.centos.org/view/3e9f6764

+ cat /proc/self/mountinfo
1 1 0:2 / / rw shared:1 - rootfs rootfs rw,size=1453724k,nr_inodes=363431,inode64
22 1 0:21 / /proc rw,nosuid,nodev,noexec,relatime shared:2 - proc proc rw
23 1 0:22 / /sys rw,nosuid,nodev,noexec,relatime shared:3 - sysfs sysfs rw
24 1 0:5 / /dev rw,nosuid shared:8 - devtmpfs devtmpfs rw,size=1453744k,nr_inodes=363436,mode=755,inode64
25 23 0:6 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:4 - securityfs securityfs rw
26 24 0:23 / /dev/shm rw,nosuid,nodev shared:9 - tmpfs tmpfs rw,inode64
27 24 0:24 / /dev/pts rw,nosuid,noexec,relatime shared:10 - devpts devpts rw,gid=5,mode=620,ptmxmode=000
28 1 0:25 / /run rw,nosuid,nodev shared:11 - tmpfs tmpfs rw,size=595368k,nr_inodes=819200,mode=755,inode64
29 23 0:26 / /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime shared:5 - cgroup2 cgroup2 rw,nsdelegate,memory_recursiveprot
30 23 0:27 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:6 - pstore pstore rw
31 23 0:28 / /sys/fs/bpf rw,nosuid,nodev,noexec,relatime shared:7 - bpf none rw,mode=700
32 1 0:29 / /var/lib/nfs/rpc_pipefs rw,relatime shared:12 - rpc_pipefs rpc_pipefs rw
+ cat /proc/mounts
rootfs / rootfs rw,size=1453724k,nr_inodes=363431,inode64 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,nosuid,size=1453744k,nr_inodes=363436,mode=755,inode64 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev,inode64 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,nodev,size=595368k,nr_inodes=819200,mode=755,inode64 0 0
cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
none /sys/fs/bpf bpf rw,nosuid,nodev,noexec,relatime,mode=700 0 0
rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0

@jrusz
Copy link
Contributor

jrusz commented Mar 8, 2022

I just rebased on top of main

@achilleas-k achilleas-k force-pushed the installer-user-creation branch 2 times, most recently from 1e0eec8 to 5e254af Compare March 18, 2022 11:43
@achilleas-k
Copy link
Member Author

Exceptions granted for 8.6 and 9.0.

Rebased on main.

@achilleas-k
Copy link
Member Author

@henrywang To confirm: We have issues with libvirt on CS9 because of the mixed versions in the compose, right?
Maybe we can hold off merging this until that's fixed since the tests are important.

Are we also hitting the bootloader bug?

@henrywang
Copy link
Member

@henrywang To confirm: We have issues with libvirt on CS9 because of the mixed versions in the compose, right? Maybe we can hold off merging this until that's fixed since the tests are important.

Are we also hitting the bootloader bug?

I filed a bug https://bugzilla.redhat.com/show_bug.cgi?id=2065708 to track this issue. This issue should block ostree-rebase.sh on CS9. But it's only on latest CS9 repo, I mean CentOS-Stream-9-20220315.0/ repo.
If our own mock repo is built by CentOS-Stream-9-20220310.0/ and before, test should pass.

@achilleas-k achilleas-k force-pushed the installer-user-creation branch 3 times, most recently from 2d180af to ed01242 Compare March 24, 2022 09:26
@achilleas-k
Copy link
Member Author

All the edge-installer tests are passing here. The only failure is the 8.6 installer test.

@achilleas-k
Copy link
Member Author

achilleas-k commented Mar 25, 2022

The latest commit works around the installer issue by adding restorecon to the end of the kickstart file. I think the issue we're facing might be related to the modifications we make to the kickstart file. When I run the installer interactively locally, the issue doesn't occur. I'm going to investigate further, but I think we can keep the workaround to avoid missing the exception deadline.

@achilleas-k achilleas-k force-pushed the installer-user-creation branch 2 times, most recently from ad9126f to 0a63c84 Compare March 26, 2022 13:51
@achilleas-k
Copy link
Member Author

The latest commit works around the installer issue by adding restorecon to the end of the kickstart file. I think the issue we're facing might be related to the modifications we make to the kickstart file. When I run the installer interactively locally, the issue doesn't occur. I'm going to investigate further, but I think we can keep the workaround to avoid missing the exception deadline.

Reverted this and added a proper, permanent workaround. See commit for details.

achilleas-k and others added 12 commits March 26, 2022 15:08
Use single NewUsersStageOptions() from osbuild2 instead of implementing
in each distro.
Use single NewGroupsStageOptions() from osbuild2 instead of implementing
in each distro.

The new function does not set the Group.Name field anymore.  The field
does not exist in the osbuild schema and was silently ignored.
The field in the stage has been marked 'omitempty' and the relevant
manifests have been updated.
Use single NewKickstartStageOptions() and replace image-type-specific
implementations from each distro.
Use single NewAnacondaStageOptions() from osbuild2 instead of
implementing in each distro.

The new function conditionally adds the user module when there are users
that need to be created at install time (image- and edge-installers).
Add two new test cases:
- image-installer-with-users
- edge-installer-with-users
New test cases for edge- and image-installer with users.
Enable the user module unconditionally for the image-installer:
- If users are specified for the kickstart file, the module is required
  to set up the users.
- If no users are specified, the module can be used at install time to
  create users.

Updated relevant test cases (manifests).
CI will not run RHEL 8.4 test any more, remove it.
If a home directory has a trailing slash, the `useradd` command fails to
set the correct selinux contexts for the home directory on creation.
This can lead to various issues, but the one that we came across was
that the ~/.ssh directory and authorized_keys file cannot be read by
sshd and we couldn't log in to the system.

This only manifests if the user is created through the kickstart file
because:
1. `useradd` does not set the selinux contexts when creating the
   directory
2. Anaconda runs `restorecon` on the home directory and authorized_keys
   file when it creates them, but uses the install-time mount path
   `/mnt/sysroot/...` for which selinux does not have contexts.

In most cases we get around this bug because we run `setfiles` on the
tree at the end of our pipelines.
For the ostree case, the relabeling in Anaconda is done correctly.
The nightly compose (currently) only has a 'latest' directory for 8.7.
Switching to the development composes which have 'latest' for 8.4
onwards.

Enable -x in script for easier troubleshooting.
Copy link
Member

@thozza thozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff!

I added a few comments. None of them are critical, but I would like to wait for their resolution before approving.

I love the de-duplication of various functions and their move to the osbuild2 package. I would suggest to de-duplicate these functions also in older distro definitions. Although some of them would slightly change manifests, those changes have no effect on the functionality and backward compatibility of these images. This should be completely OK as an exception to our policy. The policy should IMO not prevent us from doing "the right thing" and cleaning up old code. If we do not want to maintain these definitions for the next 10 years as they are currently implemented, but want to eventually replace them with their equivalents using new code, then we should be cleaning them up and de-duplicating the code in them along the way.

In addition, I do not think that we should be blocking the merge of this PR due to waiting on RHEL exception. The reason is that we can selectively backport specific commits to RHEL or cherry-pick them on some other branch in upstream for that release. Blocking development changes on main for this reason IMO does not make sense.

internal/distro/rhel90/pipelines.go Show resolved Hide resolved
internal/osbuild2/groups_stage.go Show resolved Hide resolved
internal/osbuild2/kickstart_stage.go Show resolved Hide resolved
test/cases/ostree-ng.sh Show resolved Hide resolved
test/cases/ostree-ng.sh Show resolved Hide resolved
test/cases/ostree-ng.sh Show resolved Hide resolved
@achilleas-k
Copy link
Member Author

With respect to deduplicating calls and changing manifests in other distros, since this is specifically about an exception we want to get into 8.6 and 9.0, I wanted to minimise any side-effects to other image types and distributions. The priority here is getting the user-creation to be a bit more sane to fix the relevant bug and to minimise potential issues during the lifetime of the next version.

@thozza
Copy link
Member

thozza commented Mar 28, 2022

With respect to deduplicating calls and changing manifests in other distros, since this is specifically about an exception we want to get into 8.6 and 9.0, I wanted to minimise any side-effects to other image types and distributions. The priority here is getting the user-creation to be a bit more sane to fix the relevant bug and to minimise potential issues during the lifetime of the next version.

I understand why you didn't do it.

However I'm not sure about the implied consequences of your comment, since the deduplication could have been done in a separate commit, which would not be backported to 8.6 and 9.0.

Does this mean that you still plan to deduplicate the code in older distros eventually in a separate PR? Or not at all?

Copy link
Member

@thozza thozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that there is a time pressure to get a working version of these changes in, so that they can be backported to 8.6 and 9.0.

As I mentioned, none of my comments are critical. So I'm approving this PR with the expectation that the raised points will be considered for a follow-up PR.

@ondrejbudai ondrejbudai dismissed their stale review March 28, 2022 12:08

exception granted

@ondrejbudai ondrejbudai merged commit 5261987 into osbuild:main Mar 28, 2022
@achilleas-k achilleas-k deleted the installer-user-creation branch March 28, 2022 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
manifest changes Pull requests that affect the test manifests and image info RHEL exception Change proposed for RHEL release as an exception
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants