Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Systemd update to 252 #3290

Merged
merged 2 commits into from
Aug 3, 2023

Conversation

arnaldo2792
Copy link
Contributor

@arnaldo2792 arnaldo2792 commented Jul 24, 2023

Description of changes:
This updates systemd to v252. I'm using the latest version (to this day) of the series, since we need to include this commit, to prevent this problem. I only saw this issue manifesting with the 5.10 kernel and with sysmted <=252.4.

There are two commits in this PR. The first commit removes all the numbers from the existing local patches, since a few commits had to be reworked, and it was hard to differentiate an actual reworked patch. The second commit includes the changes for the update. All the back-ported patches were deleted since the new version includes all the changes in the patches.

A new feature to compress the journal was introduced in v252, which isn't backwards-compatible. This feature is disabled with an environment variable set in the systemd-journald service.

Testing done:

  • Heavy workload test to verify that the kubelet doesn't cause OOM, like in High memory utilization on 1.13.4 release  #3057
  • Test the journal can be read with journalctl in the admin container
  • Test the journal can be read after a migration to the new systemd version
  • Test the journal lists all the boots
  • Test the journal can be read after a rollback
  • Test instances boot with unified partitioning
  • 10,000 instance launches for these instance types:
    • m5.large
    • m6g.medium
    • t3.nano
    • t4g.medium
    • m5.16xlarge
    • m6a.12xlarge
    • m6g.12xlarge

Systemd 252/251 NEWS summary

This summary only includes what caught my attention and I thought was worth calling out. Refer to the full NEWS for greater detail.

Call out

  • They said that in future versions of systemd (>252), support for cgroups v1 will be dropped. Thus, we can’t move to any newer version until all the existing variants that use cgroups v1 are removed.

Compatibility Breaks

  • For this, We don’t use ConditionKernelVersion
  • For this, We don’t have SELinux labels in service units
  • For this, we are moving out of 250.11, so technically that version already included the changes to disable the feature described in the link.

New features

  • Nothing particularly appealing. Most of the new tools are related to Unified Kernel Images or other services like systemd-homed, which we don't use.

Changes in systemd itself

  • Systemd added a new compile option to do a “full preset” in the first boot. This option defaults to false in this release, but will be enabled by default in upcoming releases. It doesn’t affect us for this update, but we need to keep an eye for future updates.

Changes in systemd-networkd

  • There are new features in v252 and v251. I’d like folks working with systemd-networkd to test if their current work doesn’t break with the update, or if they could benefit from any of the new features (cc @zmrow / @yeazelm)

Changes in other components

  • Journalctl has a new feature that ins’t backwards-compatible with the journalctl version in the admin container. It was disabled as suggested in the linked section.
  • We could benefit from this feature in systemd-repart while building the disk partitions(?). Something to consider.
  • Folks working with systemd-networkd and the network stack should keep an eye on this
  • There is a new flag to set the default compression for the journal. Nothing is set by default, but could be nice to use one of the supported algorithms to save a few extra bytes of space. We will have to test how existing tools that collect logs behave if the logs are compressed.

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

Copy link
Contributor

@stmcginnis stmcginnis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine from a basic operations perspective.

I built and launched an instance. Everything came up OK and it was able to join the cluster.

Ran the following successfully:

systemctl status kubelet
systemctl restart kubelet
journalctl
journalctl -u kubelet

Ran sonobuoy run:

...
21:38:27             e2e                                         global   complete   passed   Passed:366, Failed:  0
21:38:27    systemd-logs   ip-192-168-24-231.us-east-2.compute.internal   complete   passed

Ran:

apiclient apply <EOF
[settings.kubernetes]
image-gc-high-threshold-percent = 85
image-gc-low-threshold-percent = 80
EOF

to trigger a kubelet config change. Then ran systemctl status kubelet again to make sure the service was restarted successfully.

Active: active (running) since Tue 2023-07-25 21:44:54 UTC; 5s ago

Copy link
Contributor

@bcressey bcressey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you diff the features and options output for this with systemd 250? The build log will dump out a config like this:

#20 14.08 systemd 250  
#20 14.08·  
#20 14.08     build mode                     : release  
#20 14.08     split /usr                     : False  
#20 14.08     split bin-sbin                 : True 
...

I'd also recommend going through meson_options.txt and looking for new options that we may want to set explicitly, usually to turn off.

Comment on lines 1 to 3
[Service]
Environment=SYSTEMD_JOURNAL_COMPACT=0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[Service]
Environment=SYSTEMD_JOURNAL_COMPACT=0
[Service]
Environment=SYSTEMD_JOURNAL_KEYED_HASH=0
Environment=SYSTEMD_JOURNAL_COMPACT=0

To drop patch 9007.

FWIW the reason I implemented it as a patch was because I wanted the patch to fail if the SYSTEMD_JOURNAL_KEYED_HASH option was ever removed upstream. That would be more obvious to a developer than an environment variable silently being ignored, though either regression should be caught by testing.

I'm fine with either approach but we should be consistent.

@@ -388,6 +365,7 @@ install -p -m 0644 %{S:4} %{buildroot}%{_cross_factorydir}%{_cross_sysconfdir}/i
%exclude %{_cross_factorydir}%{_cross_sysconfdir}/pam.d
%exclude %{_cross_factorydir}%{_cross_sysconfdir}/pam.d/other
%exclude %{_cross_factorydir}%{_cross_sysconfdir}/pam.d/system-auth
%exclude %{_cross_factorydir}%{_cross_sysconfdir}/locale.conf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be worth keeping, assuming it just contains LANG=C.UTF-8.

@bcressey
Copy link
Contributor

From v252 release notes:

Changes in systemd-networkd: the RapidCommit= is (re-)introduced to enable faster configuration via DHCPv6 (RFC 3315).
@zmrow this could be of interest in the move to networkd, but we should dig into why it was withdrawn.

@arnaldo2792
Copy link
Contributor Author

I updated the PR description and included a summary of what I thought it was worth calling out from the NEWS file.

@etungsten etungsten linked an issue Aug 1, 2023 that may be closed by this pull request
@arnaldo2792
Copy link
Contributor Author

@bcressey: this is the diff:

--- 250 2023-08-02 23:03:30.172333928 +0000
+++ 252 2023-08-02 23:03:20.900340189 +0000
@@ -1,4 +1,4 @@
-   systemd 250
+   systemd 252

        build mode                     : release
        split /usr                     : False
@@ -19,8 +19,10 @@
        D-Bus policy directory         : /x86_64-bottlerocket-linux-gnu/sys-root/usr/share/dbus-1/system.d
        D-Bus session directory        : /x86_64-bottlerocket-linux-gnu/sys-root/usr/share/dbus-1/services
        D-Bus system directory         : /x86_64-bottlerocket-linux-gnu/sys-root/usr/share/dbus-1/system-services
+       D-Bus interfaces directory     : no
        bash completions directory     : no
        zsh completions directory      : no
+       private shared lib version tag : 252
        extra start script             : /etc/rc.local
        debug shell                    : /bin/sh @ /dev/tty9
        system UIDs                    : <=999 (alloc >=201)
@@ -35,6 +37,7 @@
        nobody user name               : nobody
        nobody group name              : nobody
        fallback hostname              : localhost
+       default compression method     : none
        default DNSSEC mode            : no
        default DNS-over-TLS mode      : no
        default mDNS mode              : no
@@ -55,13 +58,15 @@
        default net.naming-scheme value: latest
        default KillUserProcesses value: True
        default locale                 : C.UTF-8
+       default nspawn locale          : C.UTF-8
+       default status unit format     : description
        default user $PATH             : (same as system services)
        systemd service watchdog       : 3min
-       time epoch                     : 1676581145 (2023-02-16T20:59:05+00:00)
+       time epoch                     : 1689671493 (2023-07-18T09:11:33+00:00)

      Features
-       enabled                        : ACL, SECCOMP, SELinux, blkid, libfdisk, efi, networkd, pstore, randomseed, repart, systemd-analyze, sysusers, tmpfiles, kmod, ldconfig, gshadow, link-udev-shared, link-systemctl-shared
-       disabled                       : AUDIT, AppArmor, IMA, PAM, SMACK, elfutils, gcrypt, gnutls, libbpf, libcryptsetup, libcryptsetup-plugins, libcurl, libfido2, libidn, libidn2, libiptc, microhttpd, openssl, p11kit, pcre2, pwquality, qrencode, tpm2, xkbcommon, zstd, lz4, xz, zlib, bzip2, backlight, binfmt, bpf-framework, coredump, environment.d, gnu-efi, firstboot, hibernate, homed, hostnamed, hwdb, importd, initrd, kernel-install, localed, logind, machined, nss-myhostname, nss-mymachines, nss-resolve, nss-systemd, oomd, portabled, quotacheck, resolve, rfkill, sysext, timedated, timesyncd, userdb, vconsole, xdg-autostart, idn, polkit, nscd, legacy-pkla, dbus, glib, tpm, man pages, html pages, man page indices, SysV compat, compat-mutable-uid-boundaries, utmp, adm group, wheel group, debug hashmap, debug mmap cache, debug siphash, valgrind, trace logging, install tests, link-networkd-shared, link-timesyncd-shared, link-boot-shared, fexecve, standalone-binaries, static-libsystemd, static-libudev, cryptolib, DNS-over-TLS
+       enabled                        : ACL, SECCOMP, SELinux, blkid, libfdisk, efi, networkd, pstore, randomseed, repart, systemd-analyze, sysusers, tmpfiles, kmod, ldconfig, gshadow, link-udev-shared, link-systemctl-shared, link-journalctl-shared
+       disabled                       : AUDIT, AppArmor, IMA, PAM, SMACK, elfutils, gcrypt, gnutls, libbpf, libcryptsetup, libcryptsetup-plugins, libcurl, libfido2, libidn, libidn2, libiptc, microhttpd, openssl, p11kit, pcre2, pwquality, qrencode, tpm2, xkbcommon, zstd, lz4, xz, zlib, bzip2, backlight, binfmt, bpf-framework, coredump, environment.d, gnu-efi, firstboot, hibernate, homed, hostnamed, hwdb, importd, initrd, kernel-install, localed, logind, machined, nss-myhostname, nss-mymachines, nss-resolve, nss-systemd, oomd, portabled, quotacheck, resolve, rfkill, sysext, sysupdate, timedated, timesyncd, userdb, vconsole, xdg-autostart, idn, polkit, nscd, legacy-pkla, dbus, glib, tpm, man pages, html pages, man page indices, SysV compat, compat-mutable-uid-boundaries, utmp, adm group, wheel group, debug hashmap, debug mmap cache, debug siphash, valgrind, trace logging, install tests, link-networkd-shared, link-timesyncd-shared, link-boot-shared, first-boot-full-preset, fexecve, standalone-binaries, coverage, static-libsystemd, static-libudev, cryptolib, DNS-over-TLS

      User defined options
        Cross files                    : ./cross-compilation.conf

@arnaldo2792
Copy link
Contributor Author

arnaldo2792 commented Aug 3, 2023

I'm working on a change to explicitly disable a few features that are disabled by default, but we don't want any surprises

Signed-off-by: Arnaldo Garcia Rincon <agarrcia@amazon.com>
Patches 9002, 9003, and 9009 were reworked for this update.

Systemd added a new feature for the journal that changes its format by
default to save space. This feature isn't backwards-compatible with
older versions of systemd, thus it was disabled through environment
variables as the documentation suggested.

Signed-off-by: Arnaldo Garcia Rincon <agarrcia@amazon.com>
@arnaldo2792
Copy link
Contributor Author

Forced push includes:

  • Drop patch 9007, and use environment variable isntead
  • Include locale.conf
  • Explicitly disable these features: sysupdate, log-message-verification, efi-tpm-pcr-compat

Copy link
Contributor

@etungsten etungsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for testing at scale.

Copy link
Contributor

@bcressey bcressey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@arnaldo2792 arnaldo2792 merged commit eba7e4e into bottlerocket-os:develop Aug 3, 2023
42 checks passed
@arnaldo2792 arnaldo2792 deleted the systemd-update branch January 29, 2024 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update to systemd 251/252/253
5 participants