Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronize haveged instances during switch root #74

Merged
merged 2 commits into from
Nov 28, 2022

Conversation

OttoHollmann
Copy link
Contributor

We found that with some system configurations zypper and lsof were complaining about haveged using deleted file (haveged binary itself) after each reboot of the system. On kernels =< 5.6 haveged is started early in the initramfs. Before switching to the permanent rootfs, haveged was supposed to do a chdir to /sysroot and re-executes the haveged binary from /sysroot. This is supposed to be done by the haveged-switch-root service.

The haveged-switch-service failed due to a communication problem. There are 2 haveged instances running on the system when the haveged-switch-service is started - one haveged daemon instance and one switch instances telling the daemon to switch its root directory.

The two daemons communicate via abstract namespace unix socket. Depending on the scheduling of the process done by the kernel, some times the daemon instance is trying to read data from the socket which is not yet sent and here the problem happens. The problem is almost systematic on KVM VMs with one cpu.

This PR using a named semaphore to prevent reading from a socket until the second process has completed writing.

Thanks to @bkahla for finding this issue providing patch.

@jirka-h
Copy link
Owner

jirka-h commented Nov 15, 2022

Thanks for the report and for the PR!

I have run the checks and linking fails with:

/usr/bin/ld: havegecmd.o: in function `new_root':
/home/runner/work/haveged/haveged/src/havegecmd.c:100: undefined reference to `sem_close'
collect2: error: ld returned 1 exit status

From man page for man sem_close:
Link with -pthread.

I think we need to update configure.ac like this:

$git diff configure.ac
diff --git a/configure.ac b/configure.ac
index 99451c9..ff1cbaa 100644
--- a/configure.ac
+++ b/configure.ac
@@ -10,7 +10,7 @@ AC_CONFIG_HEADER([config.h])
 AM_INIT_AUTOMAKE([subdir-objects no-dependencies])
 AC_CONFIG_SRCDIR([src/haveged.c])
 AC_CHECK_TYPES([uint32_t, uint8_t])
-HA_LDFLAGS=""
+HA_LDFLAGS="-pthread"
 
 ##libtool_start##
 LT_INIT
@@ -73,7 +73,6 @@ AC_ARG_ENABLE(threads,
    , enable_threads="no")
 if test "x$enable_threads" = "xyes"; then
    AC_DEFINE(NUMBER_CORES, 4, [Define maxium number of collection threads])
-   HA_LDFLAGS="-pthread"
 else
    AC_DEFINE(NUMBER_CORES, 1, [Define to single collection thread])
 fi

It works fine for me on a laptop. Could you please test it on your end as well and update the PR?

Thanks a lot!
Jirka

@OttoHollmann
Copy link
Contributor Author

You are right, I forgot to link pthread. Originally I added pthread only into our spec file.

@jirka-h jirka-h merged commit 1c5eb6b into jirka-h:master Nov 28, 2022
@jirka-h
Copy link
Owner

jirka-h commented Nov 28, 2022

Thanks a lot for the fix!

All tests are passing, I have merged the PR.

Jirka

@VA1DER
Copy link

VA1DER commented Oct 3, 2024

This causes a fatal error "haveged: Couldn't create nammed semaphore haveged_sem error: Permission denied" in Debian
Command line being attempted is: /usr/sbin/haveged --Foreground --verbose=1 -T 6

@frostschutz
Copy link

I had to work around this error by issuing mkdir -p /dev/shm before running haveged (in Arch mkinitcpio initramfs, which apparently does not provide /dev/shm on its own accord).

I don't switch-root haveged, instead its terminated at end of initramfs, and a new instance started by real init (if it has such a service). So personally I don't need this handover functionality at all, and would rather not require a semaphore (not to mention, terminate outright over not being able to create one - this could be a warning instead of a fatal error?).

But I guess it's a case of can't make everyone happy.

@jirka-h
Copy link
Owner

jirka-h commented Oct 13, 2024

Hi all,

Thanks for the discussion and the workaround. I don't use this feature. I have reviewed the PRs contributed by users and merged them.

Looking at

https://github.com/jirka-h/haveged/blob/master/src/haveged.c#L495

I can change this to a warning.

Thanks
Jirka

@jirka-h
Copy link
Owner

jirka-h commented Oct 13, 2024

Could you please review and test this change?

6b30426

@VA1DER
Copy link

VA1DER commented Oct 15, 2024

Could you please review and test this change?

This successfully prevents the lack of a semaphor from terminating haveged.

I had to work around this error by issuing mkdir -p /dev/shm before running haveged (in Arch mkinitcpio initramfs, which apparently does not provide /dev/shm on its own accord).

It turns out the problem is an over-eager app armor definition that doesn't allow permissions to write on /dev/shm/**:

--- /etc/apparmor.d/usr.sbin.haveged.orig       2021-01-13 19:56:44.000000000 -0400
+++ /etc/apparmor.d/usr.sbin.haveged    2024-10-15 13:47:14.990523890 -0300
@@ -12,10 +12,11 @@

   @{PROC}/sys/kernel/osrelease r,
   @{PROC}/sys/kernel/random/poolsize r,
   @{PROC}/sys/kernel/random/write_wakeup_threshold w,
   /dev/random w,
+  /dev/shm/** rw,

   /sys/devices/system/cpu/ r,
   /sys/devices/system/cpu/cpu*/cache/ r,
   /sys/devices/system/cpu/cpu*/cache/index*/{type,size,level} r,
   /usr/sbin/haveged mr,

This will need to be changed in Debian's package. There isn't an active maintainer for this package in Debian any more, though. Both existing ones retired. I may volunteer to take this on myself, but I do need to make sure I understand its workings well first.

@jirka-h
Copy link
Owner

jirka-h commented Oct 15, 2024

Thanks for following up on this one!

I'm inclined to keep the changes in the code and only warn when semaphore creation fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants