Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fixes hanging firstboot with kernel 6.1y #71

Merged
merged 2 commits into from
Mar 29, 2023

Conversation

KwadFan
Copy link
Contributor

@KwadFan KwadFan commented Mar 24, 2023

This fixes issue with preinstalled kernel 6.1y,
which leads to an hanging first boot in function fix_partuuid on Raspberry Pi Zero 2 W and Pi 3 series.

Instead of waiting for /dev/hwrng to produce enough symbols, force it to write to /dev/urandom and read from there.

This should fix #70

@MichaIng
Copy link
Contributor

Is there no rng-tools5 installed, so that one can get sufficient true random characters from /dev/random?

@KwadFan
Copy link
Contributor Author

KwadFan commented Mar 24, 2023

I have to check that, but what we basicly do is the same procedure as octopi and addings stuff due guysoft/custompios into the existing Raspberry Pi OS Image in a chroot env. So, if it is missing I think it should already a part of it.

@KwadFan
Copy link
Contributor Author

KwadFan commented Mar 24, 2023

I checked it and neither in our image nor raspi os lite.

@MichaIng
Copy link
Contributor

MichaIng commented Mar 24, 2023

Because what rng-tools5 does is exactly what you are doing manually here, just into /dev/random and on a continuous basis. It should assure that the true entropy pool, accessible via /dev/random, is always sufficiently filled. Using this does not start to create pseudo-random characters when the entropy pool is empty (although not really relevant here).

I'm just not sure whether the daemon runs already at this early boot stage. It has default dependencies, but no additional ordering:

root@micha:~# systemctl cat rngd
# /lib/systemd/system/rngd.service
[Unit]
Description=Start entropy gathering daemon (rngd)
Documentation=man:rngd(8)

[Service]
ExecStart=/usr/sbin/rngd -f

[Install]
WantedBy=multi-user.target

Reading from the hardware random generator device /dev/hwrng directly seems to be wrong, as, AFAIK it is not a pool like /dev/random, hence naturally empty quickly.

@KwadFan
Copy link
Contributor Author

KwadFan commented Mar 25, 2023

I really understand what you want to improve here and I appriciate that, but I simply modified to get it working what exists already. If that is unwanted, then Raspberry Pi OS has to overthink the whole concept at some point.

We dont build the base Image on our self, we use "of the shelf " what is available via download from the Pi Imager.
So, in the end it will hit the same wall on releases with 6.1y kernel. And your mentioned service will not run on firstboot because in cmdline.txt it will be overwritten due init= replacement of the real systemd init.

Dont want argue to strong here, yes possible solution for a running OS, but not on firstboot.

@MichaIng
Copy link
Contributor

And your mentioned service will not run on firstboot because in cmdline.txt it will be overwritten due init= replacement of the real systemd init.

Ah okay, then indeed it is not running and using /dev/urandom seems the right solution. If reading that 256 bytes from /dev/hwrng directly is assured to work, it seems to be the best that can be done. Otherwise I think for the PARTUUID pseudo-randomness is fine and hence what /dev/urandom gives even with an empty entropy pool should be sufficient.

@KwadFan
Copy link
Contributor Author

KwadFan commented Mar 25, 2023

Thats exactly what I thought also, a partuuid isnt that crutial as a hash for ssh. So, yes urandom should be sufficiant enough. I used 256 bytes only to "grant" some what of real randomness.

KwadFan added a commit to mainsail-crew/MainsailOS that referenced this pull request Mar 25, 2023
To avoid conflicts with non rpi images, moved to seperate modules

Can be removed if solved due merge of RPi-Distro/raspberrypi-sys-mods#71

Signed-off-by: Stephan Wendel <me@stephanwe.de>
@@ -63,7 +63,11 @@ get_variables () {
fix_partuuid() {
mount -o remount,rw "$ROOT_PART_DEV"
mount -o remount,rw "$BOOT_PART_DEV"
DISKID="$(tr -dc 'a-f0-9' < /dev/hwrng | dd bs=1 count=8 2>/dev/null)"
MAJOR="$(uname -r | cut -f1 -d.)"
if [[ "$MAJOR" -eq "6" ]] && [[ -c /dev/hwrng ]]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the major version matter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably because the slowdown appeared (and is now fixed) in 6.1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is saying if running 6.0 and newer, inject potentially slow hwrng into urandom. If the idea was to avoid the slowdown, then shouldn't that be negated? In either case, especially with the kernel fix, I can't think of a reason to have different behaviour in these two cases. So I'm tempted to just skip hwrng and only use urandom for the ID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My Idea behind that was to keep the original as much as possible, but I also dont wnated to get completly rid of some hardware generated symbols/chars. I never tested it with time measuring in mind. I only want to get rid of that 'hanging' behavior.
Therefor I used the major version because 5.15y wasnt affected by that.

@pelwell
Copy link
Member

pelwell commented Mar 27, 2023

FYI the kernel change that caused this catastrophic slowdown has already been reverted and is due to be replaced with a proper patch. This is likely to be a one-off problem, so I question the need for the change (and indeed the possibility of creating a system that produces cryptographically good random numbers quicker than the HWRNG block, or even at an acceptably slower rate).

@XECDesign
Copy link
Member

In this particular case I don't think the quality of the random numbers matters. AIUI, uranodom is a lot faster and will have more than enough entropy to avoid generating sd card with the same ID. So I'm thinking we should just use urandom directly without injecting anything extra from hwrng.

That seems to save 2 seconds on a pi 4.

Or would you still advise we keep it as is?

@MichaIng
Copy link
Contributor

MichaIng commented Mar 27, 2023

In this particular case I don't think the quality of the random numbers matters.

I think we all agree with this and the feeding from hwrng is only done to preserve the "true" randomness at least partly, in case it was done for some reason before.

to avoid generating sd card with the same ID

AFAIK, the chance that two UUIDs will be identical is exactly the same with hwrng, random and urandom. The only difference is that the latter uses sources which are known and hence the result is theoretically predictable and hence not suitable for cryptographic keys and such.

So yes, just replace /dev/hwrng with /dev/urandom and skip the feeding should be perfectly fine.

@pelwell
Copy link
Member

pelwell commented Mar 27, 2023

Isn't much of the time wasted in pre-reading data which is never used due to the way the pipeline is constructed? What's wrong with:

DISKID="$(dd if=/dev/hwrng bs=4 count=1 | od -An -tx4 | cut -c2-9)"

Feel free to substitute /dev/urandom if that is still preferred.

@MichaIng
Copy link
Contributor

MichaIng commented Mar 27, 2023

dd if=/dev/hwrng bs=4 count=1 produces only 4 characters, with high likely not a single usable hexadecimal number among them 🙂. The dd bs=1 count=8 at the end of the pipe terminates reading as fast as it has the needed 8 valid characters (as long as there is no large buffer created when piping?), so that should be fine.

@pelwell
Copy link
Member

pelwell commented Mar 27, 2023

The od converts those 4 bytes into 8 guaranteed hexadecimal characters. In the previous version, the dd at the end only receives some data after man, many bytes have been read from the hwrng. Real numbers imminent.

@XECDesign
Copy link
Member

Isn't much of the time wasted in pre-reading data which is never used due to the way the pipeline is constructed? What's wrong with:

DISKID="$(dd if=/dev/hwrng bs=4 count=1 | od -An -tx4 | cut -c2-9)"

Feel free to substitute /dev/urandom if that is still preferred.

Yeah, your version is much better. In this case there's no difference between the speed of urandom and hwrng, so if hwrng is a higher entropy source, then I see no reason to switch to urandom.

I didn't like it either throwing data away either. I blame whoever I stole it from on stackoverflow.

@MichaIng
Copy link
Contributor

MichaIng commented Mar 27, 2023

Oh wow, I should have tested the whole pipe. Nice tool od, never really worked with it. It is not suitable for cryptography, even when reading from /dev/random or /dev/hwrng, is it? At least that would explain why I also only found the "grab and throw away what does not suite" method for generating random passwords on stackoverflow and alike.

so if hwrng is a higher entropy source, then I see no reason to switch to urandom.

With only 4 characters it doesn't make a difference, but also using urandom doesn't make anything worse. As said, it is not worse in terms of producing results that differ from each other, it is only worse in terms of making it possible for attackers to predict or reconstruct what has been produced. But the UUID is not a secret and can be obtain by any unprivileged user anyway.

@pelwell
Copy link
Member

pelwell commented Mar 27, 2023

Logging the number of characters read from hwrng, here's the old version:

[   66.982977] 8192
[   67.103745] 8192
[   67.224523] 8192
[   67.345313] 8192
[   67.466111] 8192
[   67.586912] 8192
[   67.707687] 8192
[   67.828480] 8192
[   67.949282] 8192
[   68.070074] 8192
[   68.190876] 8192
[   68.311680] 8192
[   68.432468] 8192
[   68.553250] 8192
[   68.674048] 8192
[   68.794841] 8192

a total of 128KB, and the new one:

[  110.796166] 4

a total of 4.

@KwadFan
Copy link
Contributor Author

KwadFan commented Mar 27, 2023

Reading the entire discussion, sounds to me ( and I am no native english speaker ), that changes suggested by @pelwell , using od, is the prefered way?

If yes I would prepare a commit to change to that version.

Regards Kwad

@XECDesign
Copy link
Member

If yes I would prepare a commit to change to that version.

Yes, that would be much appreciated. Thank you!

KwadFan added a commit to KwadFan/raspberrypi-sys-mods that referenced this pull request Mar 28, 2023
As discussed in RPi-Distro#71 changed behavior to
mentioned code by pelwel.

See RPi-Distro#71 (comment)
for details.

Signed-off-by: Stephan Wendel <me@stephanwe.de>
@KwadFan
Copy link
Contributor Author

KwadFan commented Mar 28, 2023

@XECDesign
Done, I hope the commit message fits.

Regards Kwad

KwadFan and others added 2 commits March 29, 2023 07:38
As discussed in RPi-Distro#71 changed behavior to
mentioned code by pelwel.

See RPi-Distro#71 (comment)
for details.

Signed-off-by: Stephan Wendel <me@stephanwe.de>
@XECDesign XECDesign merged commit 346005c into RPi-Distro:master Mar 29, 2023
@XECDesign
Copy link
Member

Thanks!

KwadFan added a commit to mainsail-crew/MainsailOS that referenced this pull request Apr 12, 2023
This reverts changes made in #214 as
described in #213.

Fix is already merged in RPi-Distro/raspberrypi-sys-mods#71

Raspberry hasn't released updated image yet, but should work as we
upgrade during build ci.

Signed-off-by: Stephan Wendel <me@stephanwe.de>
KwadFan added a commit to mainsail-crew/MainsailOS that referenced this pull request Apr 15, 2023
This reverts changes made in #214 as
described in #213.

Fix is already merged in RPi-Distro/raspberrypi-sys-mods#71

Raspberry hasn't released updated image yet, but should work as we
upgrade during build ci.

Signed-off-by: Stephan Wendel <me@stephanwe.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

firstboot script hangs if image contains already kernel 6.1
4 participants