Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDRAND on AMD CPUs does not work #11810

Closed
TheBeasT15 opened this issue Feb 23, 2019 · 111 comments · Fixed by #12536
Closed

RDRAND on AMD CPUs does not work #11810

TheBeasT15 opened this issue Feb 23, 2019 · 111 comments · Fixed by #12536

Comments

@TheBeasT15
Copy link

@TheBeasT15 TheBeasT15 commented Feb 23, 2019

After suspending the system once can't suspend again. Shutdown gives unmount failed errors and can't shutdown too.
System info
System: Host: Capsparrow Kernel: 4.20.10-1-MANJARO x86_64 bits: 64 compiler: gcc v: 8.2.1 Desktop: KDE Plasma 5.15.0 Distro: Manjaro Linux Machine: Type: Laptop System: HP product: HP 245 G4 Notebook PC v: Type1ProductConfigId serial: <filter> Mobo: HP model: 80C7 v: KBC Version 98.0E serial: <filter> UEFI: Insyde v: F.1C date: 10/29/2015 Battery: ID-1: BAT0 charge: 38.8 Wh condition: 39.7/39.7 Wh (100%) model: Hewlett-Packard Primary status: Charging CPU: Topology: Quad Core model: AMD A8-7410 APU with AMD Radeon R5 Graphics bits: 64 type: MCP arch: Puma rev: 1 L2 cache: 2048 KiB flags: lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 17572 Speed: 998 MHz min/max: 1000/2200 MHz Core speeds (MHz): 1: 1006 2: 1087 3: 1177 4: 1288 Graphics: Device-1: AMD Mullins [Radeon R4/R5 Graphics] vendor: Hewlett-Packard driver: radeon v: kernel bus ID: 00:01.0 Display: x11 server: X.Org 1.20.3 driver: ati,radeon unloaded: modesetting resolution: 1366x768~60Hz OpenGL: renderer: AMD MULLINS (DRM 2.50.0 4.20.10-1-MANJARO LLVM 7.0.1) v: 4.5 Mesa 18.3.3 direct render: Yes Audio: Device-1: AMD Kabini HDMI/DP Audio vendor: Hewlett-Packard driver: snd_hda_intel v: kernel bus ID: 00:01.1 Device-2: AMD FCH Azalia vendor: Hewlett-Packard driver: snd_hda_intel v: kernel bus ID: 00:14.2 Sound Server: ALSA v: k4.20.10-1-MANJARO Network: Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet vendor: Hewlett-Packard driver: r8168 v: 8.045.08-NAPI port: 2000 bus ID: 01:00.0 IF: eno1 state: down mac: <filter> Device-2: Broadcom and subsidiaries BCM43142 802.11b/g/n vendor: Hewlett-Packard driver: wl v: kernel port: 2000 bus ID: 05:00.0 Drives: Local Storage: total: 465.76 GiB used: 68.75 GiB (14.8%) ID-1: /dev/sda vendor: Toshiba model: MQ01ABF050 size: 465.76 GiB Partition: ID-1: / size: 147.39 GiB used: 14.49 GiB (9.8%) fs: ext4 dev: /dev/sda1 ID-2: /home size: 294.29 GiB used: 54.26 GiB (18.4%) fs: ext4 dev: /dev/sda2 ID-3: swap-1 size: 4.00 GiB used: 0 KiB (0.0%) fs: swap dev: /dev/sda3 Sensors: System Temperatures: cpu: 52.9 C mobo: 0.0 C gpu: radeon temp: 54 C Fan Speeds (RPM): N/A Info: Processes: 176 Uptime: 1h 07m Memory: 3.32 GiB used: 1.74 GiB (52.3%) Init: systemd Compilers: gcc: 8.2.1 Shell: zsh v: 5.7.1 inxi: 3.0.30
Journalctl :

`➜ ~ journalctl -xe -p3 -b

-- Subject: A start job for unit network-suspend.service has failed
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- A start job for unit network-suspend.service has finished with a failure.

-- The job identifier is 1544 and the job result is failed.
Feb 22 20:06:49 Capsparrow systemd[1]: sleep.target: Failed to set invocation ID for unit: File exists
Feb 22 20:06:49 Capsparrow systemd[1]: Failed to start Sleep.
-- Subject: A start job for unit sleep.target has failed
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- A start job for unit sleep.target has finished with a failure.

-- The job identifier is 1543 and the job result is failed.
Feb 22 20:06:49 Capsparrow systemd[1]: network-resume.service: Failed to set invocation ID for unit: Fi>
Feb 22 20:06:49 Capsparrow systemd[1]: Failed to start Network resume service.
-- Subject: A start job for unit network-resume.service has failed
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- A start job for unit network-resume.service has finished with a failure.

-- The job identifier is 1628 and the job result is failed.`

@poettering

This comment has been minimized.

Copy link
Member

@poettering poettering commented Feb 25, 2019

These failures almost surely are kernel or driver issues. Please contact your downstream distro about this first, and let them escalate issues to us if they are sure this is a systemd issue, which however I think is unlikely.

Also, when you file a bug here, please fill in the form supplied, i.e. provide the systemd version and such. We put that form up for a reason. Thank you for understanding.

@poettering poettering closed this Feb 25, 2019
@bl33pbl0p

This comment has been minimized.

Copy link
Contributor

@bl33pbl0p bl33pbl0p commented Feb 25, 2019

@poettering Someone has reported this in #systemd IRC before. The key thing here is that after the first sleep, systemd returns -EEXIST for unit_set_invocation_id (which uses hashmap operations entirely).

It might still not be a systemd bug, but the upshot is units end up failing due to this error, so neither does suspend work for the reporter, nor shutdown, in particular, their logs said that multiple jobs failed due to the "Failed to set invocation ID for unit: File exists".

In their case, writing the value to the sysfs file would work, but making systemd start the target up wouldn't.

@bl33pbl0p

This comment has been minimized.

Copy link
Contributor

@bl33pbl0p bl33pbl0p commented Feb 26, 2019

@poettering Some bug reports in other distributions which were filed very recently:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=921267

https://bbs.archlinux.org/viewtopic.php?id=244399

Looks it broke after upgrading to a newer systemd for them...

@poettering

This comment has been minimized.

Copy link
Member

@poettering poettering commented Feb 26, 2019

ok

@poettering poettering reopened this Feb 26, 2019
@Experimenter

This comment has been minimized.

Copy link

@Experimenter Experimenter commented Mar 7, 2019

Facing a similar issue, downgrading systemd to v239 solves the problem. The same reappears on v240 & v241. Tested with Manjaro and Debian sid.

@TheBeasT15

This comment has been minimized.

Copy link
Author

@TheBeasT15 TheBeasT15 commented Mar 8, 2019

@Experimenter Yes I did that now its resolved.

@sandy-8925

This comment has been minimized.

Copy link

@sandy-8925 sandy-8925 commented Mar 10, 2019

I'm also facing this problem all the time, on a fresh Arch Linux install, on a laptop with an AMD CPU.

I have another laptop and desktop, each with Intel CPUs and neither face this problem. Any chance that this specifically happens on AMD CPUs?

@poettering poettering added the login label Mar 11, 2019
@TheBeasT15

This comment has been minimized.

Copy link
Author

@TheBeasT15 TheBeasT15 commented Mar 17, 2019

Yep I am on AMD too I tried on Intel CPU worked flawlessly.

@Experimenter

This comment has been minimized.

Copy link

@Experimenter Experimenter commented Mar 17, 2019

@thodnev

This comment has been minimized.

Copy link

@thodnev thodnev commented Mar 28, 2019

Also got AMD CPU and experienced the same problem after latest Arch upgrade.
Downgraded packages systemd, lib32-systemd and systemd-sysvcompat to version 239.6-1 and it started working like a charm.
The problem is that it got packaged and distributed to many users as stable update.
Probably distro maintainers should be more careful.
sudo lshw
journalctl -b -1

@oyvinds

This comment has been minimized.

Copy link

@oyvinds oyvinds commented Apr 5, 2019

got this issue on Fedora 30 with kernel 5.0.6 on "AMD E1-6010 APU with AMD Radeon R2 Graphics". systemd version systemd-241-4.gitcbf14c9.fc30.x86_64.

After suspend any and all services fail to start with "Failed to set invocation ID for unit: File exists" and the machine is essentially unusable.

Tried using both radeon and amdgpu for the gpu part of the machines apu, that makes no difference.

I'm not sure where to begin figuring out why systemd breaks after return from suspend. Let me know if/how I can provide more information.

@ChrisJAllan

This comment has been minimized.

Copy link

@ChrisJAllan ChrisJAllan commented Apr 6, 2019

I have this issue on my laptop with "AMD A6-6310 APU with AMD Radeon R4 Graphics", but not on my desktop with an AMD CPU and an nVidia GPU, so it's probably something on the GPU end.

@oyvinds

This comment has been minimized.

Copy link

@oyvinds oyvinds commented Apr 6, 2019

Just a little detail, suspend to disk doesn't have this problem on my machine - just suspend to RAM. Thus, using suspend to disk instead is a somewhat acceptable workaround.

@ranjan-purbey

This comment has been minimized.

Copy link

@ranjan-purbey ranjan-purbey commented Apr 25, 2019

In my case, the problem occurs not only with suspend but also after restarting any of the systemd services. For e.g., running the following:

$ sudo service systemd-resolved restart

causes any subsequent attempts to start/restart a service to fail.
Yes, I am also on AMD CPU + Radeon GPU

@ranjan-purbey

This comment has been minimized.

Copy link

@ranjan-purbey ranjan-purbey commented May 1, 2019

Even v142 has the same issue. So for the time being, downgrading to v139 seems to be the only work-around.

@mbiebl

This comment has been minimized.

Copy link
Contributor

@mbiebl mbiebl commented May 1, 2019

Apparently, this issue is easy to trigger, if you have the right hardware (and unfortunately none of the developers seem to have those).
So, if anyone who is affected by this issue can run a git bisect between v239 and v240 to find the first faulty commit, this would be super helpful.

@oyvinds

This comment has been minimized.

Copy link

@oyvinds oyvinds commented May 1, 2019

Triggering this really is super-easy on the AMD E1-6010, it's one of those happens-every-time bugs. Doing a git bisect on a 1.4 GHz dual-core isn't very tempting, though. That's the only hardware I have where this happens, Ryzen+RX570 doesn't have this problem. I could do if nobody else with hardware that doesn't spend days compiling systemd (or anything) does it.

@mbiebl

This comment has been minimized.

Copy link
Contributor

@mbiebl mbiebl commented May 1, 2019

@oyvinds That would be great, thanks! If the AMD E1-6010 is indeed that slow, I would consider compiling it on your faster Ryzen system and copying the binaries/rpm over.

@madhur4127

This comment has been minimized.

Copy link

@madhur4127 madhur4127 commented May 1, 2019

@oyvinds @mbiebl, I am also affected by the bug and I want to help, but I am new to Open Source and require guidance as this project is so big!

I have AMD A8-7410 quad-core 2.2Ghz, so I think it would suffice.

@sandy-8925

This comment has been minimized.

Copy link

@sandy-8925 sandy-8925 commented May 1, 2019

I haven't used that problematic system recently, will see if I can still reproduce, and if yes then I will try bisecting.

@jimy-byerley

This comment has been minimized.

Copy link

@jimy-byerley jimy-byerley commented May 4, 2019

Hello everyone.
I have the same problem with an AMD E1 and AMD radeon R2. My laptop can't suspend a second time and then can neither shutdown.
I got the same kind of message in journalctl:

mai 04 12:09:20 wopr systemd[1]: systemd-suspend.service: Failed to set invocation ID for unit: File exists
mai 04 12:09:20 wopr systemd[1]: Failed to start Suspend.
-- Subject: L'unité (unit) systemd-suspend.service a échoué
-- Defined-By: systemd
-- Support: https://www.debian.org/support
-- 
-- L'unité (unit) systemd-suspend.service a échoué, avec le résultat failed.
@mbiebl

This comment has been minimized.

Copy link
Contributor

@mbiebl mbiebl commented May 4, 2019

I guess at this point we don't need further "me too's", as this doesn't really help us to find the root cause. Instead we need someone with the appropriate hardware to find the first faulty commit.

@madhur4127

This comment has been minimized.

Copy link

@madhur4127 madhur4127 commented May 5, 2019

@mbiebl, I am currently bisecting between v239 and v240.

Using systemd-nspawn -bi image.raw to test images causes image to be loaded in terminal itself. How can I test whether the current commit is good/bad?

I thought of using systemctl suspend twice to test for image but the error displayed is: Failed to suspend system via logind: Sleep verb "suspend" not supported

@mbiebl

This comment has been minimized.

Copy link
Contributor

@mbiebl mbiebl commented May 5, 2019

@madhur4127 You'll need to run systemd no bare-metal and not inside a systemd-nspawn container. You can't suspend a container via systemctl suspend.

@jimy-byerley

This comment has been minimized.

Copy link

@jimy-byerley jimy-byerley commented May 6, 2019

I was trying to compile systemd to do a git bisect, but I went to an error during ./configure because my version of libmount (use by command mount) is to old (<= 2.30, I use a debian stable for the compilation)
I tried to recompile libmount myself, from util-linux, but after that pkgconfig continues to say my libmount version is 2.29. Is there a way to correct it ?

@vcaputo

This comment has been minimized.

Copy link
Member

@vcaputo vcaputo commented May 6, 2019

@jimy-byerley In my experience it's simpler to just modify the meson.build file where the libmount >= 2.30 dependency is specified. However Debian stable's version of Meson is also too old for the systemd meson.build syntax, so I also have a newer Meson version built from source in /usr/local.

With those two changes systemd builds on Debian stable for me, unless I'm forgetting other things it needed.

@keszybz

This comment was marked as off-topic.

Copy link
Member

@keszybz keszybz commented Jul 8, 2019

@forthy42 try reading more carefully. The call to rdrand() is immediately followed by a fallback to getrandom() and /dev/urandom.

@forthy42

This comment has been minimized.

Copy link

@forthy42 forthy42 commented Jul 8, 2019

Some invocations of rdrand() fall back to libc's rand()… instead of calling again. This sort of code is error-prone; don't do that.

BTW: If all you need is a non-repeating sequence of lousy quality pseudo random 64 bit numbers, read one single seed, feed it into an xorshift PRNG, and use that output for the rest. It's absolutely guaranteed to not repeat itself, regardless if that first seed is 0 or -1 or really random.

@mbiebl

This comment was marked as outdated.

Copy link
Contributor

@mbiebl mbiebl commented Jul 8, 2019

Some invocations of rdrand() fall back to libc's rand()… instead of calling again. This sort of code is error-prone; don't do that.

BTW: If all you need is a non-repeating sequence of lousy quality pseudo random 64 bit numbers, read one single seed, feed it into an xorshift PRNG, and use that output for the rest. It's absolutely guaranteed to not repeat itself, regardless if that first seed is 0 or -1 or really random.

@forthy42 This sounds like a reasonable request. Could you file it as RFE and post your input there?

@poettering

This comment has been minimized.

Copy link
Member

@poettering poettering commented Jul 8, 2019

Some invocations of rdrand() fall back to libc's rand()… instead of calling again. This sort of code is error-prone; don't do that.

Please try to understand the usecases for the fallback-to-rand() logic. It's documented in long detail here:

https://github.com/systemd/systemd/blob/master/src/basic/random-util.c#L36
https://github.com/systemd/systemd/blob/master/src/basic/random-util.c#L354

There's really no point in warming up this discussion over and over again. Instead, please just read the comments, it saves everyone time.

We fall back to rand() if rdrand() is only used where the quality really doesn't matter much (such as: a loop around temporary file generation which includes a random number, where low entropy just means we loop a bit longer, during early boot where the kernel's own pool is not filled yet, at which point we use that.).

BTW: If all you need is a non-repeating sequence of lousy quality pseudo random 64 bit numbers, read one single seed, feed it into an xorshift PRNG, and use that output for the rest. It's absolutely guaranteed to not repeat itself, regardless if that first seed is 0 or -1 or really random.

No, let's not roll our own PRNGs, we don't want to maintain that. it's fine to use the CPU RNG, the kernels or glibc's, but I don't want to maintain our own version in our codebase, sorry. Also, why even? glibc is fine for low quality randomness. If you think glibc's implementation sucks, please work with them to fix that.

@forthy42 This sounds like a reasonable request. Could you file it as RFE and post your input there?

That's just hiding a bug that is elsewhere. In our own code we handle the case fine too, and log about it, that's already good enough. I mean, if we'd just generate a series froever from a single constant value that means when we ever need low quality entropy (for cases such as temporary file name generation) we are forever guessable, which sucks...

@forthy42

This comment has been minimized.

Copy link

@forthy42 forthy42 commented Jul 8, 2019

No, let's not roll our own PRNGs, we don't want to maintain that. it's fine to use the CPU RNG, the kernels or glibc's, but I don't want to maintain our own version in our codebase, sorry. Also, why even? glibc is fine for low quality randomness. If you think glibc's implementation sucks, please work with them to fix that.

It used to be an LCG, and if this is good enough for you, just use it. glibc has already an improved random() function, and the current version uses that for rand(), too (the LCG is gone for good). The random() function passes some of those tests where the LCG fails, and has enough internal state for non-cryptographic use-cases. It's not in a state I would complain about. Given sufficient seeding, it's good. RdRand, if correctly implemented, is cryptographically good, so far better. But you can't rely on that.

The question would be rather why use rdrand() for anything else than seeding rand(), since you can't rely on it. You need to know it's good to actually prefer it.

All you need is to seed rand(); using rdrand for this purpose is good (as you do now), because it's just once, and if the rdrand seed is deterministic, it only fails security concerns (the attacker then can guess all the init states of the hash tables if it has some clues about the boot time). Logging the seed value would be a good idea, because people can check if it looks random. It won't crash if it isn't.

Essentially, this general approach (seed with monitored entropy source+deterministic, well-known PRNG that is never reseeded once it's running) is what DJB also suggests for cryptographic PRNGs:

http://blog.cr.yp.to/20140205-entropy.html

This approach is sane, easy to maintain and won't fail, and problems can be monitored. If you don't need high quality randomness, then your PRNG can indeed be simple (LCG probably a bit too bad, but random() is good).

Actually, you already do that, if rdrand() fails. All you need is to replace the for(;;) loop at line 179 with the fallback to pseudo_random_bytes() and not bother with rdrand() outside of initialize_srand(). There, it's only an entropy source, and you can use it (together with at least one other lousy entropy source, monitored). The only suggestion I would have for pseudo_random_bytes is to use the most significant 24 bits of rand() (>>23, >>15, >>7), not the least, if you compile for an old glibc version where rand() actually still is the LCG. With rand()=random(), it doesn't matter.

The security researcher problem with rdrand() is that it is a black box, and can fail in unexpected ways. You can't audit it, and you can't expect it to be reasonable, or that it sets the “I failed” flag when it failed, or that retry is actually a cure (the CF is a “please retry” flag in Intel's documentation). Ryzen's behavior proves that pessimistic assumption to be absolutely correct.

@poettering

This comment has been minimized.

Copy link
Member

@poettering poettering commented Jul 9, 2019

The question would be rather why use rdrand() for anything else than seeding rand(), since you can't rely on it. You need to know it's good to actually prefer it.

The comments explain that already.

We assign UUIDs for all service invocations, to make them recognizable in logs and stuff so that we can associate logs with their invocations properly. For UUIDs using rand() is not good enough, since the results are likely not going to be unique. OTOH we don't need cryptographic quality either, since these are not key material. We used to pull UUIDs from /dev/urandom, which sucks during early boot though, since the kernel complains if we do that while the pool isn't fully initialized. We can't use getrandom() for this usecase either, since it either fails entirely or blocks during early boot, and possibly for a long time. So we try to avoid using /dev/urandom for such stuff if we can, by using RDRAND, which is not superduper fast, but also not super dupler slow, doesn't result in log spew. When RDRAND doesn't work though we will fall back to /dev/urandom for thus, but never to rand().

In general: we have to trust that CPUs work. If they don't we are fucked. We don't have to trust them to the level to directly derive cryptographic keys from it (though we usually do now, most bigger distros trust RDRAND now for /dev/urandom filling), but we certainly have to trust them that the basics work. And no, on those AMD CPUs the basics don't work at all, apparently. I mean, if you can't even rely on AMD CPUs to give you a non-constant value on RDRAND, what's next? that it returns a constant value on ADD? I mean, we have to put the line somewhere...

And anyway, we detect the constant values these days, and log. It sucks hard, but it does what is needed to to do. If we'd just seed a systemd-private PRNG from RDRAND exactly zero is won, since the AMD CPUs don't fail when you query many RDRAND values, but alread yon the first. So what would you gain, besides having to maintain your own PRNG which we really don't want?

And anyway, random-util.c has pretty much more comments than code these days, explaining all our usecases, with a strong focus on the fact we need all this during early boot. Instead of warming that up again and again, why not just have a look there and read up? Because this stuff has been discussed before, and you are not really adding anything new to the discussion.

@forthy42

This comment has been minimized.

Copy link

@forthy42 forthy42 commented Jul 9, 2019

No, the comment doesn't. I've read it, and I'm not convinced.

First of all: If you need the UUIDs to be unique, either use a reliable nonce creating technique, or do something else to avoid collisions (e.g. re-get another one, and fall back to increment if that fails, too). You said you don't want to maintain your own PRNG, but a cryptographic random number generator definitely can't promise you to deliver unique numbers (it's just pretty unlikely that within a small bunch of 64 bit numbers, two are equal — but if you boot on billions of computers, and create many UUIDs, it will probably happen to someone, sometimes). LCGs the way you use them (lower 24 bits) will give you non-unique results with a much higher likelihood, which is why rand() on older glibcs can't be used, and I actually don't know how good the new random() is for that purpose (it's better, but is it good enough?). Unless you add the inevitable collision avoidance code, which is what you should have done in the first place, anyways.

/dev/urandom has a PRNG that passes all relevant tests (even if badly seeded), but it will give you the same number of random collisions as any other true random source (few, but not zero). You'll likely never see them in your lab, and if the user reboots if it fails to boot once, he also won't see it again. Don't treat a random number as predictable nonce; if you want a nonce, use a deliberate pedictable nonce function. If you want to be cryptographically hard, the only nonce function you can use is a counter encrypted with a secret key and a block cipher (take the full block then, it will not repeat as long as the counter doesn't repeat). If you can accept external predictability, just use a counter. If you can detect collisions, add avoidance code. Since systemd fails with a lot of error messages, you obviously can detect collisions, you then can also avoid them.

About trusting CPUs: We know that we can't trust RdRand, because it's such an easy attack vector. Nobody trusts it. AMD got away with their bug in their own lab, because nobody trusts it (apart from recent systemd, which trusts it even to deliver a nonce, which was never part of the promise). Don't trust it. We are not fucked up if we know where we can't trust the CPU. We are not fucked up if we can't trust the entropy, but then, we need to test the entropy before we use it.

Actually, cryptographic hard code does a lot of stuff to avoid those parts of CPUs where we know we can't trust them. We need particular code to minimize side channel leaks of secrets (conditional moves instead of branches, no memory indexed by secrets, etc.; side channel leaks are more than a decade older than Spectre), we need to work around over-optimization in C compilers to make sure our secret copies are erased after use (there was an hour-long talk on 35c3 about only that topic, a conceptual simple thing like predictable erase memory is already hard — because compiler writer don't understand that this is not something that can be optimized away). We are not fucked up if things don't quite work as they should.

@forthy42

This comment has been minimized.

Copy link

@forthy42 forthy42 commented Jul 9, 2019

BTW, one comment to improve the auditability of this code (it's going to be looked at by many people, anyhow):

This random generator has a number of different users (callers) that have clearly very different requirements. It satisfies those by having a bunch of flags, and different code paths for the different flags; part of the flags are obvious optimizations, i.e. for speed. This makes it pretty hard to audit.

It would be much cleaner if each of the different requirements had their own function including with spec documented, e.g. get_random_uuid(uuid * x, int (check_collision)(uuid x)) for the UUID stuff. The spec should say what you need, like

[x] unique
[x] with check function to verify that
[x] unpredictable
[ ] fast
[x] reliable (caller has no fallback)
[64] bits

something like that. Then auditors can look at it and quickly know what a particular function needs to do.

Do one thing and do that right is a good idea for functions in C programs, too.

@poettering

This comment has been minimized.

Copy link
Member

@poettering poettering commented Jul 10, 2019

No, the comment doesn't. I've read it, and I'm not convinced.

First of all: If you need the UUIDs to be unique, either use a reliable nonce creating technique, or do something else to avoid collisions (e.g. re-get another one, and fall back to increment if that fails, too). You said you don't want to maintain your own PRNG, but a cryptographic random number generator definitely can't promise you to deliver unique numbers (it's just pretty unlikely that within a small bunch of 64 bit numbers, two are equal — but if you boot on billions of computers, and create many UUIDs, it will probably happen to someone, sometimes).

Well, it's how uuids are generated these days. Pretty much everyone just uses type 4 uuids (i.e. randomized ones). If you think that the likeliness of collision is too high, by all means, fix the whole stack, but please start with the linux kernel, which for example generates uuids this way via /proc/sys/kernel/random/uuid or /proc/sys/kernel/random/boot_id.

I am pretty sure we are fine with this mode of operation as long as the Linux kernel is.

And no, the other ways to generate uuids don#t really work: maintaining global counters is either unsafe (because we need uuids in all kinds of processes, and we can't share state safely if they run with different privileges and shall not be able to trigger collisions easily). And things like uuidd doesn't work during early boot either.

LCGs the way you use them (lower 24 bits) will give you non-unique results with a much higher likelihood, which is why rand() on older glibcs can't be used, and I actually don't know how good the new random() is for that purpose (it's better, but is it good enough?). Unless you add the inevitable collision avoidance code, which is what you should have done in the first place, anyways.

We don't use glibc rand() for generating uuids, and that's expressly documented in our comments.

About trusting CPUs: We know that we can't trust RdRand, because it's such an easy attack vector. Nobody trusts it. AMD got away with their bug in their own lab, because nobody trusts it (apart from recent systemd, which trusts it even to deliver a nonce, which was never part of the promise). Don't trust it. We are not fucked up if we know where we can't trust the CPU. We are not fucked up if we can't trust the entropy, but then, we need to test the entropy before we use it.

Yes, but already discussed to length: Thing though is that a) we don't trust RDRAND for generating cryptographic keys. When we do that we go to /dev/urandom like everyone should. b) we trust it enough for generating uuids, which are not cryptographic key material, where RDRAND should be fine and c) The linux kernel on various prominent distros actually does trust RDRAND now, and will fill /dev/urandom with it crediting it for full entropy, so that we'll actually end up using RDRAND after all in the end... i.e. check this:

$ grep CONFIG_RANDOM_TRUST_CPU /usr/lib/modules/`uname -r`/config
CONFIG_RANDOM_TRUST_CPU=y

(that's from Fedora)

So, yes, in systemd we are not going to make the decision to rely on RDRAND for key generation, but your distro might have configured to /dev/urandom like that, so we might ultimately after all.

systemd uses RDRAND directly only for generating UUIDs, seeding hash tables, temporary file name starting seeds, unix UID acquisition statring seeds, where things generally don't matter that much.

Actually, cryptographic hard code does a lot of stuff to avoid those parts of CPUs where we know we can't trust them. We need particular code to minimize side channel leaks of secrets (conditional moves instead of branches, no memory indexed by secrets, etc.; side channel leaks are more than a decade older than Spectre), we need to work around over-optimization in C compilers to make sure our secret copies are erased after use (there was an hour-long talk on 35c3 about only that topic, a conceptual simple thing like predictable erase memory is already hard — because compiler writer don't understand that this is not something that can be optimized away). We are not fucked up if things don't quite work as they should.

Yeah, sure, for cryptographic stuff we use /dev/urandom, read the sources and its comments, it makes that clear.

BTW, one comment to improve the auditability of this code (it's going to be looked at by many people, anyhow):

This random generator has a number of different users (callers) that have clearly very different requirements. It satisfies those by having a bunch of flags, and different code paths for the different flags; part of the flags are obvious optimizations, i.e. for speed. This makes it pretty hard to audit.

It would be much cleaner if each of the different requirements had their own function including with spec documented, e.g. get_random_uuid(uuid * x, int (check_collision)(uuid x)) for the UUID stuff. The spec should say what you need, like

uh? callback functions? what's that supposed to do? also, anything with such callback functions probably makes things harder to read, not simpler. (I know, because I spend 90% of my days reviewing code, do you?)

@poettering

This comment has been minimized.

Copy link
Member

@poettering poettering commented Jul 12, 2019

BTW, AMD is preparing a firmware update now that fixes RDRAND for them (not sure of just one new Ryzen or on old Ryzen too):

https://www.heise.de/newsticker/meldung/AMD-beseitigt-Fehler-des-Ryzen-3000-ueber-BIOS-Updates-4468815.html

(sorry, German only)

@forthy42

This comment has been minimized.

Copy link

@forthy42 forthy42 commented Jul 12, 2019

The Linux kernel uses a (mostly) sane construction: It seeds an entropy pool and uses a cryptographic hard (chacha20-based) PRNG to emit a stream of randomness. With 16 bytes out of a chacha20-based PRNG with a central entropy pool (so no way to read the same value twice; there's a central counter besides the entropy pool), collision is clearly low enough to be tolerable.

Remaining sanity problems: The kernel uses rdrand() in some places (get_random_u64(), e.g. used as random ID for seccomp stuff; this is probably arcane enough). What it should do is using get_random_bytes() that does not block when there's not enough entropy in the pool. This still delivers a stream of cryptographic hard stream of randomness that is to some extend predictable (not enough entropy in the pool), but certainly won't provide collisions.

This is something the Linux kernel only partly delivers to userland, but is worth having. Implementation would be straight-forward, just another flag to getrandom() which returns a bunch of pseudo-random bytes without blocking, even when the pool is not yet filled with trustworthy entropy, but only with the non-trustworthy RDSEED entropy. This would relieve the systemd's problem that it needs entropy very early in the boot sequence.

Essentially, it's a one-line patch of random.c, line 2135:

-if (!crng_ready()) {
+if (!crng_ready() && !(flags & GRND_PREMATURE) {

BTW: Directly reading from /dev/urandom has the same effect, as unlike getrandom() it does never block, only warn when there's no entropy.

The point is that when you want collision-free stuff, you have about four options:

a) use a known cryptographic hard random source that has central entropy pool+state (entropy doesn't matter much, it matters that reads always give non-overlapping parts of the output stream)
b) a nonce generator that is guaranteed to deliver non-repeating numbers (which, BTW, even the LCG rand() does, as long as it follows the central entropy pool+state paradigm which is probably difficult if that are different processes, and the seed may fail); cryptographic hard nonce generators are e.g. block ciphers in counter mode with a random seed as key
c) a central counter (not worrying about predictability)
d) a entropy source that is good enough to pass the randomness test suite like dieharder (looks like AMD's entropy source actually is that, because RDRAND and RDSEED fail the same way — that indicates that they don't have a conditioning PRBG in RDRAND, just an entropy source with an inadequate power saving mode, and RDRAND+RDSEED are identical)

If you want to scale, and not rely on a centralized pool or state, you can use construct a) with per-process/thread PRBG (pseudo-random; if you a block cipher with a counter as tweak, a per-thread guaranteed non-overlapping starting point of the counter is sufficient); the seed for that can come from something like randombytes(), or you use construct d) with per-core entropy sources (non-deterministic NRBG).

The thing that is no option is a entropy source that is not either monitored for quality or by audit known good. The failure modes of hardware are not necessarily the ones you think they are. “Don't matter that much” assumes maybe a well-hidden backdoor (indistinguishable from randomness unless you have the key), not a rather hard failure mode, e.g. entropy source not powered up.

You don't trust RDRAND for cryptography, but you do trust it for non-collision. That works for the Intel case, where RDRAND is the output of a stream cipher (AES in counter mode with the key coming from the entropy source — if the entropy source fails, the key stream still will look indistinguishable from proper random). That doesn't work for the AMD case, where a failing entropy source also causes RDRAND to fail the same way as RDSEED.

A dubious quality random number source where you can know when it is failing (by simply analyzing it) is probably one of the better deals you can get in this situation. It helps if you can only destroy it, but can't make it predictable by someone who knows a secret.

BTW: You missed the main point in the last part: have separate functions for separate purposes, and clearly specify what they should deliver. The callback for checking for collisions was an example. The main thing is that you properly specify what you need, and then you can evaluate whether you call the right function. Linux has generate_random_uuid(), which calls get_random_bytes(), which warns only if the entropy is insufficient, but delivers a collision-unlikely PRNG output nonetheless.

This is something that is quite straight-forward to audit, even when it lacks comments. Anyways: with the exception of two or three cases where get_random_u64 is used in the Linux kernel, and the lack of a pseudo-random fallback mode for getrandom(), there's almost nothing to complain there: This uses a state of the art construction with entropy source+predictable expansion. The code there has clearly been written by someone who knows what he was doing, and has changed completely after 2013 (for good).

@owlstead

This comment has been minimized.

Copy link

@owlstead owlstead commented Jul 23, 2019

You don't trust RDRAND for cryptography, but you do trust it for non-collision. That works for the Intel case, where RDRAND is the output of a stream cipher (AES in counter mode with the key coming from the entropy source — if the entropy source fails, the key stream still will look indistinguishable from proper random). That doesn't work for the AMD case, where a failing entropy source also causes RDRAND to fail the same way as RDSEED.

Of course this doesn't work for Intel either. The key stream will look random on it's own but it would be indistinguishable from an earlier generated stream that uses the same seed material. Maybe that won't wrong-foot the mentioned systemd, but it would break any other system that relies on it to not collide with a previous stream. For instance, let's assume that you start by generating UUIDs: these are supposed to be unique across systems, not just the booting system.

@tytso

This comment has been minimized.

Copy link
Contributor

@tytso tytso commented Jul 23, 2019

So that's not quite rigth. RDRAND is fine for creating random cryptographic keys; it is NIST SP 800-90A complaint. However, if you need to seed another cryptographic generator, you should use RDSEED. For more details, see [1]

[1] https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed

All of this is of course if you trust Intel, of course. Of course, I'd probably trust Intel much more than I would trust an ARM chip manufactured by Huawei --- but someone working for the Ministry of State Security or the Chinese People's Liberation Army might have an opposite point of view on that score.

As far as GRND_PREMATURE is concerned, it is something I've considered, although it would probably be something like GRND_I_SOLEMNLY_SWEAR_I_UNDERSTAND_THIS_IS_INSECURE. Ultimately, it's about how much do you trust application programmers, and whether someone (perhaps a less sophisticated distribution packager/maintainer) might blindly apply a patch that adds GRND_PREMATURE to "fix" a hang during early boot problem, when it's being used to create long-term public/private keypairs. One could argue that silly/stupid programmers like that might very well use /dev/urandom, and that's a fair point. My preference would be to simply tell people if you really need something which is unique, but not necessarily cryptographically secure, to simply read a UUID from /proc/sys/kernel/random/uuid. Yes, today it will potentially issue a warning to dmesg since it calls get_random_bytes(), but I'd much rather fix the kernel to suppress a warning if someone fetches a UUID via /proc/sys/kernel/random/uuid than to add a new GRND_WATCH_ME_GET_ABUSED_BY_A_CLUELESS APPLICATION_PROGRAMMER flag. Because I can guarantee it will happen, and probably it will get baked into some trashy IOT device....

@poettering

This comment has been minimized.

Copy link
Member

@poettering poettering commented Jul 23, 2019

hmmm, but with a totally not initialized pool on a system that is entropy starved (i.e. no rdrand, no hwrng of any form), generating a uuid from /proc/sys/kernel/random/uuid wouldn't be solving our problem either, would it? i mean, if there's really no entropy than pulling a uuid from that would result in basically a constant value, which defeats the purpose of uuids...

@tytso

This comment has been minimized.

Copy link
Contributor

@tytso tytso commented Jul 23, 2019

It'll be at least as good as using /dev/urandom, and you're OK with that, right? We will seed the entropy pool with the hardware TOD clock, and with the ethernet addresses for any device drivers that are loaded (although if they are all modules, that'll be an ordering issue), etc. So if you're going for uniqueness but not necessarily cryptographically secure, in most cases it will hopefully be good enough. Now if the there is no hardware TOD clock (no batteries, and you're utterly reliant on ntp), and all of your networking devices are modules loaded very late in the boot process, this might no be all that helpful --- but then you're screwed if you use /dev/urandom. Again, /proc/sys/kernel/random/uuid is no worse than /dev/urandom. And today it will issue a kernel warnings, but that is something I'll suppress in the future, especially if we can advertise that the only safe things people should be using is /proc/sys/kernel/random/uuid or getrandom(2), and there may be times when the uuid won't be unique, but ultimately, if you're using crap designed hardware, we can't manufacture uniqueness (or randomness) where there is none.....

@forthy42

This comment has been minimized.

Copy link

@forthy42 forthy42 commented Jul 24, 2019

@poettering

This comment has been minimized.

Copy link
Member

@poettering poettering commented Jul 24, 2019

It'll be at least as good as using /dev/urandom, and you're OK with that, right?

No, we do not use /dev/urandom for UUID generation (at least not on systems that support getrandom() that is, on older stuff we do, simply because we don't have anything better and then use /dev/urandom as replacement for getrandom()). On modern systems, for UUID generation we'll try RDRAND first, and synchronous getrandom(flags=0) second, and that's where the story ends.

We will seed the entropy pool with the hardware TOD clock, and with the ethernet addresses for any device drivers that are loaded (although if they are all modules, that'll be an ordering issue), etc. So if you're going for uniqueness but not necessarily cryptographically secure, in most cases it will hopefully be good enough.

Typical general purpose systems are usually not linking ethernet drivers in, which leaves the clock as only source, but if you start multiple systems at once (which is likely enough in virtualized environments) there's a good chance they then all generate the same series of uuids, which we don't want...

Now if the there is no hardware TOD clock (no batteries, and you're utterly reliant on ntp), and all of your networking devices are modules loaded very late in the boot process, this might no be all that helpful --- but then you're screwed if you use /dev/urandom. Again, /proc/sys/kernel/random/uuid is no worse than /dev/urandom. And today it will issue a kernel warnings, but that is something I'll suppress in the future, especially if we can advertise that the only safe things people should be using is /proc/sys/kernel/random/uuid or getrandom(2), and there may be times when the uuid won't be unique, but ultimately, if you're using crap designed hardware, we can't manufacture uniqueness (or randomness) where there is none.....

I would prefer if the warning would stay for /proc/sys/kernel/random/uuid too... In virtualized environments the chance is too high that on general purpose distros this will generate a non-unique stream of uuids.

(BTW, related to this, in #13137 I prepped a PR that makes systemd credit entropy with data stored in a specific EFI variable to the kernel pool if its set, very early on. It also contains work that systemd-boot (which is an EFI boot loader included in our tree) reads a random seed file from the ESP, then generates two new seeds from it with SHA256 counter mode, rewriting the random seed file with one, and writing the other into the EFI variable. This should then give us a full pool from earliest userspace on, at least on EFI systems that use systemd-boot, and if the random seed file in the ESP was initialized with good entropy first. It takes inspiration from NetBSD's boot loader random seed stuff, but is a bit more careful to never reuse the same unmodified seed in multiple boots on the same or on different systems. Ideally the kernel would already read that EFI variable itself btw, and unset it. And ideally there'd be a non-EFI mechanism for this too.)

@poettering

This comment has been minimized.

Copy link
Member

@poettering poettering commented Jul 24, 2019

If the purpose is getting an UUID, it's OK. File interfaces are problematic, as they might not be mounted (e.g. early in the boot sequence, or in a jailed environment). That was the original point of providing a kernel call for this function.

Nah, systemd runs as PID 1, it always mounts /proc as first thing, and the first uuid it needs afterwards. We are totally OK with use /proc if we need to. That said, we don't want to generate uuids from an uninitialized pool, since the chance is too high for it not resulting in unique ids.

ashkitten added a commit to ashkitten/systemd that referenced this issue Jul 25, 2019
An ugly, ugly work-around for systemd#11810. And no, we shouldn't have to do
this. This is something for AMD, the firmware or the kernel to
fix/work-around, not us. But nonetheless, this should do it for now.

Fixes: systemd#11810
leo60228 added a commit to leo60228/systemd that referenced this issue Jul 27, 2019
An ugly, ugly work-around for systemd#11810. And no, we shouldn't have to do
this. This is something for AMD, the firmware or the kernel to
fix/work-around, not us. But nonetheless, this should do it for now.

Fixes: systemd#11810
Mic92 added a commit to NixOS/systemd that referenced this issue Jul 30, 2019
An ugly, ugly work-around for systemd#11810. And no, we shouldn't have to do
this. This is something for AMD, the firmware or the kernel to
fix/work-around, not us. But nonetheless, this should do it for now.

Fixes: systemd#11810
@poettering poettering changed the title Can't suspend again after suspending one time. RDRAND on AMD CPUs does not work Aug 15, 2019
@poettering

This comment has been minimized.

Copy link
Member

@poettering poettering commented Aug 15, 2019

Just to mention this: it appears AMD admits to the borkage now, and now posted this:

https://lore.kernel.org/patchwork/patch/1115413/

In times of most big distros defaulting to CONFIG_RANDOM_TRUST_CPU=y this sounds like a big CVE-worthy security vuln, but apparently noone cares enough for AMD CPUs?

Also, I put together this:

https://systemd.io/RANDOM_SEEDS

That document explains systemd logic behind random seed initialization and RDRAND usage.

dsahern pushed a commit to dsahern/systemd that referenced this issue Nov 4, 2019
An ugly, ugly work-around for systemd#11810. And no, we shouldn't have to do
this. This is something for AMD, the firmware or the kernel to
fix/work-around, not us. But nonetheless, this should do it for now.

Fixes: systemd#11810
(cherry picked from commit 1c53d4a)
@noloader

This comment has been minimized.

Copy link

@noloader noloader commented Jan 17, 2020

In case someone is interested in picking up a test rig... The dell inspiron 3180 is a 15h machine (cpu_family = 21) and should have the buggy instruction. A used one costs about $90 USD.

@noloader

This comment has been minimized.

Copy link

@noloader noloader commented Jan 18, 2020

AMD provides a secure RNG library at RNG Library. It was last updated Jan 15, 2020.

The latest secure RNG code is available in aocl-securerng-linux-aocc-2.1.tar.gz. The release notes and source code do not discuss the RDRAND bug:

amd-securerng$ egrep -iIR 'bug|resume'
...
ReleaseNotes_AMDSecureRNG.txt:-   Bug fixes and improvements
amd-securerng$

Looking at the source code in src_lib/secrng.c, it appears the library just uses _rdrand32_step (and friends), and checks the return value of the intrinsic. The return value is set by carry flag from the instruction.

It is not clear (to me) if all that is needed is to check the carry flag.

However, the release notes skip 15h and 16h processors, and moves on to 17h processors:

System Requirements
-------------------

Hardware - Processor should support RDRAND and RDSEED instructions. AMD Family 
17h processors and other products of 2016 and beyond support these instructions

It is a pretty shitty way to do things... don't discuss the bug, and don't tell folks how to use the instruction safely.

@poettering

This comment has been minimized.

Copy link
Member

@poettering poettering commented Jan 20, 2020

Looking at the source code in src_lib/secrng.c, it appears the library just uses _rdrand32_step (and friends), and checks the return value of the intrinsic. The return value is set by carry flag from the instruction.

It is not clear (to me) if all that is needed is to check the carry flag.

We check the carry flag. Checking the carry flag is documented, and hence we do it. However, on the affected AMD CPU flag they set the carry flag incorrectly and hand us constant data with no indication that it is rubbish

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.