Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rngd and the LRNG #173

Closed
smuellerDD opened this issue May 31, 2022 · 14 comments
Closed

rngd and the LRNG #173

smuellerDD opened this issue May 31, 2022 · 14 comments

Comments

@smuellerDD
Copy link
Contributor

When using the rngd on the LRNG patch set [1], the rngd starts an endless loop. The endless loop repeatedly shows "Added 648/1280 bits of entropy". The issue is the following

The architecture of the LRNG is such that it reports a pool size such as the following:

# cat /proc/sys/kernel/random/poolsize      
1288

which you can never fill by simply writing data into the kernel with the IOCTL RNDADDENTROPY. The reason is that the poolsize shows the maximum number of all entropy pools present in the LRNG. But when writing data to the kernel, you can only update one pool. I.e. after the RNDADDENTROPY IOCTL

# cat /proc/sys/kernel/random/entropy_avail 
456

may go up, but will reach a point where no additional IOCTL can change this value.

Thus when the rngd random_add_entropy performs RNDADDENTROPY followed by RNDGETENTCNT, eventually it will not see the value going up.

I see the following solutions which I would like to discuss:

  • the LRNG is changed such that it only reports the pool contents of the pool that is externally changeable

  • the rngd is changed such that the loop stops if the rngd finds it cannot update the entropy

IMHO the 2nd option sounds more like it because even in the upstream random.c implementation there could be an issue where you cannot update the pool and thus run into an endless loop.

I have tested option 1 and it makes the rngd working on the LRNG. But IMHO this is not a good solution as it does not reflect reality.

[1] https://github.com/smuellerDD/lrng

@nhorman
Copy link
Owner

nhorman commented May 31, 2022

I should point out that if the LRNG patch has altered /proc/sys/kernel/random/poolsize's meaning, that could be construed as a alteration of userspace ABI, and should likely be modified from a philosophical standpoint. This also goes for the write_wakeup_threshold value. It seems like the LRNG patch has changed the ABI such that the previous meaning of the value (the amount of entropy in the pool at which a process blocking on a POLLIN event will be scheduled) to something else, which is really not good.

How do you propose detecting a failure to add entropy to the pool? I see you are suggesting that we follow the call to RNDADDENTROPY with a call to GETENTCNT, terminating the loop on detection of failure to make the entropy count go up, but I'm a bit leery of that heuristic, as it would cause all our watermarking to be broken (i.e. if someone was draining entropy, we would just continually attempt to fill the pool, even if it was close to full already).

What would be better is if you re-introduced a mechanism into the kernel to emulate what write_wakeup_threshold did previously. From the admin guide:

  • write_wakeup_threshold: when the entropy count drops below this
    (as a number of bits), processes waiting to write to /dev/random
    are woken up. This file is writable for compatibility purposes, but
    writing to it has no effect on any RNG behavior.

From what I can see that sysctl has been disconnected, meaning that any applications using it to control when poll operations are triggered are broken, that's what really needs to be fixed.

Barring that an alternate method needs to be developed to allow for applications to detect when more entropy should be added.

@smuellerDD
Copy link
Contributor Author

smuellerDD commented May 31, 2022 via email

@nhorman
Copy link
Owner

nhorman commented May 31, 2022

Am Dienstag, 31. Mai 2022, 14:44:47 CEST schrieb Neil Horman: Hi Neil,
I should point out that if the LRNG patch has altered /proc/sys/kernel/random/poolsize's meaning, that could be construed as a alteration of userspace ABI, and should likely be modified from a philosophical standpoint. This also goes for the write_wakeup_threshold value. It seems like the LRNG patch has changed the ABI such that the previous meaning of the value (the amount of entropy in the pool at which a process blocking on a POLLIN event will be scheduled) to something else, which is really not good.
Well, IMHO it is still ABI compliant as the LRNG has multiple pools and only one of them is relevant for the user space interaction. As I said, I could trivially change the LRNG to be 100% identical to what random.c provides. But that would not reflect reality. But on the other hand, the LRNG has another interface where the different pool values are reflected.

I'm not sure of your reasoning here. I agree the LRNG introduces multiple pools and that the existing ABI for random.c is insufficient to allow for the expression of multiple pool entropy counts, but amalgamating them into a single value is causing behavioral issues, and exporting an entropy poolsize that can never be reached (when previously it could), strikes me as an ABI alteration.

How do you propose detecting a failure to add entropy to the pool?
The return code of the IOCTL should indicate a failure.
So, looking at the head of Linus' tree, I see that random_ioctl confirms that the entirety of a given entropy block was written, and returns -EFAULT if it wasn't. Thats potentially useable, but its aliased with fault resulting from a failed get_user call. If we can return a different error code here (E2BIG or ENOSPC perhaps), then we could use this as a detector for a full poll definitively

I see you are suggesting that we follow the call to RNDADDENTROPY with a call to GETENTCNT, terminating the loop on detection of failure to make the entropy count go up, but I'm a bit leery of that heuristic, as it would cause all our watermarking to be broken (i.e. if someone was draining entropy, we would just continually attempt to fill the pool, even if it was close to full already).
Agreed, but why are you not waiting for a wakeup from the kernel using select() and thus using the write_wakeup_threshold as mentioned below? The kernel would wake you up and you would not need to call GETENTCNT to check in user space whether to refill.

I am. Se calls to random_sleep in rng-tools. We block on POLLOUT events on the random_fd, executed whenever we detect that the pool is full. Waking up from this call is supposed to be controlled by the write_wakeup_threshold.

However, it seems that the mechanism by which poll triggers are made is now also significantly different from what it used to be. It appears that instead of waking up when the entropy pool is below the write_wakeup_threshold, pollout is triggered anytime a block of entropy is credited to the pool. write_wakeup_threshold appears to be entirely disconnected from any functionality at the moment, which strikes me as further ABI change.

If write_wakeup_threshold worked as documented, we could use that and ignore the full condition from GETENTCNT (or an error from ADDENTROPY, even though it would generate more syscalls), but as it is currently, since rngd is the process adding entropy to the pool, if a kernel isn't for whatever reason adding entropy on its own via add_timer_randomness, its entirely possible that rngd will deadlock on a call to poll(random_fd)

What would be better is if you re-introduced a mechanism into the kernel to emulate what write_wakeup_threshold did previously. From the admin guide: * write_wakeup_threshold: when the entropy count drops below this (as a number of bits), processes waiting to write to /dev/random are woken up. This file is writable for compatibility purposes, but writing to it has no effect on any RNG behavior.
This is exactly what the LRNG does. But it applies only to the user-space manageable entropy pool. The jitterentropy-rngd and the haveged rely on this behavior and they fare quite well with it.
How? How does LRNG introduce a machinsm that preforms what the write_wakeup_threshold does that isn't considered an ABI breakage?
I'm not saying I won't adapt to that mechanism, I will, but I'm not thrilled that the mechanism was changed without having user space backwards compatibility in place first.

From what I can see that sysctl has been disconnected, meaning that any applications using it to control when poll operations are triggered are broken, that's what really needs to be fixed.
I do not see that. What I see the rngd seems to be doing is while (avail_entropy < poolsize) { RNDADDENTROPY(new entropy); avail_entropy = GETENTCNT; } And if avail_entropy does not change, you enter an endless loop - the one I see. If I change the LRNG code such that poolsize reports only the user-space managable pool and GETENTCNT only returns the entropy level in that user-space managable pool, the rngd works well. Thus, why not simply call RNDADDENTROPY without any loop checking the available entropy and wait until the kernel tells you to insert new entropy using select() on the /dev/random FD?

Because I never had to do that before, this is why I'm asserting this is ABI breakage.

First to your point about how rngd behaves, it actually does 2 things:

  1. It adds entropy, then checks to see if the pool is full, exactly as you mention (this was done to aid in aggressive entropy refilling when its drained quickly - i.e. we wanted to be able to keep filling the pool without having to go through an extra kernel trap for the poll call, see rngd commit fa6f75d

  2. Once the pool is full, we sleep blocking on POLLOUT in random_sleep, waiting for the entropy pool to drop to a configured level (write_wakeup_threshold) at which time we begin filling the entropy pool again. This has been in place since 2017, commit 5e6f8f9

It seems to me that (1) no longer works, Regardless of what I do. I could certainly modify rngd to only use the poll check, but I'm opposed to that for three reasons:

  1. What I currently have should work, and it has in the past. I don't dispute that its sub-optimal code, but it worked, and from what I see, there no reason it should not have worked, and continue to work (that is to say, I should be able to do a comparison of poolsize to the value returned from GETENTCNT, and conclude weather or not the entropy pool is full). Because that is no longer the case, I am aserting that this is an ABI break. While I can make the modification to avoid this comparison, that's not how this should have worked. There should have been a new abi established, along with a deprecation of the old one, such that applications could make the switch without having to scramble to fix breakages that were revealed as a result of the change.

  2. Making the change you suggest re-introduces the problem I was trying to avoid in commit fa6f75d. That is to say, If I go to sleep in a poll after every write to /dev/random, rngd has to go through a scheduling cycle to get back on the CPU to add more entropy if it is being drained rapidly. That creates a significant performance impact. Perhaps its less of an issues now that /dev/random no longer blocks, I'm not sure, but I don't like the idea of relying on it, especially since we now need to straddle kernel versions where /dev/random may or may not block based on the kernel version we are running with.

  3. Polling on /dev/random no longer follows the semantics of write_wake_threshold (or more specifically, it may or may not, dependent on what kernel version is being used). In the latest kernel version write_wakeup_threshold appears to be ignored, instead waking any waiters when entropy Is added to the user pool. This implies that rngd will either wait forever to be activated when sleeping on a poll, creating a deadlock, or be woken at random times when the kernel adds to the pool, only for rngd to have to go through the potentially unneeded activity of having to add entropy to what may be an already full pool.

This really feels like a mess to me right now. It seems the existing ABI has been repruposed to a new behavioral model, which while perhaps superior to the old model, simply discarded any old users, in the assumption they would adapt, which is backwards from the way its supposed to be.

I'm not really sure what to do here. I think the "right" philosophical thing to do, is restore the behavior of write_wake_threshold as well as the behavior of the random_ioctl, so that applications continue to function as they previously expected, but that's likely impractical, as you've already started forcing adoption on other applications such that the new model is somewhat de-facto, which leaves me in a lousy position.

I think perhaps the best/only real course of action here that's even loosely palatable would be to:

  1. restore the functionality of write_wake_threshold, such that when the amount of entropy in a pool falls below that mark, any process waiting on poll for /dev/random wakes up. it may wake up more often than that, but it should at least wake up at that point.

  2. Have rngd check for the presence of /proc/lrng_type, and based on its presence or absence, using that as a toggle to determine if poolsize and the value of GENENTCNT can be safely used to determine pool full status. if we can't use that measurement, then we rely solely on write_wake_threshold to trigger refills (which underscores the need for it to function as documented) I really don't like having to do that, but given that its already out in the wild, and reverting this behavior will cause as many problems as it solves, I don't see a way around it.

Barring that an alternate method needs to be developed to allow for applications to detect when more entropy should be added.
Ciao Stephan

@smuellerDD
Copy link
Contributor Author

smuellerDD commented May 31, 2022 via email

@nhorman
Copy link
Owner

nhorman commented May 31, 2022

Are you referring to random.c or the LRNG? If you refer to the LRNG, I am not
sure why you say that. The wakeup happens at the right time when the user-
space managable pool entropy level falls below the threshold. It works
precisely as it is defined IMHO.
I'm looking at random_poll in random.c. Am I missing something about what would trigger a pullout event here?

I am surprised that you say that the LRNG deviates in this interface from
random.c. The LRNG DOES wake you up if it has insufficient entropy in its
entropy pool managable by external. It works exactly as defined in random(4).

On my test system:

$ cat /proc/sys/kernel/random/write_wakeup_threshold
448

Its possible that I'm missing something here. I'm looking at the random_poll function in random.c, which waits on the crng_init_wait queue, that is woken up only on calls to _credit_init_bits.

If the lrng patches do something different, and are still pending somewhere other than linus' tree, please point them out to me. This may be a non-issue if I'm looking at the wrong code

@smuellerDD
Copy link
Contributor Author

smuellerDD commented May 31, 2022 via email

@nhorman
Copy link
Owner

nhorman commented May 31, 2022

ok, that sounds good. Given that lrng is the new API here, I'm personally ok with that. The existing API needs to maintain functionality with applications that rely on it, even if its not an accurate reflection of reality. As long as there is an interface (in this case /proc/lrng_type) that does reflect reality, I think we're ok.

But more opinions never hurt, it certainly would be a bad idea to circulate this idea and get additional input

@smuellerDD
Copy link
Contributor Author

smuellerDD commented May 31, 2022 via email

@nhorman
Copy link
Owner

nhorman commented May 31, 2022

I'm not sure I understand why entropy_avail would be zero here. Shouldn't it reflect entropy gathered by the kernel through its various processes? Or is this because with the lrng patch random/entropy_avail is zero because those in-kerenel entropy sources fill other pools?

@smuellerDD
Copy link
Contributor Author

smuellerDD commented Jun 1, 2022 via email

@smuellerDD
Copy link
Contributor Author

smuellerDD commented Oct 11, 2022 via email

@nhorman
Copy link
Owner

nhorman commented Oct 11, 2022

Ok, I'm sorry, its been so long since we discussed this, I'm not sure where we left off. Your most recent post describes how the lrng wakes up in the kernel, but IIRC we were discussing how to make the proc interface behave in a backwards compatible way as I recall?

@smuellerDD
Copy link
Contributor Author

smuellerDD commented Oct 11, 2022 via email

@nhorman
Copy link
Owner

nhorman commented Oct 11, 2022

No worries, that feels right to me. Thanks for all your help!

@nhorman nhorman closed this as completed Oct 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants