Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Fix the DRBG reseed propagation [master] #12871

Closed

Conversation

mspncp
Copy link
Contributor

@mspncp mspncp commented Sep 13, 2020

This pull request continues work started on the 1.1.1 stable branch in #12759. For a more detailed explanation, see that pr.

Still work in progress, tests are failing.

The first two commits are from #12866, on which this pull request is currently based. They will disappear as soon as #12866 is merged.

@mspncp mspncp added this to In progress in 3.0 New Core + FIPS via automation Sep 13, 2020
@mspncp mspncp added this to the 3.0.0 beta1 milestone Sep 13, 2020
@mspncp mspncp force-pushed the pr-drbg-fix-reseed-propagation-3.0 branch from 1635d98 to 1954869 Compare September 14, 2020 04:42
@mspncp
Copy link
Contributor Author

mspncp commented Sep 14, 2020

Dropped the two commit from #12866.

Copy link
Contributor

@paulidale paulidale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've a concern about access parent fields (via OSSL_PARAM calls) without holding a lock. Parents are can be from any provider in any library context. They aren't your normal calls across the libcrypto <-> provider boundary, they are direct provider <-> provider ones. Hence the need for the locking/unlock etc.

I haven't convinced myself that what's here is broken, I'm just concerned that there might be ramifications we've not considered.

Otherwise, I'd approve.

goto err;
}
*params = OSSL_PARAM_construct_uint(OSSL_DRBG_PARAM_RESEED_COUNTER, &r);
/* we don't call drbg_lock_parent(drbg), because an atomic read is used */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somewhat disagree here. The parent DRBG need not be any of our DRBGs. This one is deliberately permitted to be cross the lib context boundary. By assuming an atomic read, the semantics are weakened. Using a lock (for which a call back is specially supplied) is the safest option.

The ideal is a an optional local and an atomic read and a big comment here saying why :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ideal is a an optional local lock and an atomic read and a big comment here saying why :)

Maybe you are right. Using atomics instead of a lock was a natural choice and a simple performance improvement back in 1.1.1. (Actually, the optimal compromise between safety and performance would have been a reader-writer lock, but since they are not available, I went with atomics instead of using full locks.) But even in 1.1.1 it turned out that it's much harder to get lock-free synchronization right.

Now with the 3.0 refactoring, the simple read of an integer has turned into a function call to possibly another provider. In this case, it's even harder to get it right and a lock seems to be the safer solution, even if it has a performance impact (which we didn't measure).

Copy link
Contributor

@paulidale paulidale Sep 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any performance impact will be related to the number of contested reseeds. Longer chains, fewer contentions.

The last time I looked properly at the impact of RNG locking was for 1.0.2 and it did terrible things to connections per second. We won't have that problem here.

ERR_raise(ERR_LIB_PROV, PROV_R_UNABLE_TO_GET_RESEED_PROP_CTR);
goto err;
/* in case of error, return my own counter + 1 to enforce a reseed */
return drbg->reseed_counter + 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a tsan operation for consistency?

if (drbg->parent == NULL)
tsan_counter(&drbg->reseed_counter);
else
tsan_store(&drbg->reseed_counter,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this would be safer with a lock as well.

tsan_counter(&drbg->reseed_counter);
else
tsan_store(&drbg->reseed_counter,
get_parent_reseed_count(drbg));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this pair.

&& get_parent_reseed_count(drbg) != drbg->parent_reseed_counter)
reseed_required = 1;
if (drbg->enable_reseed_propagation && drbg->parent != NULL) {
if (drbg->reseed_counter != get_parent_reseed_count(drbg))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here.

@paulidale
Copy link
Contributor

And the Cis don't like the change -- they are a little too tightly integrated into the "known" behaviour.

@mspncp
Copy link
Contributor Author

mspncp commented Sep 15, 2020

And the Cis don't like the change -- they are a little too tightly integrated into the "known" behaviour.

Yes I know, I mentioned it in the description. It's still a draft only and I did not find the time yet to figure out the problem.

Still work in progress, tests are failing.

@mattcaswell
Copy link
Member

This has the 3.0 beta1 milestone against it. Should it have?

@mspncp
Copy link
Contributor Author

mspncp commented Sep 17, 2020

This has the 3.0 beta1 milestone against it. Should it have?

I put the beta1 milestone against it because I wasn't sure whether I need to change some public API or not. Currently, I don't think it's necessary, but I need to check it once more. That will probably not happen before the weekend, but I'll remember to move the ticket from the beta1 milestone to the release milestone, if it's not urgent.

@mattcaswell
Copy link
Member

I put the beta1 milestone against it because I wasn't sure whether I need to change some public API or not. Currently, I don't think it's necessary, but I need to check it once more. That will probably not happen before the weekend, but I'll remember to move the ticket from the beta1 milestone to the release milestone, if it's not urgent.

So, the weekend has come and gone....any further thoughts on the milestone?

@mspncp
Copy link
Contributor Author

mspncp commented Sep 21, 2020

So, the weekend has come and gone....any further thoughts on the milestone?

I didn't have much time recently to work on this pull request. But looking at the diffstat below, it seems like the essential part of it, commit ab43ded3e7 does not affect the public API. It's only commit 1954869a15, beause of the renaming OSSL_DRBG_PARAM_RESEED_CTR -> OSSL_DRBG_PARAM_RESEED_COUNTER.

The two commits don't commute cleanly, but the merge conflict is simple and straigthforward to solve. So what I'm going to do is to swap the two commits and split off the renaming into separate pull request, because it is ready for review and the only part which needs to go in before beta.

1954869a15 Revert two renamings backported from master
doc/man3/EVP_RAND.pod                        |  2 +-
doc/man7/EVP_RAND-CTR-DRBG.pod               |  2 +-
doc/man7/EVP_RAND-HASH-DRBG.pod              |  2 +-
doc/man7/EVP_RAND-HMAC-DRBG.pod              |  2 +-
doc/man7/EVP_RAND-TEST-RAND.pod              |  2 +-
doc/man7/provider-rand.pod                   |  2 +-
include/openssl/core_names.h                 |  2 +-
providers/implementations/rands/drbg.c       | 14 +++++++-------
providers/implementations/rands/drbg_hash.c  |  2 +-
providers/implementations/rands/drbg_local.h |  4 ++--
providers/implementations/rands/test_rng.c   |  4 ++--
11 files changed, 19 insertions(+), 19 deletions(-)

ab43ded3e7 prov/drbg: fix the DRBG seed propagation
providers/implementations/rands/drbg.c       | 56 +++++++++++++++++++++-----------------------------------
providers/implementations/rands/drbg_local.h | 20 ++++++++++++--------
2 files changed, 33 insertions(+), 43 deletions(-)

@mspncp
Copy link
Contributor Author

mspncp commented Sep 21, 2020

The last force-push swapped the two commits without tree changes. The first one has been split off as pull request #12941.

@mspncp mspncp modified the milestones: 3.0.0 beta1, 3.0.0 Sep 21, 2020
@paulidale paulidale added the triaged: bug The issue/pr is/fixes a bug label Nov 10, 2020
@romen romen added the triaged: OTC evaluated This issue/pr was triaged by OTC label Nov 10, 2020
This commit is the port of commit 958fec7
from the 1.1.1 stable branch to master.

In a nutshell, reseed propagation is a compatibility feature with the sole
purpose to support the traditional way of (re-)seeding manually by calling
'RAND_add()' before 'RAND_bytes(). It ensures that the former has an immediate
effect on the latter *within the same thread*, but it does not care about
immediate reseed propagation to other threads. The implementation is lock-free,
i.e., it works without taking the lock of the primary DRBG.

Pull request openssl#7399 not only fixed the data race issue openssl#7394 but also changed
the original implementation of the seed propagation unnecessarily.
This commit reverts most of the changes of commit 1f98527 and intends to
fix the data race while retaining the original simplicity of the seed propagation.

- use atomics with relaxed semantics to load and store the seed counter
- add a new member drbg->enable_reseed_propagation to simplify the
  overflow treatment of the seed propagation counter
- don't handle races between different threads

This partially reverts commit a83dc59.
@mspncp mspncp force-pushed the pr-drbg-fix-reseed-propagation-3.0 branch from d2e3d41 to 0995c75 Compare November 17, 2020 21:17
@mspncp
Copy link
Contributor Author

mspncp commented Nov 17, 2020

Rebased on master only to get rid of the commit which was split off to PR #12941. This pull request is still stalled, no need to review at the moment.

@paulidale paulidale removed this from the 3.0.0 milestone May 24, 2021
@paulidale
Copy link
Contributor

Clearing the milestone. This not being done will not block the 3.0 release.
It would still be good to get it into the release, it's just not critical.

@mspncp mspncp closed this Dec 28, 2021
3.0 New Core + FIPS automation moved this from In progress to Done Dec 28, 2021
@mspncp mspncp deleted the pr-drbg-fix-reseed-propagation-3.0 branch October 20, 2023 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged: bug The issue/pr is/fixes a bug triaged: OTC evaluated This issue/pr was triaged by OTC
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants