New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Fix the DRBG reseed propagation [master] #12871
Conversation
1635d98
to
1954869
Compare
Dropped the two commit from #12866. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've a concern about access parent fields (via OSSL_PARAM calls) without holding a lock. Parents are can be from any provider in any library context. They aren't your normal calls across the libcrypto <-> provider boundary, they are direct provider <-> provider ones. Hence the need for the locking/unlock etc.
I haven't convinced myself that what's here is broken, I'm just concerned that there might be ramifications we've not considered.
Otherwise, I'd approve.
goto err; | ||
} | ||
*params = OSSL_PARAM_construct_uint(OSSL_DRBG_PARAM_RESEED_COUNTER, &r); | ||
/* we don't call drbg_lock_parent(drbg), because an atomic read is used */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I somewhat disagree here. The parent DRBG need not be any of our DRBGs. This one is deliberately permitted to be cross the lib context boundary. By assuming an atomic read, the semantics are weakened. Using a lock (for which a call back is specially supplied) is the safest option.
The ideal is a an optional local and an atomic read and a big comment here saying why :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ideal is a an optional
locallock and an atomic read and a big comment here saying why :)
Maybe you are right. Using atomics instead of a lock was a natural choice and a simple performance improvement back in 1.1.1. (Actually, the optimal compromise between safety and performance would have been a reader-writer lock, but since they are not available, I went with atomics instead of using full locks.) But even in 1.1.1 it turned out that it's much harder to get lock-free synchronization right.
Now with the 3.0 refactoring, the simple read of an integer has turned into a function call to possibly another provider. In this case, it's even harder to get it right and a lock seems to be the safer solution, even if it has a performance impact (which we didn't measure).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any performance impact will be related to the number of contested reseeds. Longer chains, fewer contentions.
The last time I looked properly at the impact of RNG locking was for 1.0.2 and it did terrible things to connections per second. We won't have that problem here.
ERR_raise(ERR_LIB_PROV, PROV_R_UNABLE_TO_GET_RESEED_PROP_CTR); | ||
goto err; | ||
/* in case of error, return my own counter + 1 to enforce a reseed */ | ||
return drbg->reseed_counter + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a tsan operation for consistency?
if (drbg->parent == NULL) | ||
tsan_counter(&drbg->reseed_counter); | ||
else | ||
tsan_store(&drbg->reseed_counter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this would be safer with a lock as well.
tsan_counter(&drbg->reseed_counter); | ||
else | ||
tsan_store(&drbg->reseed_counter, | ||
get_parent_reseed_count(drbg)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this pair.
&& get_parent_reseed_count(drbg) != drbg->parent_reseed_counter) | ||
reseed_required = 1; | ||
if (drbg->enable_reseed_propagation && drbg->parent != NULL) { | ||
if (drbg->reseed_counter != get_parent_reseed_count(drbg)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
And the Cis don't like the change -- they are a little too tightly integrated into the "known" behaviour. |
Yes I know, I mentioned it in the description. It's still a draft only and I did not find the time yet to figure out the problem.
|
This has the 3.0 beta1 milestone against it. Should it have? |
I put the beta1 milestone against it because I wasn't sure whether I need to change some public API or not. Currently, I don't think it's necessary, but I need to check it once more. That will probably not happen before the weekend, but I'll remember to move the ticket from the beta1 milestone to the release milestone, if it's not urgent. |
So, the weekend has come and gone....any further thoughts on the milestone? |
I didn't have much time recently to work on this pull request. But looking at the diffstat below, it seems like the essential part of it, commit ab43ded3e7 does not affect the public API. It's only commit 1954869a15, beause of the renaming OSSL_DRBG_PARAM_RESEED_CTR -> OSSL_DRBG_PARAM_RESEED_COUNTER. The two commits don't commute cleanly, but the merge conflict is simple and straigthforward to solve. So what I'm going to do is to swap the two commits and split off the renaming into separate pull request, because it is ready for review and the only part which needs to go in before beta.
|
1954869
to
d2e3d41
Compare
The last force-push swapped the two commits without tree changes. The first one has been split off as pull request #12941. |
This commit is the port of commit 958fec7 from the 1.1.1 stable branch to master. In a nutshell, reseed propagation is a compatibility feature with the sole purpose to support the traditional way of (re-)seeding manually by calling 'RAND_add()' before 'RAND_bytes(). It ensures that the former has an immediate effect on the latter *within the same thread*, but it does not care about immediate reseed propagation to other threads. The implementation is lock-free, i.e., it works without taking the lock of the primary DRBG. Pull request openssl#7399 not only fixed the data race issue openssl#7394 but also changed the original implementation of the seed propagation unnecessarily. This commit reverts most of the changes of commit 1f98527 and intends to fix the data race while retaining the original simplicity of the seed propagation. - use atomics with relaxed semantics to load and store the seed counter - add a new member drbg->enable_reseed_propagation to simplify the overflow treatment of the seed propagation counter - don't handle races between different threads This partially reverts commit a83dc59.
d2e3d41
to
0995c75
Compare
Rebased on master only to get rid of the commit which was split off to PR #12941. This pull request is still stalled, no need to review at the moment. |
Clearing the milestone. This not being done will not block the 3.0 release. |
This pull request continues work started on the 1.1.1 stable branch in #12759. For a more detailed explanation, see that pr.
Still work in progress, tests are failing.
The first two commits are from #12866, on which this pull request is currently based. They will disappear as soon as #12866 is merged.