New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crypto/init.c: Fix a locking issue in OPENSSL_init_crypto() #13288
Conversation
OPENSSL_init_crypto() passes its `settings` argument to `ossl_init_config()` through a static variable called `conf_settings`, because ossl_init_config() is a run-once function which can't take any parameters. This workaround is thread-safe, because passing a non-null `settings` argument is well defined and allowed only during early initialization in single-threaded mode. Nevertheless, the 'init_lock' was added in commit bf24111 (Avoid a race condition in loading config settings, 2016-02-09) to protect this workaround. This probably happened to fix a thread sanitizer warning. I believe it was a false positive caused by unnecessary writes of null values to the `conf_settings` variable. Since commit e6c5461 (Load the default config file before working with default properties, 2020-07-31) the unnecessary lock is causing a lock-order-inversion, so it needs to be removed. Instead, we prevent concurrency warnings by avoiding to update `conf_settings` unnecessarily. Fixes openssl#12869
b58b2f7
to
9ffbf0f
Compare
@bernd-edlinger I'd be interested in your expert feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Wait for @bernd-edlinger's input though.
I think the thread sanitizer is simply not designed to look for cycles involving |
But would you say that the code is ok under the assumption that the application is only allowed to pass a non-zero pointer before using libcrypto resp. creating additional threads? |
I think the problem is when the application passes zero. |
Is there really a lock lurking inside |
See pthread_once.c. |
@bernd-edlinger I assume you are talking about the wait operations on the once_control? But those happen outside the init_routine, do you really think they could cause a lock order inversion? |
My understanding is that |
shrug If you want to, that's fine, but I honestly don't care about getting "credit" in the commit message. What analysis I've contributed is there in the issues / PRs regardless. I do appreciate your concern for properly acknowledging contributions, however. |
Yes, exactly, one thread sets __PTHREAD_ONCE_INPROGRESS |
Yes, but what happens if there is no call with settings != NULL, So while you are right that the mutex is unnecessary when the OPENSSL_init_crypto |
Is a TSAN build where there is locking around the atomic operations worthwhile? I.e. make TSAN more aware of what is happening. I don't see why it would help but it's easy to miss things when dealing with concurrency. |
@bernd-edlinger after digging more deeply into the issue I agree with your #13288 (comment) and #13288 (comment) that removing lock M38 around the run-once call does not fix the true issue, which is a potential race of different threads competing for a (first) call to OPENSSL_init_crypto(). Although my pull request might improve the situation because it reduces the likelihood of a deadlock (the lock in OPENSSL_init_crypto() effectively becomes a one-time lock), I still have a bad feeling about this until the problem is properly understood. I tried to untangle the lock cycles mentally, but I failed. So I had a call with @mattcaswell and explained the problem to him. He will try to help me figure it out in the next days. For his convenience, I post the results of my preliminary analysis. I won't merge this pull request, until the problem is fully understood. The two locking cyclesI reproduced issue #12869 with a recent copy of master. The TSAN report is still the same, it reports two cycles. I separated them into different files and added markers in the call stack at the interesting locations. If you look at the diff of those two reports, you can see that it's essentially the same cycle, only with M70 replaced by M89. The difference is caused by the fact that one lock is taken by an --- M38-M62-M70-M38.log 2020-11-03 01:18:53.837246012 +0100
+++ M38-M62-M89-M38.log 2020-11-02 23:41:47.972695248 +0100
@@ -1,5 +1,5 @@
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=26018)
- Cycle in lock order graph: M38 (0x7b10000005c0) => M62 (0x7b1000000800) => M70 (0x7b1000000740) => M38
+ Cycle in lock order graph: M38 (0x7b10000005c0) => M62 (0x7b1000000800) => M89 (0x7b1000000740) => M38
Mutex M62 acquired here while holding mutex M38 in main thread:
#0 pthread_rwlock_wrlock /var/tmp/portage/sys-devel/gcc-9.3.0-r1/work/gcc-9.3.0/libsanitizer/tsan/tsan_interceptors.cc:1324 (libtsan.so.0+0x2f2ba)
@@ -30,11 +30,11 @@
#4 apps_startup apps/openssl.c:66 (openssl+0x7a187)
#5 main apps/openssl.c:260 (openssl+0x7a2cf)
- Mutex M70 acquired here while holding mutex M62 in main thread:
+ Mutex M89 acquired here while holding mutex M62 in main thread:
#0 pthread_rwlock_wrlock /var/tmp/portage/sys-devel/gcc-9.3.0-r1/work/gcc-9.3.0/libsanitizer/tsan/tsan_interceptors.cc:1324 (libtsan.so.0+0x2f2ba)
#1 CRYPTO_THREAD_write_lock crypto/threads_pthread.c:86 (libcrypto.so.3+0x2da193)
#2 ossl_property_write_lock crypto/property/property.c:127 (libcrypto.so.3+0x31f360)
-==> #3 ossl_method_store_add crypto/property/property.c:249 (libcrypto.so.3+0x31f9b1) M62[provider store lock]=>M70[method store lock]
+==> #3 ossl_method_store_add crypto/property/property.c:249 (libcrypto.so.3+0x31f9b1) M62[provider store lock]=>M89[method store lock]
#4 put_evp_method_in_store crypto/evp/evp_fetch.c:168 (libcrypto.so.3+0x27bef5)
#5 ossl_method_construct_this crypto/core_fetch.c:95 (libcrypto.so.3+0x2c0214)
#6 algorithm_do_this crypto/core_algorithm.c:65 (libcrypto.so.3+0x2bfb51)
@@ -44,10 +44,14 @@
#10 ossl_method_construct crypto/core_fetch.c:124 (libcrypto.so.3+0x2c03e4)
#11 inner_evp_generic_fetch crypto/evp/evp_fetch.c:316 (libcrypto.so.3+0x27c57f)
#12 evp_generic_fetch crypto/evp/evp_fetch.c:363 (libcrypto.so.3+0x27c7e5)
- #13 EVP_MAC_fetch crypto/evp/mac_meth.c:155 (libcrypto.so.3+0x294423)
- #14 fipsinstall_main apps/fipsinstall.c:425 (openssl+0x6c502)
- #15 do_cmd apps/openssl.c:411 (openssl+0x7aa88)
- #16 main apps/openssl.c:293 (openssl+0x7a458)
+ #13 EVP_MD_fetch crypto/evp/digest.c:961 (libcrypto.so.3+0x255de9)
+ #14 ossl_prov_digest_fetch providers/common/provider_util.c:131 (libcrypto.so.3+0x478880)
+ #15 ossl_prov_digest_load_from_params providers/common/provider_util.c:154 (libcrypto.so.3+0x4789cf)
+ #16 hmac_set_ctx_params providers/implementations/macs/hmac_prov.c:262 (libcrypto.so.3+0x461a20)
+ #17 EVP_MAC_CTX_set_params crypto/evp/mac_lib.c:156 (libcrypto.so.3+0x29362e)
+ #18 fipsinstall_main apps/fipsinstall.c:445 (openssl+0x6c5fc)
+ #19 do_cmd apps/openssl.c:411 (openssl+0x7aa88)
+ #20 main apps/openssl.c:293 (openssl+0x7a458)
Mutex M62 previously acquired by the same thread here:
#0 pthread_rwlock_rdlock /var/tmp/portage/sys-devel/gcc-9.3.0-r1/work/gcc-9.3.0/libsanitizer/tsan/tsan_interceptors.cc:1294 (libtsan.so.0+0x2ef7a)
@@ -57,39 +61,50 @@
#4 ossl_method_construct crypto/core_fetch.c:124 (libcrypto.so.3+0x2c03e4)
#5 inner_evp_generic_fetch crypto/evp/evp_fetch.c:316 (libcrypto.so.3+0x27c57f)
#6 evp_generic_fetch crypto/evp/evp_fetch.c:363 (libcrypto.so.3+0x27c7e5)
- #7 EVP_MAC_fetch crypto/evp/mac_meth.c:155 (libcrypto.so.3+0x294423)
- #8 fipsinstall_main apps/fipsinstall.c:425 (openssl+0x6c502)
- #9 do_cmd apps/openssl.c:411 (openssl+0x7aa88)
- #10 main apps/openssl.c:293 (openssl+0x7a458)
+ #7 EVP_MD_fetch crypto/evp/digest.c:961 (libcrypto.so.3+0x255de9)
+ #8 ossl_prov_digest_fetch providers/common/provider_util.c:131 (libcrypto.so.3+0x478880)
+ #9 ossl_prov_digest_load_from_params providers/common/provider_util.c:154 (libcrypto.so.3+0x4789cf)
+ #10 hmac_set_ctx_params providers/implementations/macs/hmac_prov.c:262 (libcrypto.so.3+0x461a20)
+ #11 EVP_MAC_CTX_set_params crypto/evp/mac_lib.c:156 (libcrypto.so.3+0x29362e)
+ #12 fipsinstall_main apps/fipsinstall.c:445 (openssl+0x6c5fc)
+ #13 do_cmd apps/openssl.c:411 (openssl+0x7aa88)
+ #14 main apps/openssl.c:293 (openssl+0x7a458)
- Mutex M38 acquired here while holding mutex M70 in main thread:
+ Mutex M38 acquired here while holding mutex M89 in main thread:
#0 pthread_rwlock_wrlock /var/tmp/portage/sys-devel/gcc-9.3.0-r1/work/gcc-9.3.0/libsanitizer/tsan/tsan_interceptors.cc:1324 (libtsan.so.0+0x2f2ba)
#1 CRYPTO_THREAD_write_lock crypto/threads_pthread.c:86 (libcrypto.so.3+0x2da193)
-==> #2 OPENSSL_init_crypto crypto/init.c:542 (libcrypto.so.3+0x2c57f7) M70[method store lock]=>M38[init lock]
+==> #2 OPENSSL_init_crypto crypto/init.c:542 (libcrypto.so.3+0x2c57f7) M89[method store lock]=>M38[init lock]
#3 ossl_ctx_global_properties crypto/property/property.c:103 (libcrypto.so.3+0x31f1e5)
-==> #4 ossl_method_store_fetch crypto/property/property.c:360 (libcrypto.so.3+0x31ffc4) M70[method store lock]
+==> #4 ossl_method_store_fetch crypto/property/property.c:360 (libcrypto.so.3+0x31ffc4) M89[method store lock]
#5 get_evp_method_from_store crypto/evp/evp_fetch.c:130 (libcrypto.so.3+0x27bd69)
#6 ossl_method_construct crypto/core_fetch.c:130 (libcrypto.so.3+0x2c0424)
#7 inner_evp_generic_fetch crypto/evp/evp_fetch.c:316 (libcrypto.so.3+0x27c57f)
#8 evp_generic_fetch crypto/evp/evp_fetch.c:363 (libcrypto.so.3+0x27c7e5)
- #9 EVP_MAC_fetch crypto/evp/mac_meth.c:155 (libcrypto.so.3+0x294423)
- #10 fipsinstall_main apps/fipsinstall.c:425 (openssl+0x6c502)
- #11 do_cmd apps/openssl.c:411 (openssl+0x7aa88)
- #12 main apps/openssl.c:293 (openssl+0x7a458)
+ #9 EVP_MD_fetch crypto/evp/digest.c:961 (libcrypto.so.3+0x255de9)
+ #10 ossl_prov_digest_fetch providers/common/provider_util.c:131 (libcrypto.so.3+0x478880)
+ #11 ossl_prov_digest_load_from_params providers/common/provider_util.c:154 (libcrypto.so.3+0x4789cf)
+ #12 hmac_set_ctx_params providers/implementations/macs/hmac_prov.c:262 (libcrypto.so.3+0x461a20)
+ #13 EVP_MAC_CTX_set_params crypto/evp/mac_lib.c:156 (libcrypto.so.3+0x29362e)
+ #14 fipsinstall_main apps/fipsinstall.c:445 (openssl+0x6c5fc)
+ #15 do_cmd apps/openssl.c:411 (openssl+0x7aa88)
+ #16 main apps/openssl.c:293 (openssl+0x7a458)
- Mutex M70 previously acquired by the same thread here:
+ Mutex M89 previously acquired by the same thread here:
#0 pthread_rwlock_rdlock /var/tmp/portage/sys-devel/gcc-9.3.0-r1/work/gcc-9.3.0/libsanitizer/tsan/tsan_interceptors.cc:1294 (libtsan.so.0+0x2ef7a)
#1 CRYPTO_THREAD_read_lock crypto/threads_pthread.c:73 (libcrypto.so.3+0x2da150)
#2 ossl_property_read_lock crypto/property/property.c:122 (libcrypto.so.3+0x31f309)
-==> #3 ossl_method_store_fetch crypto/property/property.c:351 (libcrypto.so.3+0x31ff3d) M70[method store lock]
+==> #3 ossl_method_store_fetch crypto/property/property.c:351 (libcrypto.so.3+0x31ff3d) M89[method store lock]
#4 get_evp_method_from_store crypto/evp/evp_fetch.c:130 (libcrypto.so.3+0x27bd69)
#5 ossl_method_construct crypto/core_fetch.c:130 (libcrypto.so.3+0x2c0424)
#6 inner_evp_generic_fetch crypto/evp/evp_fetch.c:316 (libcrypto.so.3+0x27c57f)
#7 evp_generic_fetch crypto/evp/evp_fetch.c:363 (libcrypto.so.3+0x27c7e5)
- #8 EVP_MAC_fetch crypto/evp/mac_meth.c:155 (libcrypto.so.3+0x294423)
- #9 fipsinstall_main apps/fipsinstall.c:425 (openssl+0x6c502)
- #10 do_cmd apps/openssl.c:411 (openssl+0x7aa88)
- #11 main apps/openssl.c:293 (openssl+0x7a458)
+ #8 EVP_MD_fetch crypto/evp/digest.c:961 (libcrypto.so.3+0x255de9)
+ #9 ossl_prov_digest_fetch providers/common/provider_util.c:131 (libcrypto.so.3+0x478880)
+ #10 ossl_prov_digest_load_from_params providers/common/provider_util.c:154 (libcrypto.so.3+0x4789cf)
+ #11 hmac_set_ctx_params providers/implementations/macs/hmac_prov.c:262 (libcrypto.so.3+0x461a20)
+ #12 EVP_MAC_CTX_set_params crypto/evp/mac_lib.c:156 (libcrypto.so.3+0x29362e)
+ #13 fipsinstall_main apps/fipsinstall.c:445 (openssl+0x6c5fc)
+ #14 do_cmd apps/openssl.c:411 (openssl+0x7aa88)
+ #15 main apps/openssl.c:293 (openssl+0x7a458)
SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) crypto/threads_pthread.c:86 in CRYPTO_THREAD_write_lock
-================== The locations of the locks in the source codeThe following patch marks the locations in the source code where the locks are taken: commit 97293ace93d24e29464fac830aaaa9d53411050e
Author: Dr. Matthias St. Pierre <matthias.st.pierre@ncp-e.com>
Date: Tue Nov 3 20:04:31 2020 +0100
Lock order inversion
diff --git a/crypto/init.c b/crypto/init.c
index cfd4eab9ed..dd04132abe 100644
--- a/crypto/init.c
+++ b/crypto/init.c
@@ -539,7 +539,7 @@ int OPENSSL_init_crypto(uint64_t opts, const OPENSSL_INIT_SETTINGS *settings)
if (opts & OPENSSL_INIT_LOAD_CONFIG) {
int ret;
- CRYPTO_THREAD_write_lock(init_lock);
+ CRYPTO_THREAD_write_lock(init_lock); /* M38[init_lock] */
conf_settings = settings;
ret = RUN_ONCE(&config, ossl_init_config);
conf_settings = NULL;
diff --git a/crypto/property/property.c b/crypto/property/property.c
index 9cfca81190..8be2de8f40 100644
--- a/crypto/property/property.c
+++ b/crypto/property/property.c
@@ -100,7 +100,7 @@ OSSL_PROPERTY_LIST **ossl_ctx_global_properties(OSSL_LIB_CTX *libctx,
int loadconfig)
{
#ifndef FIPS_MODULE
- if (loadconfig && !OPENSSL_init_crypto(OPENSSL_INIT_LOAD_CONFIG, NULL))
+ if (loadconfig && !OPENSSL_init_crypto(OPENSSL_INIT_LOAD_CONFIG, NULL)) /* M70,M89[method store lock]=>M38[init lock] */
return NULL;
#endif
return ossl_lib_ctx_get_data(libctx, OSSL_LIB_CTX_GLOBAL_PROPERTIES,
@@ -246,7 +246,7 @@ int ossl_method_store_add(OSSL_METHOD_STORE *store, const OSSL_PROVIDER *prov,
* A write lock is used unconditionally because we wend our way down to the
* property string code which isn't locking friendly.
*/
- ossl_property_write_lock(store);
+ ossl_property_write_lock(store); /* M62[provider store lock]=>M70,M89[method store lock] */
ossl_method_cache_flush(store, nid);
if ((impl->properties = ossl_prop_defn_get(store->ctx, properties)) == NULL) {
impl->properties = ossl_parse_property(store->ctx, properties);
@@ -348,7 +348,7 @@ int ossl_method_store_fetch(OSSL_METHOD_STORE *store, int nid,
* This only needs to be a read lock, because queries never create property
* names or value and thus don't modify any of the property string layer.
*/
- ossl_property_read_lock(store);
+ ossl_property_read_lock(store); /* M70,M89[method store lock] */
alg = ossl_method_store_retrieve(store, nid);
if (alg == NULL) {
ossl_property_unlock(store);
diff --git a/crypto/provider_core.c b/crypto/provider_core.c
index 52e68641ae..6348dfda4f 100644
--- a/crypto/provider_core.c
+++ b/crypto/provider_core.c
@@ -226,7 +226,7 @@ OSSL_PROVIDER *ossl_provider_find(OSSL_LIB_CTX *libctx, const char *name,
#endif
tmpl.name = (char *)name;
- CRYPTO_THREAD_write_lock(store->lock);
+ CRYPTO_THREAD_write_lock(store->lock); /* M38[init_lock]=>M62[provider store lock] */
if ((i = sk_OSSL_PROVIDER_find(store->providers, &tmpl)) == -1
|| (prov = sk_OSSL_PROVIDER_value(store->providers, i)) == NULL
|| !ossl_provider_up_ref(prov))
@@ -742,7 +742,7 @@ int ossl_provider_forall_loaded(OSSL_LIB_CTX *ctx,
#endif
if (store != NULL) {
- CRYPTO_THREAD_read_lock(store->lock);
+ CRYPTO_THREAD_read_lock(store->lock); /* M62[provider store lock] */
provider_activate_fallbacks(store);
|
I forgot to mention in my last post: the following change by @mattcaswell added the dependency
which closed the cycle.
It was introduced in commit e6c5461 in order to fix issue #12565. |
Closing in favour of #13321. |
OPENSSL_init_crypto() passes its
settings
argument toossl_init_config()
through a static variable calledconf_settings
, because ossl_init_config() is arun-once function which can't take any parameters.
This workaround is thread-safe, because passing a non-null
settings
argument is well defined and allowed only duringearly initialization in single-threaded mode.
Nevertheless, the 'init_lock' was added in commit bf24111
(Avoid a race condition in loading config settings, 2016-02-09)
to protect this workaround. This probably happened to fix a
thread sanitizer warning. I believe it was a
falsefalsepositive caused by unnecessary writes of null values to the
conf_settings
variable.Since commit e6c5461 (Load the default config file before
working with default properties, 2020-07-31) the unnecessary
lock is causing a lock-order-inversion, so it needs to be
removed. Instead, we prevent concurrency warnings by avoiding
to update
conf_settings
unnecessarily.Fixes #12869
Checklist