New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug (crash) when stopping provider #21023
Comments
Do you have the ability to run this in a debugger? Not sure how easy that is given the oqs-provider test framework. It would be really nice to be able to confirm that the write lock is indeed NULL in |
Alas, not really. :-( BTW, am I correct that (at least in several cases) OpenSSL chose not to follow "safe programming" approach, and refused to validate pointers when they are "supposed to" be non- |
Since the same function appears to be located at two different addresses, |
Ah!! The question is how does it end up being loaded twice in this scenario? |
Very good observation, @bernd-edlinger !
Yes: Here's the high-level dependency list:
Now, one question to @mouse07410 : You once stated you are not building everything "together" as I (& our CI) do: Is it possible that you have linked Second, a proposal to @mouse07410 : Could you please build Edit/Add: On second thought: Shouldn't all |
See for yourself:
Two versions of OpenSSL present:
IMHO, no, not possible - see above for my reasons.
Absolutely! Will do and report here.
Maybe, if more than one provider (or something else?) does something like |
This shows us liboqs and oqsprovider - but what about the test application itself. Presumably this is also linked against libcrypto. If the libcrypto versions that you have on your system is one built with debug symbols and one without? I note that at this point in the stack trace...
We suddenly go from not having source filenames/line numbers to having them. |
I build Here's what's happening when I build
|
So, my assumption is that the Python is picking up the 3.1.0 libcrypto version (no debug symbols) and then loading the oqsprovider which is picking up the 3.2.0-dev libcrypto version. Chaos ensues. |
Unfortunately I know nothing about python or how to influence what version of libcrypto it uses... |
Thus, the solution for me would be removing
I don't know much about Python - but am pretty sure that it would be impossible to influence, unless one builds it from source and can configure appropriately. Since I'm using system-wide binary distribution of Python - it's out of question. |
I think we can close this. First, this workaround tested OK, and second - it doesn't look like there's a way to "fix" or otherwise address it in the code. |
BTW, speaking of . . .
openssl/crypto/engine/eng_table.c
205: OPENSSL_init_crypto(OPENSSL_INIT_LOAD_CONFIG, NULL);
openssl/crypto/engine/eng_all.c
15: OPENSSL_init_crypto(OPENSSL_INIT_ENGINE_ALL_BUILTIN, NULL);
openssl/crypto/objects/obj_dat.c
77: OPENSSL_init_crypto(OPENSSL_INIT_LOAD_CONFIG, NULL);
openssl/crypto/conf/conf_sap.c
40: OPENSSL_init_crypto(OPENSSL_INIT_LOAD_CONFIG, &settings);
openssl/crypto/rand/rand_lib.c
457: OPENSSL_init_crypto(OPENSSL_INIT_BASE_ONLY, NULL);
openssl/crypto/provider_core.c
415: OPENSSL_init_crypto(OPENSSL_INIT_LOAD_CONFIG, NULL);
1367: OPENSSL_init_crypto(OPENSSL_INIT_LOAD_CONFIG, NULL);
. . . |
I find this improbable: How should
This in turn looks much more like something worth while investigating: This list contains at least one function ( May I suggest to not close this issue such as to eventually look at it as and when there's folks meeting that are interested in making providers (and all their conceptual features) a "more regular" OpenSSL capability (there's a call for participation in something like this open for all I know)? |
Sure. It looks like OpenSSL could do something to lower the probability of this issue rearing its ugly head. ;-) |
@baentsch - isn't oqsprovider linked against libcrypto? The output shown in this comment suggests that is is: And it is making calls to libcrypto API functions. So the dynamic linker will "pick up" libcrypto when it loads the oqsprovider. |
Yes. This is not what I put in question. I just wonder how the same dynamic linker that loaded Further, when looking at the code of OPENSSL_init_crypto there are lots of code paths (and some comments) that make me think some review (with the background of providers loading components that need |
I don't think that oqsprovider and liboqs are ending up with different libcrypto versions. I think python and oqsprovider are using different libcrypto versions, i.e. python is using the system libcrypto, and oqsprovider is using a custom libcrypto.
Undoubtedly it would be desirable to move more stuff into OSSL_LIB_CTX. But I don't believe that is what is at the heart of this issue. |
Exactly. Because I build both oqsprovider and liboqs - so I can (and do) control which libcrypto they're linked against. And the loader obediently picks the that libcrypto to resolve calls from these components. In this context, it's libcrypto from OpenSSL-3.2.0-dev. Since Python comes from binary distribution and is linked against a different libcrypto - in this case it's libcrypto from OpenSSL-3.1.0 (also installed in binary by Macports). That practically guarantees loading two different libcrypto versions when testing or running software built for/with one version of OpenSSL (like 3.2.0+) with tools that are built for/with the "released" version (in this case 3.1.0). |
Detailed story is in this issue, though I'd recommend starting here.
TL;DR
OpenSSL crashes after running
liboqs
tests, hittingNULL
ptr when cleaning up/deallocating things. Great analysis is here. It occurs when several "additional" providers are defined inopenssl.cnf
- by "additional" I meanpkcs11-provider
andoqs-provider
, thoughlegacy
provider seems to also contribute to this problem when it's defined (I stopped enabling it, precisely because of that).Quoting from the referring issue:
With both
pkcs11-provider
andoqs-provider
enabled -liboqs
tests will pass (reporting== 464 passed, 220 skipped in 40.80s ==
), but crash in the end:and crash report:
The text was updated successfully, but these errors were encountered: