New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIGSEGV: NSS + SoftHSMv2 == crash during atexit() #635
Comments
Putting breakpoints onto the constructor and destructor of the mutexes, we get a double destructor call:
|
This may be fixed in the develop branch (code is not in a release yet), to check: is this with a released version or the most recent develop branch? |
Can you point to the patch that fixes this if possible? I'll need to do a significant hack job on the machine to get the RPM out and the dev code in, this is not easy to do. I'm not currently working on the softhsm code, but rather just the presence of softhsm causes unrelated code to crash. |
The RPM as shipped by Redhat includes the following patch, which looks very similar to the patches you've listed above. The crash however is happening in NSS code, not OpenSSL - I suspect whatever problem was triggering the crash in OpenSSL (and subsequently fixed) is also triggering a crash in NSS.
|
Looking in more detail, there seems to be a pattern of memory management where variables are being manually created with new and manually destroyed with delete inside destructors. It then in turn looks like objects are being shallow copied, and then c++ is trying to auto-destruct the copy in atexit(), which has the effect of attempting to delete the referenced objects a second time. The double free then triggers a crash. What makes this problem really serious is that on many (all?) Linux distros pk11-kit is auto-wired into NSS, and softhsm is auto-wired into p11-kit. As a result, the simple act of installing the softhsm driver is enough to crash an application that uses NSS, even when that application that uses NSS makes no attempt to know or care about the softhsm. |
Can you print a stack trace for the double destructor call you saw earlier on? (so do |
The two stacktraces are below.
Looking closer, it looks like we're stopping twice in the destructor. Another question to ask - why is the SoftHSM object still alive and wired into atexit? Once C_FInalize is called, SoftHSM should be completely gone. Placing a breakpoint on the SoftHSM destructor, we see the object is still alive during atexit:
|
More digging. I've discovered that in this case NSS was calling C_Initialize, but was not calling C_Finalize. The reason was the presence of the NSS_INIT_COOPERATE flag on it, which sets the NSS_INIT_NOPK11FINALIZE flag, which tells NSS not to run C_Finalize on anything it called C_Initialize on. Removing the NSS_INIT_COOPERATE flag when initialising avoids the crash. It looks like the destructors in softhsm shut down cleanly when C_Finalize is called, but if not called, the finaliser in the SoftHSM object causes softhsm to crash. |
I use the OpenSSL 3 PKCS11 provider and everything works great until the end of the process. I get SIGSEGV where it is trying to destruct. I'm currently using SoftHSM2 with Botan because it doesn't work well with openssl 3. Any updates on this issue? I was able to use lldb and trace it down to calling destructors on SoftHSM instance in C++. Maybe they are getting called more than one time? |
I have code that calls NSS on EL8.
When softhsm is installed and present, the code crashes on shutdown in the atexit handler as follows:
Placing a breakpoint on the destructor, we see the destructor is called twice.
The first time the destructor is called is during initialisation of NSS:
The second time the destructor is called, is during atexit() shutdown:
Digging into the MutexFactory we get this:
Any ideas so far?
The text was updated successfully, but these errors were encountered: