OpenSSL memory leak in error queue #964

Closed
nased0 opened this Issue Aug 17, 2016 · 28 comments

Projects

None yet

5 participants

@nased0
nased0 commented Aug 17, 2016

Hello
I have found memory leaks, increasing with every OpenSSL session.
I use SoftwareVerify Memory Validator and I have made screenshots:
http://images78.fotosik.pl/823/9e7658313231fd01.png
http://images76.fotosik.pl/823/a7587db189782949.png
My application uses OpenSSL certificates and private key in PEM files format.

As you can see on screenshots, leaking memory is allocated by OpenSSL's ERR_get_state() function, called by ERR_clear_error(), called by SSL_CTX_use_certificate_chain_file.
SSL_CTX_use_certificate_chain_file is called by cURL's cert_stuff function, used by ossl_connect_step1.

I wrote to Matt Caswell from openssl.org about this memleah, and he answered:
OpenSSL maintains a separate error queue for each thread. On each queue there can be
multiple errors. ERR_get_state() does not add any errors to the queue it
merely returns the ERR_STATE (i.e. the queue) for the current thread.
If the current thread has no queue then ERR_get_state() will create one.

ERR_clear_error() removes all the errors that are on the current
thread's queue. It does not deallocate the current thread's queue.

ERR_remove_thread_state() deallocates the specified thread's queue.

The mem leaks you are seeing are almost certainly because the queues for
your app's threads have not been deallocated.

I expected the following

Function ERR_remove_thread_state() should be called at the end of every OpenSSL session, because it deallocates memory allocated at the beginning of every OpenSSL Session.
I made a patch, which calls ERR_remove_thread_state() at Curl_ossl_close_all:
openssl_c_patch.txt

It solved the problem for me.

curl/libcurl version

curl 7.50.1 (i686-pc-mingw32) libcurl/7.50.1 OpenSSL/1.0.2h zlib/1.2.8 WinIDN libssh2/1.6.1_DEV
Protocols: dict file ftp ftps gopher http https imap imaps ldap pop3 pop3s rtsp scp sftp smtp smtps telnet tftp
Features: AsynchDNS TrackMemory IDN IPv6 Largefile SSPI Kerberos SPNEGO NTLM SSL libz

curl_memleak1
curl_memleak2

operating system

Windows 7 Professional x64

@bagder
Member
bagder commented Aug 17, 2016

The endless hunt to fix the last OpenSSL leak...

Based on your explanation, this seems like a fine fix!

@jay
Member
jay commented Aug 18, 2016

If we free the thread error queue after each easy handle cleanup couldn't some other easy handle in the same thread have an openssl error and we're freeing that?

@bagder
Member
bagder commented Aug 18, 2016

couldn't some other easy handle in the same thread have an openssl error and we're freeing that?

(First, OpenSSL at least fixed this in 1.1 so we can stop worrying there.)

Good thinking! The error queue handling in OpenSSL is so strange and I wished I understood it better. Since it is global per-thread data, how does it handle multiple errors from different users within the same thread?

@nased0
nased0 commented Aug 18, 2016 edited

That's why I decided to put my fix in the function Curl_ossl_close_all, where also OpenSSL engine is being freed, not in the Curl_ossl_close function.
Besides, comment above Curl_ossl_close_all is self evident:
/*

  • This function is called when the 'data' struct is going away. Close
  • down everything and free all resources!
    */

In my opinion you should not use cURL and OpenSSL data structures after calling that.

@bagder
Member
bagder commented Aug 18, 2016

@nased0, users could still have many more easy handles in the same thread, like when using the multi interface and doing hundreds of parallel downloads. Maybe 22 of them all got SSL problems... if the first handle you close cleans out the error queue, won't that risk that it'll clear out errors for the other easy handles?

@nased0
nased0 commented Aug 18, 2016 edited

I don't know when Curl_ossl_close_all is being called when using a cURL multi interface, because I'm using only easy interface in my app.
My fix should obviously be inserted into the function that is called when all users finished their sessions in the same thread. But if many users use the OpenSSL engine, freeing it would break their sessions, so I assumed that Curl_ossl_close_all is called when it's appropriate.

@bagder
Member
bagder commented Aug 18, 2016

There is no such call. The libcurl API is centered around the handles and an application can create and close handles many times through-out its lifetime and there's no API call made when "it is all done in this thread". There's only the closing of handles and the global cleanup.

One way forward I can possibly think of right now to handle this, is to have a reference counter (protected with a mutex callback) on all easy handles and when that reaches zero (when the last easy handle is closed in the thread), clean up the error queue. But it would have to count only easy handles within the same thread, and we allow users to "pass" handles to other threads and they could get closed there so it would need a busload of trickery to work!

@nased0
nased0 commented Aug 18, 2016 edited

So maybe you should provide such function to the programmers using the multi interface, that they will call to free the engine and the OpenSSL error queue for the current thread?
It will require some simple changes in their apps, but it is better than the current situation!
Users of the easy interface should have it called automatically when curl_easy_cleanup is being called (assuming that they do not use the multi interface at the same time and in the same thread).
P.S.
OpenSSL engine operations are also important to me (especially PKCS#11), and it seems that the engine is being improperly freed when using multi interface!

@bagder
Member
bagder commented Aug 18, 2016

maybe you should provide such function to the programmers using the multi interface, that they will call to free the engine and the OpenSSL error queue for the current thread

A new function in libcurl to handle the existing (and flawed I'd say) OpenSSL version only? You may think of that as "better than the current situation" but I (as one of us who'll live with the maintenance and documentation burden) strongly disagree. I couldn't even write up simple explanation for application authors when or why such a function would be called.

Users of the easy interface should have it called automatically when curl_easy_cleanup is being called

Assuming they only create one easy handle at a time you mean? I know lots of applications that create many, even when using only the easy interface. And again, a multi-threading situation makes it very hard to even just document how it would be used (not to mention figure out how it would work code wise).

No, I think we need to keep searching for another fix.

OpenSSL engine operations are also important for me, and it seems that they are being improperly freed when using multi interface!

Yes it seems they suffer from the same problem. This has probably gone unnoticed due to not that many people are using the openssl engine stuff, and those who do might not have done it with multiple handles in the multi interface or something.

@nased0
nased0 commented Aug 18, 2016 edited

One remark about the engine from
https://www.openssl.org/docs/man1.0.1/crypto/engine.html
According to this documentation, OpenSSL ENGINE object is protected by reference counting.
"All structural references [(meaning pointers)] should be released by a corresponding to call to the ENGINE_free() function - the ENGINE object itself will only actually be cleaned up and deallocated when the last structural reference is released.|

So this engine freeing code in Curl_ossl_close_all may actially be partially correct.
But "All functional references are released by calling ENGINE_finish() (which removes the implicit structural reference as well)." So calling ENGINE_free() after ENGINE_finish() in Curl_ossl_close_all may not be necessary and even may cause a double free.
Edit:
ENGINE_finish in crypto\engine\eng_init.c calls engine_unlocked_finish and it calls engine_free_util.

/*
 * Free a functional reference to a engine type. This version is only used
 * internally.
 */
int engine_unlocked_finish(ENGINE *e, int unlock_for_handlers)
{
    int to_return = 1;

    /*
     * Reduce the functional reference count here so if it's the terminating
     * case, we can release the lock safely and call the finish() handler
     * without risk of a race. We get a race if we leave the count until
     * after and something else is calling "finish" at the same time -
     * there's a chance that both threads will together take the count from 2
     * to 0 without either calling finish().
     */
    e->funct_ref--;
    engine_ref_debug(e, 1, -1);
    if ((e->funct_ref == 0) && e->finish) {
        if (unlock_for_handlers)
            CRYPTO_w_unlock(CRYPTO_LOCK_ENGINE);
        to_return = e->finish(e);
        if (unlock_for_handlers)
            CRYPTO_w_lock(CRYPTO_LOCK_ENGINE);
        if (!to_return)
            return 0;
    }
#ifdef REF_CHECK
    if (e->funct_ref < 0) {
        fprintf(stderr, "ENGINE_finish, bad functional reference count\n");
        abort();
    }
#endif
    /* Release the structural reference too */
    if (!engine_free_util(e, 0)) {
        ENGINEerr(ENGINE_F_ENGINE_UNLOCKED_FINISH, ENGINE_R_FINISH_FAILED);
        return 0;
    }
    return to_return;
}

@bagder
Member
bagder commented Aug 18, 2016

calling ENGINE_free() after ENGINE_finish() in Curl_ossl_close_all may not be necessary and even may cause a double free

However, even the example code on OpenSSL's ENGINE page does ENGINE_finish + ENGINE_free which could be an indication that it is fine...

@nased0
nased0 commented Aug 18, 2016 edited

Ok. it seems that in function engine_free_util(ENGINE *e, int locked) pointer to the engine is also protected by another (structural) reference counter, so if it works, then it is fine.
http://docs.huihoo.com/doxygen/openssl/1.0.1c/eng__lib_8c_source.html
(This engine_free_util code in openssl-1.0.2h is identical)

@nased0
nased0 commented Aug 18, 2016 edited

I have decided to heed Matt Caswell advice and to call ERR_remove_thread_state(NULL) directly from my application just before the communications thread exit.
I know that it may not be convenient for other libcurl users, but it solved the problem with memleaks for me. Especially because I already have to include and link OpenSSL to control OpenSSL PKCS#11 engine directly, to provide it with PIN entered by the user.
So maybe you should mention in documentation, that curl_easy_cleanup does not remove OpenSSL 1,0.2 error queue for the current thread and you need to call ERR_remove_thread_state(NULL) directly before exiting this thread?

@mkauf
Contributor
mkauf commented Aug 18, 2016

The error queue handling in OpenSSL is so strange and I wished I understood it better. Since it is global per-thread data, how does it handle multiple errors from different users within the same thread?

OpenSSL's error queue should be inspected as soon as an OpenSSL function returns a value that indicates an error. I think that libcurl handles this pretty well.

Example: If SSL_CTX_use_certificate_chain_file() fails, the error stack is inspected immediately with ERR_get_error():

if(SSL_CTX_use_certificate_chain_file(ctx, cert_file) != 1) {
  failf(data, "could not load PEM client certificate, " OSSL_PACKAGE " error %s, "
    "(no key found, wrong pass phrase, or wrong file format?)",
  ERR_error_string(ERR_get_error(), NULL) );
  return 0;
}
@jay
Member
jay commented Aug 19, 2016

Right, that's a good point. It appears everywhere in openssl.c we are popping the error data we need as soon as it is added so specifically the ERR_remove_thread_state(NULL) on easy_cleanup is probably ok as long as we keep doing that. I can't think of a test we can write to enforce it though. Also I don't know why we're peeking here.

@bagder
Member
bagder commented Aug 19, 2016

The peek call originates from commit 0e9626b (it was later changed again in 61fc904 to the current call). Are you suggesting there's a better way for us to show the error string there?

@dwmw2
Contributor
dwmw2 commented Aug 19, 2016

Can we call pthread_cleanup_push() to ensure that the appropriate cleanup is called when the thread exits? Or should it be OpenSSL doing that?

Reading the above... note that you shouldn't actually need to mess around with the OpenSSL-specific ENGINE support. If you pass a certificate name that is actually a PKCS#11 URI then that should Just Work™. It does with GnuTLS, it will with NSS, and it can with OpenSSL+ENGINE too.

I suppose I should make curl do something like the patch in that last link...

@nased0
nased0 commented Aug 20, 2016 edited

But cURL and OpenSSL can use certificate saved on PKCS#11 card automatically (without passing a certificate name), you may pass a certificate ID to cURL (by setting CURLOPT_SSLCERT and CURLOPT_SSLKEY to that ID string), but you don't have to if you have only one certificate on card. In this case you only have to set CURLOPT_SSLCERTTYPE and CURLOPT_SSLKEYTYPE to "ENG".
It is very convenient to me and I don't want to pass any certificate name if I don't have to!

PIN attribute is specific to the engine_pkcs11, it can be specified in pkcs11 section of the openssl.cnf file (i.e. PIN=1234) or by the function ENGINE_ctrl_cmd_string(ENGINE *e, const char *cmd_name, const char *arg, int cmd_optional); where cmd_name="PIN and arg="1234".

P.S.
I use cURL on Windows platform, and pthread_cleanup_push() is specific to UNIX (and Windows pthreads implementations ), so in my opinion it should be used by OpenSSL to clean its own mess, not by cURL.
Besides, jay explained why my patch is good enough for cURL.

@dwmw2
Contributor
dwmw2 commented Aug 20, 2016

Obviously the facility to set CURLOPT_SSLCERTTYPE and CURLOPT_SSLKEYTYPE wouldn't be removed. And you can still provide an openssl.cnf file and manage the engine explicitly if you really want to.

But (using the command line as an example), this ought to work with OpenSSL just as it does with GnuTLS and (soon, when my patches land) NSS builds:

  curl --cert 'pkcs11:manufacturer=piv_II;id=%01;pin-value=1234' $URL

And to use the first cert it finds, equivalent to your use case, you could just use a bare pkcs11: as the URI. Or pkcs11:pin-value=1234. It's really a set of search/filter criteria, not a URI.

But we digress. I'll send patches...

@bagder
Member
bagder commented Aug 21, 2016 edited

Can we call pthread_cleanup_push()

Right now we can't since we don't require/use pthreads unconditionally in curl. And it seems like quite a drastic change just to fix this issue...

Or should it be OpenSSL doing that?

I think each lib should clean up its own mess as good as possible so if anyone wants a cleanup at thread-exit then I think that lib should call that function. I think that's also why OpenSSL 1.1.0 doesn't have this problem because they now A) mandate pthreads and B) takes better care of their cleanups.

@bagder
Member
bagder commented Aug 28, 2016

It appears everywhere in openssl.c we are popping the error data we need as soon as it is added so specifically the ERR_remove_thread_state(NULL) on easy_cleanup is probably ok as long as we keep doing that

That seems logical. If we just keep extracting the correct error from the error queue when OpenSSL has returned one I think we can clean the error queue as this patch suggests. Anyone objects?

@bagder bagder self-assigned this Aug 28, 2016
@jay
Member
jay commented Aug 28, 2016

I would not do it like the original patch because only ERR_remove_thread_state we know is thread-specific, the other one ERR_remove_state apparently isn't in earlier versions. If they're using a version older than 1.0.0 I don't know of a good solution

@nased0
nased0 commented Aug 28, 2016 edited

Well, I simply copied code called by curl_global_cleanup().
It is possible that ERR_remove_state() shouldn't be called by my patch, I really don't care about obsolete OpenSSL versions.

P.S.
There are much more memory leaks in OpenSSL 1.0.2h and cURL that occur during engine PKCS#11 operations. I detected them by Memory Validator despite using newest versions of libp11 and libpkcs11 libraries compiled by MSVC 2012.
I have to load my communications DLL library dynamically, to be able to free and reload it after every 100 sessions.
Otherwise cURL and OpenSSL cease to connect after about 600 communication sessions.
This problem does not occur when I'm using PEM certificates instead of PKCS#11 card.

@bagder
Member
bagder commented Aug 29, 2016

Here's my suggested patch. Thoughts?

--- a/lib/vtls/openssl.c
+++ b/lib/vtls/openssl.c
@@ -1048,10 +1048,18 @@ void Curl_ossl_close_all(struct Curl_easy *data)
     data->state.engine = NULL;
   }
 #else
   (void)data;
 #endif
+#if !defined(HAVE_ERR_REMOVE_THREAD_STATE_DEPRECATED) && \
+  defined(HAVE_ERR_REMOVE_THREAD_STATE)
+  /* OpenSSL 1.0.1 and 1.0.2 build an error queue that is stored per-thread
+     so we need to clean it here in case the thread will be killed. All OpenSSL
+     code should extract the error in association with the error so clearing
+     this queue here should be harmless at worst. */
+  ERR_remove_thread_state(NULL);
+#endif
 }

 /* ====================================================== */

@jay
Member
jay commented Aug 30, 2016 edited

Here's my suggested patch. Thoughts?

I'd go shorter in the comments and ref this issue like /* We have to clean the error queue here since we never know when the thread will be killed, see https://github.com/curl/curl/issues/964 */

There are much more memory leaks in OpenSSL 1.0.2h and cURL that occur during engine PKCS#11 operations. I detected them by Memory Validator despite using newest versions of libp11 and libpkcs11 libraries compiled by MSVC 2012.

I don't think a lot of people are using PKCS, so I wouldn't be surprised. If you have patches for pkcs memory leaks please open a separate issue or pr.

@nased0
nased0 commented Aug 30, 2016

I can test PKCS#11 communication with server only at work, and my evaluation version of Memory Validator has expired, so I have to request a licence.
Besides, it will take much of my time to patch all these leaks.
Some of them are quite compromising, like leaking a private key (or what PKCS#11 card presents instead of a private key that cannot leave the card, (at least if the protection bit is set)).
So I posted info about it here, hoping that someone else could look at this problem.
These cards and USB readers aren't expensive and you can use a TPM module to emulate a crypto card.

@dwmw2
Contributor
dwmw2 commented Aug 30, 2016

You don't need hardware to use PKCS#11. You can use it with software tokens — including GNOME keyring, the NSS softokens that Chrome/Firefox/etc. use, and SoftHSM.

@bagder bagder changed the title from OpenSSL Memory Leak in LibCurl.dll to OpenSSL memory leak in error queue Sep 14, 2016
@bagder
Member
bagder commented Sep 14, 2016

I clarified the name of this bug to make it about this particular memory leak only. If/when there are additional leaks, lets open specific issues for those.

@bagder bagder added a commit that closed this issue Sep 14, 2016
@bagder bagder openssl: fix per-thread memory leak usiong 1.0.1 or 1.0.2
OpenSSL 1.0.1 and 1.0.2 build an error queue that is stored per-thread
so we need to clean it when easy handles are freed, in case the thread
will be killed in which the easy handle was used. All OpenSSL code in
libcurl should extract the error in association with the error already
so clearing this queue here should be harmless at worst.

Fixes #964
d932156
@bagder bagder closed this in d932156 Sep 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment