Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plug mysql_thread_init memory leak, fixes #6231 #7509

Merged
merged 1 commit into from Mar 19, 2019

Conversation

Projects
None yet
6 participants
@Habbie
Copy link
Member

Habbie commented Feb 21, 2019

Short description

Whenever a thread uses MySQL, make sure to call mysql_thread_end() when the thread terminates.

NOTE: not PRed to master yet. #7517

Checklist

I have:

  • read the CONTRIBUTING.md document
  • compiled this code
  • tested this code
  • included documentation (including possible behaviour changes)
  • documented the code
  • added or modified regression test(s)
  • added or modified unit test(s)
  • checked that this code was merged to master
@@ -38,6 +38,22 @@
typedef bool my_bool;
#endif

class MySQLThreadCloser

This comment has been minimized.

Copy link
@zeha

zeha Feb 21, 2019

Collaborator

this black magic code should have a comment on the "why" :)

This comment has been minimized.

Copy link
@Habbie

Habbie Feb 22, 2019

Author Member

Yes, agreed!

@Habbie

This comment has been minimized.

Copy link
Member Author

Habbie commented Feb 21, 2019

As pointed out by @rgacogne in conversation before I filed this PR:

You could do per-thread refcounting of SMySQL objects, but you can’t call mysql_thread_end() if the refcount goes to 0 because we could still create a new one (note that it might work and just cause a new allocation, but the documentation is not clear on that point so it’s a bit iffy) and if the refcount isn’t at 0 when your destructor is called you won’t get called again. I guess you could then set a flag so that the last SMySQL destructor knows we are exiting the thread, but what if your flag is destructed before that?

We need to check if this is a problem in practice. The best answer I can hope for is 'hide this cleanup trick behind a pdns.conf flag and we will get away with it'.

@Habbie

This comment has been minimized.

Copy link
Member Author

Habbie commented Feb 22, 2019

With https://gist.github.com/Habbie/92f0a5b389339c09d7d0931addf94329, and this PR cherry-picked on top of master (available at https://github.com/Habbie/pdns/tree/mysqlthreadcloser):

API call:

298088192 SMySQL
298088192 MySQLThreadCloser enabled
298088192 ~SMySQL
298088192 ~MySQLThreadCloser closing ...  mysql_thread_end()

AXFR:

298088192 SMySQL
298088192 MySQLThreadCloser enabled
323266304 SMySQL
323266304 MySQLThreadCloser enabled
306480896 SMySQL
314873600 SMySQL
306480896 MySQLThreadCloser enabled
314873600 MySQLThreadCloser enabled
314873600 ~SMySQL
314873600 ~SMySQL
323266304 ~SMySQL
314873600 ~MySQLThreadCloser closing ...  mysql_thread_end()
314873600 ~MySQLThreadCloser closing ...  mysql_thread_end()
323266304 ~MySQLThreadCloser closing ...  mysql_thread_end()
306480896 ~SMySQL
306480896 ~MySQLThreadCloser closing ...  mysql_thread_end()
298088192 ~SMySQL
298088192 ~MySQLThreadCloser closing ...  mysql_thread_end()
@Habbie

This comment has been minimized.

Copy link
Member Author

Habbie commented Feb 22, 2019

With https://gist.github.com/Habbie/92f0a5b389339c09d7d0931addf94329, and this PR cherry-picked on top of master (available at https://github.com/Habbie/pdns/tree/mysqlthreadcloser)

This debugging output is not portable (works on el7, doesn't work on OSX):

  CXX      smysql.lo
smysql.cc:44:10: error: cannot initialize return object of type 'int' with an rvalue of type 'pthread_t _Nonnull' (aka '_opaque_pthread_t *')
  return pthread_self();
         ^~~~~~~~~~~~~~
1 error generated.
make: *** [smysql.lo] Error 1
@rgacogne

This comment has been minimized.

Copy link
Member

rgacogne commented Feb 22, 2019

I think this might work (with some caveats below) because:

  • destructors of thread_local variables are called before pthread thread-specific data is destroyed 1 ;
  • objects with an automatic storage duration will already have been destroyed when thread_local objects are destroyed.

But that means that we need to make sure that no UeberBackend object outlives the duration of the thread that created it. This means no global (static), thread-local (because then the order of destruction is very hard to predict), or static local objects. Also we need to be careful about dynamically allocated objects outliving their creator thread.
I already see one possible issue in 2, a static local UeberBackend that might be created in one thread and be used by another one after the initial thread has been destroyed, but there might be other. :-/

@rgacogne rgacogne added the auth label Feb 22, 2019

@omoerbeek

This comment has been minimized.

Copy link
Member

omoerbeek commented Feb 25, 2019

After reading the mysql 5.5 docs on mysql_thread_end() and related pages (e.g. https://dev.mysql.com/doc/refman/5.5/en/c-api-threaded-clients.html):

  • mysql_thread_init() should be called (directly or indirectly via mysql_connect() for every thread that uses mysql functions.
  • mysql_thread_end() should be called just before pthread_exit()
  • The case where a mysql_connect() is done by one thread but the resulting handle is used by another is covered as well. A thread using mysql functions should call mysql_thread_init() even if it din not create the mysql connection. And per consequence it should also call mysql_thread_end() on thread exit.

I suspect that sharing sql connections between threads could cause trouble if the thread that did not create the connection uses mysql functions and did not call mysql_thread_init(). But likely we should have seen this trouble already if that happens in the current code. If the thread that uses but not created the mysql connection exits at some point, it should call mysql_thread_end(). If it lives forever, no active leak occurs. In general I say that sharing mysql connection objects between threads is something that makes a hard problem even harder and should be avoided.

As for short lived threads, it looks like the mechanism provided is sufficient, provided that we keep the sql connection to the thread and no mysql functions are used by the thread exiting between the call to mysql_thread_end() and pthread_exit(). As we are already in thread teardown state when the C++ thread local destructor calls are done this abusive use of mysql functions can only be achieved by (other) thread local destructor calls, which can be avoided easily.

So I agree with Remi's analysis, we are likely safe (with some reservations).

We should probably make this thread local object mechanism something conditional. If we know exactly which client libs expose the problem, we could use https://dev.mysql.com/doc/refman/5.5/en/mysql-get-client-version.html to decide to enable the mechanism. Otherwise use a config flag.

@omoerbeek

This comment has been minimized.

Copy link
Member

omoerbeek commented Feb 25, 2019

314873600 ~MySQLThreadCloser closing ... mysql_thread_end()
314873600 ~MySQLThreadCloser closing ... mysql_thread_end()

Hmm, a double call, that should not happen, right?

I see for 314873600:

314873600 SMySQL
314873600 MySQLThreadCloser enabled
314873600 ~SMySQL
314873600 ~SMySQL
314873600 ~MySQLThreadCloser closing ...  mysql_thread_end()
314873600 ~MySQLThreadCloser closing ...  mysql_thread_end()

Only one create, but two dt calls.. Is this log complete?

Probably should add the value of this to the debug output to see which objects we're dealing with.
The double call to mysql_thread_end() should be avoided.

@Habbie

This comment has been minimized.

Copy link
Member Author

Habbie commented Feb 25, 2019

Only one create, but two dt calls.. Is this log complete?

Well spotted, this is weird. Will try to rerun that test.

@omoerbeek

This comment has been minimized.

Copy link
Member

omoerbeek commented Feb 25, 2019

To avoid the double call to mysql_thread_end() the bool should probably be a counter: increment on enable, decrement in dt and call mysql_thread_end() if zero.

@Habbie

This comment has been minimized.

Copy link
Member Author

Habbie commented Feb 25, 2019

To avoid the double call to mysql_thread_end() the bool should probably be a counter: increment on enable, decrement in dt and call mysql_thread_end() if zero.

First we need to figure out why a destructor got called twice..

@omoerbeek

This comment has been minimized.

Copy link
Member

omoerbeek commented Feb 25, 2019

Yes, that is one step further. First need to find out why there is a single construct with two dt calls in the log.

@Habbie

This comment has been minimized.

Copy link
Member Author

Habbie commented Feb 25, 2019

Yes, that is one step further. First need to find out why there is a single construct with two dt calls in the log.

That version does not log the construct, just the enable, but it indeed logs just one enable..

@Habbie

This comment has been minimized.

Copy link
Member Author

Habbie commented Feb 25, 2019

Hmm, a double call, that should not happen, right?

It usually doesn't. I've added this to the logging and retried until it happened again:

-1300248832 SMySQL
-1300248832 0x7fefb27fc6f9 MySQLThreadCloser construct
-1300248832 0x7fefb27fc6f9 MySQLThreadCloser enabled
-1325426944 SMySQL
-1325426944 0x7fefb0ff96f9 MySQLThreadCloser construct
-1325426944 0x7fefb0ff96f9 MySQLThreadCloser enabled
-1317034240 SMySQL
-1308641536 SMySQL
-1317034240 0x7fefb17fa6f9 MySQLThreadCloser construct
-1317034240 0x7fefb17fa6f9 MySQLThreadCloser enabled
-1308641536 0x7fefb1ffb6f9 MySQLThreadCloser construct
-1308641536 0x7fefb1ffb6f9 MySQLThreadCloser enabled
-1308641536 ~SMySQL
-1308641536 ~SMySQL
-1317034240 ~SMySQL
-1308641536 0x7fefb1ffb6f9 ~MySQLThreadCloser closing ...  mysql_thread_end()
-1308641536 0x7fefb1ffb6f9 ~MySQLThreadCloser closing ...  mysql_thread_end()
Feb 25 12:35:08 AXFR of domain 'example.com' to 127.0.0.1:46709 finished
-1317034240 0x7fefb17fa6f9 ~MySQLThreadCloser closing ...  mysql_thread_end()
-1325426944 ~SMySQL
-1325426944 0x7fefb0ff96f9 ~MySQLThreadCloser closing ...  mysql_thread_end()
-1300248832 ~SMySQL
-1300248832 0x7fefb27fc6f9 ~MySQLThreadCloser closing ...  mysql_thread_end()

4 constructs, 4 enables, 5 destruct+close

@Habbie

This comment has been minimized.

Copy link
Member Author

Habbie commented Feb 25, 2019

It turns out I'm using cerr in a thread unsafe way, which is duplicating output lines. With static std::atomic<int> s_unique; and (s_unique++) once in every cerr line, here's the smoking gun:

1518323456 SMySQL
1518323456 0x7f6a5a7fc6ec MySQLThreadCloser constructed, d_constructed=1, s_unique=32
1518323456 0x7f6a5a7fc6ec MySQLThreadCloser enabled, s_unique=33
1509930752 SMySQL
1509930752 0x7f6a59ffb6ec MySQLThreadCloser constructed, d_constructed=1, s_unique=34
1509930752 0x7f6a59ffb6ec MySQLThreadCloser enabled, s_unique=35
1493145344 SMySQL
1501538048 SMySQL
1493145344 0x7f6a58ff96ec MySQLThreadCloser constructed, d_constructed=1, s_unique=36
1493145344 0x7f6a58ff96ec MySQLThreadCloser enabled, s_unique=37
1501538048 0x7f6a597fa6ec MySQLThreadCloser constructed, d_constructed=1, s_unique=38
1501538048 0x7f6a597fa6ec MySQLThreadCloser enabled, s_unique=39
1493145344 ~SMySQL
1509930752 ~SMySQL
1493145344 0x7f6a58ff96ec ~MySQLThreadCloser closing ... , d_constructed=1, d_destructed=1, s_unique=40 mysql_thread_end()
1493145344 0x7f6a58ff96ec ~MySQLThreadCloser closing ... , d_constructed=1, d_destructed=1, s_unique=40 mysql_thread_end()
1501538048 ~SMySQL
1493145344 0x7f6a58ff96ec ~MySQLThreadCloser closing ... , d_constructed=1, d_destructed=1, s_unique=40 mysql_thread_end()
1501538048 ~SMySQL
1509930752 0x7f6a59ffb6ec ~MySQLThreadCloser closing ... , d_constructed=1, d_destructed=1, s_unique=41 mysql_thread_end()
1501538048 0x7f6a597fa6ec ~MySQLThreadCloser closing ... , d_constructed=1, d_destructed=1, s_unique=42 mysql_thread_end()
1518323456 ~SMySQL
1518323456 0x7f6a5a7fc6ec ~MySQLThreadCloser closing ... , d_constructed=1, d_destructed=1, s_unique=43 mysql_thread_end()

@Habbie Habbie referenced this pull request Feb 26, 2019

Merged

plug mysql_thread_init memory leak #7517

5 of 8 tasks complete
@Habbie

This comment has been minimized.

Copy link
Member Author

Habbie commented Feb 26, 2019

#7517 has a configurable version of this for master, please review

@Habbie Habbie force-pushed the Habbie:4.1.x-mysqlthreadcloser branch from 2c35018 to b4a7c77 Mar 1, 2019

@Habbie

This comment has been minimized.

Copy link
Member Author

Habbie commented Mar 1, 2019

Now identical to #7517, minus the documentation changes. Ready for review, merge.

@Habbie Habbie added this to the auth-4.1.x milestone Mar 1, 2019

@aerique aerique requested a review from zeha Mar 14, 2019

@pieterlexis pieterlexis merged commit 6b23ae3 into PowerDNS:rel/auth-4.1.x Mar 19, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@aerique aerique modified the milestones: auth-4.1.x, auth-4.1.8 Mar 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.