MySQLOnRocksDB/mysql-5.6
forked from facebook/mysql-5.6

Loading…
rocksdb.cardinality and other tests fail assert in rocksdb::port::PthreadCall #108
I will try it out.
I have run the following
./mysql-test/mtr --suite=rocksdb cardinality --mysqld=--default-storage-engine=rocksdb
and it has passed so far.
I will try to reproduce.
Tried on a clean install of Ubuntu 14.04.
Compiled the server as specified here: https://github.com/MySQLOnRocksDB/mysql-5.6/wiki/Build-steps
Run the testcase as pasted here: https://gist.github.com/spetrunia/fdf51050309b2b592f2a
and it reliably fails.
@rven1 , I can provide access to an AWS instance where this is observed.
Using the command line from the above on my system, it still worked. Please look it over to see if something needs to be changed. Otherwise, I can look at your AWS instance.
rven@dev6965:~/local/mysql/5.6/_build-5.6-Debug/mysql-test (webscalesql-5.6.24.97)$ ./mysql-test-run --mysqld=--default-storage-engine=rocksdb --retry=1 --force --retry=1 --max-test-fail=100 --suite=rocksdb rocksdb.optimize_table
Logging: /home/rven/local/mysql/5.6/mysql-test/mysql-test-run.pl --mysqld=--default-storage-engine=rocksdb --retry=1 --force --retry=1 --max-test-fail=100 --suite=rocksdb rocksdb.optimize_table
[Note] Global variable initial value (from cmd line) @@datadir = '/tmp/UzUe5S3g4K'
[Note] Global variable initial value (from cmd line) @@lc_messages_dir = '/home/rven/mysql/5.6/_build-5.6-Debug/sql/share'
2015-09-09 09:39:26 0 [Note] /home/rven/mysql/5.6/_build-5.6-Debug/sql/mysqld (mysqld 5.6.24-fb-debug) starting as process 348218 ...
2015-09-09 09:39:26 348218 [Note] Flashcache bypass: disabled
2015-09-09 09:39:26 348218 [Note] Flashcache setup error is : open flash device failed
2015-09-09 09:39:26 348218 [Note] Plugin 'FEDERATED' is disabled.
2015-09-09 09:39:26 348218 [Note] Binlog end
2015-09-09 09:39:26 348218 [Note] Shutting down plugin 'CSV'
2015-09-09 09:39:26 348218 [Note] Shutting down plugin 'MyISAM'
MySQL Version 5.6.24
Using default engine 'rocksdb'
Checking supported features...
-
SSL connections supported
- binaries are debug compiled Tests will be run in sync (default) client mode Collecting tests... Checking leftover processes... Removing old var directory... Creating var directory '/home/rven/mysql/5.6/_build-5.6-Debug/mysql-test/var'... Installing system database...
TEST RESULT TIME (ms) or COMMENT
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
rocksdb.optimize_table [ pass ] 23808
The servers were restarted 0 times
Spent 23.808 of 36 seconds executing testcasesCompleted: All 1 tests were successful.
rven@dev6965:~/local/mysql/5.6/_build-5.6-Debug/mysql-test (webscalesql-5.6.24.97)$
I built it using
../tools/mysqlbuild.sh --clean
tools/mysqlbuild.sh is a Facebook internal script that Sergei doesn't have access to because it sets paths and links to libraries that are only accessible internally.
Could you give me access to an AWS instance where this issue is observed.
I can now reproduce the problem:
rven@dev6965:~/mysql/5.6/_build-5.6-Debug/mysql-test/var/log (webscalesql-5.6.23.71)$ m mysqltest.log
CURRENT_TEST: rocksdb.cardinality
CURRENT_TEST: rocksdb.cardinality
--- /home/rven/mysql/5.6/mysql-test/suite/rocksdb/r/cardinality.result 2015-08-19 00:38:08.167927000 +0300
+++ /home/rven/mysql/5.6/_build-5.6-Debug/mysql-test/var/log/cardinality.reject 2015-09-09 23:50:20.200210000 +0300
@@ -56,35 +56,35 @@
select cast(variable_value as unsigned) into @a from information_schema.global_status where variable_name='rocksdb_number_stat_computes';
show index in t1;
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment
-t1 0 PRIMARY 1 id A 150000 NULL NULL BTREE
-t1 1 t1_1 1 id A 150000 NULL NULL BTREE
-t1 1 t1_1 2 i1 A 150000 NULL NULL YES BTREE
-t1 1 t1_2 1 i1 A 150000 NULL NULL YES BTREE
-t1 1 t1_2 2 i2 A 150000 NULL NULL YES BTREE
-t1 1 t1_3 1 i2 A 16666 NULL NULL YES BTREE
-t1 1 t1_3 2 i1 A 150000 NULL NULL YES BTREE
-t1 1 t1_4 1 c1 A 150000 NULL NULL YES BTREE
-t1 1 t1_4 2 c2 A 150000 NULL NULL YES BTREE
-t1 1 t1_5 1 c2 A 16666 NULL NULL YES BTREE
-t1 1 t1_5 2 c1 A 150000 NULL NULL YES BTREE
+t1 0 PRIMARY 1 id A 50000 NULL NULL BTREE
+t1 1 t1_1 1 id A 50000 NULL NULL BTREE
+t1 1 t1_1 2 i1 A 50000 NULL NULL YES BTREE
+t1 1 t1_2 1 i1 A 50000 NULL NULL YES BTREE
+t1 1 t1_2 2 i2 A 50000 NULL NULL YES BTREE
+t1 1 t1_3 1 i2 A 5555 NULL NULL YES BTREE
+t1 1 t1_3 2 i1 A 50000 NULL NULL YES BTREE
+t1 1 t1_4 1 c1 A 50000 NULL NULL YES BTREE
+t1 1 t1_4 2 c2 A 50000 NULL NULL YES BTREE
+t1 1 t1_5 1 c2 A 5555 NULL NULL YES BTREE
+t1 1 t1_5 2 c1 A 50000 NULL NULL YES BTREE
SELECT table_name, table_rows FROM information_schema.tables WHERE table_schema = DATABASE();
table_name table_rows
-t1 150000
+t1 50000
restarting...
show index in t1;
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment
-t1 0 PRIMARY 1 id A 150000 NULL NULL BTREE
-t1 1 t1_1 1 id A 150000 NULL NULL BTREE
-t1 1 t1_1 2 i1 A 150000 NULL NULL YES BTREE
-t1 1 t1_2 1 i1 A 150000 NULL NULL YES BTREE
-t1 1 t1_2 2 i2 A 150000 NULL NULL YES BTREE
-t1 1 t1_3 1 i2 A 21428 NULL NULL YES BTREE
-t1 1 t1_3 2 i1 A 150000 NULL NULL YES BTREE
-t1 1 t1_4 1 c1 A 150000 NULL NULL YES BTREE
-t1 1 t1_4 2 c2 A 150000 NULL NULL YES BTREE
-t1 1 t1_5 1 c2 A 21428 NULL NULL YES BTREE
-t1 1 t1_5 2 c1 A 150000 NULL NULL YES BTREE
+t1 0 PRIMARY 1 id A 50000 NULL NULL BTREE
+t1 1 t1_1 1 id A 50000 NULL NULL BTREE
+t1 1 t1_1 2 i1 A 50000 NULL NULL YES BTREE
+t1 1 t1_2 1 i1 A 50000 NULL NULL YES BTREE
+t1 1 t1_2 2 i2 A 50000 NULL NULL YES BTREE
+t1 1 t1_3 1 i2 A 12500 NULL NULL YES BTREE
+t1 1 t1_3 2 i1 A 50000 NULL NULL YES BTREE
+t1 1 t1_4 1 c1 A 50000 NULL NULL YES BTREE
+t1 1 t1_4 2 c2 A 50000 NULL NULL YES BTREE
+t1 1 t1_5 1 c2 A 12500 NULL NULL YES BTREE
+t1 1 t1_5 2 c1 A 50000 NULL NULL YES BTREE
SELECT table_name, table_rows FROM information_schema.tables WHERE table_schema = DATABASE();
table_name table_rows
-t1 150000
+t1 50000
drop table t1;
mysqltest: Result content mismatch
(In case you still need access to the instance: access details sent over email)
I have been having difficulty reproducing this problem on the latest MyRocks.
Just spoke to Yoshi. He said this test is flaky because the statistics is collected by the background thread which makes it non-deterministic. Alexey has a fix in the optimizer which he is working on.
Assigning to Alexey.
Some analysis.
It crashes because Thread 1 called abort() from here:
#6 0x0000000000e92036 in rocksdb::port::PthreadCall (label=0x15994dc "lock", result=22) at /home/ubuntu/mysql-5.6-look3/rocksdb/port/port_posix.cc:29
it aborted, because pthread_mutex_lock() returned result=22, which is EINVAL.
man page for pthread_mutex_lock() says:
The pthread_mutex_lock function returns the following error code on error:
EINVAL the mutex has not been properly initialized.
It was trying to lock this mutex:
port::Mutex ThreadLocalPtr::StaticMeta::mutex_
As far as I understand, this mutex is statically initialized (in Mutex::Mutex) and deinitialized (in Mutex::~Mutex).
Why is it deinitialized? This is because this thread has called exit(exit_code):
Thread 3 (Thread 0x7ff88df6d900 (LWP 13121)):
#0 0x00007ff88c8a966b in pthread_join (threadid=140704898590464, thread_return=0x0) at pthread_join.c:92
#1 0x0000000000eddeb5 in rocksdb::(anonymous namespace)::PosixEnv::ThreadPool::JoinAllThreads (this=0x3b97860) at /home/ubuntu/mysql-5.6-look3/rocksdb/util/env_posix.cc:1343
#2 0x0000000000edbaa1 in rocksdb::(anonymous namespace)::PosixEnv::~PosixEnv (this=0x1e27ae0 <rocksdb::Env::Default()::default_env>, __in_chrg=<optimized out>) at /home/ubuntu/mysql-5.6-look3/rocksdb/util/env_posix.cc:819
#3 0x00007ff88bcf7259 in __run_exit_handlers (status=0, listp=0x7ff88c0796c8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
#4 0x00007ff88bcf72a5 in __GI_exit (status=<optimized out>) at exit.c:104
#5 0x00000000008adb65 in mysqld_exit (exit_code=0) at /home/ubuntu/mysql-5.6-look3/sql/mysqld.cc:1978
#6 0x00000000008b5784 in mysqld_main (argc=87, argv=0x3c1b808) at /home/ubuntu/mysql-5.6-look3/sql/mysqld.cc:6580
#7 0x00000000008ab19d in main (argc=13, argv=0x7fffb5599928) at /home/ubuntu/mysql-5.6-look3/sql/main.cc:25
The thread is inside exit() call, it calls PosixEnv::~PosixEnv(). As far as I understand, there is only one PosixEnv object which is statically allocated.
I guess it is possible that ThreadLocalPtr::StaticMeta::mutex_ was deleted before PosixEnv is deleted.
Maybe, if rocksdb_done_func called this:
rocksdb::Env::Default()->WaitForJoin();
it would help?
Your analysis is correct.
delete rdb; on line 2034 will close the DB.
It is then safe to call
rocksdb::Env::Default()->WaitForJoin();
to wait for threads started by StartThread to terminate, so that the race condition of freeing the thread local storage after the mutex does not occur.
Can you give me access to the machine again so I can look at it again.
Adding WaitForJoin call didn't help. Looking at where the threads get stuck
Thread 3 (Thread 0x7f05440e2900 (LWP 21824)):
#0 0x00007f0542a1e66b in pthread_join (threadid=139659837429504, thread_return=0x0) at pthread_join.c:92
#1 0x0000000000eddf07 in rocksdb::(anonymous namespace)::PosixEnv::ThreadPool::JoinAllThreads (this=0x3737860) at /home/ubuntu/mysql-5.6-look3/rocksdb/util/env_posix.cc:1343
#2 0x0000000000edbaf3 in rocksdb::(anonymous namespace)::PosixEnv::~PosixEnv (this=0x1e27ae0 <rocksdb::Env::Default()::default_env>, __in_chrg=<optimized out>) at /home/ubuntu/mysql-5.6-look3/rocksdb/util/env_posix.cc:819
#3 0x00007f0541e6c259 in __run_exit_handlers (status=0, listp=0x7f05421ee6c8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
#4 0x00007f0541e6c2a5 in __GI_exit (status=<optimized out>) at exit.c:104
#5 0x00000000008adb65 in mysqld_exit (exit_code=0) at /home/ubuntu/mysql-5.6-look3/sql/mysqld.cc:1978
#6 0x00000000008b5793 in mysqld_main (argc=87, argv=0x37bb808) at /home/ubuntu/mysql-5.6-look3/sql/mysqld.cc:6581
#7 0x00000000008ab19d in main (argc=13, argv=0x7fff5136d968) at /home/ubuntu/mysql-5.6-look3/sql/main.cc:25
Thread 2 (Thread 0x7f05177fe700 (LWP 21853)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x0000000000ede0ab in rocksdb::(anonymous namespace)::PosixEnv::ThreadPool::BGThread (this=0x3737940, thread_id=0) at /home/ubuntu/mysql-5.6-look3/rocksdb/util/env_posix.cc:1396
#2 0x0000000000ede33f in rocksdb::(anonymous namespace)::PosixEnv::ThreadPool::BGThreadWrapper (arg=0x7f050c07eac0) at /home/ubuntu/mysql-5.6-look3/rocksdb/util/env_posix.cc:1473
#3 0x00007f0542a1d182 in start_thread (arg=0x7f05177fe700) at pthread_create.c:312
#4 0x00007f0541f2a47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Thread 1 (Thread 0x7f0516ffd700 (LWP 21854)):
#0 0x00007f0542a22621 in __pthread_kill (threadid=<optimized out>, signo=6) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:61
#1 0x0000000000d6efd9 in my_write_core (sig=6) at /home/ubuntu/mysql-5.6-look3/mysys/stacktrace.c:422
#2 0x00000000009cf6e6 in handle_fatal_signal (sig=6) at /home/ubuntu/mysql-5.6-look3/sql/signal_handler.cc:239
#3 <signal handler called>
#4 0x00007f0541e66cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#5 0x00007f0541e6a0d8 in __GI_abort () at abort.c:89
#6 0x0000000000e92088 in rocksdb::port::PthreadCall (label=0x159953c "lock", result=22) at /home/ubuntu/mysql-5.6-look3/rocksdb/port/port_posix.cc:29
#7 0x0000000000e921a4 in rocksdb::port::Mutex::Lock (this=0x1e29160 <rocksdb::ThreadLocalPtr::StaticMeta::mutex_>) at /home/ubuntu/mysql-5.6-look3/rocksdb/port/port_posix.cc:56
#8 0x0000000000e85aa4 in rocksdb::MutexLock::MutexLock (this=0x7f0516ffcca0, mu=0x1e29160 <rocksdb::ThreadLocalPtr::StaticMeta::mutex_>) at /home/ubuntu/mysql-5.6-look3/rocksdb/util/mutexlock.h:28
#9 0x0000000000f864b6 in rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit (ptr=0x7f04f002bbf0) at /home/ubuntu/mysql-5.6-look3/rocksdb/util/thread_local.cc:118
#10 0x00007f0542a1cf82 in __nptl_deallocate_tsd () at pthread_create.c:158
#11 0x00007f0542a1d195 in start_thread (arg=0x7f0516ffd700) at pthread_create.c:325
#12 0x00007f0541f2a47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
one can notice these:
#1 0x0000000000ede0ab in rocksdb::(anonymous namespace)::PosixEnv::ThreadPool::BGThread (this=0x3737940, thread_id=0) at /home/ubuntu/mysql-5.6-look3/rocksdb/util/env_posix.cc:1396
#2 0x0000000000ede33f in rocksdb::(anonymous namespace)::PosixEnv::ThreadPool::BGThreadWrapper (arg=0x7f050c07eac0) at /home/ubuntu/mysql-5.6-look3/rocksdb/util/env_posix.cc:1473
I got a hypothesis that WaitForJoin call doesn't stop RocksDB's internal thread pool (cont'd)
A proof-of-concept fix that made the crash go away:
At top level (i.e. in MyRocks):
diff --git a/storage/rocksdb/ha_rocksdb.cc b/storage/rocksdb/ha_rocksdb.cc
index 07c0f2e..9f014c3 100644
--- a/storage/rocksdb/ha_rocksdb.cc
+++ b/storage/rocksdb/ha_rocksdb.cc
@@ -2041,6 +2043,9 @@ static int rocksdb_done_func(void *p)
if (table_options.block_cache)
table_options.block_cache->DisownData();
#endif /* HAVE_purify */
+
+ // psergey
+ rocksdb::Env::Default()->WaitForJoin();
DBUG_RETURN(error);
}Inside RocksDB: I've made PosixEnv::WaitForJoin() stop thread pool threads, but then I was getting an assert (*) PosixEnv::~PosixEnv, when it tried to stop thread pool which was already stopped. I solved this by empty-ing thread_pools_ array (it fixes the crash, not sure if this approach is in line with RocksDB's style)
diff --git a/util/env_posix.cc b/util/env_posix.cc
index b856f1d..081f14b 100644
--- a/util/env_posix.cc
+++ b/util/env_posix.cc
@@ -815,8 +815,10 @@ class PosixEnv : public Env {
for (const auto tid : threads_to_join_) {
pthread_join(tid, nullptr);
}
- for (int pool_id = 0; pool_id < Env::Priority::TOTAL; ++pool_id) {
- thread_pools_[pool_id].JoinAllThreads();
+ //psergey: for (int pool_id = 0; pool_id < Env::Priority::TOTAL; ++pool_id) {
+ for (auto &it: thread_pools_) {
+ //psergey: thread_pools_[pool_id].JoinAllThreads();
+ it.JoinAllThreads();
}
// All threads must be joined before the deletion of
// thread_status_updater_.
@@ -1667,6 +1669,14 @@ void PosixEnv::WaitForJoin() {
for (const auto tid : threads_to_join_) {
pthread_join(tid, nullptr);
}
+//psergey:
+ //for (int pool_id = 0; pool_id < Env::Priority::TOTAL; ++pool_id) {
+ for (auto &it: thread_pools_) {
+ //thread_pools_[pool_id].JoinAllThreads();
+ it.JoinAllThreads();
+ }
+ thread_pools_.clear();
+// : psergey
threads_to_join_.clear();
}EDIT: (*) The assert was in ThreadPool::JoinAllThreads(), this one: assert(!exit_all_threads_);
@rven1 , access key sent again over email.
Currently on the machine:
mysql-5.6-look3 - current tree.
Running
./mysql-test-run --mysqld=--default-storage-engine=rocksdb --retry=1 --force --retry=1 --max-test-fail=100 --suite=rocksdb rocksdb.cardinality
in this tree will crash.
mysql-5.6-look3/top-level.diff
mysql-5.6-look3/rocksdb-dir.diff
^ the above patches. Applying them to top level and rocksdb dir makes the test not to crash anymore.
Could you review these (or take over, whatever is easier)?
The thread in question which crashes in rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit was not started by rocksdb but would have done some sql which would have resulted in touching rocksdb, so that thread local storage for rocksdb would have been created. I am debugging on the AWS machine. Not clear why it doesn't reproduce on Facebook machines. The threads with BGThreadWrapper on the stack are those started by RocksDB.
From further analysis, it looks like the thread which caused the abort must have been a myrocks background thread and should have exited prior to the return from "delete rdb;" The abort is caused by one of the myrocks threads not exiting prior to ~PosixEnv, I think the rocksdb threads are correctly taken care of by the call to "thread_pools_[pool_id].JoinAllThreads();" in ~PosixEnv.
If you agree, I suspect that a call to join the myrocks threads should fix the problem, by joining for the myrocks threads prior to return from rocksdb_done_func. The threads which used rocksdb should be all done by now anyway.
If we wait for the myrocks threads, no call to waitforjoin of the rocksdb threads is needed.
The test rocksdb.cardinality (and sometimes other tests) fail like this: