Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MyRocks: 1 hour stall in DelayWrite() during bulk load #348

Closed
RickPizzi opened this issue Oct 21, 2016 · 14 comments
Closed

MyRocks: 1 hour stall in DelayWrite() during bulk load #348

RickPizzi opened this issue Oct 21, 2016 · 14 comments

Comments

@RickPizzi
Copy link

Hi,

still trying to load our 830 tables stats server into MyRocks, we noticed a very long (> 1 hr) stall during the bulk load, when it was about to complete. Only 2 threads from myloader were left, but there has been no progress for more than one hour (SST files untouched, by looking at the files modification time).

I ran a pstack to check what was going on (output below). From what I can see, there were 2 threads trying to commit (do_bulk_commit) and a 3rd thread was a background thread trying to persist statistics (persist_stats). Out of the 2 committing thread, one was stuck in DelayWrite() and the other one waiting on a mutex.

After some more time everything magically resolved by itself (not sure if my pstack command was part of the black magic here).

Relevant output from pstack follows.

Thread 17 (Thread 0x7f9f735fe700 (LWP 30412)):
#0 0x00007fa102d4768c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007fa1015aa37c in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /usr/local/lib64/libstdc++.so.6
#2 0x0000000000ca221c in rocksdb::WriteThread::BlockingAwaitState(rocksdb::WriteThread::Writer*, unsigned char) ()
#3 0x0000000000ca2348 in rocksdb::WriteThread::AwaitState(rocksdb::WriteThread::Writer_, unsigned char, rocksdb::WriteThread::AdaptationContext_) ()
#4 0x0000000000ca28a4 in rocksdb::WriteThread::JoinBatchGroup(rocksdb::WriteThread::Writer*) ()
#5 0x0000000000c0927f in rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch_, rocksdb::WriteCallback_, unsigned long*, unsigned long, bool) ()
#6 0x0000000000c0b13b in rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*) ()
#7 0x0000000000baa4cd in myrocks::Rdb_dict_manager::commit(rocksdb::WriteBatch*, bool) ()
#8 0x0000000000badc75 in myrocks::Rdb_ddl_manager::persist_stats(bool) ()
#9 0x0000000000b8803c in myrocks::Rdb_background_thread::run() ()
#10 0x0000000000bdef54 in myrocks::Rdb_thread::thread_func(void*) ()
#11 0x00007fa102d43aa1 in start_thread () from /lib64/libpthread.so.0
#12 0x00007fa100dacaad in clone () from /lib64/libc.so.6

Thread 7 (Thread 0x7fa0fc08f700 (LWP 6071)):
#0 0x00007fa100d70bdd in nanosleep () from /lib64/libc.so.6
#1 0x00007fa100da5d94 in usleep () from /lib64/libc.so.6
#2 0x0000000000be895f in rocksdb::DBImpl::DelayWrite(unsigned long) ()
#3 0x0000000000c0abfa in rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch_, rocksdb::WriteCallback_, unsigned long*, unsigned long, bool) ()
#4 0x0000000000c0b13b in rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*) ()
#5 0x0000000000d37838 in rocksdb::TransactionImpl::Commit() ()
#6 0x0000000000b9ec6b in myrocks::Rdb_transaction_impl::commit_no_binlog() ()
#7 0x0000000000b8c25d in myrocks::ha_rocksdb::do_bulk_commit(myrocks::Rdb_transaction*) ()
#8 0x0000000000b92fac in myrocks::ha_rocksdb::update_write_row(unsigned char const_, unsigned char const_, bool) ()
#9 0x0000000000b9301d in myrocks::ha_rocksdb::write_row(unsigned char*) ()
#10 0x0000000000620334 in handler::ha_write_row(unsigned char*) ()
#11 0x0000000000e47f98 in ha_partition::write_row(unsigned char*) ()
#12 0x0000000000620334 in handler::ha_write_row(unsigned char*) ()
#13 0x000000000076daad in write_record(THD_, TABLE_, COPY_INFO_, COPY_INFO_) ()
#14 0x0000000000772f28 in mysql_insert(THD_, TABLE_LIST_, List&, List<List >&, List&, List&, enum_duplicates, bool) ()
#15 0x000000000078e420 in mysql_execute_command(THD_, unsigned long long_, unsigned long long*) ()
#16 0x0000000000793aef in mysql_parse(THD_, char_, unsigned int, Parser_state_, unsigned long long_, char_, std::cxx11::basic_string<char, std::char_traits, std::allocator > const&, char) ()
#17 0x0000000000795442 in dispatch_command(enum_server_command, THD
, char_, unsigned int) ()
#18 0x0000000000753adc in do_handle_one_connection(THD*) ()
#19 0x0000000000753d89 in handle_one_connection ()
#20 0x00007fa102d43aa1 in start_thread () from /lib64/libpthread.so.0
#21 0x00007fa100dacaad in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7fa0f42b5700 (LWP 6085)):
#0 0x00007fa102d4768c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007fa1015aa37c in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /usr/local/lib64/libstdc++.so.6
#2 0x0000000000ca221c in rocksdb::WriteThread::BlockingAwaitState(rocksdb::WriteThread::Writer*, unsigned char) ()
#3 0x0000000000ca2348 in rocksdb::WriteThread::AwaitState(rocksdb::WriteThread::Writer_, unsigned char, rocksdb::WriteThread::AdaptationContext_) ()
#4 0x0000000000ca28a4 in rocksdb::WriteThread::JoinBatchGroup(rocksdb::WriteThread::Writer*) ()
#5 0x0000000000c0927f in rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch_, rocksdb::WriteCallback_, unsigned long*, unsigned long, bool) ()
#6 0x0000000000c0b13b in rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*) ()
#7 0x0000000000d37838 in rocksdb::TransactionImpl::Commit() ()
#8 0x0000000000b9ec6b in myrocks::Rdb_transaction_impl::commit_no_binlog() ()
#9 0x0000000000b8c25d in myrocks::ha_rocksdb::do_bulk_commit(myrocks::Rdb_transaction*) ()
#10 0x0000000000b92fac in myrocks::ha_rocksdb::update_write_row(unsigned char const_, unsigned char const_, bool) ()
#11 0x0000000000b9301d in myrocks::ha_rocksdb::write_row(unsigned char*) ()
#12 0x0000000000620334 in handler::ha_write_row(unsigned char*) ()
#13 0x0000000000e47f98 in ha_partition::write_row(unsigned char*) ()
#14 0x0000000000620334 in handler::ha_write_row(unsigned char*) ()
#15 0x000000000076daad in write_record(THD_, TABLE_, COPY_INFO_, COPY_INFO_) ()
#16 0x0000000000772f28 in mysql_insert(THD_, TABLE_LIST_, List&, List<List >&, List&, List&, enum_duplicates, bool) ()
#17 0x000000000078e420 in mysql_execute_command(THD_, unsigned long long_, unsigned long long*) ()
#18 0x0000000000793aef in mysql_parse(THD_, char_, unsigned int, Parser_state_, unsigned long long_, char_, std::cxx11::basic_string<char, std::char_traits, std::allocator > const&, char) ()
#19 0x0000000000795442 in dispatch_command(enum_server_command, THD
, char_, unsigned int) ()
#20 0x0000000000753adc in do_handle_one_connection(THD*) ()
#21 0x0000000000753d89 in handle_one_connection ()
#22 0x00007fa102d43aa1 in start_thread () from /lib64/libpthread.so.0
#23 0x00007fa100dacaad in clone () from /lib64/libc.so.6

@RickPizzi
Copy link
Author

This just happened again, about 20 minutes later, with identical stack traces

@RickPizzi
Copy link
Author

Relevant config

rocksdb
skip-innodb
default-storage-engine=rocksdb
default-tmp-storage-engine=MyISAM
collation-server=latin1_bin
binlog_format = ROW

rocksdb_max_open_files=-1
rocksdb_base_background_compactions=1
rocksdb_max_total_wal_size=4G
rocksdb_block_size=16384
rocksdb_block_cache_size=16G
rocksdb_table_cache_numshardbits=6
rocksdb_skip_unique_check=1
rocksdb_commit_in_the_middle=1
rocksdb_write_disable_wal=1
rocksdb_max_background_flushes=40
rocksdb_max_background_compactions=40
rocksdb_default_cf_options=write_buffer_size=128m;level0_file_num_compaction_trigger=4;level0_slowdown_writes_trigger=256;level0_stop_writes_trigger=256;max_write_buffer_number=16;memtable=vector:1024

@RickPizzi
Copy link
Author

RickPizzi commented Oct 21, 2016

Can it perhaps be related to rocksdb_stats_dump_period_sec value? It is happening at regular intervals it seems. Mind you, MyRocks has 32100+ files open at this time (I opened a different issue about this)

@yoshinorim
Copy link
Contributor

Could you paste "show global status like '%hash%';" output? We had RocksDB stall bugs at DelayWrite() before, but newer rocksdb revision fixed known bugs.

@RickPizzi
Copy link
Author

RickPizzi commented Oct 21, 2016

Unfortunately my installation doesn't seem to have this info for some reason....

mysql> show global status like '%hash%';
+------------------+--------------------+
| Variable_name | Value |
+------------------+--------------------+
| git_hash | @MYSQL_GIT_HASH@ |
| rocksdb_git_hash | @ROCKSDB_GIT_HASH@ |
+------------------+--------------------+
2 rows in set (0.01 sec)

Perhaps I can fetch that same info from a file in .git folder?

@jkedgar
Copy link
Contributor

jkedgar commented Oct 21, 2016

The cmake process is supposed to assign values to MYSQL_GIT_HASH and ROCKSDB_GIT_HASH but that doesn't seem to have occurred in this case.

Yes - send the git hash from your repository and from the rocksdb submodule.

@RickPizzi
Copy link
Author

Dunno why, I followed instructions found on your wiki page here....
anyways, git rev-parse HEAD gives: 86587af
I'm not git expert so if I need to run some other command please let me know.

@mdcallag
Copy link
Contributor

The first commits from this would be useful
cd $mysql-src; git log
cd $mysql-src/rocksdb; git log

On Fri, Oct 21, 2016 at 11:06 AM, Rick Pizzi notifications@github.com
wrote:

Unfortunately my installation doesn't seem to have this info for some
reason....

mysql> show global status like '%hash%';
+------------------+--------------------+
| Variable_name | Value |
+------------------+--------------------+
| git_hash | @MYSQL_GIT_HASH@ |
| rocksdb_git_hash | @ROCKSDB_GIT_HASH@ |
+------------------+--------------------+
2 rows in set (0.01 sec)

Perhaps I can fetch that same info from a file in .git folder?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#348 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABkKTQyXKQqXqHTeexHZclZO3xtAEzdVks5q2P8VgaJpZM4KdAAC
.

Mark Callaghan
mdcallag@gmail.com

@RickPizzi
Copy link
Author

Mark: by �first commits", you mean more recent ones?

Thanks

Rick

Riccardo Pizzi
pizzi@leopardus.com
http://www.vecchiflipper.it
http://www.pinballowners.com/webmaster

Dogs have masters. Cats have staff.

On 25/ott/2016, at 20:19, Mark Callaghan notifications@github.com wrote:

The first commits from this would be useful
cd $mysql-src; git log
cd $mysql-src/rocksdb; git log

On Fri, Oct 21, 2016 at 11:06 AM, Rick Pizzi notifications@github.com
wrote:

Unfortunately my installation doesn't seem to have this info for some
reason....

mysql> show global status like '%hash%';
+------------------+--------------------+
| Variable_name | Value |
+------------------+--------------------+
| git_hash | @MYSQL_GIT_HASH@ |
| rocksdb_git_hash | @ROCKSDB_GIT_HASH@ |
+------------------+--------------------+
2 rows in set (0.01 sec)

Perhaps I can fetch that same info from a file in .git folder?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#348 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABkKTQyXKQqXqHTeexHZclZO3xtAEzdVks5q2P8VgaJpZM4KdAAC
.

Mark Callaghan
mdcallag@gmail.com

You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.

@mdcallag
Copy link
Contributor

Please share the first (top) commit it shows.

On Tue, Oct 25, 2016 at 11:23 AM, Rick Pizzi notifications@github.com
wrote:

Mark: by �first commits", you mean more recent ones?

Thanks

Rick

Riccardo Pizzi
pizzi@leopardus.com
http://www.vecchiflipper.it
http://www.pinballowners.com/webmaster

Dogs have masters. Cats have staff.

On 25/ott/2016, at 20:19, Mark Callaghan notifications@github.com wrote:

The first commits from this would be useful
cd $mysql-src; git log
cd $mysql-src/rocksdb; git log

On Fri, Oct 21, 2016 at 11:06 AM, Rick Pizzi notifications@github.com
wrote:

Unfortunately my installation doesn't seem to have this info for some
reason....

mysql> show global status like '%hash%';
+------------------+--------------------+
| Variable_name | Value |
+------------------+--------------------+
| git_hash | @MYSQL_GIT_HASH@ |
| rocksdb_git_hash | @ROCKSDB_GIT_HASH@ |
+------------------+--------------------+
2 rows in set (0.01 sec)

Perhaps I can fetch that same info from a file in .git folder?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#348
issuecomment-255435150>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/
ABkKTQyXKQqXqHTeexHZclZO3xtAEzdVks5q2P8VgaJpZM4KdAAC>
.

Mark Callaghan
mdcallag@gmail.com

You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#348 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABkKTZh-3XG9j4uyatvgvjrVOVyV5xT6ks5q3kkGgaJpZM4KdAAC
.

Mark Callaghan
mdcallag@gmail.com

@RickPizzi
Copy link
Author

Here you go.

pwd

/home/rpizzi/mysql-fb

git log | head -100

commit 86587af
Author: Tian Xia tianx@fb.com
Date: Tue Oct 4 10:01:52 2016 -0700

Allow filtering of show commands through admission control

Summary:
Added SHOW command filter in admission control.

Squash with https://reviews.facebook.net/D55185
Squash with https://reviews.facebook.net/D48405

Test Plan: mtr

Reviewers: santoshb, jkedgar

Reviewed By: jkedgar

Subscribers: webscalesql-eng

Differential Revision: https://reviews.facebook.net/D64629

commit 8959b43
Author: Anirban Rahut arahut@fb.com
was built with so we can skip tests that are known to fail without full regex support

Test Plan: Run rocksdb.collation on the regular build and see that it passes.  Then run it on the --gcc48 build and see that it gets skipped.

Reviewers: hermanlee4, mung, gunnarku

Reviewed By: mung

Subscribers: webscalesql-eng

Differential Revision: https://reviews.facebook.net/D64515

commit b0904cc
Author: Alex Yang alexyang@fb.com
Date: Fri Aug 5 13:29:40 2016 -0700

Add sstfilewriter api improvements to fast secondary index creation

Summary:
This diff utilizes the recently implemented SSTFileWriter API for fast secondary index creation.

Writing to SST files directly requires the records of each secondary key to be written in order, to do this we need to leverage an external sort as the index is often larger than what can be held memory.

We iterate through the primary key entries, writing the offsets of each entry into a balanced binary tree (std::set, which uses red-black tree under the hood).  The tree itself uses a custom comparator which does a byte-wise comparison of records during insertion, which it uses to maintain the ordering of the tree.

Once we have reached the limit of what our sort buffer can hold in memory (RDB_MERGE_BUF_SIZE), we write to a tmp file out to disk in sorted order by iterating through the sorted "offset tree".  We continue to do this until all entries in the primary key have been traversed.  At this point we have several sorted chunks in a tmp_file on disk.

pwd

/home/rpizzi/mysql-fb/rocksdb

git log | head -100

commit 6a17b07ca856e573fabd6345d70787d4e481f57b
Author: Islam AbdelRahman tec@fb.com
Date: Fri Aug 19 16:17:56 2016 -0700

Add TablePropertiesCollector support in SstFileWriter

Summary: Update SstFileWriter to use user TablePropertiesCollectors that are passed in Options

Test Plan: unittests

Reviewers: sdong

Reviewed By: sdong

Subscribers: jkedgar, andrewkr, hermanlee4, dhruba, yoshinorim

Differential Revision: https://reviews.facebook.net/D62253

  make all check -j32
  ROCKSDB_FBCODE_BUILD_WITH_481=1 make all check -j32
  ROCKSDB_NO_FBCODE=1 make all check -j32

  export TBB_BASE=/mnt/gvfs/third-party2/tbb/afa54b33cfcf93f1d90a3160cdb894d6d63d5dca/4.0_update2/gcc-4.9-glibc-2.20/e9936bf;
  ROCKSDB_NO_FBCODE=1 CFLAGS="-I $TBB_BASE/include" LDFLAGS="-L $TBB_BASE/lib -Wl,-rpath=$TBB_BASE/lib" make all check -j32

Reviewers: IslamAbdelRahman, sdong

Reviewed By: sdong

Subscribers: kradhakrishnan, yhchiang, IslamAbdelRahman, andrewkr, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D56979

commit 49d88be021bb83ddc4b8fe44dbf8f28fc00868b5
Author: Jay busyjaylee@gmail.com
Date: Thu Aug 18 09:48:43 2016 +0800

c abi: allow compaction filter ignore snapshot (#1268)

close #1262

commit 0b63f51fbc4321bf67175be5c2b12f2884f23f90
Author: Alexander Jipa alexander.jipa@gmail.com
Date: Wed Aug 17 19:59:22 2016 -0400

fixes 1215: execute_process(COMMAND mkdir ${DIR}) fails to create a directory with cmake on Windows (#1219)

CMake - Use a platform-neutral mkdir function

commit 3981345be191dc676833341bac99372dc7940b86

Riccardo Pizzi
pizzi@leopardus.com
http://www.vecchiflipper.it
http://www.pinballowners.com/webmaster

Dogs have masters. Cats have staff.

On 25/ott/2016, at 20:29, Mark Callaghan notifications@github.com wrote:

Please share the first (top) commit it shows.

On Tue, Oct 25, 2016 at 11:23 AM, Rick Pizzi notifications@github.com
wrote:

Mark: by �first commits", you mean more recent ones?

Thanks

Rick

Riccardo Pizzi
pizzi@leopardus.com
http://www.vecchiflipper.it
http://www.pinballowners.com/webmaster

Dogs have masters. Cats have staff.

On 25/ott/2016, at 20:19, Mark Callaghan notifications@github.com wrote:

The first commits from this would be useful
cd $mysql-src; git log
cd $mysql-src/rocksdb; git log

On Fri, Oct 21, 2016 at 11:06 AM, Rick Pizzi notifications@github.com
wrote:

Unfortunately my installation doesn't seem to have this info for some
reason....

mysql> show global status like '%hash%';
+------------------+--------------------+
| Variable_name | Value |
+------------------+--------------------+
| git_hash | @MYSQL_GIT_HASH@ |
| rocksdb_git_hash | @ROCKSDB_GIT_HASH@ |
+------------------+--------------------+
2 rows in set (0.01 sec)

Perhaps I can fetch that same info from a file in .git folder?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#348
issuecomment-255435150>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/
ABkKTQyXKQqXqHTeexHZclZO3xtAEzdVks5q2P8VgaJpZM4KdAAC>
.

Mark Callaghan
mdcallag@gmail.com

You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#348 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABkKTZh-3XG9j4uyatvgvjrVOVyV5xT6ks5q3kkGgaJpZM4KdAAC
.

Mark Callaghan
mdcallag@gmail.com

You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.

@yoshinorim
Copy link
Contributor

@RickPizzi We added new (much more efficient) data loading feature and fixed all known bugs. Could you please pull all changes (git pull --rebase), build, and try loading data again (See load optimizations at https://github.com/facebook/mysql-5.6/wiki/data-loading)? DelayWrite() stall is expected to disappear with our new data loading feature.

@RickPizzi
Copy link
Author

Sure! Will do and report results back. Thank you!

Riccardo Pizzi
pizzi@leopardus.com
http://www.vecchiflipper.it
http://www.pinballowners.com/webmaster

Dogs have masters. Cats have staff.

On 31/ott/2016, at 22:55, Yoshinori Matsunobu notifications@github.com wrote:

@RickPizzi We added new (much more efficient) data loading feature and fixed all known bugs. Could you please pull all changes (git pull --rebase), build, and try loading data again (See load optimizations at https://github.com/facebook/mysql-5.6/wiki/data-loading)? DelayWrite() stall is expected to disappear with our new data loading feature.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@gunnarku
Copy link

gunnarku commented Apr 6, 2017

Please re-open if you still the issue.

@gunnarku gunnarku closed this as completed Apr 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants