MySQLOnRocksDB/mysql-5.6
forked from facebook/mysql-5.6

Loading…
rocksdb.rocksdb test asserts with message "Failed to get column family info from index id" #103
@spetrunia I'm aware of this issue but I couldn't repeat on my machine. Could you please send me all rocksdb data files (under $datadir/.rocksdb)?
This is the issue which me mentioned at the last meeting. It is very important to fix, but we don't have any leads right now. Sergey, it would be great if you could share your data directory and any other pertinent information.
It is also tracked by t7984062
The abort started happening after I committed refactoring drop_table diff -- https://reviews.facebook.net/D42261 . I still have no idea if the diff itself has a bug, or a bug existed for a long time and the diff started catching and raising assertion. And I couldn't repeat the error on my machine. So failed sst/wal files would be very helpful.
A log of mysql-test-run that caused the crash:
https://gist.github.com/spetrunia/f25066026c1c5c15f416
Data directory: http://s.petrunia.net/scratch/issue103-var-log-rocksdb-rocksdb.tgz
Whole mysql-test/var, just in case: http://s.petrunia.net/scratch/issue103-var.tgz
Thanks Sergey!
From error log:
2015-08-29 00:24:17 3207 [Note] RocksDB: Begin filtering dropped index 260
2015-08-29 00:24:17 3207 [Note] RocksDB: Finished filtering dropped index 260
2015-08-29 00:24:26 3242 [ERROR] RocksDB: Failed to get column family info from index id 260. MyRocks data dictionary may get corrupted.
So the index 0x260 caused the problem.
sst_dump showed that dictionary index_cf_mapping (0x2) was removed correctly, but drop_index (0x5) was not removed. This was definitely data inconsistency..
'0000000200000104' @ 18 : 0 =>
'0000000500000104' @ 19 : 1 => 0001
Relevant code is here -- https://github.com/MySQLOnRocksDB/mysql-5.6/blob/webscalesql-5.6.24.97/storage/rocksdb/rdb_dropped_indices.cc#L165-L172
Looks like signal_drop_index_thread() has to be called after dictionary commit -- https://github.com/MySQLOnRocksDB/mysql-5.6/blob/webscalesql-5.6.24.97/storage/rocksdb/ha_rocksdb.cc#L5176-L5184
Found root cause
Every once in a while rocksdb.rocksdb test crashes like this:
The crash only happens as a part of big mysql-test-run run with
--memand--parallel=4or more.The crash seems to happen when mysqtest is executing these lines: