Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Clone in Desktop Download ZIP

Loading…

rocksdb.rocksdb test asserts with message "Failed to get column family info from index id" #103

Closed
spetrunia opened this Issue · 8 comments

4 participants

@spetrunia
Collaborator

Every once in a while rocksdb.rocksdb test crashes like this:

2015-08-27 13:36:28 29451 [ERROR] RocksDB: Failed to get column family info from index id 278. MyRocks data dictionary may get corrupted.

The current stack traces:
/home/psergey/dev-git/mysql-5.6-rocksdb-new-locking-r7/sql/mysqld(my_pstack+0xbb) [0xd504a0]
/home/psergey/dev-git/mysql-5.6-rocksdb-new-locking-r7/sql/mysqld(_ZN23Dropped_indices_manager10set_insertEjPKc+0x53) [0xd9d699]
/home/psergey/dev-git/mysql-5.6-rocksdb-new-locking-r7/sql/mysqld(_ZN23Dropped_indices_manager4initEP17Table_ddl_managerP12Dict_manager+0xd2) [0xd9d542]
/home/psergey/dev-git/mysql-5.6-rocksdb-new-locking-r7/sql/mysqld() [0xd64e35]
/home/psergey/dev-git/mysql-5.6-rocksdb-new-locking-r7/sql/mysqld(_Z24ha_initialize_handlertonP13st_plugin_int+0xdd) [0x8d1cd7]
/home/psergey/dev-git/mysql-5.6-rocksdb-new-locking-r7/sql/mysqld() [0xa864fa]
/home/psergey/dev-git/mysql-5.6-rocksdb-new-locking-r7/sql/mysqld(_Z11plugin_initPiPPci+0x72b) [0xa86ee2]
/home/psergey/dev-git/mysql-5.6-rocksdb-new-locking-r7/sql/mysqld() [0x8b95b7]
/home/psergey/dev-git/mysql-5.6-rocksdb-new-locking-r7/sql/mysqld(_Z11mysqld_mainiPPc+0x34a) [0x8ba768]
/home/psergey/dev-git/mysql-5.6-rocksdb-new-locking-r7/sql/mysqld(main+0x20) [0x8b0a2d]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f126006aec5]
/home/psergey/dev-git/mysql-5.6-rocksdb-new-locking-r7/sql/mysqld() [0x8b0949]

The crash only happens as a part of big mysql-test-run run with --mem and --parallel=4 or more.

The crash seems to happen when mysqtest is executing these lines:

insert into t1 values (null);
--source include/restart_mysqld.inc
@yoshinorim
Owner

@spetrunia I'm aware of this issue but I couldn't repeat on my machine. Could you please send me all rocksdb data files (under $datadir/.rocksdb)?

@maykov
Owner

This is the issue which me mentioned at the last meeting. It is very important to fix, but we don't have any leads right now. Sergey, it would be great if you could share your data directory and any other pertinent information.
It is also tracked by t7984062

@hermanlee hermanlee was assigned by maykov
@yoshinorim
Owner

The abort started happening after I committed refactoring drop_table diff -- https://reviews.facebook.net/D42261 . I still have no idea if the diff itself has a bug, or a bug existed for a long time and the diff started catching and raising assertion. And I couldn't repeat the error on my machine. So failed sst/wal files would be very helpful.

@spetrunia
Collaborator
@yoshinorim
Owner

Thanks Sergey!

From error log:
2015-08-29 00:24:17 3207 [Note] RocksDB: Begin filtering dropped index 260
2015-08-29 00:24:17 3207 [Note] RocksDB: Finished filtering dropped index 260
2015-08-29 00:24:26 3242 [ERROR] RocksDB: Failed to get column family info from index id 260. MyRocks data dictionary may get corrupted.

So the index 0x260 caused the problem.

sst_dump showed that dictionary index_cf_mapping (0x2) was removed correctly, but drop_index (0x5) was not removed. This was definitely data inconsistency..
'0000000200000104' @ 18 : 0 =>
'0000000500000104' @ 19 : 1 => 0001

Relevant code is here -- https://github.com/MySQLOnRocksDB/mysql-5.6/blob/webscalesql-5.6.24.97/storage/rocksdb/rdb_dropped_indices.cc#L165-L172

@yoshinorim
Owner

Looks like signal_drop_index_thread() has to be called after dictionary commit -- https://github.com/MySQLOnRocksDB/mysql-5.6/blob/webscalesql-5.6.24.97/storage/rocksdb/ha_rocksdb.cc#L5176-L5184

@hermanlee hermanlee was unassigned by yoshinorim
@yoshinorim yoshinorim self-assigned this
@yoshinorim
Owner

Found root cause

@yoshinorim yoshinorim closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.