Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Clone in Desktop Download ZIP

Loading…

Server crashes in RDBSE_KEYDEF::get_primary_key_tuple #70

Closed
elenst opened this Issue · 5 comments

3 participants

@elenst

Stack trace below is from commit fe4820c, built with
cmake . -DCMAKE_BUILD_TYPE=Debug -DWITH_SSL:STRING=system -DWITH_ZLIB:STRING=system -DMYSQL_MAINTAINER_MODE=1

#3
#4 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:33
#5 0x0000000000ee0b76 in RDBSE_KEYDEF::get_primary_key_tuple (this=0x2348d60, pk_descr=0x2420360, key=0x7f442c6bd520, pk_buffer=0x245ab40 "") at storage/rocksdb/rdb_datadic.cc:353
#6 0x0000000000ebda5c in ha_rocksdb::index_read_map (this=0x242bd90, buf=0x242b9b8 "\373\002", key=0x23d31b0 "", keypart_map=1, find_flag=HA_READ_KEY_EXACT) at storage/rocksdb/ha_rocksdb.cc:3163
#7 0x00000000007355cd in handler::ha_index_read_map (this=0x242bd90, buf=0x242b9b8 "\373\002", key=0x23d31b0 "", keypart_map=1, find_flag=HA_READ_KEY_EXACT) at sql/handler.cc:2925
#8 0x0000000001066976 in ha_partition::handle_unordered_scan_next_partition (this=0x242b3c0, buf=0x242b9b8 "\373\002") at sql/ha_partition.cc:6061
#9 0x0000000001065780 in ha_partition::common_index_read (this=0x242b3c0, buf=0x242b9b8 "\373\002", have_start_key=true) at sql/ha_partition.cc:5442
#10 0x000000000106548d in ha_partition::index_read_map (this=0x242b3c0, buf=0x242b9b8 "\373\002", key=0x23d31b0 "", keypart_map=1, find_flag=HA_READ_KEY_EXACT) at sql/ha_partition.cc:5358
#11 0x00000000007355cd in handler::ha_index_read_map (this=0x242b3c0, buf=0x242b9b8 "\373\002", key=0x23d31b0 "", keypart_map=1, find_flag=HA_READ_KEY_EXACT) at sql/handler.cc:2925
#12 0x000000000089ce7f in join_read_always_key (tab=0x23d2d18) at sql/sql_executor.cc:2244
#13 0x000000000089adc7 in sub_select (join=0x23eafb8, join_tab=0x23d2d18, end_of_records=false) at sql/sql_executor.cc:1294
#14 0x000000000089a6c5 in do_select (join=0x23eafb8) at sql/sql_executor.cc:950
#15 0x00000000008986cc in JOIN::exec (this=0x23eafb8) at sql/sql_executor.cc:207
#16 0x00000000008f5d05 in mysql_execute_select (thd=0x2383380, select_lex=0x2386ac8, free_join=true) at sql/sql_select.cc:1133
#17 0x00000000008f5fb3 in mysql_select (thd=0x2383380, tables=0x239b938, wild_num=0, fields=..., conds=0x23eae20, order=0x2386c90, group=0x2386bc8, having=0x0, select_options=2147748608, result=0x239d5f8, unit=0x2386470, select_lex=0x2386ac8) at sql/sql_select.cc:1254
#18 0x00000000008f4106 in handle_select (thd=0x2383380, result=0x239d5f8, setup_tables_done_option=0) at sql/sql_select.cc:126
#19 0x00000000008ceb22 in execute_sqlcom_select (thd=0x2383380, all_tables=0x239b938, last_timer=0x7f442c6bfbe0) at sql/sql_parse.cc:5721
#20 0x00000000008c7679 in mysql_execute_command (thd=0x2383380, statement_start_time=0x7f442c6bec18, post_parse=0x7f442c6bfbe0) at sql/sql_parse.cc:3129
#21 0x00000000008d167a in mysql_parse (thd=0x2383380, rawbuf=0x239b6b0 "SELECT f1 FROM t1 WHERE f2 = ( SELECT f1 FROM t2 WHERE pk = 2 )", length=63, parser_state=0x7f442c6bf540, last_timer=0x7f442c6bfbe0, async_commit=0x7f442c6bfbdf "") at sql/sql_parse.cc:7002
#22 0x00000000008c3a63 in dispatch_command (command=COM_QUERY, thd=0x2383380, packet=0x2396f41 "", packet_length=63) at sql/sql_parse.cc:1515
#23 0x00000000008c2733 in do_command (thd=0x2383380) at sql/sql_parse.cc:1065
#24 0x000000000088ca98 in do_handle_one_connection (thd_arg=0x2383380) at sql/sql_connect.cc:1021
#25 0x000000000088c582 in handle_one_connection (arg=0x2383380) at sql/sql_connect.cc:929
#26 0x00007f442f5d10a4 in start_thread (arg=0x7f442c6c0700) at pthread_create.c:309
#27 0x00007f442d40a04d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Test case:

--source include/have_partition.inc

CREATE TABLE t1 (pk INT PRIMARY KEY, f1 INT, f2 INT, KEY(f2)) ENGINE=RocksDB
PARTITION BY HASH(pk) PARTITIONS 2;
INSERT INTO t1 VALUES (1, 6, NULL), (2, NULL, 1);

CREATE TABLE t2 (pk INT PRIMARY KEY, f1 INT) ENGINE=RocksDB;
INSERT INTO t2 VALUES (1, 1), (2, 1);

SELECT f1 FROM t1 WHERE f2 = ( SELECT f1 FROM t2 WHERE pk = 2 );


If the test case does not cause a crash for you, try running it with valgrind. I have a machine where the test case itself does not crash, the server crashes later on shutdown with this:

#5 0x00007f9bd459b83b in __GI_abort () at abort.c:91
#6 0x00007f9bd45d561e in __libc_message (do_abort=2, fmt=0x7f9bd46dfb40 "*** glibc detected *** %s: %s: 0x%s **\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:201
#7 0x00007f9bd45dfe16 in malloc_printerr (action=3, str=0x7f9bd46dfd08 "free(): invalid next size (fast)", ptr=) at malloc.c:5047
#8 0x0000000000ba8a14 in my_free ()
#9 0x0000000000bcf062 in ha_rocksdb::close() () at storage/rocksdb/ha_rocksdb.cc:2374
#10 0x000000000072ba11 in handler::ha_close() () at sql/handler.cc:2717
#11 0x000000000108c1db in ha_partition::close() () at sql/ha_partition.cc:3638
#12 0x000000000072ba11 in handler::ha_close() () at sql/handler.cc:2717
#13 0x000000000099303d in closefrm(TABLE
, bool) () at sql/table.cc:2630
#14 0x000000000085a468 in intern_close_table(TABLE) () at sql/sql_base.cc:821
#15 0x000000000099d4b2 in Table_cache::free_all_unused_tables() () at sql/table_cache.cc:159
#16 0x000000000099dc58 in Table_cache_manager::free_all_unused_tables() () at sql/table_cache.cc:394
#17 0x000000000085a78e in close_cached_tables(THD
, TABLE_LIST, bool, unsigned long) () at sql/sql_base.cc:927
#18 0x00000000008595ba in table_def_start_shutdown() () at sql/sql_base.cc:408
#19 0x000000000070903a in clean_up(bool) () at sql/mysqld.cc:2030
#20 0x0000000000708c01 in unireg_end() () at sql/mysqld.cc:1885
#21 0x0000000000708b44 in kill_server(void
) () at sql/mysqld.cc:1813
#22 0x0000000000708b69 in kill_server_thread () at sql/mysqld.cc:1836
#23 0x00007f9bd51b8e9a in start_thread (arg=0x7f9bc97fa700) at pthread_create.c:308
#24 0x00007f9bd46558bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112

But if the test is run with valgrind, it produces this and a lot more:

==24895== Thread 23:
==24895== Conditional jump or move depends on uninitialised value(s)
==24895== at 0x4C2CF3A: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==24895== by 0xBF2897: RDBSE_KEYDEF::get_primary_key_tuple(RDBSE_KEYDEF, rocksdb::Slice const, char) (rdb_datadic.cc:353)
==24895== by 0xBD0FEB: ha_rocksdb::index_read_map(unsigned char
, unsigned char const, unsigned long, ha_rkey_function) (ha_rocksdb.cc:3163)
==24895== by 0x72C1C0: handler::ha_index_read_map(unsigned char
, unsigned char const, unsigned long, ha_rkey_function) (handler.cc:2925)
==24895== by 0x1091007: ha_partition::handle_unordered_scan_next_partition(unsigned char
) (ha_partition.cc:6061)
==24895== by 0x108FD95: ha_partition::common_index_read(unsigned char, bool) (ha_partition.cc:5442)
==24895== by 0x108FA9C: ha_partition::index_read_map(unsigned char
, unsigned char const, unsigned long, ha_rkey_function) (ha_partition.cc:5358)
==24895== by 0x72C1C0: handler::ha_index_read_map(unsigned char
, unsigned char const, unsigned long, ha_rkey_function) (handler.cc:2925)
==24895== by 0x89F819: join_read_always_key(st_join_table
) (sql_executor.cc:2244)
==24895== by 0x89D71C: sub_select(JOIN, st_join_table, bool) (sql_executor.cc:1294)
==24895== by 0x89CFD3: do_select(JOIN) (sql_executor.cc:950)
==24895== by 0x89AF2E: JOIN::exec() (sql_executor.cc:207)
==24895== by 0x8FC83E: mysql_execute_select(THD
, st_select_lex, bool) (sql_select.cc:1133)
==24895== by 0x8FCB22: mysql_select(THD
, TABLE_LIST, unsigned int, List&, Item, SQL_I_List, SQL_I_List, Item, unsigned long long, select_result, st_select_lex_unit, st_select_lex) (sql_select.cc:1254)
==24895== by 0x8FA993: handle_select(THD, select_result, unsigned long) (sql_select.cc:126)
==24895== by 0x8D3C98: execute_sqlcom_select(THD, TABLE_LIST, unsigned long long*) (sql_parse.cc:5721)

@elenst

More stack traces that very similar or identical test cases sometimes produce (the choice of crash seems somewhat random):

#3
#4 __GI___libc_free (mem=0x2801ee1000007f01) at malloc.c:2970
#5 0x0000000000bfe3be in Apply_changes_iter::~Apply_changes_iter() () at storage/rocksdb/rdb_applyiter.cc:39
#6 0x0000000000bd4183 in ha_rocksdb::index_end() () at storage/rocksdb/ha_rocksdb.cc:4307
#7 0x000000000072bd4b in handler::ha_index_end() () at sql/handler.cc:2791
#8 0x000000000108f995 in ha_partition::index_end() () at sql/ha_partition.cc:5315
#9 0x000000000072bd4b in handler::ha_index_end() () at sql/handler.cc:2791
#10 0x0000000000736925 in handler::ha_index_or_rnd_end() () at sql/handler.h:2098
#11 0x0000000000a56992 in QUICK_RANGE_SELECT::range_end() () at sql/opt_range.cc:1440
#12 0x0000000000a569f9 in QUICK_RANGE_SELECT::~QUICK_RANGE_SELECT() () at sql/opt_range.cc:1452
#13 0x0000000000a56b50 in QUICK_RANGE_SELECT::~QUICK_RANGE_SELECT() () at sql/opt_range.cc:1468
#14 0x00000000008a5da7 in SQL_SELECT::set_quick(QUICK_SELECT_I) () at sql/opt_range.h:933
#15 0x0000000000a56595 in SQL_SELECT::cleanup() () at sql/opt_range.cc:1354
#16 0x0000000000a5661c in SQL_SELECT::~SQL_SELECT() () at sql/opt_range.cc:1368
#17 0x0000000000a9ac9c in mysql_delete(THD
, TABLE_LIST, Item, SQL_I_List, unsigned long long, unsigned long long) () at sql/sql_delete.cc:313
#18 0x00000000008ceb67 in mysql_execute_command(THD
, unsigned long long, unsigned long long) () at sql/sql_parse.cc:4076
#19 0x00000000008d68da in mysql_parse(THD, char, unsigned int, Parser_state, unsigned long long, char) () at sql/sql_parse.cc:7002
#20 0x00000000008c86fb in dispatch_command(enum_server_command, THD
, char, unsigned int) () at sql/sql_parse.cc:1515
#21 0x00000000008c7345 in do_command(THD
) () at sql/sql_parse.cc:1065
#22 0x000000000088ea96 in do_handle_one_connection(THD*) () at sql/sql_connect.cc:1021
#23 0x000000000088e549 in handle_one_connection () at sql/sql_connect.cc:929
#24 0x00007fc566705e9a in start_thread (arg=0x7fc55c529700) at pthread_create.c:308
#25 0x00007fc565ba28bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112


storage/rocksdb/ha_rocksdb.cc:2156: int ha_rocksdb::convert_record_from_storage_format(uchar*):
Assertion `reader.remaining_bytes() == 0' failed.
21:21:24 UTC - mysqld got signal 6 ;

#6 0x00007fa3b5f84d9e in __assert_fail_base (fmt=, assertion=0x12c8ebd "reader.remaining_bytes() == 0", file=0x12c7748 "storage/rocksdb/ha_rocksdb.cc", line=, function=) at assert.c:94
#7 0x00007fa3b5f84e42 in __GI___assert_fail (assertion=0x12c8ebd "reader.remaining_bytes() == 0", file=0x12c7748 "storage/rocksdb/ha_rocksdb.cc", line=2156, function=0x12caa00 "int ha_rocksdb::convert_record_from_storage_format(uchar)") at assert.c:103
#8 0x0000000000bce463 in ha_rocksdb::convert_record_from_storage_format(unsigned char
) () at storage/rocksdb/ha_rocksdb.cc:2156
#9 0x0000000000bd3f8c in ha_rocksdb::rnd_next_with_direction(unsigned char, bool) () at storage/rocksdb/ha_rocksdb.cc:4265
#10 0x0000000000bd3bac in ha_rocksdb::rnd_next(unsigned char
) () at storage/rocksdb/ha_rocksdb.cc:4175
#11 0x000000000072c031 in handler::ha_rnd_next(unsigned char) () at sql/handler.cc:2860
#12 0x000000000108eaeb in ha_partition::rnd_next(unsigned char
) () at sql/ha_partition.cc:4865
#13 0x000000000072c031 in handler::ha_rnd_next(unsigned char) () at sql/handler.cc:2860
#14 0x0000000000a248d5 in find_all_keys(Sort_param
, SQL_SELECT, Filesort_info, st_io_cache, st_io_cache, Bounded_queue, unsigned long long) () at sql/filesort.cc:793
#15 0x0000000000a2311d in filesort(THD, TABLE, Filesort, bool, unsigned long long, unsigned long long) () at sql/filesort.cc:339
#16 0x0000000000962436 in mysql_update(THD
, TABLE_LIST, List&, List&, Item, unsigned int, st_order, unsigned long long, enum_duplicates, bool, unsigned long long, unsigned long long) () at sql/sql_update.cc:572
#17 0x00000000008cdffe in mysql_execute_command(THD
, unsigned long long, unsigned long long) () at sql/sql_parse.cc:3810
#18 0x00000000008d68da in mysql_parse(THD, char, unsigned int, Parser_state, unsigned long long, char) () at sql/sql_parse.cc:7002
#19 0x00000000008c86fb in dispatch_command(enum_server_command, THD
, char, unsigned int) () at sql/sql_parse.cc:1515
#20 0x00000000008c7345 in do_command(THD
) () at sql/sql_parse.cc:1065
#21 0x000000000088ea96 in do_handle_one_connection(THD*) () at sql/sql_connect.cc:1021
#22 0x000000000088e549 in handle_one_connection () at sql/sql_connect.cc:929
#23 0x00007fa3b6bace9a in start_thread (arg=0x7fa3b42d7700) at pthread_create.c:308
#24 0x00007fa3b60498bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112

@yoshinorim
Owner

Thanks for the bug report. Does your test case repeat without Partition? We have not added Partition support in MyRocks yet so some crash bugs are not surprising.

@elenst

No, when I tried it did not crash without partitioning. And yes, I suspected it was not fully supported, but then I would expect it to be prohibited at table creation time.

@spetrunia spetrunia self-assigned this
@spetrunia
Collaborator

Ok, I investigated it. The cause is as follows:

When partitioning is not used, SQL layer provides "extended keys". That is, when ha_rocksdb::open() examines key definitions in TABLE*, it finds that every (non-unique) secondary index has a suffix of PK columns. MyRocks code depends on this property.

When the table is partitioned, SQL layer doesn't provide "extended keys". This causes MyRocks to crash when trying to extract PK columns from secondary key value.

@spetrunia spetrunia referenced this issue from a commit
@spetrunia spetrunia MyRocks Issue #70: Server crashes in RDBSE_KEYDEF::get_primary_key_tuple
Summary:
SQL layer doesn't provide "index extensions" of PK columns when the
table is partitioned.
Resolve this in the same way as we did for unique secondary keys: attach
suffix of PK columns ourselves.

Since this changes on-disk format, bumped the version#.

Test Plan: Run mtr

Reviewers: maykov, jtolmer, hermanlee4, yoshinorim

Reviewed By: hermanlee4

Differential Revision: https://reviews.facebook.net/D39027
77aee0e
@spetrunia spetrunia closed this
@spetrunia spetrunia referenced this issue from a commit
@spetrunia spetrunia MyRocks Issue #70: Server crashes in RDBSE_KEYDEF::get_primary_key_tuple
Summary:
SQL layer doesn't provide "index extensions" of PK columns when the
table is partitioned.
Resolve this in the same way as we did for unique secondary keys: attach
suffix of PK columns ourselves.

Since this changes on-disk format, bumped the version#.

Test Plan: Run mtr

Reviewers: maykov, jtolmer, hermanlee4, yoshinorim

Reviewed By: hermanlee4

Differential Revision: https://reviews.facebook.net/D39027
8ccc9c9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.