PS-7865 Dropping a table with discarded tablespace crashes the server #2

voodam · 2021-12-09T11:18:59Z

This is how the bug occurs:

If table have a lot of data, these conditions are satisfied and btr_search_update_block_hash_info returns true: https://github.com/percona/percona-server/blob/8.0/storage/innobase/btr/btr0sea.cc#L588 on one of LOAD DATA commands execution. This causes the creation of a hash index.
On discarding table this line is reached: https://github.com/percona/percona-server/blob/8.0/storage/innobase/row/row0mysql.cc#L3994. This will be the problematic index on which assert fails because of index->space == FIL_NULL but block->page.id.space() stays the same.
This erros occurs only on a lot amount of data, otherwise a hash index will not be created and hence will not be droped.

I suppose that solution can be in widening assert condition to include index->space == FIL_NULL case. But I'am not sure, is it okay that we drop the reseted index in btr_search_drop_page_hash_index? But early returning from the function probably is not an option, because in that case assert_block_ahi_empty(block) is failed:

  if (block->index == nullptr ||
      block->index->space == FIL_NULL) {
    rw_lock_s_unlock(latch);
    assert_block_ahi_empty(block);
    return;
  }

Assert was originally added in percona@195760a57028 commit. Probably it's not good idea to widen other same asserts because other cases are not related with dropping index, but with creating, updating etc. But I'am also not sure.

I added a little faster version of a testcase. It differs from the original one, but works in the same way.

…h_drop_page_hash_index

…ILER WARNINGS Remove some stringop-truncation warning using cstrbuf. Change-Id: I3ab43f6dd8c8b0b784d919211b041ac3ad4fad40

Patch #1 caused several problems in mysql-trunk related to ndbinfo initialization and upgrade, including the failure of the test ndb_76_inplace_upgrade and the failure of all NDB MTR tests in Pushbuild on Windows. This patch fixes these issues, including fixes for bug#33726826 and bug#33730799. In ndbinfo, revert the removal of ndb$blocks and ndb$index_stats and the change of blocks and index_stats from views to tables. Improve the ndbinfo schema upgrade & initialization logic to better handle such a change in the future. This logic now runs in two passes: first it drops the known tables and views from current and previous versions, then it creates the tables and views for the current version. Add a new class method NdbDictionary::printColumnTypeDescription(). This is needed for the ndbinfo.columns table in patch #2 but was missing from patch #1. Add boilerplate index lookup initialization code that was also missing. Fix ndbinfo prefix determination on Windows. Change-Id: I422856bcad4baf5ae9b14c1e3a1f2871bd6c5f59

A subset of binlog encryption tests was crashing with: * thread percona#39, stop reason = signal SIGSTOP frame #0: 0x00007fff56063b66 libsystem_kernel.dylib`__pthread_kill + 10 frame #1: 0x00007fff5622e080 libsystem_pthread.dylib`pthread_kill + 333 frame #2: 0x000000010657442b mysqld-debug`my_write_core(sig=11) at stacktrace.cc:278 frame #3: 0x0000000104d84334 mysqld-debug`::handle_fatal_signal(sig=11) at signal_handler.cc:254 frame #4: 0x00007fff56221f5a libsystem_platform.dylib`_sigtramp + 26 frame #5: 0x00007fff5622934d libsystem_pthread.dylib`pthread_mutex_lock + 1 frame #6: 0x0000000106578d05 mysqld-debug`native_mutex_lock(mutex=0x0000000000000000) at thr_mutex.h:93 frame #7: 0x0000000106578a57 mysqld-debug`safe_mutex_lock(mp=0x0000000000000000, try_lock=false, file="/Users/laurynas/percona/mysql-server/mysys/mf_iocache2.cc", line=113) at thr_mutex.cc:70 frame #8: 0x000000010653cd3a mysqld-debug`my_mutex_lock(mp=0x00007ffb6b215038, file="/Users/laurynas/percona/mysql-server/mysys/mf_iocache2.cc", line=113) at thr_mutex.h:180 frame #9: 0x000000010653b2cc mysqld-debug`inline_mysql_mutex_lock(that=0x00007ffb6b215038, src_file="/Users/laurynas/percona/mysql-server/mysys/mf_iocache2.cc", src_line=113) at mysql_mutex.h:267 * frame #10: 0x000000010653b0d8 mysqld-debug`my_b_append_tell(info=0x00007ffb6b214fd8) at mf_iocache2.cc:113 frame #11: 0x0000000105ed6a96 mysqld-debug`MYSQL_BIN_LOG::write_buffer(this=0x00007ffb6b214cb8, buf="", len=47, mi=0x00007ffb6b1f6a00) at binlog.cc:7128 frame #12: 0x0000000105f4d54b mysqld-debug`queue_event(mi=0x00007ffb6b1f6a00, buf="", event_len=47, do_flush_mi=true) at rpl_slave.cc:7756 frame percona#13: 0x0000000105f3a243 mysqld-debug`::handle_slave_io(arg=0x00007ffb6b1f6a00) at rpl_slave.cc:5382 frame percona#14: 0x00000001065b87a5 mysqld-debug`pfs_spawn_thread(arg=0x00007ffb6a543af0) at pfs.cc:2836 frame percona#15: 0x00007fff5622b661 libsystem_pthread.dylib`_pthread_body + 340 frame percona#16: 0x00007fff5622b50d libsystem_pthread.dylib`_pthread_start + 377 frame percona#17: 0x00007fff5622abf9 libsystem_pthread.dylib`thread_start + 13 This was caused by my_b_append_tell trying to lock a nullptr IO_CACHE::append_buffer_lock. The lock was nullptr, because it's only initialized for SEQ_READ_APPEND IO_CACHEs, whereas we have WRITE_CACHE. This mismatch was introduced by WL#8599 [1] changing the IO_CACHE type from the former to the latter. Fix by using the correct API for the new IO_CACHE type: my_b_tell instead of my_b_append_tell. [1]: commit dbd2ca2 Author: Joao Gramacho <joao.gramacho@oracle.com> Date: Tue Nov 1 06:45:39 2016 +0000 WL#8599: Reduce contention in IO and SQL threads (...)

create_table_info_t::create_table_def leaked memory in the case enable_encryption(table) call failed: worker[5] Sanitizer report from /tmp/results/PS/mysql-test/var/5/log/mysqld.2.err after tests: binlog_encryption.binlog_encryption_without_keyring group_replication.gr_change_master_hidden group_replication.gr_server_uuid_matches_group_name group_replication.gr_perfschema_connect_status group_replication.gr_single_primary_and_leader_election_on_error group_replication.gr_without_perfschema rpl.rpl_key_rotation -------------------------------------------------------------------------- ==14131==ERROR: LeakSanitizer: detected memory leaks Direct leak of 1136 byte(s) in 1 object(s) allocated from: #0 0x7fe9233f1602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602) #1 0xc692483 in ut_allocator<unsigned char>::allocate(unsigned long, unsigned char const*, unsigned int, bool, bool) storage/innobase/include/ut0new.h:608 #2 0xc692483 in mem_heap_create_block_func(mem_block_info_t*, unsigned long, unsigned long) storage/innobase/mem/memory.cc:281 #3 0xb99ff96 in mem_heap_create_func storage/innobase/include/mem0mem.ic:464 #4 0xbae8604 in create_table_info_t::create_table_def(dd::Table const*) storage/innobase/handler/ha_innodb.cc:10349 #5 0xbaee018 in create_table_info_t::create_table(dd::Table const*) storage/innobase/handler/ha_innodb.cc:12420 #6 0xbaf1aba in int innobase_basic_ddl::create_impl<dd::Table>(THD*, char const*, TABLE*, HA_CREATE_INFO*, dd::Table*, bool, bool, bool, unsigned long, unsigned long) storage/innobase/handler/ha_innodb.cc:12805 #7 0xbaf7e6a in ha_innobase::create(char const*, TABLE*, HA_CREATE_INFO*, dd::Table*) storage/innobase/handler/ha_innodb.cc:13756 #8 0x2857f7a in ha_create_table(THD*, char const*, char const*, char const*, HA_CREATE_INFO*, List<Create_field> const*, bool, bool, dd::Table*) sql/handler.cc:5156 #9 0x19d0d9f in rea_create_base_table sql/sql_table.cc:991 #10 0x19d0d9f in create_table_impl sql/sql_table.cc:7118 #11 0x19d37cf in mysql_create_table_no_lock(THD*, char const*, char const*, HA_CREATE_INFO*, Alter_info*, unsigned int, bool, bool*, handlerton**) sql/sql_table.cc:7200 #12 0x19dffb2 in mysql_create_table(THD*, TABLE_LIST*, HA_CREATE_INFO*, Alter_info*) sql/sql_table.cc:7950 percona#13 0x3b58b9b in Sql_cmd_create_table::execute(THD*) sql/sql_cmd_ddl_table.cc:319 percona#14 0x15917c1 in mysql_execute_command(THD*, bool) sql/sql_parse.cc:4417 percona#15 0x15b086e in mysql_parse(THD*, Parser_state*, bool) sql/sql_parse.cc:5139 percona#16 0x8efc7fd in Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned long) sql/log_event.cc:5295 percona#17 0x8f7ea48 in Log_event::apply_event(Relay_log_info*) sql/log_event.cc:3882 percona#18 0x91cb682 in apply_event_and_update_pos sql/rpl_slave.cc:4352 percona#19 0x9215e69 in exec_relay_log_event sql/rpl_slave.cc:4812 percona#20 0x9254685 in handle_slave_sql sql/rpl_slave.cc:6912 percona#21 0xb1913a3 in pfs_spawn_thread storage/perfschema/pfs.cc:2836 percona#22 0x7fe9231436b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9) Fix by adding the missing mem_heap_free(heap) call.

Avoid undefined behavior in audit_log_update_thd_local by avoiding passing NULL as source pointer to memcpy, even with zero length. The UBSan report fixed is /usr/include/x86_64-linux-gnu/bits/string3.h:53:71: runtime error: null pointer passed as argument 2, which is declared to never be null #0 0x7fe5aad56fb1 in memcpy /usr/include/x86_64-linux-gnu/bits/string3.h:53 #1 0x7fe5aad56fb1 in audit_log_update_thd_local plugin/audit_log/audit_log.cc:987 #2 0x7fe5aad56fb1 in audit_log_notify plugin/audit_log/audit_log.cc:1105 #3 0x1ecac37 in plugins_dispatch sql/sql_audit.cc:1284 #4 0x1ecac37 in event_class_dispatch sql/sql_audit.cc:1322 #5 0x1ecb311 in event_class_dispatch_error sql/sql_audit.cc:1340 #6 0x1ed21b1 in mysql_audit_notify(THD*, mysql_event_connection_subclass_t, char const*, int) sql/sql_audit.cc:438 #7 0x1350071 in check_connection sql/sql_connect.cc:868 #8 0x1350071 in login_connection sql/sql_connect.cc:929 #9 0x1357881 in thd_prepare_connection(THD*, bool) sql/sql_connect.cc:1084 #10 0x1e66347 in handle_connection sql/conn_handler/connection_handler_per_thread.cc:313 #11 0xb1913a3 in pfs_spawn_thread storage/perfschema/pfs.cc:2836 #12 0x7fe5d352f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9) percona#13 0x7fe5d0bd741c in clone (/lib/x86_64-linux-gnu/libc.so.6+0x10741c)