PS-7144 : incorrect number of doublewrite files created if innodb_buffer_pool_size > 1G #4

satya-bodapati · 2020-06-30T07:45:26Z

Problem:
-------
According to documentation, default value of innodb_doublewrite_files is
innodb_buffer_pool_instances * 2

But when you start InnoDB with buffer pool size > 1G (the number of instances
is 8), it still creates 2 files only. Expected files is 16 (8*2)

Fix:
----
If buffer pool size is >= 1G and the explicit number of dblwr files not
specified, create 2*innodb_buffer_pool_instances files

…fer_pool_size > 1G Problem: ------- According to documentation, default value of innodb_doublewrite_files is innodb_buffer_pool_instances * 2 But when you start InnoDB with buffer pool size > 1G (the number of instances is 8), it still creates 2 files only. Expected files is 16 (8*2) Fix: ---- If buffer pool size is >= 1G and the explicit number of dblwr files not specified, create 2*innodb_buffer_pool_instances files

…TH VS 2019 [#4] [noclose] storage\ndb\src\ndbapi\ObjectMap.hpp(168,1): warning C4302: 'type cast': truncation from 'void *' to 'long' Change-Id: I11ffba127bc19db15e9a50307b50532941f9fdb2

…close] Make the range optimizer return AccessPaths instead of TABLE_READ_PLAN. This is the first step of getting rid of TABLE_READ_PLAN and moving everything into AccessPath; currently, it's just a very thin shell: 1. TRPs are still used internally, and AccessPath is created at the very end. 2. Child TRPs are still child TRPs (ie., there are no child AccessPaths). 3. All returned AccessPaths are still of the type INDEX_RANGE_SCAN, wrapping a TRP. 4. Some callers still reach directly into the TRP, assuming #3. Most callers (save for the aforemented #4) use a set of simple wrapper functions to access TRP-derived properties from AccessPaths; as we continue the transformation, this is the main place we'll change the interaction (ie., most of the calling code will remain unchanged). Change-Id: I3d9dc9e33c53d1e5124ea9c47b7d6d9270cd1906

This error happens for queries such as: SELECT ( SELECT 1 FROM t1 ) AS a, ( SELECT a FROM ( SELECT x FROM t1 ORDER BY a ) AS d1 ); Query_block::prepare() for query block #4 (corresponding to the 4th SELECT in the query above) calls setup_order() which again calls find_order_in_list(). That function replaces an Item_ident for 'a' in Query_block.order_list with an Item_ref pointing to query block #2. Then Query_block::merge_derived() merges query block #4 into query block #3. The Item_ref mentioned above is then moved to the order_list of query block #3. In the next step, find_order_in_list() is called for query block #3. At this point, 'a' in the select list has been resolved to another Item_ref, also pointing to query block #2. find_order_in_list() detects that the Item_ref in the order_list is equivalent to the Item_ref in the select list, and therefore decides to replace the former with the latter. Then find_order_in_list() calls Item::clean_up_after_removal() recursively (via Item::walk()) for the order_list Item_ref (since that is no longer needed). When calling clean_up_after_removal(), no Cleanup_after_removal_context object is passed. This is the actual error, as there should be a context pointing to query block #3 that ensures that clean_up_after_removal() only purge Item_subselect.unit if both of the following conditions hold: 1) The Item_subselect should not be in any of the Item trees in the select list of query block #3. 2) Item_subselect.unit should be a descendant of query block #3. These conditions ensure that we only purge Item_subselect.unit if we are sure that it is not needed elsewhere. But without the right context, query block #2 gets purged even if it is used in the select lists of query blocks #1 and #3. The fix is to pass a context (for query block #3) to clean_up_after_removal(). Both of the above conditions then become false, and Item_subselect.unit is not purged. As an additional shortcut, find_order_in_list() will not call clean_up_after_removal() if real_item() of the order item and the select list item are identical. In addition, this commit changes clean_up_after_removal() so that it requires the context to be non-null, to prevent similar errors. It also simplifies Item_sum::clean_up_after_removal() by removing window functions unconditionally (and adds a corresponding test case). Change-Id: I449be15d369dba97b23900d1a9742e9f6bad4355

*Problem:* ASAN complains about stack-buffer-overflow on function `mysql_heartbeat`: ``` ==90890==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fe746d06d14 at pc 0x7fe760f5b017 bp 0x7fe746d06cd0 sp 0x7fe746d06478 WRITE of size 24 at 0x7fe746d06d14 thread T16777215 Address 0x7fe746d06d14 is located in stack of thread T26 at offset 340 in frame #0 0x7fe746d0a55c in mysql_heartbeat(void*) /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:62 This frame has 4 object(s): [48, 56) 'result' (line 66) [80, 112) '_db_stack_frame_' (line 63) [144, 200) 'tm_tmp' (line 67) [240, 340) 'buffer' (line 65) <== Memory access at offset 340 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported) Thread T26 created by T25 here: #0 0x7fe760f5f6d5 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216 #1 0x557ccbbcb857 in my_thread_create /home/yura/ws/percona-server/mysys/my_thread.c:104 #2 0x7fe746d0b21a in daemon_example_plugin_init /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:148 #3 0x557ccb4c69c7 in plugin_initialize /home/yura/ws/percona-server/sql/sql_plugin.cc:1279 #4 0x557ccb4d19cd in mysql_install_plugin /home/yura/ws/percona-server/sql/sql_plugin.cc:2279 #5 0x557ccb4d218f in Sql_cmd_install_plugin::execute(THD*) /home/yura/ws/percona-server/sql/sql_plugin.cc:4664 #6 0x557ccb47695e in mysql_execute_command(THD*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5160 #7 0x557ccb47977c in mysql_parse(THD*, Parser_state*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5952 percona#8 0x557ccb47b6c2 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/yura/ws/percona-server/sql/sql_parse.cc:1544 percona#9 0x557ccb47de1d in do_command(THD*) /home/yura/ws/percona-server/sql/sql_parse.cc:1065 percona#10 0x557ccb6ac294 in handle_connection /home/yura/ws/percona-server/sql/conn_handler/connection_handler_per_thread.cc:325 percona#11 0x557ccbbfabb0 in pfs_spawn_thread /home/yura/ws/percona-server/storage/perfschema/pfs.cc:2198 percona#12 0x7fe760ab544f in start_thread nptl/pthread_create.c:473 ``` The reason is that `my_thread_cancel` is used to finish the daemon thread. This is not and orderly way of finishing the thread. ASAN does not register the stack variables are not used anymore which generates the error above. This is a benign error as all the variables are on the stack. *Solution*: Finish the thread in orderly way by using a signalling variable.

Change-Id: I579c2691165865101559915239b2cd027c10ab56

**Problem:** The tests fail under ASAN: ``` ==470513==ERROR: AddressSanitizer: heap-use-after-free on address 0x632000054e20 at pc 0x556599b68016 bp 0x7ffc630afb30 sp 0x7ffc630afb20 READ of size 8 at 0x632000054e20 thread T0 #0 0x556599b68015 in destroy_rwlock(PFS_rwlock*) /tmp/ps/storage/perfschema/pfs_instr.cc:430 #1 0x556599b30b82 in pfs_destroy_rwlock_v2(PSI_rwlock*) /tmp/ps/storage/perfschema/pfs.cc:2596 #2 0x7fa44336d62e in inline_mysql_rwlock_destroy /tmp/ps/include/mysql/psi/mysql_rwlock.h:289 #3 0x7fa44336da39 in vtoken_lock_cleanup::~vtoken_lock_cleanup() /tmp/ps/plugin/version_token/version_token.cc:517 #4 0x7fa46a7188a6 in __run_exit_handlers /build/glibc-SzIz7B/glibc-2.31/stdlib/exit.c:108 #5 0x7fa46a718a5f in __GI_exit /build/glibc-SzIz7B/glibc-2.31/stdlib/exit.c:139 #6 0x556596531da2 in mysqld_exit /tmp/ps/sql/mysqld.cc:2512 #7 0x55659655d579 in mysqld_main(int, char**) /tmp/ps/sql/mysqld.cc:8505 percona#8 0x55659609c5b5 in main /tmp/ps/sql/main.cc:25 percona#9 0x7fa46a6f6082 in __libc_start_main ../csu/libc-start.c:308 percona#10 0x55659609c4ed in _start (/tmp/results/PS/runtime_output_directory/mysqld+0x3c1b4ed) 0x632000054e20 is located 50720 bytes inside of 90112-byte region [0x632000048800,0x63200005e800) freed by thread T0 here: #0 0x7fa46b5f940f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:122 #1 0x556599b617eb in pfs_free(PFS_builtin_memory_class*, unsigned long, void*) /tmp/ps/storage/perfschema/pfs_global.cc:113 #2 0x556599b61a15 in pfs_free_array(PFS_builtin_memory_class*, unsigned long, unsigned long, void*) /tmp/ps/storage/perfschema/pfs_global.cc:177 #3 0x556599b6f28b in PFS_buffer_default_allocator<PFS_rwlock>::free_array(PFS_buffer_default_array<PFS_rwlock>*) /tmp/ps/storage/perfschema/pfs_buffer_container.h:172 #4 0x556599b75628 in PFS_buffer_scalable_container<PFS_rwlock, 1024, 1024, PFS_buffer_default_array<PFS_rwlock>, PFS_buffer_default_allocator<PFS_rwlock> >::cleanup() /tmp/ps/storage/perfschema/pfs_buffer_container.h:452 #5 0x556599b6d591 in cleanup_instruments() /tmp/ps/storage/perfschema/pfs_instr.cc:231 #6 0x556599b8c3f1 in cleanup_performance_schema /tmp/ps/storage/perfschema/pfs_server.cc:343 #7 0x556599b8dcfc in shutdown_performance_schema() /tmp/ps/storage/perfschema/pfs_server.cc:374 percona#8 0x556596531d96 in mysqld_exit /tmp/ps/sql/mysqld.cc:2500 percona#9 0x55659655d579 in mysqld_main(int, char**) /tmp/ps/sql/mysqld.cc:8505 percona#10 0x55659609c5b5 in main /tmp/ps/sql/main.cc:25 percona#11 0x7fa46a6f6082 in __libc_start_main ../csu/libc-start.c:308 previously allocated by thread T0 here: #0 0x7fa46b5fa6e5 in __interceptor_posix_memalign ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:217 #1 0x556599b6167e in pfs_malloc(PFS_builtin_memory_class*, unsigned long, int) /tmp/ps/storage/perfschema/pfs_global.cc:68 #2 0x556599b6187a in pfs_malloc_array(PFS_builtin_memory_class*, unsigned long, unsigned long, int) /tmp/ps/storage/perfschema/pfs_global.cc:155 #3 0x556599b6fa9e in PFS_buffer_default_allocator<PFS_rwlock>::alloc_array(PFS_buffer_default_array<PFS_rwlock>*) /tmp/ps/storage/perfschema/pfs_buffer_container.h:159 #4 0x556599b6ff12 in PFS_buffer_scalable_container<PFS_rwlock, 1024, 1024, PFS_buffer_default_array<PFS_rwlock>, PFS_buffer_default_allocator<PFS_rwlock> >::allocate(pfs_dirty_state*) /tmp/ps/storage/perfschema/pfs_buffer_container.h:602 #5 0x556599b69abc in create_rwlock(PFS_rwlock_class*, void const*) /tmp/ps/storage/perfschema/pfs_instr.cc:402 #6 0x556599b341f5 in pfs_init_rwlock_v2(unsigned int, void const*) /tmp/ps/storage/perfschema/pfs.cc:2578 #7 0x556599b9487b in inline_mysql_rwlock_init /tmp/ps/include/mysql/psi/mysql_rwlock.h:261 percona#8 0x556599b94ba7 in init_pfs_tls_channels_instrumentation() /tmp/ps/storage/perfschema/pfs_tls_channel.cc:209 percona#9 0x556599b8ca44 in initialize_performance_schema(PFS_global_param*, PSI_thread_bootstrap**, PSI_mutex_bootstrap**, PSI_rwlock_bootstrap**, PSI_cond_bootstrap**, PSI_file_bootstrap**, PSI_socket_bootstrap**, PSI_table_bootstrap**, PSI_mdl_bootstrap**, PSI_idle_bootstrap**, PSI_stage_bootstrap**, PSI_statement_bootstrap**, PSI_transaction_bootstrap**, PSI_memory_bootstrap**, PSI_error_bootstrap**, PSI_data_lock_bootstrap**, PSI_system_bootstrap**, PSI_tls_channel_bootstrap**) /tmp/ps/storage/perfschema/pfs_server.cc:266 percona#10 0x55659655a585 in mysqld_main(int, char**) /tmp/ps/sql/mysqld.cc:7497 percona#11 0x55659609c5b5 in main /tmp/ps/sql/main.cc:25 percona#12 0x7fa46a6f6082 in __libc_start_main ../csu/libc-start.c:308 SUMMARY: AddressSanitizer: heap-use-after-free /tmp/ps/storage/perfschema/pfs_instr.cc:430 in destroy_rwlock(PFS_rwlock*) Shadow bytes around the buggy address: 0x0c6480002970: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c6480002980: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c6480002990: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c64800029a0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c64800029b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd =>0x0c64800029c0: fd fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd 0x0c64800029d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c64800029e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c64800029f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c6480002a00: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c6480002a10: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==470513==ABORTING ``` The reason of the error is Percona's change on 5ae4d27 which causes the static variables of the plugin not to be deallocated. This causes `void cleanup_instruments()` to be called before `vtoken_lock_cleanup::~vtoken_lock_cleanup()`, which finds the memory of the object to have been deallocated. **Solution:** Do not run the tests under ASAN or Valgrind.

**Problem:** The following leak is detected when running the test `encryption.upgrade_crypt_data_57_v1`: ``` ==388399==ERROR: LeakSanitizer: detected memory leaks Direct leak of 70 byte(s) in 1 object(s) allocated from: #0 0x7f5f87812808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144 #1 0x55f098875d2c in ut::detail::malloc(unsigned long) /home/ldonoso/src/release-8.0.29-20/storage/innobase/include/detail/ut/allocator_traits.h:71 #2 0x55f098875db5 in ut::detail::Alloc_fn::malloc(unsigned long) /home/ldonoso/src/release-8.0.29-20/storage/innobase/include/detail/ut/allocator_traits.h:88 #3 0x55f0988aa4b9 in void* ut::detail::Alloc_fn::alloc<false>(unsigned long) /home/ldonoso/src/release-8.0.29-20/storage/innobase/include/detail/ut/allocator_traits.h:97 #4 0x55f09889b7a3 in void* ut::detail::Alloc_pfs::alloc<false>(unsigned long, unsigned int) /home/ldonoso/src/release-8.0.29-20/storage/innobase/include/detail/ut/alloc.h:275 #5 0x55f09889bb9a in std::enable_if<ut::detail::Alloc_pfs::is_pfs_instrumented_v, void*>::type ut::detail::Alloc_<ut::detail::Alloc_pfs>::alloc<false, ut::detail::Alloc_pfs>(unsigned long, unsigned int) /home/ldonoso/src/release-8.0.29-20/storage/innobase/include/detail/ut/alloc.h:438 #6 0x55f0988767dd in ut::malloc_withkey(ut::PSI_memory_key_t, unsigned long) /home/ldonoso/src/release-8.0.29-20/storage/innobase/include/ut0new.h:604 #7 0x55f09937dd3c in rec_copy_prefix_to_buf_old /home/ldonoso/src/release-8.0.29-20/storage/innobase/rem/rem0rec.cc:1206 percona#8 0x55f09937dfd3 in rec_copy_prefix_to_buf(unsigned char const*, dict_index_t const*, unsigned long, unsigned char**, unsigned long*) /home/ldonoso/src/release-8.0.29-20/storage/innobase/rem/rem0rec.cc:1233 percona#9 0x55f098ae0ae3 in dict_index_copy_rec_order_prefix(dict_index_t const*, unsigned char const*, unsigned long*, unsigned char**, unsigned long*) /home/ldonoso/src/release-8.0.29-20/storage/innobase/dict/dict0dict.cc:3764 percona#10 0x55f098c3d0ba in btr_pcur_t::store_position(mtr_t*) /home/ldonoso/src/release-8.0.29-20/storage/innobase/btr/btr0pcur.cc:141 percona#11 0x55f098c027b6 in dict_getnext_system_low /home/ldonoso/src/release-8.0.29-20/storage/innobase/dict/dict0load.cc:256 percona#12 0x55f098c02933 in dict_getnext_system(btr_pcur_t*, mtr_t*) /home/ldonoso/src/release-8.0.29-20/storage/innobase/dict/dict0load.cc:298 percona#13 0x55f098c0c05b in dict_check_sys_tables /home/ldonoso/src/release-8.0.29-20/storage/innobase/dict/dict0load.cc:1573 percona#14 0x55f098c1770d in dict_load_tablespaces_for_upgrade() /home/ldonoso/src/release-8.0.29-20/storage/innobase/dict/dict0load.cc:3233 percona#15 0x55f0987e9ed1 in innobase_init_files /home/ldonoso/src/release-8.0.29-20/storage/innobase/handler/ha_innodb.cc:6072 percona#16 0x55f098819ed3 in innobase_ddse_dict_init /home/ldonoso/src/release-8.0.29-20/storage/innobase/handler/ha_innodb.cc:13985 percona#17 0x55f097fa5c10 in dd::bootstrap::DDSE_dict_init(THD*, dict_init_mode_t, unsigned int) /home/ldonoso/src/release-8.0.29-20/sql/dd/impl/bootstrap/bootstrapper.cc:742 percona#18 0x55f0986696a6 in dd::upgrade_57::do_pre_checks_and_initialize_dd(THD*) /home/ldonoso/src/release-8.0.29-20/sql/dd/upgrade_57/upgrade.cc:922 percona#19 0x55f09550e082 in handle_bootstrap /home/ldonoso/src/release-8.0.29-20/sql/bootstrap.cc:327 percona#20 0x55f0997416e7 in pfs_spawn_thread /home/ldonoso/src/release-8.0.29-20/storage/perfschema/pfs.cc:2943 percona#21 0x7f5f876a1608 in start_thread /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477 SUMMARY: AddressSanitizer: 70 byte(s) leaked in 1 allocation(s). ``` **Solution:** The cause of the leak raises from the traversing of `pcur`. When traversing is exhausted `pcur.close()` is automatically called and all `pcur` resources are deallocated. Percona adds some early returns to the traverse, hence sometimes the traversing is not exhausted and `pcur.close()` is not called. The solution is calling `pcur.close()` explicitly. `close()` is an idempotent function so it is not a bug if it is called several times as a result of this change.

The NdbEventOperationImpl owns a NdbDictionary::Event by having a pointer to the Event's implementation. Make sure the pointer is always initialized in constructor and mark the pointer const. Also mark NdbEventImpl's pointer to the Event as const. Thus the m_eventImpl->m_facade pointer path should now be enforced by compiler. Change-Id: I469b39d66a4d83daf08307d980d555da1ab79827

-- Patch #1: Persist secondary load information -- Problem: We need a way of knowing which tables were loaded to HeatWave after MySQL restarts due to a crash or a planned shutdown. Solution: Add a new "secondary_load" flag to the `options` column of mysql.tables. This flag is toggled after a successful secondary load or unload. The information about this flag is also reflected in INFORMATION_SCHEMA.TABLES.CREATE_OPTIONS. -- Patch #2 -- The second patch in this worklog triggers the table reload from InnoDB after MySQL restart. The recovery framework recognizes that the system restarted by checking whether tables are present in the Global State. If there are no tables present, the framework will access the Data Dictionary and find which tables were loaded before the restart. This patch introduces the "Data Dictionary Worker" - a MySQL service recovery worker whose task is to query the INFORMATION_SCHEMA.TABLES table from a separate thread and find all tables whose secondary_load flag is set to 1. All tables that were found in the Data Dictionary will be appended to the list of tables that have to be reloaded by the framework from InnoDB. If an error occurs during restart recovery we will not mark the recovery as failed. This is done because the types of failures that can occur when the tables are reloaded after a restart are less critical compared to previously existing recovery situations. Additionally, this code will soon have to be adapted for the next worklog in this area so we are proceeding with the simplest solution that makes sense. A Global Context variable m_globalStateEmpty is added which indicates whether the Global State should be recovered from an external source. -- Patch #3 -- This patch adds the "rapid_reload_on_restart" system variable. This variable is used to control whether tables should be reloaded after a restart of mysqld or the HeatWave plugin. This variable is persistable (i.e., SET PERSIST RAPID_RELOAD_ON_RESTART = TRUE/FALSE). The default value of this variable is set to false. The variable can be modified in OFF, IDLE, and SUSPENDED states. -- Patch #4 -- This patch refactors the recovery code by removing all recovery-related code from ha_rpd.cc and moving it to separate files: - ha_rpd_session_factory.h/cc: These files contain the MySQLAdminSessionFactory class, which is used to create admin sessions in separate threads that can be used to issue SQL queries. - ha_rpd_recovery.h/cc: These files contain the MySQLServiceRecoveryWorker, MySQLServiceRecoveryJob and ObjectStoreRecoveryJob classes which were previously defined in ha_rpd.cc. This file also contains a function that creates the RecoveryWorkerFactory object. This object is passed to the constructor of the Recovery Framework and is used to communicate with the other section of the code located in rpdrecoveryfwk.h/cc. This patch also renames rpdrecvryfwk to rpdrecoveryfwk for better readability. The include relationship between the files is shown on the following diagram: rpdrecoveryfwk.h◄──────────────rpdrecoveryfwk.cc ▲ ▲ │ │ │ │ │ └──────────────────────────┐ │ │ ha_rpd_recovery.h◄─────────────ha_rpd_recovery.cc──┐ ▲ │ │ │ │ │ │ │ │ │ ▼ │ ha_rpd.cc───────────────────────►ha_rpd.h │ ▲ │ │ │ ┌───────────────────────────────┘ │ │ ▼ ha_rpd_session_factory.cc──────►ha_rpd_session_factory.h Other changes: - In agreement with Control Plane, the external Global State is now invalidated during recovery framework startup if: 1) Recovery framework recognizes that it should load the Global State from an external source AND, 2) rapid_reload_on_restart is set to OFF. - Addressed review comments for Patch #3, rapid_reload_on_restart is now also settable while plugin is ON. - Provide a single entry point for processing external Global State before starting the recovery framework loop. - Change when the Data Dictionary is read. Now we will no longer wait for the HeatWave nodes to connect before querying the Data Dictionary. We will query it when the recovery framework starts, before accepting any actions in the recovery loop. - Change the reload flow by inserting fake global state entries for tables that need to be reloaded instead of manually adding them to a list of tables scheduled for reload. This method will be used for the next phase where we will recover from Object Storage so both recovery methods will now follow the same flow. - Update secondary_load_dd_flag added in Patch #1. - Increase timeout in wait_for_server_bootup to 300s to account for long MySQL version upgrades. - Add reload_on_restart and reload_on_restart_dbg tests to the rapid suite. - Add PLUGIN_VAR_PERSIST_AS_READ_ONLY flag to "rapid_net_orma_port" and "rapid_reload_on_restart" definitions, enabling their initialization from persisted values along with "rapid_bootstrap" when it is persisted as ON. - Fix numerous clang-tidy warnings in recovery code. - Prevent suspended_basic and secondary_load_dd_flag tests to run on ASAN builds due to an existing issue when reinstalling the RAPID plugin. -- Bug#33752387 -- Problem: A shutdown of MySQL causes a crash in queries fired by DD worker. Solution: Prevent MySQL from killing DD worker's queries by instantiating a DD_kill_immunizer before the queries are fired. -- Patch #5 -- Problem: A table can be loaded before the DD Worker queries the Data Dictionary. This means that table will be wrongly processed as part of the external global state. Solution: If the table is present in the current in-memory global state we will not consider it as part of the external global state and we will not process it by the recovery framework. -- Bug#34197659 -- Problem: If a table reload after restart causes OOM the cluster will go into RECOVERYFAILED state. Solution: Recognize when the tables are being reloaded after restart and do not move the cluster into RECOVERYFAILED. In that case only the current reload will fail and the reload for other tables will be attempted. Change-Id: Ic0c2a763bc338ea1ae6a7121ff3d55b456271bf0

https://jira.percona.com/browse/PS-8592 Description ----------- GR suffered from problems caused by the security probes and network scanner processes connecting to the group replication communication port. This usually is not a problem, but poses a serious threat when another member tries to join the cluster by initialting a connection to the member which is affected by external processes using the port dedicated for group communication for longer durations. On such activites by external processes, the SSL enabled server stalled forever on the SSL_accept() call waiting for handshake data. Below is the stacktrace: Thread 55 (Thread 0x7f7bb77ff700 (LWP 2198598)): #0 in read () #1 in sock_read () #2 in BIO_read () #3 in ssl23_read_bytes () #4 in ssl23_get_client_hello () #5 in ssl23_accept () #6 in xcom_tcp_server_startup(Xcom_network_provider*) () When the server stalled in the above path forever, it prohibited other members to join the cluster resulting in the following messages on the joiner server's logs. [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group' [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is already leaving or joining a group.' Solution -------- This patch adds two new variables 1. group_replication_xcom_ssl_socket_timeout It is a file-descriptor level timeout in seconds for both accept() and SSL_accept() calls when group replication is listening on the xcom port. When set to a valid value, say for example 5 seconds, both accept() and SSL_accept() return after 5 seconds. The default value has been set to 0 (waits infinitely) for backward compatibility. This variable is effective only when GR is configred with SSL. 2. group_replication_xcom_ssl_accept_retries It defines the number of retries to be performed before closing the socket. For each retry the server thread calls SSL_accept() with timeout defined by the group_replication_xcom_ssl_socket_timeout for the SSL handshake process once the connection has been accepted by the first accept() call. The default value has been set to 10. This variable is effective only when GR is configred with SSL. Note: - Both of the above variables are dynamically configurable, but will become effective only on START GROUP_REPLICATION.

…ocal DDL executed https://perconadev.atlassian.net/browse/PS-9018 Problem ------- In high concurrency scenarios, MySQL replica can enter into a deadlock due to a race condition between the replica applier thread and the client thread performing a binlog group commit. Analysis -------- It needs at least 3 threads for this deadlock to happen 1. One client thread 2. Two replica applier threads How this deadlock happens? -------------------------- 0. Binlog is enabled on replica, but log_replica_updates is disabled. 1. Initially, both "Commit Order" and "Binlog Flush" queues are empty. 2. Replica applier thread 1 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 3. Since both "Commit Order" and "Binlog Flush" queues are empty, the applier thread 1 3.1. Becomes leader (In Commit_stage_manager::enroll_for()). 3.2. Registers in the commit order queue. 3.3. Acquires the lock MYSQL_BIN_LOG::LOCK_log. 3.4. Commit Order queue is emptied, but the lock MYSQL_BIN_LOG::LOCK_log is not yet released. NOTE: SE commit for applier thread is already done by the time it reaches here. 4. Replica applier thread 2 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 5. Since the "Commit Order" queue is empty (emptied by applier thread 1 in 3.4), the applier thread 2 5.1. Becomes leader (In Commit_stage_manager::enroll_for()) 5.2. Registers in the commit order queue. 5.3. Tries to acquire the lock MYSQL_BIN_LOG::LOCK_log. Since it is held by applier thread 1 it will wait until the lock is released. 6. Client thread enters the group commit pipeline to register in the "Binlog Flush" queue. 7. Since "Commit Order" queue is not empty (there is applier thread 2 in the queue), it enters the conditional wait `m_stage_cond_leader` with an intention to become the leader for both the "Binlog Flush" and "Commit Order" queues. 8. Applier thread 1 releases the lock MYSQL_BIN_LOG::LOCK_log and proceeds to update the GTID by calling gtid_state->update_commit_group() from Commit_order_manager::flush_engine_and_signal_threads(). 9. Applier thread 2 acquires the lock MYSQL_BIN_LOG::LOCK_log. 9.1. It checks if there is any thread waiting in the "Binlog Flush" queue to become the leader. Here it finds the client thread waiting to be the leader. 9.2. It releases the lock MYSQL_BIN_LOG::LOCK_log and signals on the cond_var `m_stage_cond_leader` and enters a conditional wait until the thread's `tx_commit_pending` is set to false by the client thread (will be done in the Commit_stage_manager::process_final_stage_for_ordered_commit_group() called by client thread from fetch_and_process_flush_stage_queue()). 10. The client thread wakes up from the cond_var `m_stage_cond_leader`. The thread has now become a leader and it is its responsibility to update GTID of applier thread 2. 10.1. It acquires the lock MYSQL_BIN_LOG::LOCK_log. 10.2. Returns from `enroll_for()` and proceeds to process the "Commit Order" and "Binlog Flush" queues. 10.3. Fetches the "Commit Order" and "Binlog Flush" queues. 10.4. Performs the storage engine flush by calling ha_flush_logs() from fetch_and_process_flush_stage_queue(). 10.5. Proceeds to update the GTID of threads in "Commit Order" queue by calling gtid_state->update_commit_group() from Commit_stage_manager::process_final_stage_for_ordered_commit_group(). 11. At this point, we will have - Client thread performing GTID update on behalf if applier thread 2 (from step 10.5), and - Applier thread 1 performing GTID update for itself (from step 8). Due to the lack of proper synchronization between the above two threads, there exists a time window where both threads can call gtid_state->update_commit_group() concurrently. In subsequent steps, both threads simultaneously try to modify the contents of the array `commit_group_sidnos` which is used to track the lock status of sidnos. This concurrent access to `update_commit_group()` can cause a lock-leak resulting in one thread acquiring the sidno lock and not releasing at all. ----------------------------------------------------------------------------------------------------------- Client thread Applier Thread 1 ----------------------------------------------------------------------------------------------------------- update_commit_group() => global_sid_lock->rdlock(); update_commit_group() => global_sid_lock->rdlock(); calls update_gtids_impl_lock_sidnos() calls update_gtids_impl_lock_sidnos() set commit_group_sidno[2] = true set commit_group_sidno[2] = true lock_sidno(2) -> successful lock_sidno(2) -> waits update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { unlock_sidno(2); commit_group_sidnos[2] = false; } Applier thread continues.. lock_sidno(2) -> successful update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { <=== this check fails and lock is not released. unlock_sidno(2); commit_group_sidnos[2] = false; } Client thread continues without releasing the lock ----------------------------------------------------------------------------------------------------------- 12. As the above lock-leak can also happen the other way i.e, the applier thread fails to unlock, there can be different consequences hereafter. 13. If the client thread continues without releasing the lock, then at a later stage, it can enter into a deadlock with the applier thread performing a GTID update with stack trace. Client_thread ------------- #1 __GI___lll_lock_wait #2 ___pthread_mutex_lock #3 native_mutex_lock <= waits for commit lock while holding sidno lock #4 Commit_stage_manager::enroll_for #5 MYSQL_BIN_LOG::change_stage #6 MYSQL_BIN_LOG::ordered_commit #7 MYSQL_BIN_LOG::commit percona#8 ha_commit_trans percona#9 trans_commit_implicit percona#10 mysql_create_like_table percona#11 Sql_cmd_create_table::execute percona#12 mysql_execute_command percona#13 dispatch_sql_command Applier thread -------------- #1 ___pthread_mutex_lock #2 native_mutex_lock #3 safe_mutex_lock #4 Gtid_state::update_gtids_impl_lock_sidnos <= waits for sidno lock #5 Gtid_state::update_commit_group #6 Commit_order_manager::flush_engine_and_signal_threads <= acquires commit lock here #7 Commit_order_manager::finish percona#8 Commit_order_manager::wait_and_finish percona#9 ha_commit_low percona#10 trx_coordinator::commit_in_engines percona#11 MYSQL_BIN_LOG::commit percona#12 ha_commit_trans percona#13 trans_commit percona#14 Xid_log_event::do_commit percona#15 Xid_apply_log_event::do_apply_event_worker percona#16 Slave_worker::slave_worker_exec_event percona#17 slave_worker_exec_job_group percona#18 handle_slave_worker 14. If the applier thread continues without releasing the lock, then at a later stage, it can perform recursive locking while setting the GTID for the next transaction (in set_gtid_next()). In debug builds the above case hits the assertion `safe_mutex_assert_not_owner()` meaning the lock is already acquired by the replica applier thread when it tries to re-acquire the lock. Solution -------- In the above problematic example, when seen from each thread individually, we can conclude that there is no problem in the order of lock acquisition, thus there is no need to change the lock order. However, the root cause for this problem is that multiple threads can concurrently access to the array `Gtid_state::commit_group_sidnos`. In its initial implementation, it was expected that threads should hold the `MYSQL_BIN_LOG::LOCK_commit` before modifying its contents. But it was not considered when upstream implemented WL#7846 (MTS: slave-preserve-commit-order when log-slave-updates/binlog is disabled). With this patch, we now ensure that `MYSQL_BIN_LOG::LOCK_commit` is acquired when the client thread (binlog flush leader) when it tries to perform GTID update on behalf of threads waiting in "Commit Order" queue, thus providing a guarantee that `Gtid_state::commit_group_sidnos` array is never accessed without the protection of `MYSQL_BIN_LOG::LOCK_commit`.

Part of WL#15135 Certificate Architecture In Ndb_cluster_connection, this patch provides a new top-level method configure_tls(). It also implements TLS initialization in connect(), calling down through the TransporterFacade layer to TransporterRegistry. In the MySQL server this adds the new read-only configuration option ndb-tls-search-path, with a compile-time default that is configurable in CMake, WITH_NDB_TLS_SEARCH_PATH. Unmodified API nodes that do not call into configure_tls() will still be able to make TLS connections if keys are found somewhere in the default search path. Change-Id: Id1d046ff3c5a48a30131c3d15274f5ed625933a9

On Win32, every binary needs one instance of openssl/applink.c. MySQL has one in client_authentication.cc, and this one is present in the server and in the client library. This patch adds instances to ndb_sign_keys, ndb_mgmd, ndbd, and testNodeCertificate-t, and includes fixes for other assorted compiler errors and warnings on Win32. Change-Id: I2d7f940b92ddac7d860d2c6fc2d98ead23e195b2

In class Transporter, add two new member variables: m_require_tls is a boolean TLS requirement m_encrypted is true only when TLS is actually in use A corresponding change in struct TransporterConfiguration also adds authMode. Some application logic is added in IPConfig.cpp to configure the new variables. On the server side, TransporterRegistry always uses a TLS authenticator. On the client side, all Transporter clients initialize a SocketAuthSimple authenticator, but then TCP Transporter clients delete this in the TCP_Transporter constructor and replace it with a TLS authenticator. Change-Id: I6392eecfc712f8a8f500697f34324eea01d29a8c

In the NDB configuration, add boolen options RequireTls and RequireCertificate to the [MGM] section. Both options default to false. Add a new test testMgmd -n MgmdWithoutCertificate In NdbStdOpt, add the --ndb-mgm-tls command-line option. The allowed values are "relaxed" and "strict". The default is "relaxed". This option will be used for utility programs, allowing the user to specify the TLS-related behavior of MGM clients. Change-Id: Id32bb8805ca19a8cf8b52f45c54a7be4d912c5e4

…ocal DDL executed https://perconadev.atlassian.net/browse/PS-9018 Problem ------- In high concurrency scenarios, MySQL replica can enter into a deadlock due to a race condition between the replica applier thread and the client thread performing a binlog group commit. Analysis -------- It needs at least 3 threads for this deadlock to happen 1. One client thread 2. Two replica applier threads How this deadlock happens? -------------------------- 0. Binlog is enabled on replica, but log_replica_updates is disabled. 1. Initially, both "Commit Order" and "Binlog Flush" queues are empty. 2. Replica applier thread 1 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 3. Since both "Commit Order" and "Binlog Flush" queues are empty, the applier thread 1 3.1. Becomes leader (In Commit_stage_manager::enroll_for()). 3.2. Registers in the commit order queue. 3.3. Acquires the lock MYSQL_BIN_LOG::LOCK_log. 3.4. Commit Order queue is emptied, but the lock MYSQL_BIN_LOG::LOCK_log is not yet released. NOTE: SE commit for applier thread is already done by the time it reaches here. 4. Replica applier thread 2 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 5. Since the "Commit Order" queue is empty (emptied by applier thread 1 in 3.4), the applier thread 2 5.1. Becomes leader (In Commit_stage_manager::enroll_for()) 5.2. Registers in the commit order queue. 5.3. Tries to acquire the lock MYSQL_BIN_LOG::LOCK_log. Since it is held by applier thread 1 it will wait until the lock is released. 6. Client thread enters the group commit pipeline to register in the "Binlog Flush" queue. 7. Since "Commit Order" queue is not empty (there is applier thread 2 in the queue), it enters the conditional wait `m_stage_cond_leader` with an intention to become the leader for both the "Binlog Flush" and "Commit Order" queues. 8. Applier thread 1 releases the lock MYSQL_BIN_LOG::LOCK_log and proceeds to update the GTID by calling gtid_state->update_commit_group() from Commit_order_manager::flush_engine_and_signal_threads(). 9. Applier thread 2 acquires the lock MYSQL_BIN_LOG::LOCK_log. 9.1. It checks if there is any thread waiting in the "Binlog Flush" queue to become the leader. Here it finds the client thread waiting to be the leader. 9.2. It releases the lock MYSQL_BIN_LOG::LOCK_log and signals on the cond_var `m_stage_cond_leader` and enters a conditional wait until the thread's `tx_commit_pending` is set to false by the client thread (will be done in the Commit_stage_manager::process_final_stage_for_ordered_commit_group() called by client thread from fetch_and_process_flush_stage_queue()). 10. The client thread wakes up from the cond_var `m_stage_cond_leader`. The thread has now become a leader and it is its responsibility to update GTID of applier thread 2. 10.1. It acquires the lock MYSQL_BIN_LOG::LOCK_log. 10.2. Returns from `enroll_for()` and proceeds to process the "Commit Order" and "Binlog Flush" queues. 10.3. Fetches the "Commit Order" and "Binlog Flush" queues. 10.4. Performs the storage engine flush by calling ha_flush_logs() from fetch_and_process_flush_stage_queue(). 10.5. Proceeds to update the GTID of threads in "Commit Order" queue by calling gtid_state->update_commit_group() from Commit_stage_manager::process_final_stage_for_ordered_commit_group(). 11. At this point, we will have - Client thread performing GTID update on behalf if applier thread 2 (from step 10.5), and - Applier thread 1 performing GTID update for itself (from step 8). Due to the lack of proper synchronization between the above two threads, there exists a time window where both threads can call gtid_state->update_commit_group() concurrently. In subsequent steps, both threads simultaneously try to modify the contents of the array `commit_group_sidnos` which is used to track the lock status of sidnos. This concurrent access to `update_commit_group()` can cause a lock-leak resulting in one thread acquiring the sidno lock and not releasing at all. ----------------------------------------------------------------------------------------------------------- Client thread Applier Thread 1 ----------------------------------------------------------------------------------------------------------- update_commit_group() => global_sid_lock->rdlock(); update_commit_group() => global_sid_lock->rdlock(); calls update_gtids_impl_lock_sidnos() calls update_gtids_impl_lock_sidnos() set commit_group_sidno[2] = true set commit_group_sidno[2] = true lock_sidno(2) -> successful lock_sidno(2) -> waits update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { unlock_sidno(2); commit_group_sidnos[2] = false; } Applier thread continues.. lock_sidno(2) -> successful update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { <=== this check fails and lock is not released. unlock_sidno(2); commit_group_sidnos[2] = false; } Client thread continues without releasing the lock ----------------------------------------------------------------------------------------------------------- 12. As the above lock-leak can also happen the other way i.e, the applier thread fails to unlock, there can be different consequences hereafter. 13. If the client thread continues without releasing the lock, then at a later stage, it can enter into a deadlock with the applier thread performing a GTID update with stack trace. Client_thread ------------- #1 __GI___lll_lock_wait #2 ___pthread_mutex_lock #3 native_mutex_lock <= waits for commit lock while holding sidno lock #4 Commit_stage_manager::enroll_for #5 MYSQL_BIN_LOG::change_stage #6 MYSQL_BIN_LOG::ordered_commit #7 MYSQL_BIN_LOG::commit percona#8 ha_commit_trans percona#9 trans_commit_implicit percona#10 mysql_create_like_table percona#11 Sql_cmd_create_table::execute percona#12 mysql_execute_command percona#13 dispatch_sql_command Applier thread -------------- #1 ___pthread_mutex_lock #2 native_mutex_lock #3 safe_mutex_lock #4 Gtid_state::update_gtids_impl_lock_sidnos <= waits for sidno lock #5 Gtid_state::update_commit_group #6 Commit_order_manager::flush_engine_and_signal_threads <= acquires commit lock here #7 Commit_order_manager::finish percona#8 Commit_order_manager::wait_and_finish percona#9 ha_commit_low percona#10 trx_coordinator::commit_in_engines percona#11 MYSQL_BIN_LOG::commit percona#12 ha_commit_trans percona#13 trans_commit percona#14 Xid_log_event::do_commit percona#15 Xid_apply_log_event::do_apply_event_worker percona#16 Slave_worker::slave_worker_exec_event percona#17 slave_worker_exec_job_group percona#18 handle_slave_worker 14. If the applier thread continues without releasing the lock, then at a later stage, it can perform recursive locking while setting the GTID for the next transaction (in set_gtid_next()). In debug builds the above case hits the assertion `safe_mutex_assert_not_owner()` meaning the lock is already acquired by the replica applier thread when it tries to re-acquire the lock. Solution -------- In the above problematic example, when seen from each thread individually, we can conclude that there is no problem in the order of lock acquisition, thus there is no need to change the lock order. However, the root cause for this problem is that multiple threads can concurrently access to the array `Gtid_state::commit_group_sidnos`. In its initial implementation, it was expected that threads should hold the `MYSQL_BIN_LOG::LOCK_commit` before modifying its contents. But it was not considered when upstream implemented WL#7846 (MTS: slave-preserve-commit-order when log-slave-updates/binlog is disabled). With this patch, we now ensure that `MYSQL_BIN_LOG::LOCK_commit` is acquired when the client thread (binlog flush leader) when it tries to perform GTID update on behalf of threads waiting in "Commit Order" queue, thus providing a guarantee that `Gtid_state::commit_group_sidnos` array is never accessed without the protection of `MYSQL_BIN_LOG::LOCK_commit`.

…ocal DDL executed https://perconadev.atlassian.net/browse/PS-9018 Merge remote-tracking branch 'venki/PS-9018-8.0-gca' into HEAD Problem ------- In high concurrency scenarios, MySQL replica can enter into a deadlock due to a race condition between the replica applier thread and the client thread performing a binlog group commit. Analysis -------- It needs at least 3 threads for this deadlock to happen 1. One client thread 2. Two replica applier threads How this deadlock happens? -------------------------- 0. Binlog is enabled on replica, but log_replica_updates is disabled. 1. Initially, both "Commit Order" and "Binlog Flush" queues are empty. 2. Replica applier thread 1 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 3. Since both "Commit Order" and "Binlog Flush" queues are empty, the applier thread 1 3.1. Becomes leader (In Commit_stage_manager::enroll_for()). 3.2. Registers in the commit order queue. 3.3. Acquires the lock MYSQL_BIN_LOG::LOCK_log. 3.4. Commit Order queue is emptied, but the lock MYSQL_BIN_LOG::LOCK_log is not yet released. NOTE: SE commit for applier thread is already done by the time it reaches here. 4. Replica applier thread 2 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 5. Since the "Commit Order" queue is empty (emptied by applier thread 1 in 3.4), the applier thread 2 5.1. Becomes leader (In Commit_stage_manager::enroll_for()) 5.2. Registers in the commit order queue. 5.3. Tries to acquire the lock MYSQL_BIN_LOG::LOCK_log. Since it is held by applier thread 1 it will wait until the lock is released. 6. Client thread enters the group commit pipeline to register in the "Binlog Flush" queue. 7. Since "Commit Order" queue is not empty (there is applier thread 2 in the queue), it enters the conditional wait `m_stage_cond_leader` with an intention to become the leader for both the "Binlog Flush" and "Commit Order" queues. 8. Applier thread 1 releases the lock MYSQL_BIN_LOG::LOCK_log and proceeds to update the GTID by calling gtid_state->update_commit_group() from Commit_order_manager::flush_engine_and_signal_threads(). 9. Applier thread 2 acquires the lock MYSQL_BIN_LOG::LOCK_log. 9.1. It checks if there is any thread waiting in the "Binlog Flush" queue to become the leader. Here it finds the client thread waiting to be the leader. 9.2. It releases the lock MYSQL_BIN_LOG::LOCK_log and signals on the cond_var `m_stage_cond_leader` and enters a conditional wait until the thread's `tx_commit_pending` is set to false by the client thread (will be done in the Commit_stage_manager::process_final_stage_for_ordered_commit_group() called by client thread from fetch_and_process_flush_stage_queue()). 10. The client thread wakes up from the cond_var `m_stage_cond_leader`. The thread has now become a leader and it is its responsibility to update GTID of applier thread 2. 10.1. It acquires the lock MYSQL_BIN_LOG::LOCK_log. 10.2. Returns from `enroll_for()` and proceeds to process the "Commit Order" and "Binlog Flush" queues. 10.3. Fetches the "Commit Order" and "Binlog Flush" queues. 10.4. Performs the storage engine flush by calling ha_flush_logs() from fetch_and_process_flush_stage_queue(). 10.5. Proceeds to update the GTID of threads in "Commit Order" queue by calling gtid_state->update_commit_group() from Commit_stage_manager::process_final_stage_for_ordered_commit_group(). 11. At this point, we will have - Client thread performing GTID update on behalf if applier thread 2 (from step 10.5), and - Applier thread 1 performing GTID update for itself (from step 8). Due to the lack of proper synchronization between the above two threads, there exists a time window where both threads can call gtid_state->update_commit_group() concurrently. In subsequent steps, both threads simultaneously try to modify the contents of the array `commit_group_sidnos` which is used to track the lock status of sidnos. This concurrent access to `update_commit_group()` can cause a lock-leak resulting in one thread acquiring the sidno lock and not releasing at all. ----------------------------------------------------------------------------------------------------------- Client thread Applier Thread 1 ----------------------------------------------------------------------------------------------------------- update_commit_group() => global_sid_lock->rdlock(); update_commit_group() => global_sid_lock->rdlock(); calls update_gtids_impl_lock_sidnos() calls update_gtids_impl_lock_sidnos() set commit_group_sidno[2] = true set commit_group_sidno[2] = true lock_sidno(2) -> successful lock_sidno(2) -> waits update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { unlock_sidno(2); commit_group_sidnos[2] = false; } Applier thread continues.. lock_sidno(2) -> successful update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { <=== this check fails and lock is not released. unlock_sidno(2); commit_group_sidnos[2] = false; } Client thread continues without releasing the lock ----------------------------------------------------------------------------------------------------------- 12. As the above lock-leak can also happen the other way i.e, the applier thread fails to unlock, there can be different consequences hereafter. 13. If the client thread continues without releasing the lock, then at a later stage, it can enter into a deadlock with the applier thread performing a GTID update with stack trace. Client_thread ------------- #1 __GI___lll_lock_wait #2 ___pthread_mutex_lock #3 native_mutex_lock <= waits for commit lock while holding sidno lock #4 Commit_stage_manager::enroll_for #5 MYSQL_BIN_LOG::change_stage #6 MYSQL_BIN_LOG::ordered_commit #7 MYSQL_BIN_LOG::commit percona#8 ha_commit_trans percona#9 trans_commit_implicit percona#10 mysql_create_like_table percona#11 Sql_cmd_create_table::execute percona#12 mysql_execute_command percona#13 dispatch_sql_command Applier thread -------------- #1 ___pthread_mutex_lock #2 native_mutex_lock #3 safe_mutex_lock #4 Gtid_state::update_gtids_impl_lock_sidnos <= waits for sidno lock #5 Gtid_state::update_commit_group #6 Commit_order_manager::flush_engine_and_signal_threads <= acquires commit lock here #7 Commit_order_manager::finish percona#8 Commit_order_manager::wait_and_finish percona#9 ha_commit_low percona#10 trx_coordinator::commit_in_engines percona#11 MYSQL_BIN_LOG::commit percona#12 ha_commit_trans percona#13 trans_commit percona#14 Xid_log_event::do_commit percona#15 Xid_apply_log_event::do_apply_event_worker percona#16 Slave_worker::slave_worker_exec_event percona#17 slave_worker_exec_job_group percona#18 handle_slave_worker 14. If the applier thread continues without releasing the lock, then at a later stage, it can perform recursive locking while setting the GTID for the next transaction (in set_gtid_next()). In debug builds the above case hits the assertion `safe_mutex_assert_not_owner()` meaning the lock is already acquired by the replica applier thread when it tries to re-acquire the lock. Solution -------- In the above problematic example, when seen from each thread individually, we can conclude that there is no problem in the order of lock acquisition, thus there is no need to change the lock order. However, the root cause for this problem is that multiple threads can concurrently access to the array `Gtid_state::commit_group_sidnos`. In its initial implementation, it was expected that threads should hold the `MYSQL_BIN_LOG::LOCK_commit` before modifying its contents. But it was not considered when upstream implemented WL#7846 (MTS: slave-preserve-commit-order when log-slave-updates/binlog is disabled). With this patch, we now ensure that `MYSQL_BIN_LOG::LOCK_commit` is acquired when the client thread (binlog flush leader) when it tries to perform GTID update on behalf of threads waiting in "Commit Order" queue, thus providing a guarantee that `Gtid_state::commit_group_sidnos` array is never accessed without the protection of `MYSQL_BIN_LOG::LOCK_commit`.

When built with ASAN, a use-after-free is reported for the TcpPortPool. AddressSanitizer: heap-use-after-free on address 0x60200019f190 at pc 0x00000076a18d bp 0x7fff51e7d1d0 sp 0x7fff51e7d1c0 #4 0x770b73 in UniqueId::ProcessUniqueIds::erase(unsigned int) ../router/tests/helpers/tcp_port_pool.h:112 #5 0x770c48 in UniqueId::~UniqueId() ../router/tests/helpers/tcp_port_pool.cc:234 ... percona#12 0x82faa3 in testing::UnitTest::~UnitTest() ../extra/googletest/googletest-release-1.12.0/googletest/src/gtest.cc:5496 percona#13 0x7f5fe085ace8 in __run_exit_handlers (/lib64/libc.so.6+0x39ce8) 0x60200019f190 is located 0 bytes inside of 16-byte region [0x60200019f190,0x60200019f1a0) freed by thread T0 here: #0 0x7f5fe3cbd10f in operator delete(void*, unsigned long) (/lib64/libasan.so.6+0xb710f) #1 0x7f5fe085ace8 in __run_exit_handlers (/lib64/libc.so.6+0x39ce8) Background ========== __run_exit_handlers destroys "static" and "global" variables in reverse order of their creation. googletest's unit-tests are a static, and the TcpPortPool also has ProcessUniqueId's which contains the process-wide unique-ids. At construct: unittest -> tcp-port-pool -> proces-unique-ids At destruct : process-unique-ids -> tcp-port-pool -> 💥 The use-after-free happens as the process-unique-ids static is destructed before the tcp-port-pool which tries to its Ids from the process-unique-ids. Change ====== - extend the lifetime of the process-unique-ids to after the last use of the tcp-port-pool via a std::shared_ptr<> Change-Id: I75b8b781e1d240f18ca72f2c86182639a7699f06

…nt on Windows and posix [#4] Introduce quoting functions suitable for POSIX shell (sh) and running C/C++ programs on Windows via CMD.EXE. Use them when running a program via ssh. A simple heuristic to guess the kind of quoting needed on remote host is. If a \ appears in any argument use the quoting function for Windows. If / appears in any argument use the quoting function for POSIX. Change-Id: I851eb3da22d716d181319e825e888631cd16aeb7

Problem: Starting ´ndb_mgmd --bind-address´ may potentially cause abnormal program termination in MgmtSrvr destructor when ndb_mgmd restart itself. Core was generated by `ndb_mgmd --defa'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f8ce4066b8f in raise () from /lib64/libc.so.6 #1 0x00007f8ce4039ea5 in abort () from /lib64/libc.so.6 #2 0x00007f8ce40a7d97 in __libc_message () from /lib64/libc.so.6 #3 0x00007f8ce40af08c in malloc_printerr () from /lib64/libc.so.6 #4 0x00007f8ce40b132d in _int_free () from /lib64/libc.so.6 #5 0x00000000006e9ffe in MgmtSrvr::~MgmtSrvr (this=0x28de4b0) at mysql/8.0/storage/ndb/src/mgmsrv/MgmtSrvr.cpp: 890 #6 0x00000000006ea09e in MgmtSrvr::~MgmtSrvr (this=0x2) at mysql/8.0/ storage/ndb/src/mgmsrv/MgmtSrvr.cpp:849 #7 0x0000000000700d94 in mgmd_run () at mysql/8.0/storage/ndb/src/mgmsrv/main.cpp:260 percona#8 0x0000000000700775 in mgmd_main (argc=<optimized out>, argv=0x28041d0) at mysql/8.0/storage/ndb/src/ mgmsrv/main.cpp:479 Analysis: While starting up, the ndb_mgmd will allocate memory for bind_address in order to potentially rewrite the parameter. When ndb_mgmd restart itself the memory will be released and dangling pointer causing double free. Fix: Drop support for bind_address=[::], it is not documented anywhere, is not useful and doesn't work. This means the need to rewrite bind_address is gone and bind_address argument need neither alloc or free. Change-Id: I7797109b9d8391394587188d64d4b1f398887e94

satya-bodapati force-pushed the ps-8.0.20-dblwr_files branch from c435065 to d24bca6 Compare June 30, 2020 07:58

satya-bodapati changed the title ~~Bug#99935 : innodb_doublewrite_files is not correct when innodb_buffe…~~ PS-7144 : incorrect number of doublewrite files created if innodb_buffer_pool_size > 1G Jun 30, 2020

percona-ysorokin pushed a commit that referenced this pull request Jul 5, 2022

Bug#32233543 [#4] Remove suppression of -Wstringop-truncation warnings.

c00dc20

Change-Id: I579c2691165865101559915239b2cd027c10ab56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PS-7144 : incorrect number of doublewrite files created if innodb_buffer_pool_size > 1G #4

PS-7144 : incorrect number of doublewrite files created if innodb_buffer_pool_size > 1G #4

satya-bodapati commented Jun 30, 2020 •

edited

Loading

PS-7144 : incorrect number of doublewrite files created if innodb_buffer_pool_size > 1G #4

Are you sure you want to change the base?

PS-7144 : incorrect number of doublewrite files created if innodb_buffer_pool_size > 1G #4

Conversation

satya-bodapati commented Jun 30, 2020 • edited Loading

satya-bodapati commented Jun 30, 2020 •

edited

Loading