Bug1418996 percona 5.6 #5

laurynas-biveinis · 2015-02-20T16:10:33Z

No description provided.

requested). Currently lock_print_info_all_transactions prints lock info if either innodb lock monitor is on, either innodb_show_locks_held is greater than zero (the default value is ten). It should be printed when both conditions hold instead.

Bug1418996 percona 5.6

Bug1418996 percona 5.6 (cherry picked from commit 57d8749)

ha_rocksdb::write_row() had a wrong check which caused its default behavior to work like its bulk-insert mode: commit the changes every @@bulk_load_size rows.

…hutdown) On several testcases (i.e. rpl_gtid_mode), LeakSanitizer diagnoses missed memory deallocation: ================================================================= ==16675==ERROR: LeakSanitizer: detected memory leaks Direct leak of 21 byte(s) in 1 object(s) allocated from: #0 0x7f17748fa54a in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x9854a) #1 0xff7f7f in my_malloc /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/mysys/my_malloc.c:38 #2 0x1634b83 in add_pfs_instr_to_array(char const*, char const*) /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/storage/perfschema/pfs_server.cc:251 #3 0x58cccf in mysqld_get_one_option /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/sql/mysqld.cc:9198 #4 0x10256c6 in my_handle_options /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/mysys_ssl/my_getopt.cc:817 #5 0x1025c63 in handle_options /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/mysys_ssl/my_getopt.cc:308 #6 0x5963e5 in handle_early_options() /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/sql/mysqld.cc:7263 #7 0x5a35a3 in mysqld_main(int, char**) /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/sql/mysqld.cc:5613 #8 0x586aae in main /mnt/workspace/percona-server-5.6-asan-param/BUILD_TYPE/debug-asan/Host/ubuntu-xenial-64bit/sql/main.cc:25 #9 0x7f17726cc82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) This class of errors is already attempted to suppress in valgrind.supp. But these suppressions have been added to work around a bug of racy PFS shutdown, which is not required anymore as pfs_instr_config_array is deallocated exactly once since [1]. Thus, free the elements of this array and remove related suppressions instead.

Patch #5 Remove special purify implementations of the uint3korr macro.

THREE BYTES ON X86 Analysis: ========= The macro uint3korr reads 4 bytes of data instead of 3 on on x86 machines. Multiple definitions were created for this macro for optimization in WIN32. The idea was to optimize reading of 3 byte ints by reading an ordinary int and masking away the unused byte. However this is an undefined behavior. It will be an issue unless users are aware of allocating an extra byte for using this macro. Fix: ==== Removing the definition which reads 4 bytes of data. The only definition of this macro would now read just 3 bytes of data thus prohibiting the usage of an extra byte. Note: ===== This is a backport of Patches #5 and #6 for Bug#17922198.

In WL-included builds ASAN run witnessed missed ~Query_log_event invocation. The destruct-or was not called due to the WL's changes in the error propagation that specifically affect LC MTS. The failure is exposed in particular by rpl_trigger as the following stack: #0 0x9ecd98 in __interceptor_malloc (/export/home/pb2/test/sb_2-22611026-1489061390.32/mysql-commercial-8.0.1-dmr-linux-x86_64-asan/bin/mysqld+0x9ecd98) #1 0x2b1a245 in my_raw_malloc(unsigned long, int) obj/mysys/../../mysqlcom-pro-8.0.1-dmr/mysys/my_malloc.cc:209:12 #2 0x2b1a245 in my_malloc obj/mysys/../../mysqlcom-pro-8.0.1-dmr/mysys/my_malloc.cc:72 #3 0x2940590 in Query_log_event::Query_log_event(char const*, unsigned int, binary_log::Format_description_event const*, binary_log::Log_event_type) obj/sql/../../mysqlcom-pro-8.0.1-dmr/sql/log_event.cc:4343:46 #4 0x293d235 in Log_event::read_log_event(char const*, unsigned int, char const**, Format_description_log_event const*, bool) obj/sql/../../mysqlcom-pro-8.0.1-dmr/sql/log_event.cc:1686:17 #5 0x293b96f in Log_event::read_log_event() #6 0x2a2a1c9 in next_event(Relay_log_info*) Previously before the WL Mts_submode_logical_clock::wait_for_workers_to_finish() had not returned any error even when Coordinator thread is killed. The WL patch needed to refine such behavior, but at doing so it also had to attend log_event.cc::schedule_next_event() to register an error to follow an existing pattern. While my_error() does not take place the killed Coordinator continued scheduling, ineffectively though - no Worker gets engaged (legal case of deferred scheduling), and without noticing its killed status up to a point when it resets the event pointer in apply_event_and_update_pos(): *ptr_ev= NULL; // announcing the event is passed to w-worker The reset was intended for an assigned Worker to perform the event destruction or by Coordinator itself when the event is deferred. As neither is the current case the event gets unattended for its termination. In contrast in the pre-WL sources the killed Coordinator does find a Worker. However such Worker could be already down (errored out and exited), in which case apply_event_and_update_pos() reasonably returns an error and executes delete ev in exec_relay_log_event() error branch. **Fixed** with deploying my_error() call in log_event.cc::schedule_next_event() error branch which fits to the existing pattern. THD::is_error() has been always checked by Coordinator before any attempt to reset *ptr_ev= NULL. In the errored case Coordinator does not reset and destroys the event itself in the exec_relay_log_event() error branch pretty similarly to how the pre-WL sources do. Tested against rpl_trigger and rpl suites to pass. Approved on rb#15667.

Summary: In the xa transation 'XA END' phase(thd_sql_command is SQLCOM_END), TokuDB slave will create both transaction for trx->sp_level and trx->stmt, this will cause the toku_xids_can_create_child abort since the trx->sp_level->xids is 0x00. How to reproduce: With tokudb_debug=32, do the queries on master: create table t1(a int)engine=tokudb; xa start 'x1'; insert into t1 values(1); xa end 'x1'; xa prepare 'x1'; xa commit 'x1'; xa start 'x2'; insert into t1 values(2); xa end 'x2'; xa prepare 'x2'; xa commit 'x2'; Slave debug info: xa start 'x1'; insert into t1 values(1); xa end 'x1'; xa prepare 'x1'; xa commit 'x1'; 2123 0x7ff2d44c5830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6533 ha_tokudb::external_lock trx (nil) (nil) (nil) (nil) 0 0 2123 /u01/tokudb/storage/tokudb/tokudb_txn.h:127 txn_begin begin txn (nil) 0x7ff2d44a3000 67108864 r=0 2123 0x7ff2d44c5830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6426 ha_tokudb::create_txn created master 0x7ff2d44a3000 2123 /u01/tokudb/storage/tokudb/tokudb_txn.h:127 txn_begin begin txn 0x7ff2d44a3000 0x7ff2d44a3100 1 r=0 2123 0x7ff2d44c5830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6468 ha_tokudb::create_txn created stmt 0x7ff2d44a3000 sp_level 0x7ff2d44a3100 2123 0x7ff2d44c5830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:4120 ha_tokudb::write_row txn 0x7ff2d44a3100 2123 /u01/tokudb/storage/tokudb/hatoku_hton.cc:942 tokudb_commit commit trx 0 txn 0x7ff2d44a3100 syncflag 512 xa start 'x2'; insert into t1 values(2); xa end 'x2'; xa prepare 'x2'; xa commit 'x2'; 2123 0x7ff2d44c5830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6533 ha_tokudb::external_lock trx 0x7ff2d44a3000 (nil) 0x7ff2d44a3000 (nil) 0 0 2123 /u01/tokudb/storage/tokudb/tokudb_txn.h:127 txn_begin begin txn 0x7ff2d44a3000 0x7ff2d44a3000 1 r=0 2123 0x7ff2d44c5830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6468 ha_tokudb::create_txn created stmt 0x7ff2d44a3000 sp_level 0x7ff2d44a3000 2123 0x7ff2d44c5830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:4120 ha_tokudb::write_row txn 0x7ff2d44a3000 2017-12-24T08:36:45.347405Z 11 [ERROR] TokuDB: toku_db_put: Transaction cannot do work when child exists 2017-12-24T08:36:45.347444Z 11 [Warning] Slave: Got error 22 from storage engine Error_code: 1030 2017-12-24T08:36:45.347448Z 11 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with 'SLAVE START'. We stopped at log 'mysql-bin.000001' position 1007 2123 /u01/tokudb/storage/tokudb/hatoku_hton.cc:972 tokudb_rollback rollback 0 txn 0x7ff2d44a3000 Segmentation fault (core dumped) This crash caused by the parent->xid is 0x00. The core statck info: (gdb) bt #0 __pthread_kill (threadid=<optimized out>, signo=signo@entry=11) at ../sysdeps/unix/sysv/linux/pthread_kill.c:62 #1 0x0000000000f6b647 in my_write_core (sig=sig@entry=11) at /u01/tokudb/mysys/stacktrace.c:249 #2 0x000000000086b945 in handle_fatal_signal (sig=11) at /u01/tokudb/sql/signal_handler.cc:223 #3 <signal handler called> #4 toku_xids_can_create_child (xids=0x0) at /u01/tokudb/storage/tokudb/PerconaFT/ft/txn/xids.cc:93 #5 0x000000000080531f in toku_txn_begin_with_xid (parent=0x7f0bf501c280, txnp=0x7f0bf50a3490, logger=0x7f0c415e66c0, xid=..., snapshot_type=TXN_SNAPSHOT_CHILD, container_db_txn=0x7f0bf50a3400, for_recovery=false, read_only=false) at /u01/tokudb/storage/tokudb/PerconaFT/ft/txn/txn.cc:137 #6 0x00000000007aa6a2 in toku_txn_begin (env=0x7f0c819fde00, stxn=0x7f0bf50a3300, txn=0x7f0bf500dca8, flags=<optimized out>) at /u01/tokudb/storage/tokudb/PerconaFT/src/ydb_txn.cc:579 #7 0x0000000000f99323 in txn_begin (thd=0x7f0bf504bfc0, flags=1, txn=0x7f0bf500dca8, parent=0x7f0bf50a3300, env=<optimized out>) at /u01/tokudb/storage/tokudb/tokudb_txn.h:116 #8 ha_tokudb::create_txn (this=0x7f0bf50c8830, thd=0x7f0bf504bfc0, trx=0x7f0bf500dca0) at /u01/tokudb/storage/tokudb/ha_tokudb.cc:6458 #9 0x0000000000fa48f9 in ha_tokudb::external_lock (this=0x7f0bf50c8830, thd=0x7f0bf504bfc0, lock_type=1) at /u01/tokudb/storage/tokudb/ha_tokudb.cc:6544 #10 0x00000000008d46eb in handler::ha_external_lock (this=0x7f0bf50c8830, thd=thd@entry=0x7f0bf504bfc0, lock_type=lock_type@entry=1) at /u01/tokudb/sql/handler.cc:8352 #11 0x0000000000e4f3b4 in lock_external (count=1, tables=0x7f0bf5050688, thd=0x7f0bf504bfc0) at /u01/tokudb/sql/lock.cc:389 #12 mysql_lock_tables (thd=thd@entry=0x7f0bf504bfc0, tables=<optimized out>, count=<optimized out>, flags=0) at /u01/tokudb/sql/lock.cc:325 #13 0x0000000000cd0b6d in lock_tables (thd=thd@entry=0x7f0bf504bfc0, tables=0x7f0bf4d11020, count=<optimized out>, flags=flags@entry=0) at /u01/tokudb/sql/sql_base.cc:6705 #14 0x0000000000cd61f2 in open_and_lock_tables (thd=0x7f0bf504bfc0, tables=0x7f0bf4d11020, flags=flags@entry=0, prelocking_strategy=prelocking_strategy@entry=0x7f0c89629680) at /u01/tokudb/sql/sql_base.cc:6523 percona#15 0x0000000000ee09eb in open_and_lock_tables (flags=0, tables=<optimized out>, thd=<optimized out>) at /u01/tokudb/sql/sql_base.h:484 percona#16 Rows_log_event::do_apply_event (this=0x7f0bf50ab4a0, rli=0x7f0c87762800) at /u01/tokudb/sql/log_event.cc:10911 percona#17 0x0000000000ed71c0 in Log_event::apply_event (this=this@entry=0x7f0bf50ab4a0, rli=rli@entry=0x7f0c87762800) at /u01/tokudb/sql/log_event.cc:3329 percona#18 0x0000000000f1d233 in apply_event_and_update_pos (ptr_ev=ptr_ev@entry=0x7f0c89629940, thd=thd@entry=0x7f0bf504bfc0, rli=rli@entry=0x7f0c87762800) at /u01/tokudb/sql/rpl_slave.cc:4761 percona#19 0x0000000000f280a8 in exec_relay_log_event (rli=0x7f0c87762800, thd=0x7f0bf504bfc0) at /u01/tokudb/sql/rpl_slave.cc:5276 percona#20 handle_slave_sql (arg=<optimized out>) at /u01/tokudb/sql/rpl_slave.cc:7491 percona#21 0x00000000013c6184 in pfs_spawn_thread (arg=0x7f0bf5bea820) at /u01/tokudb/storage/perfschema/pfs.cc:2185 percona#22 0x00007f0c885126ba in start_thread (arg=0x7f0c8962a700) at pthread_create.c:333 percona#23 0x00007f0c87d293dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 (gdb) f 10 #10 0x00000000008d46eb in handler::ha_external_lock (this=0x7f0bf50c8830, thd=thd@entry=0x7f0bf504bfc0, lock_type=lock_type@entry=1) at /u01/tokudb/sql/handler.cc:8352 8352 /u01/tokudb/sql/handler.cc: No such file or directory. (gdb) p thd->lex->sql_command = SQLCOM_END With the fixed patch, the debug info is: xa start 'x1'; insert into t1 values(1); xa end 'x1'; xa prepare 'x1'; xa commit 'x1'; 24111 0x7f4aba6c4830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6534 ha_tokudb::external_lock trx (nil) (nil) (nil) (nil) 0 0 24111 /u01/tokudb/storage/tokudb/tokudb_txn.h:127 txn_begin begin txn (nil) 0x7f4aba689000 67108864 r=0 24111 0x7f4aba6c4830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6469 ha_tokudb::create_txn created stmt (nil) sp_level 0x7f4aba689000 24111 0x7f4aba6c4830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:4120 ha_tokudb::write_row txn 0x7f4aba689000 24111 /u01/tokudb/storage/tokudb/hatoku_hton.cc:942 tokudb_commit commit trx 0 txn 0x7f4aba689000 syncflag 512 xa start 'x2'; insert into t1 values(2); xa end 'x2'; xa prepare 'x2'; xa commit 'x2'; 24111 0x7f4aba6c4830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6534 ha_tokudb::external_lock trx (nil) (nil) (nil) (nil) 0 0 24111 /u01/tokudb/storage/tokudb/tokudb_txn.h:127 txn_begin begin txn (nil) 0x7f4aba689000 67108864 r=0 24111 0x7f4aba6c4830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:6469 ha_tokudb::create_txn created stmt (nil) sp_level 0x7f4aba689000 24111 0x7f4aba6c4830 /u01/tokudb/storage/tokudb/ha_tokudb.cc:4120 ha_tokudb::write_row txn 0x7f4aba689000 24111 /u01/tokudb/storage/tokudb/hatoku_hton.cc:942 tokudb_commit commit trx 0 txn 0x7f4aba689000 syncflag 512 Test: mtr --suite=tokudb xa Reviewed by: Rik

…ecuting global constructors (before main gets called). the assert catches global mutex initialization which seems to be problematic. since tokudb has a global mutex (open_tables_mutex) and tokudb is statically linked into mysqld (plugin type mandatory), the assert fires. we could: 1. remove the assert, or 2. rewrite tokudb to remove the open_tables_mutex, or 3. compile tokudb into a shared library. this commit compiles tokudb into a shared library for debug builds. GNU gdb (Ubuntu 8.0.1-0ubuntu1) 8.0.1 Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from bin/mysqld...done. [New LWP 24606] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `bin/mysqld --initialize'. Program terminated with signal SIGABRT, Aborted. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007f4498d8df5d in __GI_abort () at abort.c:90 #2 0x00007f4498d83f17 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x5581f567ed23 "safe_mutex_inited", file=file@entry=0x5581f567ecf0 "/home/rfp/projects/xelabs-server/mysys/thr_mutex.c", line=line@entry=39, function=function@entry=0x5581f567f040 <__PRETTY_FUNCTION__.6977> "safe_mutex_init") at assert.c:92 #3 0x00007f4498d83fc2 in __GI___assert_fail (assertion=0x5581f567ed23 "safe_mutex_inited", file=0x5581f567ecf0 "/home/rfp/projects/xelabs-server/mysys/thr_mutex.c", line=39, function=0x5581f567f040 <__PRETTY_FUNCTION__.6977> "safe_mutex_init") at assert.c:101 #4 0x00005581f4ab1a40 in safe_mutex_init (mp=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>, attr=0x5581f6336550 <my_fast_mutexattr>, file=0x5581f57ac188 "/home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h", line=207) at /home/rfp/projects/xelabs-server/mysys/thr_mutex.c:39 #5 0x00005581f500aa9f in my_mutex_init (mp=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>, attr=0x5581f6336550 <my_fast_mutexattr>, file=0x5581f57ac188 "/home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h", line=207) at /home/rfp/projects/xelabs-server/include/thr_mutex.h:167 #6 0x00005581f500ac25 in inline_mysql_mutex_init (key=4294967295, that=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>, attr=0x5581f6336550 <my_fast_mutexattr>, src_file=0x5581f57ac188 "/home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h", src_line=207) at /home/rfp/projects/xelabs-server/include/mysql/psi/mysql_thread.h:668 #7 0x00005581f5043ee5 in tokudb::thread::mutex_t::mutex_t (this=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>, key=4294967295) at /home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h:207 #8 0x00005581f5043e68 in tokudb::thread::mutex_t::mutex_t (this=0x5581f632b8e0 <TOKUDB_SHARE::_open_tables_mutex>) at /home/rfp/projects/xelabs-server/storage/tokudb/tokudb_thread.h:45 #9 0x00005581f50433a4 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at /home/rfp/projects/xelabs-server/storage/tokudb/ha_tokudb.cc:38 #10 0x00005581f5043936 in _GLOBAL__sub_I__ZN6tokudb8metadata4readEP9__toku_dbP13__toku_db_txnyPvmPm () at /home/rfp/projects/xelabs-server/storage/tokudb/ha_tokudb.cc:9061 #11 0x00005581f529aefd in __libc_csu_init () #12 0x00007f4498d76150 in __libc_start_main (main=0x5581f4025aea <main(int, char**)>, argc=2, argv=0x7ffe9dcd7118, init=0x5581f529aeb0 <__libc_csu_init>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffe9dcd7108) at ../csu/libc-start.c:264 #13 0x00005581f4025a0a in _start () (gdb) q

TokuDB will be crashed during shutdown due to PFS key double-free. (gdb) bt #0 __pthread_kill (threadid=<optimized out>, signo=signo@entry=11) at ../sysdeps/unix/sysv/linux/pthread_kill.c:62 #1 0x0000000000f6b3e7 in my_write_core (sig=sig@entry=11) at /u01/tokudb/mysys/stacktrace.c:249 #2 0x000000000086b6e5 in handle_fatal_signal (sig=11) at /u01/tokudb/sql/signal_handler.cc:223 #3 <signal handler called> #4 destroy_mutex (pfs=0x7f8fb71c9900) at /u01/tokudb/storage/perfschema/pfs_instr.cc:327 #5 0x00000000013c574a in pfs_destroy_mutex_v1 (mutex=<optimized out>) at /u01/tokudb/storage/perfschema/pfs.cc:1833 #6 0x0000000000fc350a in inline_mysql_mutex_destroy (that=0x1f84ea0 <tokudb_map_mutex>) at /u01/tokudb/include/mysql/psi/mysql_thread.h:681 #7 tokudb::thread::mutex_t::~mutex_t (this=0x1f84ea0 <tokudb_map_mutex>, __in_chrg=<optimized out>) at /u01/tokudb/storage/tokudb/tokudb_thread.h:214 #8 0x00007f8fbb38cff8 in _run_exit_handlers (status=status@entry=0, listp=0x7f8fbb7175f8 <_exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82 #9 0x00007f8fbb38d045 in __GI_exit (status=status@entry=0) at exit.c:104 #10 0x000000000085b7b5 in mysqld_exit (exit_code=exit_code@entry=0) at /u01/tokudb/sql/mysqld.cc:1205 #11 0x0000000000865fa6 in mysqld_main (argc=46, argv=0x7f8fbaeb3088) at /u01/tokudb/sql/mysqld.cc:5430 #12 0x00007f8fbb373830 in __libc_start_main (main=0x78d700 <main(int, char**)>, argc=10, argv=0x7ffd36ab1a28, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffd36ab1a18) at ../csu/libc-start.c:291 #13 0x00000000007a7b79 in _start () And the AddressSanitizer errors: ==27219==ERROR: AddressSanitizer: heap-use-after-free on address 0x7f009b12d118 at pc 0x00000265d80b bp 0x7ffd3bb7ffb0 sp 0x7ffd3bb7ffa0 READ of size 8 at 0x7f009b12d118 thread T0 #0 0x265d80a in destroy_mutex(PFS_mutex*) /u01/tokudb/storage/perfschema/pfs_instr.cc:323 #1 0x1cfaef3 in inline_mysql_mutex_destroy /u01/tokudb/include/mysql/psi/mysql_thread.h:681 #2 0x1cfaef3 in tokudb::thread::mutex_t::~mutex_t() /u01/tokudb/storage/tokudb/tokudb_thread.h:214 #3 0x7f009d512ff7 (/lib/x86_64-linux-gnu/libc.so.6+0x39ff7) #4 0x7f009d513044 in exit (/lib/x86_64-linux-gnu/libc.so.6+0x3a044) #5 0x9a8239 in mysqld_exit /u01/tokudb/sql/mysqld.cc:1205 #6 0x9b210f in unireg_abort /u01/tokudb/sql/mysqld.cc:1175 #7 0x9b629e in init_server_components /u01/tokudb/sql/mysqld.cc:4509 #8 0x9b8aca in mysqld_main(int, char**) /u01/tokudb/sql/mysqld.cc:5001 #9 0x7f009d4f982f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) #10 0x7e8308 in _start (/home/ubuntu/mysql_20161216/bin/mysqld+0x7e8308)

…TABLE_UPGRADE_GUARD To repeat: cmake -DWITH_ASAN=1 -DWITH_ASAN_SCOPE=1 ./mtr --mem --sanitize main.dd_upgrade_error A few dd tests fail with: ==26861==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7000063bf5e8 at pc 0x00010d4dbe8b bp 0x7000063bda40 sp 0x7000063bda38 READ of size 8 at 0x7000063bf5e8 thread T2 #0 0x10d4dbe8a in Prealloced_array<st_plugin_int**, 16ul>::empty() const prealloced_array.h:186 #1 0x10d406a8b in lex_end(LEX*) sql_lex.cc:560 percona#2 0x10dae4b6d in dd::upgrade::Table_upgrade_guard::~Table_upgrade_guard() (mysqld:x86_64+0x100f87b6d) percona#3 0x10dadc557 in dd::upgrade::migrate_table_to_dd(THD*, std::__1::basic_string<char, std::__1::char_traits<char>, Stateless_allocator<char, dd::String_type_alloc, My_free_functor> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, Stateless_allocator<char, dd::String_type_alloc, My_free_functor> > const&, bool) (mysqld:x86_64+0x100f7f557) percona#4 0x10dad7e85 in dd::upgrade::migrate_plugin_table_to_dd(THD*) (mysqld:x86_64+0x100f7ae85) percona#5 0x10daec6a1 in dd::upgrade::do_pre_checks_and_initialize_dd(THD*) upgrade.cc:1216 percona#6 0x10cd0a5c0 in bootstrap::handle_bootstrap(void*) bootstrap.cc:336 Change-Id: I265ec6dd97ee8076aaf03763840c0cdf9e20325b Fix: increase lifetime of 'LEX lex;' which is used by 'table_guard'

A subset of binlog encryption tests was crashing with: * thread #39, stop reason = signal SIGSTOP frame #0: 0x00007fff56063b66 libsystem_kernel.dylib`__pthread_kill + 10 frame #1: 0x00007fff5622e080 libsystem_pthread.dylib`pthread_kill + 333 frame #2: 0x000000010657442b mysqld-debug`my_write_core(sig=11) at stacktrace.cc:278 frame #3: 0x0000000104d84334 mysqld-debug`::handle_fatal_signal(sig=11) at signal_handler.cc:254 frame #4: 0x00007fff56221f5a libsystem_platform.dylib`_sigtramp + 26 frame #5: 0x00007fff5622934d libsystem_pthread.dylib`pthread_mutex_lock + 1 frame #6: 0x0000000106578d05 mysqld-debug`native_mutex_lock(mutex=0x0000000000000000) at thr_mutex.h:93 frame #7: 0x0000000106578a57 mysqld-debug`safe_mutex_lock(mp=0x0000000000000000, try_lock=false, file="/Users/laurynas/percona/mysql-server/mysys/mf_iocache2.cc", line=113) at thr_mutex.cc:70 frame #8: 0x000000010653cd3a mysqld-debug`my_mutex_lock(mp=0x00007ffb6b215038, file="/Users/laurynas/percona/mysql-server/mysys/mf_iocache2.cc", line=113) at thr_mutex.h:180 frame #9: 0x000000010653b2cc mysqld-debug`inline_mysql_mutex_lock(that=0x00007ffb6b215038, src_file="/Users/laurynas/percona/mysql-server/mysys/mf_iocache2.cc", src_line=113) at mysql_mutex.h:267 * frame #10: 0x000000010653b0d8 mysqld-debug`my_b_append_tell(info=0x00007ffb6b214fd8) at mf_iocache2.cc:113 frame #11: 0x0000000105ed6a96 mysqld-debug`MYSQL_BIN_LOG::write_buffer(this=0x00007ffb6b214cb8, buf="", len=47, mi=0x00007ffb6b1f6a00) at binlog.cc:7128 frame #12: 0x0000000105f4d54b mysqld-debug`queue_event(mi=0x00007ffb6b1f6a00, buf="", event_len=47, do_flush_mi=true) at rpl_slave.cc:7756 frame #13: 0x0000000105f3a243 mysqld-debug`::handle_slave_io(arg=0x00007ffb6b1f6a00) at rpl_slave.cc:5382 frame #14: 0x00000001065b87a5 mysqld-debug`pfs_spawn_thread(arg=0x00007ffb6a543af0) at pfs.cc:2836 frame #15: 0x00007fff5622b661 libsystem_pthread.dylib`_pthread_body + 340 frame #16: 0x00007fff5622b50d libsystem_pthread.dylib`_pthread_start + 377 frame #17: 0x00007fff5622abf9 libsystem_pthread.dylib`thread_start + 13 This was caused by my_b_append_tell trying to lock a nullptr IO_CACHE::append_buffer_lock. The lock was nullptr, because it's only initialized for SEQ_READ_APPEND IO_CACHEs, whereas we have WRITE_CACHE. This mismatch was introduced by WL#8599 [1] changing the IO_CACHE type from the former to the latter. Fix by using the correct API for the new IO_CACHE type: my_b_tell instead of my_b_append_tell. [1]: commit dbd2ca2 Author: Joao Gramacho <joao.gramacho@oracle.com> Date: Tue Nov 1 06:45:39 2016 +0000 WL#8599: Reduce contention in IO and SQL threads (...)

create_table_info_t::create_table_def leaked memory in the case enable_encryption(table) call failed: worker[5] Sanitizer report from /tmp/results/PS/mysql-test/var/5/log/mysqld.2.err after tests: binlog_encryption.binlog_encryption_without_keyring group_replication.gr_change_master_hidden group_replication.gr_server_uuid_matches_group_name group_replication.gr_perfschema_connect_status group_replication.gr_single_primary_and_leader_election_on_error group_replication.gr_without_perfschema rpl.rpl_key_rotation -------------------------------------------------------------------------- ==14131==ERROR: LeakSanitizer: detected memory leaks Direct leak of 1136 byte(s) in 1 object(s) allocated from: #0 0x7fe9233f1602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602) #1 0xc692483 in ut_allocator<unsigned char>::allocate(unsigned long, unsigned char const*, unsigned int, bool, bool) storage/innobase/include/ut0new.h:608 #2 0xc692483 in mem_heap_create_block_func(mem_block_info_t*, unsigned long, unsigned long) storage/innobase/mem/memory.cc:281 #3 0xb99ff96 in mem_heap_create_func storage/innobase/include/mem0mem.ic:464 #4 0xbae8604 in create_table_info_t::create_table_def(dd::Table const*) storage/innobase/handler/ha_innodb.cc:10349 #5 0xbaee018 in create_table_info_t::create_table(dd::Table const*) storage/innobase/handler/ha_innodb.cc:12420 #6 0xbaf1aba in int innobase_basic_ddl::create_impl<dd::Table>(THD*, char const*, TABLE*, HA_CREATE_INFO*, dd::Table*, bool, bool, bool, unsigned long, unsigned long) storage/innobase/handler/ha_innodb.cc:12805 #7 0xbaf7e6a in ha_innobase::create(char const*, TABLE*, HA_CREATE_INFO*, dd::Table*) storage/innobase/handler/ha_innodb.cc:13756 #8 0x2857f7a in ha_create_table(THD*, char const*, char const*, char const*, HA_CREATE_INFO*, List<Create_field> const*, bool, bool, dd::Table*) sql/handler.cc:5156 #9 0x19d0d9f in rea_create_base_table sql/sql_table.cc:991 #10 0x19d0d9f in create_table_impl sql/sql_table.cc:7118 #11 0x19d37cf in mysql_create_table_no_lock(THD*, char const*, char const*, HA_CREATE_INFO*, Alter_info*, unsigned int, bool, bool*, handlerton**) sql/sql_table.cc:7200 #12 0x19dffb2 in mysql_create_table(THD*, TABLE_LIST*, HA_CREATE_INFO*, Alter_info*) sql/sql_table.cc:7950 #13 0x3b58b9b in Sql_cmd_create_table::execute(THD*) sql/sql_cmd_ddl_table.cc:319 #14 0x15917c1 in mysql_execute_command(THD*, bool) sql/sql_parse.cc:4417 #15 0x15b086e in mysql_parse(THD*, Parser_state*, bool) sql/sql_parse.cc:5139 #16 0x8efc7fd in Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned long) sql/log_event.cc:5295 #17 0x8f7ea48 in Log_event::apply_event(Relay_log_info*) sql/log_event.cc:3882 #18 0x91cb682 in apply_event_and_update_pos sql/rpl_slave.cc:4352 #19 0x9215e69 in exec_relay_log_event sql/rpl_slave.cc:4812 #20 0x9254685 in handle_slave_sql sql/rpl_slave.cc:6912 #21 0xb1913a3 in pfs_spawn_thread storage/perfschema/pfs.cc:2836 #22 0x7fe9231436b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9) Fix by adding the missing mem_heap_free(heap) call.

Avoid undefined behavior in audit_log_update_thd_local by avoiding passing NULL as source pointer to memcpy, even with zero length. The UBSan report fixed is /usr/include/x86_64-linux-gnu/bits/string3.h:53:71: runtime error: null pointer passed as argument 2, which is declared to never be null #0 0x7fe5aad56fb1 in memcpy /usr/include/x86_64-linux-gnu/bits/string3.h:53 #1 0x7fe5aad56fb1 in audit_log_update_thd_local plugin/audit_log/audit_log.cc:987 #2 0x7fe5aad56fb1 in audit_log_notify plugin/audit_log/audit_log.cc:1105 #3 0x1ecac37 in plugins_dispatch sql/sql_audit.cc:1284 #4 0x1ecac37 in event_class_dispatch sql/sql_audit.cc:1322 #5 0x1ecb311 in event_class_dispatch_error sql/sql_audit.cc:1340 #6 0x1ed21b1 in mysql_audit_notify(THD*, mysql_event_connection_subclass_t, char const*, int) sql/sql_audit.cc:438 #7 0x1350071 in check_connection sql/sql_connect.cc:868 #8 0x1350071 in login_connection sql/sql_connect.cc:929 #9 0x1357881 in thd_prepare_connection(THD*, bool) sql/sql_connect.cc:1084 #10 0x1e66347 in handle_connection sql/conn_handler/connection_handler_per_thread.cc:313 #11 0xb1913a3 in pfs_spawn_thread storage/perfschema/pfs.cc:2836 #12 0x7fe5d352f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9) #13 0x7fe5d0bd741c in clone (/lib/x86_64-linux-gnu/libc.so.6+0x10741c)

Upstream commit ID : fb-mysql-5.6.35/8cb1dc836b68f1f13e8b2655b2b8cb2d57f400b3 PS-5217 : Merge fb-prod201803 Summary: Original report: https://jira.mariadb.org/browse/MDEV-15816 To reproduce this bug just following below steps, client 1: USE test; CREATE TABLE t1 (i INT) ENGINE=MyISAM; HANDLER t1 OPEN h; CREATE TABLE t2 (i INT) ENGINE=RocksDB; LOCK TABLES t2 WRITE; client 2: FLUSH TABLES WITH READ LOCK; client 1: INSERT INTO t2 VALUES (1); So client 1 acquired the lock and set m_lock_rows = RDB_LOCK_WRITE. Then client 2 calls store_lock(TL_IGNORE) and m_lock_rows was wrongly set to RDB_LOCK_NONE, as below ``` #0 myrocks::ha_rocksdb::store_lock (this=0x7fffbc03c7c8, thd=0x7fffc0000ba0, to=0x7fffc0011220, lock_type=TL_IGNORE) percona#1 get_lock_data (thd=0x7fffc0000ba0, table_ptr=0x7fffe84b7d20, count=1, flags=2) percona#2 mysql_lock_abort_for_thread (thd=0x7fffc0000ba0, table=0x7fffbc03bbc0) percona#3 THD::notify_shared_lock (this=0x7fffc0000ba0, ctx_in_use=0x7fffbc000bd8, needs_thr_lock_abort=true) percona#4 MDL_lock::notify_conflicting_locks (this=0x555557a82380, ctx=0x7fffc0000cc8) percona#5 MDL_context::acquire_lock (this=0x7fffc0000cc8, mdl_request=0x7fffe84b8350, lock_wait_timeout=2) percona#6 Global_read_lock::lock_global_read_lock (this=0x7fffc0003fe0, thd=0x7fffc0000ba0) ``` Finally, client 1 "INSERT INTO..." hits the Assertion 'm_lock_rows == RDB_LOCK_WRITE' failed in myrocks::ha_rocksdb::write_row() Fix this bug by not setting m_locks_rows if lock_type == TL_IGNORE. Closes facebook/mysql-5.6#838 Pull Request resolved: facebook/mysql-5.6#871 Differential Revision: D9417382 Pulled By: lth fbshipit-source-id: c36c164e06c

…s=0 and a local DDL executed https://perconadev.atlassian.net/browse/PS-9018 Problem ------- In high concurrency scenarios, MySQL replica can enter into a deadlock due to a race condition between the replica applier thread and the client thread performing a binlog group commit. Analysis -------- It needs at least 3 threads for this deadlock to happen 1. One client thread 2. Two replica applier threads How this deadlock happens? -------------------------- 0. Binlog is enabled on replica, but log_replica_updates is disabled. 1. Initially, both "Commit Order" and "Binlog Flush" queues are empty. 2. Replica applier thread 1 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 3. Since both "Commit Order" and "Binlog Flush" queues are empty, the applier thread 1 3.1. Becomes leader (In Commit_stage_manager::enroll_for()). 3.2. Registers in the commit order queue. 3.3. Acquires the lock MYSQL_BIN_LOG::LOCK_log. 3.4. Commit Order queue is emptied, but the lock MYSQL_BIN_LOG::LOCK_log is not yet released. NOTE: SE commit for applier thread is already done by the time it reaches here. 4. Replica applier thread 2 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 5. Since the "Commit Order" queue is empty (emptied by applier thread 1 in 3.4), the applier thread 2 5.1. Becomes leader (In Commit_stage_manager::enroll_for()) 5.2. Registers in the commit order queue. 5.3. Tries to acquire the lock MYSQL_BIN_LOG::LOCK_log. Since it is held by applier thread 1 it will wait until the lock is released. 6. Client thread enters the group commit pipeline to register in the "Binlog Flush" queue. 7. Since "Commit Order" queue is not empty (there is applier thread 2 in the queue), it enters the conditional wait `m_stage_cond_leader` with an intention to become the leader for both the "Binlog Flush" and "Commit Order" queues. 8. Applier thread 1 releases the lock MYSQL_BIN_LOG::LOCK_log and proceeds to update the GTID by calling gtid_state->update_commit_group() from Commit_order_manager::flush_engine_and_signal_threads(). 9. Applier thread 2 acquires the lock MYSQL_BIN_LOG::LOCK_log. 9.1. It checks if there is any thread waiting in the "Binlog Flush" queue to become the leader. Here it finds the client thread waiting to be the leader. 9.2. It releases the lock MYSQL_BIN_LOG::LOCK_log and signals on the cond_var `m_stage_cond_leader` and enters a conditional wait until the thread's `tx_commit_pending` is set to false by the client thread (will be done in the Commit_stage_manager::process_final_stage_for_ordered_commit_group() called by client thread from fetch_and_process_flush_stage_queue()). 10. The client thread wakes up from the cond_var `m_stage_cond_leader`. The thread has now become a leader and it is its responsibility to update GTID of applier thread 2. 10.1. It acquires the lock MYSQL_BIN_LOG::LOCK_log. 10.2. Returns from `enroll_for()` and proceeds to process the "Commit Order" and "Binlog Flush" queues. 10.3. Fetches the "Commit Order" and "Binlog Flush" queues. 10.4. Performs the storage engine flush by calling ha_flush_logs() from fetch_and_process_flush_stage_queue(). 10.5. Proceeds to update the GTID of threads in "Commit Order" queue by calling gtid_state->update_commit_group() from Commit_stage_manager::process_final_stage_for_ordered_commit_group(). 11. At this point, we will have - Client thread performing GTID update on behalf if applier thread 2 (from step 10.5), and - Applier thread 1 performing GTID update for itself (from step 8). Due to the lack of proper synchronization between the above two threads, there exists a time window where both threads can call gtid_state->update_commit_group() concurrently. In subsequent steps, both threads simultaneously try to modify the contents of the array `commit_group_sidnos` which is used to track the lock status of sidnos. This concurrent access to `update_commit_group()` can cause a lock-leak resulting in one thread acquiring the sidno lock and not releasing at all. ----------------------------------------------------------------------------------------------------------- Client thread Applier Thread 1 ----------------------------------------------------------------------------------------------------------- update_commit_group() => global_sid_lock->rdlock(); update_commit_group() => global_sid_lock->rdlock(); calls update_gtids_impl_lock_sidnos() calls update_gtids_impl_lock_sidnos() set commit_group_sidno[2] = true set commit_group_sidno[2] = true lock_sidno(2) -> successful lock_sidno(2) -> waits update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { unlock_sidno(2); commit_group_sidnos[2] = false; } Applier thread continues.. lock_sidno(2) -> successful update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { <=== this check fails and lock is not released. unlock_sidno(2); commit_group_sidnos[2] = false; } Client thread continues without releasing the lock ----------------------------------------------------------------------------------------------------------- 12. As the above lock-leak can also happen the other way i.e, the applier thread fails to unlock, there can be different consequences hereafter. 13. If the client thread continues without releasing the lock, then at a later stage, it can enter into a deadlock with the applier thread performing a GTID update with stack trace. Client_thread ------------- #1 __GI___lll_lock_wait #2 ___pthread_mutex_lock #3 native_mutex_lock <= waits for commit lock while holding sidno lock percona#4 Commit_stage_manager::enroll_for percona#5 MYSQL_BIN_LOG::change_stage percona#6 MYSQL_BIN_LOG::ordered_commit percona#7 MYSQL_BIN_LOG::commit percona#8 ha_commit_trans percona#9 trans_commit_implicit percona#10 mysql_create_like_table percona#11 Sql_cmd_create_table::execute percona#12 mysql_execute_command percona#13 dispatch_sql_command Applier thread -------------- #1 ___pthread_mutex_lock #2 native_mutex_lock #3 safe_mutex_lock percona#4 Gtid_state::update_gtids_impl_lock_sidnos <= waits for sidno lock percona#5 Gtid_state::update_commit_group percona#6 Commit_order_manager::flush_engine_and_signal_threads <= acquires commit lock here percona#7 Commit_order_manager::finish percona#8 Commit_order_manager::wait_and_finish percona#9 ha_commit_low percona#10 trx_coordinator::commit_in_engines percona#11 MYSQL_BIN_LOG::commit percona#12 ha_commit_trans percona#13 trans_commit percona#14 Xid_log_event::do_commit percona#15 Xid_apply_log_event::do_apply_event_worker percona#16 Slave_worker::slave_worker_exec_event percona#17 slave_worker_exec_job_group percona#18 handle_slave_worker 14. If the applier thread continues without releasing the lock, then at a later stage, it can perform recursive locking while setting the GTID for the next transaction (in set_gtid_next()). In debug builds the above case hits the assertion `safe_mutex_assert_not_owner()` meaning the lock is already acquired by the replica applier thread when it tries to re-acquire the lock. Solution -------- In the above problematic example, when seen from each thread individually, we can conclude that there is no problem in the order of lock acquisition, thus there is no need to change the lock order. However, the root cause for this problem is that multiple threads can concurrently access to the array `Gtid_state::commit_group_sidnos`. In its initial implementation, it was expected that threads should hold the `MYSQL_BIN_LOG::LOCK_commit` before modifying its contents. But it was not considered when upstream implemented WL#7846 (MTS: slave-preserve-commit-order when log-slave-updates/binlog is disabled). With this patch, we now ensure that `MYSQL_BIN_LOG::LOCK_commit` is acquired when the client thread (binlog flush leader) when it tries to perform GTID update on behalf of threads waiting in "Commit Order" queue, thus providing a guarantee that `Gtid_state::commit_group_sidnos` array is never accessed without the protection of `MYSQL_BIN_LOG::LOCK_commit`.

PS-5741: Incorrect use of memset_s in keyring_vault. Fixed the usage of memset_s. The arguments should be: void memset_s(void *dest, size_t dest_max, int c, size_t n) where the 2nd argument is size of buffer and the 3rd is argument is character to fill. --------------------------------------------------------------------------- PS-7769 - Fix use-after-return error in audit_log_exclude_accounts_validate --- *Problem:* `st_mysql_value::val_str` might return a pointer to `buf` which after the function called is deleted. Therefore the value in `save`, after reuturnin from the function, is invalid. In this particular case, the error is not manifesting as val_str` returns memory allocated with `thd_strmake` and it does not use `buf`. *Solution:* Allocate memory with `thd_strmake` so the memory in `save` is not local. --------------------------------------------------------------------------- Fix test main.bug12969156 when WITH_ASAN=ON *Problem:* ASAN complains about stack-buffer-overflow on function `mysql_heartbeat`: ``` ==90890==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fe746d06d14 at pc 0x7fe760f5b017 bp 0x7fe746d06cd0 sp 0x7fe746d06478 WRITE of size 24 at 0x7fe746d06d14 thread T16777215 Address 0x7fe746d06d14 is located in stack of thread T26 at offset 340 in frame #0 0x7fe746d0a55c in mysql_heartbeat(void*) /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:62 This frame has 4 object(s): [48, 56) 'result' (line 66) [80, 112) '_db_stack_frame_' (line 63) [144, 200) 'tm_tmp' (line 67) [240, 340) 'buffer' (line 65) <== Memory access at offset 340 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported) Thread T26 created by T25 here: #0 0x7fe760f5f6d5 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216 #1 0x557ccbbcb857 in my_thread_create /home/yura/ws/percona-server/mysys/my_thread.c:104 #2 0x7fe746d0b21a in daemon_example_plugin_init /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:148 #3 0x557ccb4c69c7 in plugin_initialize /home/yura/ws/percona-server/sql/sql_plugin.cc:1279 percona#4 0x557ccb4d19cd in mysql_install_plugin /home/yura/ws/percona-server/sql/sql_plugin.cc:2279 percona#5 0x557ccb4d218f in Sql_cmd_install_plugin::execute(THD*) /home/yura/ws/percona-server/sql/sql_plugin.cc:4664 percona#6 0x557ccb47695e in mysql_execute_command(THD*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5160 percona#7 0x557ccb47977c in mysql_parse(THD*, Parser_state*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5952 percona#8 0x557ccb47b6c2 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/yura/ws/percona-server/sql/sql_parse.cc:1544 percona#9 0x557ccb47de1d in do_command(THD*) /home/yura/ws/percona-server/sql/sql_parse.cc:1065 percona#10 0x557ccb6ac294 in handle_connection /home/yura/ws/percona-server/sql/conn_handler/connection_handler_per_thread.cc:325 percona#11 0x557ccbbfabb0 in pfs_spawn_thread /home/yura/ws/percona-server/storage/perfschema/pfs.cc:2198 percona#12 0x7fe760ab544f in start_thread nptl/pthread_create.c:473 ``` The reason is that `my_thread_cancel` is used to finish the daemon thread. This is not and orderly way of finishing the thread. ASAN does not register the stack variables are not used anymore which generates the error above. This is a benign error as all the variables are on the stack. *Solution*: Finish the thread in orderly way by using a signalling variable. --------------------------------------------------------------------------- PS-8204: Fix XML escape rules for audit plugin https://jira.percona.com/browse/PS-8204 There was a wrong length specified for some XML escape rules. As a result of this terminating null symbol from replacement rule was copied into resulting string. This lead to quer text truncation in audit log file. In addition added empty replacement rules for '\b' and 'f' symbols which just remove them from resulting string. These symboles are not supported in XML 1.0. --------------------------------------------------------------------------- PS-8854: Add main.percona_udf MTR test Add a test to check FNV1A_64, FNV_64, and MURMUR_HASH user-defined functions.

…n read() syscall over network https://jira.percona.com/browse/PS-8592 Description ----------- GR suffered from problems caused by the security probes and network scanner processes connecting to the group replication communication port. This usually is not a problem, but poses a serious threat when another member tries to join the cluster by initialting a connection to the member which is affected by external processes using the port dedicated for group communication for longer durations. On such activites by external processes, the SSL enabled server stalled forever on the SSL_accept() call waiting for handshake data. Below is the stacktrace: Thread 55 (Thread 0x7f7bb77ff700 (LWP 2198598)): #0 in read () #1 in sock_read () #2 in BIO_read () #3 in ssl23_read_bytes () percona#4 in ssl23_get_client_hello () percona#5 in ssl23_accept () percona#6 in xcom_tcp_server_startup(Xcom_network_provider*) () When the server stalled in the above path forever, it prohibited other members to join the cluster resulting in the following messages on the joiner server's logs. [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group' [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is already leaving or joining a group.' Solution -------- This patch adds two new variables 1. group_replication_xcom_ssl_socket_timeout It is a file-descriptor level timeout in seconds for both accept() and SSL_accept() calls when group replication is listening on the xcom port. When set to a valid value, say for example 5 seconds, both accept() and SSL_accept() return after 5 seconds. The default value has been set to 0 (waits infinitely) for backward compatibility. This variable is effective only when GR is configred with SSL. 2. group_replication_xcom_ssl_accept_retries It defines the number of retries to be performed before closing the socket. For each retry the server thread calls SSL_accept() with timeout defined by the group_replication_xcom_ssl_socket_timeout for the SSL handshake process once the connection has been accepted by the first accept() call. The default value has been set to 10. This variable is effective only when GR is configred with SSL. Note: - Both of the above variables are dynamically configurable, but will become effective only on START GROUP_REPLICATION. ------------------------------------------------------------------------- PS-8844: Fix the failing main.mysqldump_gtid_purged https://jira.percona.com/browse/PS-8844 This patch fixes the test failure of main.mysqldump_gtid_purged that failed due to the uninitialized variable $redirect_stderr in the start_proc_in_background.inc.

…ocal DDL executed https://perconadev.atlassian.net/browse/PS-9018 Problem ------- In high concurrency scenarios, MySQL replica can enter into a deadlock due to a race condition between the replica applier thread and the client thread performing a binlog group commit. Analysis -------- It needs at least 3 threads for this deadlock to happen 1. One client thread 2. Two replica applier threads How this deadlock happens? -------------------------- 0. Binlog is enabled on replica, but log_replica_updates is disabled. 1. Initially, both "Commit Order" and "Binlog Flush" queues are empty. 2. Replica applier thread 1 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 3. Since both "Commit Order" and "Binlog Flush" queues are empty, the applier thread 1 3.1. Becomes leader (In Commit_stage_manager::enroll_for()). 3.2. Registers in the commit order queue. 3.3. Acquires the lock MYSQL_BIN_LOG::LOCK_log. 3.4. Commit Order queue is emptied, but the lock MYSQL_BIN_LOG::LOCK_log is not yet released. NOTE: SE commit for applier thread is already done by the time it reaches here. 4. Replica applier thread 2 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 5. Since the "Commit Order" queue is empty (emptied by applier thread 1 in 3.4), the applier thread 2 5.1. Becomes leader (In Commit_stage_manager::enroll_for()) 5.2. Registers in the commit order queue. 5.3. Tries to acquire the lock MYSQL_BIN_LOG::LOCK_log. Since it is held by applier thread 1 it will wait until the lock is released. 6. Client thread enters the group commit pipeline to register in the "Binlog Flush" queue. 7. Since "Commit Order" queue is not empty (there is applier thread 2 in the queue), it enters the conditional wait `m_stage_cond_leader` with an intention to become the leader for both the "Binlog Flush" and "Commit Order" queues. 8. Applier thread 1 releases the lock MYSQL_BIN_LOG::LOCK_log and proceeds to update the GTID by calling gtid_state->update_commit_group() from Commit_order_manager::flush_engine_and_signal_threads(). 9. Applier thread 2 acquires the lock MYSQL_BIN_LOG::LOCK_log. 9.1. It checks if there is any thread waiting in the "Binlog Flush" queue to become the leader. Here it finds the client thread waiting to be the leader. 9.2. It releases the lock MYSQL_BIN_LOG::LOCK_log and signals on the cond_var `m_stage_cond_leader` and enters a conditional wait until the thread's `tx_commit_pending` is set to false by the client thread (will be done in the Commit_stage_manager::process_final_stage_for_ordered_commit_group() called by client thread from fetch_and_process_flush_stage_queue()). 10. The client thread wakes up from the cond_var `m_stage_cond_leader`. The thread has now become a leader and it is its responsibility to update GTID of applier thread 2. 10.1. It acquires the lock MYSQL_BIN_LOG::LOCK_log. 10.2. Returns from `enroll_for()` and proceeds to process the "Commit Order" and "Binlog Flush" queues. 10.3. Fetches the "Commit Order" and "Binlog Flush" queues. 10.4. Performs the storage engine flush by calling ha_flush_logs() from fetch_and_process_flush_stage_queue(). 10.5. Proceeds to update the GTID of threads in "Commit Order" queue by calling gtid_state->update_commit_group() from Commit_stage_manager::process_final_stage_for_ordered_commit_group(). 11. At this point, we will have - Client thread performing GTID update on behalf if applier thread 2 (from step 10.5), and - Applier thread 1 performing GTID update for itself (from step 8). Due to the lack of proper synchronization between the above two threads, there exists a time window where both threads can call gtid_state->update_commit_group() concurrently. In subsequent steps, both threads simultaneously try to modify the contents of the array `commit_group_sidnos` which is used to track the lock status of sidnos. This concurrent access to `update_commit_group()` can cause a lock-leak resulting in one thread acquiring the sidno lock and not releasing at all. ----------------------------------------------------------------------------------------------------------- Client thread Applier Thread 1 ----------------------------------------------------------------------------------------------------------- update_commit_group() => global_sid_lock->rdlock(); update_commit_group() => global_sid_lock->rdlock(); calls update_gtids_impl_lock_sidnos() calls update_gtids_impl_lock_sidnos() set commit_group_sidno[2] = true set commit_group_sidno[2] = true lock_sidno(2) -> successful lock_sidno(2) -> waits update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { unlock_sidno(2); commit_group_sidnos[2] = false; } Applier thread continues.. lock_sidno(2) -> successful update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { <=== this check fails and lock is not released. unlock_sidno(2); commit_group_sidnos[2] = false; } Client thread continues without releasing the lock ----------------------------------------------------------------------------------------------------------- 12. As the above lock-leak can also happen the other way i.e, the applier thread fails to unlock, there can be different consequences hereafter. 13. If the client thread continues without releasing the lock, then at a later stage, it can enter into a deadlock with the applier thread performing a GTID update with stack trace. Client_thread ------------- #1 __GI___lll_lock_wait #2 ___pthread_mutex_lock #3 native_mutex_lock <= waits for commit lock while holding sidno lock #4 Commit_stage_manager::enroll_for percona#5 MYSQL_BIN_LOG::change_stage percona#6 MYSQL_BIN_LOG::ordered_commit percona#7 MYSQL_BIN_LOG::commit percona#8 ha_commit_trans percona#9 trans_commit_implicit percona#10 mysql_create_like_table percona#11 Sql_cmd_create_table::execute percona#12 mysql_execute_command percona#13 dispatch_sql_command Applier thread -------------- #1 ___pthread_mutex_lock #2 native_mutex_lock #3 safe_mutex_lock #4 Gtid_state::update_gtids_impl_lock_sidnos <= waits for sidno lock percona#5 Gtid_state::update_commit_group percona#6 Commit_order_manager::flush_engine_and_signal_threads <= acquires commit lock here percona#7 Commit_order_manager::finish percona#8 Commit_order_manager::wait_and_finish percona#9 ha_commit_low percona#10 trx_coordinator::commit_in_engines percona#11 MYSQL_BIN_LOG::commit percona#12 ha_commit_trans percona#13 trans_commit percona#14 Xid_log_event::do_commit percona#15 Xid_apply_log_event::do_apply_event_worker percona#16 Slave_worker::slave_worker_exec_event percona#17 slave_worker_exec_job_group percona#18 handle_slave_worker 14. If the applier thread continues without releasing the lock, then at a later stage, it can perform recursive locking while setting the GTID for the next transaction (in set_gtid_next()). In debug builds the above case hits the assertion `safe_mutex_assert_not_owner()` meaning the lock is already acquired by the replica applier thread when it tries to re-acquire the lock. Solution -------- In the above problematic example, when seen from each thread individually, we can conclude that there is no problem in the order of lock acquisition, thus there is no need to change the lock order. However, the root cause for this problem is that multiple threads can concurrently access to the array `Gtid_state::commit_group_sidnos`. In its initial implementation, it was expected that threads should hold the `MYSQL_BIN_LOG::LOCK_commit` before modifying its contents. But it was not considered when upstream implemented WL#7846 (MTS: slave-preserve-commit-order when log-slave-updates/binlog is disabled). With this patch, we now ensure that `MYSQL_BIN_LOG::LOCK_commit` is acquired when the client thread (binlog flush leader) when it tries to perform GTID update on behalf of threads waiting in "Commit Order" queue, thus providing a guarantee that `Gtid_state::commit_group_sidnos` array is never accessed without the protection of `MYSQL_BIN_LOG::LOCK_commit`.

PS-5741: Incorrect use of memset_s in keyring_vault. Fixed the usage of memset_s. The arguments should be: void memset_s(void *dest, size_t dest_max, int c, size_t n) where the 2nd argument is size of buffer and the 3rd is argument is character to fill. --------------------------------------------------------------------------- PS-7769 - Fix use-after-return error in audit_log_exclude_accounts_validate --- *Problem:* `st_mysql_value::val_str` might return a pointer to `buf` which after the function called is deleted. Therefore the value in `save`, after reuturnin from the function, is invalid. In this particular case, the error is not manifesting as val_str` returns memory allocated with `thd_strmake` and it does not use `buf`. *Solution:* Allocate memory with `thd_strmake` so the memory in `save` is not local. --------------------------------------------------------------------------- Fix test main.bug12969156 when WITH_ASAN=ON *Problem:* ASAN complains about stack-buffer-overflow on function `mysql_heartbeat`: ``` ==90890==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fe746d06d14 at pc 0x7fe760f5b017 bp 0x7fe746d06cd0 sp 0x7fe746d06478 WRITE of size 24 at 0x7fe746d06d14 thread T16777215 Address 0x7fe746d06d14 is located in stack of thread T26 at offset 340 in frame #0 0x7fe746d0a55c in mysql_heartbeat(void*) /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:62 This frame has 4 object(s): [48, 56) 'result' (line 66) [80, 112) '_db_stack_frame_' (line 63) [144, 200) 'tm_tmp' (line 67) [240, 340) 'buffer' (line 65) <== Memory access at offset 340 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported) Thread T26 created by T25 here: #0 0x7fe760f5f6d5 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216 #1 0x557ccbbcb857 in my_thread_create /home/yura/ws/percona-server/mysys/my_thread.c:104 #2 0x7fe746d0b21a in daemon_example_plugin_init /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:148 #3 0x557ccb4c69c7 in plugin_initialize /home/yura/ws/percona-server/sql/sql_plugin.cc:1279 percona#4 0x557ccb4d19cd in mysql_install_plugin /home/yura/ws/percona-server/sql/sql_plugin.cc:2279 percona#5 0x557ccb4d218f in Sql_cmd_install_plugin::execute(THD*) /home/yura/ws/percona-server/sql/sql_plugin.cc:4664 percona#6 0x557ccb47695e in mysql_execute_command(THD*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5160 percona#7 0x557ccb47977c in mysql_parse(THD*, Parser_state*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5952 percona#8 0x557ccb47b6c2 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/yura/ws/percona-server/sql/sql_parse.cc:1544 percona#9 0x557ccb47de1d in do_command(THD*) /home/yura/ws/percona-server/sql/sql_parse.cc:1065 percona#10 0x557ccb6ac294 in handle_connection /home/yura/ws/percona-server/sql/conn_handler/connection_handler_per_thread.cc:325 percona#11 0x557ccbbfabb0 in pfs_spawn_thread /home/yura/ws/percona-server/storage/perfschema/pfs.cc:2198 percona#12 0x7fe760ab544f in start_thread nptl/pthread_create.c:473 ``` The reason is that `my_thread_cancel` is used to finish the daemon thread. This is not and orderly way of finishing the thread. ASAN does not register the stack variables are not used anymore which generates the error above. This is a benign error as all the variables are on the stack. *Solution*: Finish the thread in orderly way by using a signalling variable. --------------------------------------------------------------------------- PS-8204: Fix XML escape rules for audit plugin https://jira.percona.com/browse/PS-8204 There was a wrong length specified for some XML escape rules. As a result of this terminating null symbol from replacement rule was copied into resulting string. This lead to quer text truncation in audit log file. In addition added empty replacement rules for '\b' and 'f' symbols which just remove them from resulting string. These symboles are not supported in XML 1.0. --------------------------------------------------------------------------- PS-8854: Add main.percona_udf MTR test Add a test to check FNV1A_64, FNV_64, and MURMUR_HASH user-defined functions.

…n read() syscall over network https://jira.percona.com/browse/PS-8592 Description ----------- GR suffered from problems caused by the security probes and network scanner processes connecting to the group replication communication port. This usually is not a problem, but poses a serious threat when another member tries to join the cluster by initialting a connection to the member which is affected by external processes using the port dedicated for group communication for longer durations. On such activites by external processes, the SSL enabled server stalled forever on the SSL_accept() call waiting for handshake data. Below is the stacktrace: Thread 55 (Thread 0x7f7bb77ff700 (LWP 2198598)): #0 in read () #1 in sock_read () #2 in BIO_read () #3 in ssl23_read_bytes () percona#4 in ssl23_get_client_hello () percona#5 in ssl23_accept () percona#6 in xcom_tcp_server_startup(Xcom_network_provider*) () When the server stalled in the above path forever, it prohibited other members to join the cluster resulting in the following messages on the joiner server's logs. [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group' [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is already leaving or joining a group.' Solution -------- This patch adds two new variables 1. group_replication_xcom_ssl_socket_timeout It is a file-descriptor level timeout in seconds for both accept() and SSL_accept() calls when group replication is listening on the xcom port. When set to a valid value, say for example 5 seconds, both accept() and SSL_accept() return after 5 seconds. The default value has been set to 0 (waits infinitely) for backward compatibility. This variable is effective only when GR is configred with SSL. 2. group_replication_xcom_ssl_accept_retries It defines the number of retries to be performed before closing the socket. For each retry the server thread calls SSL_accept() with timeout defined by the group_replication_xcom_ssl_socket_timeout for the SSL handshake process once the connection has been accepted by the first accept() call. The default value has been set to 10. This variable is effective only when GR is configred with SSL. Note: - Both of the above variables are dynamically configurable, but will become effective only on START GROUP_REPLICATION. ------------------------------------------------------------------------- PS-8844: Fix the failing main.mysqldump_gtid_purged https://jira.percona.com/browse/PS-8844 This patch fixes the test failure of main.mysqldump_gtid_purged that failed due to the uninitialized variable $redirect_stderr in the start_proc_in_background.inc.

When built with ASAN, a use-after-free is reported for the TcpPortPool. AddressSanitizer: heap-use-after-free on address 0x60200019f190 at pc 0x00000076a18d bp 0x7fff51e7d1d0 sp 0x7fff51e7d1c0 #4 0x770b73 in UniqueId::ProcessUniqueIds::erase(unsigned int) ../router/tests/helpers/tcp_port_pool.h:112 #5 0x770c48 in UniqueId::~UniqueId() ../router/tests/helpers/tcp_port_pool.cc:234 ... #12 0x82faa3 in testing::UnitTest::~UnitTest() ../extra/googletest/googletest-release-1.12.0/googletest/src/gtest.cc:5496 #13 0x7f5fe085ace8 in __run_exit_handlers (/lib64/libc.so.6+0x39ce8) 0x60200019f190 is located 0 bytes inside of 16-byte region [0x60200019f190,0x60200019f1a0) freed by thread T0 here: #0 0x7f5fe3cbd10f in operator delete(void*, unsigned long) (/lib64/libasan.so.6+0xb710f) #1 0x7f5fe085ace8 in __run_exit_handlers (/lib64/libc.so.6+0x39ce8) Background ========== __run_exit_handlers destroys "static" and "global" variables in reverse order of their creation. googletest's unit-tests are a static, and the TcpPortPool also has ProcessUniqueId's which contains the process-wide unique-ids. At construct: unittest -> tcp-port-pool -> proces-unique-ids At destruct : process-unique-ids -> tcp-port-pool -> 💥 The use-after-free happens as the process-unique-ids static is destructed before the tcp-port-pool which tries to its Ids from the process-unique-ids. Change ====== - extend the lifetime of the process-unique-ids to after the last use of the tcp-port-pool via a std::shared_ptr<> Change-Id: I75b8b781e1d240f18ca72f2c86182639a7699f06

…nt on Windows and posix [#5] A test program NdbProcess-t for testing argument quoting. Change-Id: I37842d1109f76f684636eb7c763d2d5a46114aae

Problem: Starting ´ndb_mgmd --bind-address´ may potentially cause abnormal program termination in MgmtSrvr destructor when ndb_mgmd restart itself. Core was generated by `ndb_mgmd --defa'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f8ce4066b8f in raise () from /lib64/libc.so.6 #1 0x00007f8ce4039ea5 in abort () from /lib64/libc.so.6 #2 0x00007f8ce40a7d97 in __libc_message () from /lib64/libc.so.6 #3 0x00007f8ce40af08c in malloc_printerr () from /lib64/libc.so.6 #4 0x00007f8ce40b132d in _int_free () from /lib64/libc.so.6 #5 0x00000000006e9ffe in MgmtSrvr::~MgmtSrvr (this=0x28de4b0) at mysql/8.0/storage/ndb/src/mgmsrv/MgmtSrvr.cpp: 890 #6 0x00000000006ea09e in MgmtSrvr::~MgmtSrvr (this=0x2) at mysql/8.0/ storage/ndb/src/mgmsrv/MgmtSrvr.cpp:849 #7 0x0000000000700d94 in mgmd_run () at mysql/8.0/storage/ndb/src/mgmsrv/main.cpp:260 #8 0x0000000000700775 in mgmd_main (argc=<optimized out>, argv=0x28041d0) at mysql/8.0/storage/ndb/src/ mgmsrv/main.cpp:479 Analysis: While starting up, the ndb_mgmd will allocate memory for bind_address in order to potentially rewrite the parameter. When ndb_mgmd restart itself the memory will be released and dangling pointer causing double free. Fix: Drop support for bind_address=[::], it is not documented anywhere, is not useful and doesn't work. This means the need to rewrite bind_address is gone and bind_address argument need neither alloc or free. Change-Id: I7797109b9d8391394587188d64d4b1f398887e94

Upstream commit ID : fb-mysql-5.6.35/8cb1dc836b68f1f13e8b2655b2b8cb2d57f400b3 PS-5217 : Merge fb-prod201803 Summary: Original report: https://jira.mariadb.org/browse/MDEV-15816 To reproduce this bug just following below steps, client 1: USE test; CREATE TABLE t1 (i INT) ENGINE=MyISAM; HANDLER t1 OPEN h; CREATE TABLE t2 (i INT) ENGINE=RocksDB; LOCK TABLES t2 WRITE; client 2: FLUSH TABLES WITH READ LOCK; client 1: INSERT INTO t2 VALUES (1); So client 1 acquired the lock and set m_lock_rows = RDB_LOCK_WRITE. Then client 2 calls store_lock(TL_IGNORE) and m_lock_rows was wrongly set to RDB_LOCK_NONE, as below ``` #0 myrocks::ha_rocksdb::store_lock (this=0x7fffbc03c7c8, thd=0x7fffc0000ba0, to=0x7fffc0011220, lock_type=TL_IGNORE) #1 get_lock_data (thd=0x7fffc0000ba0, table_ptr=0x7fffe84b7d20, count=1, flags=2) #2 mysql_lock_abort_for_thread (thd=0x7fffc0000ba0, table=0x7fffbc03bbc0) #3 THD::notify_shared_lock (this=0x7fffc0000ba0, ctx_in_use=0x7fffbc000bd8, needs_thr_lock_abort=true) #4 MDL_lock::notify_conflicting_locks (this=0x555557a82380, ctx=0x7fffc0000cc8) percona#5 MDL_context::acquire_lock (this=0x7fffc0000cc8, mdl_request=0x7fffe84b8350, lock_wait_timeout=2) percona#6 Global_read_lock::lock_global_read_lock (this=0x7fffc0003fe0, thd=0x7fffc0000ba0) ``` Finally, client 1 "INSERT INTO..." hits the Assertion 'm_lock_rows == RDB_LOCK_WRITE' failed in myrocks::ha_rocksdb::write_row() Fix this bug by not setting m_locks_rows if lock_type == TL_IGNORE. Closes facebook/mysql-5.6#838 Pull Request resolved: facebook/mysql-5.6#871 Differential Revision: D9417382 Pulled By: lth fbshipit-source-id: c36c164e06c

PS-5741: Incorrect use of memset_s in keyring_vault. Fixed the usage of memset_s. The arguments should be: void memset_s(void *dest, size_t dest_max, int c, size_t n) where the 2nd argument is size of buffer and the 3rd is argument is character to fill. --------------------------------------------------------------------------- PS-7769 - Fix use-after-return error in audit_log_exclude_accounts_validate --- *Problem:* `st_mysql_value::val_str` might return a pointer to `buf` which after the function called is deleted. Therefore the value in `save`, after reuturnin from the function, is invalid. In this particular case, the error is not manifesting as val_str` returns memory allocated with `thd_strmake` and it does not use `buf`. *Solution:* Allocate memory with `thd_strmake` so the memory in `save` is not local. --------------------------------------------------------------------------- Fix test main.bug12969156 when WITH_ASAN=ON *Problem:* ASAN complains about stack-buffer-overflow on function `mysql_heartbeat`: ``` ==90890==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fe746d06d14 at pc 0x7fe760f5b017 bp 0x7fe746d06cd0 sp 0x7fe746d06478 WRITE of size 24 at 0x7fe746d06d14 thread T16777215 Address 0x7fe746d06d14 is located in stack of thread T26 at offset 340 in frame #0 0x7fe746d0a55c in mysql_heartbeat(void*) /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:62 This frame has 4 object(s): [48, 56) 'result' (line 66) [80, 112) '_db_stack_frame_' (line 63) [144, 200) 'tm_tmp' (line 67) [240, 340) 'buffer' (line 65) <== Memory access at offset 340 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported) Thread T26 created by T25 here: #0 0x7fe760f5f6d5 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216 #1 0x557ccbbcb857 in my_thread_create /home/yura/ws/percona-server/mysys/my_thread.c:104 #2 0x7fe746d0b21a in daemon_example_plugin_init /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:148 #3 0x557ccb4c69c7 in plugin_initialize /home/yura/ws/percona-server/sql/sql_plugin.cc:1279 #4 0x557ccb4d19cd in mysql_install_plugin /home/yura/ws/percona-server/sql/sql_plugin.cc:2279 percona#5 0x557ccb4d218f in Sql_cmd_install_plugin::execute(THD*) /home/yura/ws/percona-server/sql/sql_plugin.cc:4664 percona#6 0x557ccb47695e in mysql_execute_command(THD*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5160 percona#7 0x557ccb47977c in mysql_parse(THD*, Parser_state*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5952 percona#8 0x557ccb47b6c2 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/yura/ws/percona-server/sql/sql_parse.cc:1544 percona#9 0x557ccb47de1d in do_command(THD*) /home/yura/ws/percona-server/sql/sql_parse.cc:1065 percona#10 0x557ccb6ac294 in handle_connection /home/yura/ws/percona-server/sql/conn_handler/connection_handler_per_thread.cc:325 percona#11 0x557ccbbfabb0 in pfs_spawn_thread /home/yura/ws/percona-server/storage/perfschema/pfs.cc:2198 percona#12 0x7fe760ab544f in start_thread nptl/pthread_create.c:473 ``` The reason is that `my_thread_cancel` is used to finish the daemon thread. This is not and orderly way of finishing the thread. ASAN does not register the stack variables are not used anymore which generates the error above. This is a benign error as all the variables are on the stack. *Solution*: Finish the thread in orderly way by using a signalling variable. --------------------------------------------------------------------------- PS-8204: Fix XML escape rules for audit plugin https://jira.percona.com/browse/PS-8204 There was a wrong length specified for some XML escape rules. As a result of this terminating null symbol from replacement rule was copied into resulting string. This lead to quer text truncation in audit log file. In addition added empty replacement rules for '\b' and 'f' symbols which just remove them from resulting string. These symboles are not supported in XML 1.0. --------------------------------------------------------------------------- PS-8854: Add main.percona_udf MTR test Add a test to check FNV1A_64, FNV_64, and MURMUR_HASH user-defined functions.

…n read() syscall over network https://jira.percona.com/browse/PS-8592 Description ----------- GR suffered from problems caused by the security probes and network scanner processes connecting to the group replication communication port. This usually is not a problem, but poses a serious threat when another member tries to join the cluster by initialting a connection to the member which is affected by external processes using the port dedicated for group communication for longer durations. On such activites by external processes, the SSL enabled server stalled forever on the SSL_accept() call waiting for handshake data. Below is the stacktrace: Thread 55 (Thread 0x7f7bb77ff700 (LWP 2198598)): #0 in read () #1 in sock_read () #2 in BIO_read () #3 in ssl23_read_bytes () #4 in ssl23_get_client_hello () percona#5 in ssl23_accept () percona#6 in xcom_tcp_server_startup(Xcom_network_provider*) () When the server stalled in the above path forever, it prohibited other members to join the cluster resulting in the following messages on the joiner server's logs. [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group' [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is already leaving or joining a group.' Solution -------- This patch adds two new variables 1. group_replication_xcom_ssl_socket_timeout It is a file-descriptor level timeout in seconds for both accept() and SSL_accept() calls when group replication is listening on the xcom port. When set to a valid value, say for example 5 seconds, both accept() and SSL_accept() return after 5 seconds. The default value has been set to 0 (waits infinitely) for backward compatibility. This variable is effective only when GR is configred with SSL. 2. group_replication_xcom_ssl_accept_retries It defines the number of retries to be performed before closing the socket. For each retry the server thread calls SSL_accept() with timeout defined by the group_replication_xcom_ssl_socket_timeout for the SSL handshake process once the connection has been accepted by the first accept() call. The default value has been set to 10. This variable is effective only when GR is configred with SSL. Note: - Both of the above variables are dynamically configurable, but will become effective only on START GROUP_REPLICATION. ------------------------------------------------------------------------------- PS-8844: Fix the failing main.mysqldump_gtid_purged https://jira.percona.com/browse/PS-8844 This patch fixes the test failure of main.mysqldump_gtid_purged that failed due to the uninitialized variable $redirect_stderr in the start_proc_in_background.inc.