Merge 5.5.42 #10

laurynas-biveinis · 2015-02-23T10:52:25Z

http://jenkins.percona.com/job/percona-server-5.5-param/1094/

G2

Description: Using correct length when moving to next field in cmp_ref. The store length already includes the length bytes of blobs, which is already considered earlier for blob types. Approved by Mattias, Jimmy [rb-7088]

…TING PERL "HOSTNAMES" When building RPMs and using the "rpmbuild" automatic scanning for Perl dependencies, it might interpret "use hostnames" in the "mysql_config.pl" script usage comment as a Perl "use" statement. And then makes the resulting RPMs depends on a non-existing module Perl "hostnames". The comment was changed to prevent this.

Fix broken gcc 4.9.1 debug build by removing end of line noise. In 5.6+ this issue was already fixed by: ------------------------------------------------------------ revno: 3091 committer: Davi Arnaut <davi.arnaut@oracle.com> branch nick: mysql-trunk timestamp: Mon 2011-05-16 11:30:54 -0300 message: Fix warnings emitted by Clang.

COMMUNICATION PACKETS; FEDERATED TABLE Description:- Execution of FLUSH TABLES on a federated table which has been idle for wait_timeout (on the remote server) + tcp_keepalive_time, fails with an error, "ERROR 1160 (08S01): Got an error writing communication packets." Analysis:- During FLUSH TABLE execution the federated table is closed which will inturn close the federated connection. While closing the connection, federated server tries to communincate with the remote server. Since the connection was idle for wait_timeout(on the remote server)+ tcp_keepalive_time, the socket gets closed. So this communication fails because of broken pipe and the error is thrown. But federated connections are expected to reconnect silently. And also it cannot reconnect because the "auto_reconnect" variable is set to 0 in "mysql_close()". Fix:- Before closing the federated connection, in "ha_federated_close()", a check is added which will verify wheather the connection is alive or not. If the connection is not alive, then "mysql->net.error" is set to 2 which will indicate that the connetion is broken. Also the setting of "auto_reconnect" variable to 0 is delayed and is done after "COM_QUIT" command. NOTE:- For reproducing this issue, "tcp_keepalive_time" has to be set to a smaller value. This value is set in the "/proc/sys/net/ipv4/tcp_keepalive_time" file in Unix systems. So we need root permission for changing it, which can't be done through mtr test. So submitting the patch without mtr test.

… REPO For 'make dist': only use 'bzr export' if bzr root == ${CMAKE_SOURCE_DIR} Same thing for git.

…G TO GIT Use 'git log -1; git branch' rather than 'bzr version-info'

Use 'git ls-files' to find source files for etags.

Change the format of 'git log' to produce INFO_SRC: commit: <commit hash> date: 2014-11-12 11:11:10 +0100 build-date: 2014-11-17 15:24:16 +0100 short: <abbreviated commit hash> branch: mysql-5.5

Analysis: -------- Certain queries using intrinsic temporary tables may fail due to name clashes in the file name for the temporary table when the 'temp-pool' enabled. 'temp-pool' tries to reduce the number of different filenames used for temp tables by allocating them from small pool in order to avoid problems in the Linux kernel by using a three part filename: <tmp_file_prefix>_<pid>_<temp_pool_slot_num>. The bit corresponding to the temp_pool_slot_num is set in the bit map maintained for the temp-pool when it used for the file name. It is cleared after the temp table is deleted for re-use. The 'create_tmp_table()' function call under error condition tries to clear the same bit twice by calling 'free_tmp_table()' and 'bitmap_lock_clear_bit()'. 'free_tmp_table()' does a delete of the table/file and clears the bit by calling the same function 'bitmap_lock_clear_bit()'. The issue reported can be triggered under the timing window mentioned below for an error condition while creating the temp table: a) THD1: Due to an error clears the temp pool slot number used by it by calling 'free_tmp_table'. b) THD2: In the process of creating the temp table by using an unused slot number in the bit map. c) THD1: Clears the slot number used THD2 by calling 'bitmap_lock_clear_bit()' after completing the call 'free_tmp_table'. d) THD3: Uses the slot number used the THD2 since it is freed by THD1. When it tries to create the temp file using that slot number, an error is reported since it is currently in use by THD2. [The error: Error 'Can't create/write to file '/tmp/#sql_277e_0.MYD' (Errcode: 17)'] Another issue which may occur in 5.6 and trunk is that: When the open temporary table fails after its creation(due to ulimit or OOM error), the file is not deleted. Thus further attempts to use the same slot number in the 'temp-pool' results in failure. Fix: --- a) Under the error condition calling the 'bitmap_lock_clear_bit()' function to clear the bit is unnecessary since 'free_tmp_table()' deletes the table/file and clears the bit. Hence removed the redundant call 'bitmap_lock_clear_bit()' in 'create_tmp_table()' This prevents the timing window under which the issue reported can be seen. b) If open of the temporary table fails, then the file is deleted thus allowing the temp-pool slot number to be utilized for the subsequent temporary table creation. c) Also if the attempt to create temp table fails since it already exists, the temp-pool slot for it is marked as used, to avoid the problem from re-appearing.

…NED_TABLES I Description: When querying a subset of columns from the information_schema.TABLES Analysis: When information about tables is collected for statements like "SELECT ENGINE FROM I_S.TABLES" we do not perform full-blown table opens in SE, instead we only use information from table shares from the Table Definition Cache or .FRMs. Still in order to simplify I_S implementation mock TABLE objects are created from TABLE_SHARE during this process. This is done by calling open_table_from_share() function with special arguments. Since this function always increments "Opened_tables" counter, calls to it can be mistakingly interpreted as full-blown table opens in SE. Note that claim that "'SELECT ENGINE FROM I_S.TABLES' statement doesn't use Table Cache" is nevertheless factually correct. But it misses the point, since such statements a) don't use full-blown TABLE objects and therefore don't do table opens b) still use Table Definition Cache. Fix: We are now incrementing the counter when db_stat(i.e open flags for ha_open( we have considered an optimization which would use TABLE objects from Table Cache when available instead of constructing mock TABLE objects, but found it too intrusive for stable releases.

CODE Problem: UDF doesn't handle the arguments properly when they are of string type due to a misplaced break. The length of arguments is also not set properly when the argument is NULL. Solution: Fixed the code by putting the break at right place and setting the argument length to zero when the argument is NULL.

Generated new certificates with validity upto 2029.

CODE Fixed a failure on pb2 caused by the patch previously pushed.

Patch for 5.5

- Moving the test case to correct place.

CRASHES ON EVERY START ATTEMPT Description: ------------ push_warning_printf function is used to print the warning message to the client. So this function should not invoke while recovering the server. Moreover current_thd is NULL while starting the server. Solution: --------- - Avoiding the warning to be printed while recovery. This patch already pushed in mysql-5.6.

Fix: === Backport Bug#11756194 to mysql-5.5. slave breaks if 'drop database' fails on master and mismatched tables on slave. 'DROP TABLE <deleted tables>' was binlogged when 'DROP DATABASE' failed and at least one table was deleted from the database. The log event would lead slave SQL thread stop if some of the tables did not exist on slave. After this patch, It is always binlogged with 'IF EXISTS' option.

Upgrading YaSSL from 2.3.5 to 2.3.7 Reviewed-by : Kristofer Pettersson <kristofer.pettersson@oracle.com> Reviewed-by : Vamsikrishna Bhagi <vamsikrishna.bhagi@oracle.com>

Explicitly disable weaker SSL protocols.

https://blueprints.launchpad.net/percona-server/+spec/merge-5.5.42. Import man pages from mysql-5.5.42.tar.gz.

The fix is to cherry-pick mysql/mysql-server@432078d from 5.6, and take mysql-test/include/linux.inc.

This reverts commit e7391e4.

george-lorch · 2015-02-26T16:56:22Z

+1 G2, for our first git based release this looks pretty good. Seems there are still some upstream tweaks being done to remove vestiges of bzr/lp.

Merge 5.5.42

…s=0 and a local DDL executed https://perconadev.atlassian.net/browse/PS-9018 Problem ------- In high concurrency scenarios, MySQL replica can enter into a deadlock due to a race condition between the replica applier thread and the client thread performing a binlog group commit. Analysis -------- It needs at least 3 threads for this deadlock to happen 1. One client thread 2. Two replica applier threads How this deadlock happens? -------------------------- 0. Binlog is enabled on replica, but log_replica_updates is disabled. 1. Initially, both "Commit Order" and "Binlog Flush" queues are empty. 2. Replica applier thread 1 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 3. Since both "Commit Order" and "Binlog Flush" queues are empty, the applier thread 1 3.1. Becomes leader (In Commit_stage_manager::enroll_for()). 3.2. Registers in the commit order queue. 3.3. Acquires the lock MYSQL_BIN_LOG::LOCK_log. 3.4. Commit Order queue is emptied, but the lock MYSQL_BIN_LOG::LOCK_log is not yet released. NOTE: SE commit for applier thread is already done by the time it reaches here. 4. Replica applier thread 2 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 5. Since the "Commit Order" queue is empty (emptied by applier thread 1 in 3.4), the applier thread 2 5.1. Becomes leader (In Commit_stage_manager::enroll_for()) 5.2. Registers in the commit order queue. 5.3. Tries to acquire the lock MYSQL_BIN_LOG::LOCK_log. Since it is held by applier thread 1 it will wait until the lock is released. 6. Client thread enters the group commit pipeline to register in the "Binlog Flush" queue. 7. Since "Commit Order" queue is not empty (there is applier thread 2 in the queue), it enters the conditional wait `m_stage_cond_leader` with an intention to become the leader for both the "Binlog Flush" and "Commit Order" queues. 8. Applier thread 1 releases the lock MYSQL_BIN_LOG::LOCK_log and proceeds to update the GTID by calling gtid_state->update_commit_group() from Commit_order_manager::flush_engine_and_signal_threads(). 9. Applier thread 2 acquires the lock MYSQL_BIN_LOG::LOCK_log. 9.1. It checks if there is any thread waiting in the "Binlog Flush" queue to become the leader. Here it finds the client thread waiting to be the leader. 9.2. It releases the lock MYSQL_BIN_LOG::LOCK_log and signals on the cond_var `m_stage_cond_leader` and enters a conditional wait until the thread's `tx_commit_pending` is set to false by the client thread (will be done in the Commit_stage_manager::process_final_stage_for_ordered_commit_group() called by client thread from fetch_and_process_flush_stage_queue()). 10. The client thread wakes up from the cond_var `m_stage_cond_leader`. The thread has now become a leader and it is its responsibility to update GTID of applier thread 2. 10.1. It acquires the lock MYSQL_BIN_LOG::LOCK_log. 10.2. Returns from `enroll_for()` and proceeds to process the "Commit Order" and "Binlog Flush" queues. 10.3. Fetches the "Commit Order" and "Binlog Flush" queues. 10.4. Performs the storage engine flush by calling ha_flush_logs() from fetch_and_process_flush_stage_queue(). 10.5. Proceeds to update the GTID of threads in "Commit Order" queue by calling gtid_state->update_commit_group() from Commit_stage_manager::process_final_stage_for_ordered_commit_group(). 11. At this point, we will have - Client thread performing GTID update on behalf if applier thread 2 (from step 10.5), and - Applier thread 1 performing GTID update for itself (from step 8). Due to the lack of proper synchronization between the above two threads, there exists a time window where both threads can call gtid_state->update_commit_group() concurrently. In subsequent steps, both threads simultaneously try to modify the contents of the array `commit_group_sidnos` which is used to track the lock status of sidnos. This concurrent access to `update_commit_group()` can cause a lock-leak resulting in one thread acquiring the sidno lock and not releasing at all. ----------------------------------------------------------------------------------------------------------- Client thread Applier Thread 1 ----------------------------------------------------------------------------------------------------------- update_commit_group() => global_sid_lock->rdlock(); update_commit_group() => global_sid_lock->rdlock(); calls update_gtids_impl_lock_sidnos() calls update_gtids_impl_lock_sidnos() set commit_group_sidno[2] = true set commit_group_sidno[2] = true lock_sidno(2) -> successful lock_sidno(2) -> waits update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { unlock_sidno(2); commit_group_sidnos[2] = false; } Applier thread continues.. lock_sidno(2) -> successful update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { <=== this check fails and lock is not released. unlock_sidno(2); commit_group_sidnos[2] = false; } Client thread continues without releasing the lock ----------------------------------------------------------------------------------------------------------- 12. As the above lock-leak can also happen the other way i.e, the applier thread fails to unlock, there can be different consequences hereafter. 13. If the client thread continues without releasing the lock, then at a later stage, it can enter into a deadlock with the applier thread performing a GTID update with stack trace. Client_thread ------------- #1 __GI___lll_lock_wait percona#2 ___pthread_mutex_lock percona#3 native_mutex_lock <= waits for commit lock while holding sidno lock percona#4 Commit_stage_manager::enroll_for percona#5 MYSQL_BIN_LOG::change_stage percona#6 MYSQL_BIN_LOG::ordered_commit percona#7 MYSQL_BIN_LOG::commit percona#8 ha_commit_trans percona#9 trans_commit_implicit percona#10 mysql_create_like_table percona#11 Sql_cmd_create_table::execute percona#12 mysql_execute_command percona#13 dispatch_sql_command Applier thread -------------- #1 ___pthread_mutex_lock percona#2 native_mutex_lock percona#3 safe_mutex_lock percona#4 Gtid_state::update_gtids_impl_lock_sidnos <= waits for sidno lock percona#5 Gtid_state::update_commit_group percona#6 Commit_order_manager::flush_engine_and_signal_threads <= acquires commit lock here percona#7 Commit_order_manager::finish percona#8 Commit_order_manager::wait_and_finish percona#9 ha_commit_low percona#10 trx_coordinator::commit_in_engines percona#11 MYSQL_BIN_LOG::commit percona#12 ha_commit_trans percona#13 trans_commit percona#14 Xid_log_event::do_commit percona#15 Xid_apply_log_event::do_apply_event_worker percona#16 Slave_worker::slave_worker_exec_event percona#17 slave_worker_exec_job_group percona#18 handle_slave_worker 14. If the applier thread continues without releasing the lock, then at a later stage, it can perform recursive locking while setting the GTID for the next transaction (in set_gtid_next()). In debug builds the above case hits the assertion `safe_mutex_assert_not_owner()` meaning the lock is already acquired by the replica applier thread when it tries to re-acquire the lock. Solution -------- In the above problematic example, when seen from each thread individually, we can conclude that there is no problem in the order of lock acquisition, thus there is no need to change the lock order. However, the root cause for this problem is that multiple threads can concurrently access to the array `Gtid_state::commit_group_sidnos`. In its initial implementation, it was expected that threads should hold the `MYSQL_BIN_LOG::LOCK_commit` before modifying its contents. But it was not considered when upstream implemented WL#7846 (MTS: slave-preserve-commit-order when log-slave-updates/binlog is disabled). With this patch, we now ensure that `MYSQL_BIN_LOG::LOCK_commit` is acquired when the client thread (binlog flush leader) when it tries to perform GTID update on behalf of threads waiting in "Commit Order" queue, thus providing a guarantee that `Gtid_state::commit_group_sidnos` array is never accessed without the protection of `MYSQL_BIN_LOG::LOCK_commit`.

PS-5741: Incorrect use of memset_s in keyring_vault. Fixed the usage of memset_s. The arguments should be: void memset_s(void *dest, size_t dest_max, int c, size_t n) where the 2nd argument is size of buffer and the 3rd is argument is character to fill. --------------------------------------------------------------------------- PS-7769 - Fix use-after-return error in audit_log_exclude_accounts_validate --- *Problem:* `st_mysql_value::val_str` might return a pointer to `buf` which after the function called is deleted. Therefore the value in `save`, after reuturnin from the function, is invalid. In this particular case, the error is not manifesting as val_str` returns memory allocated with `thd_strmake` and it does not use `buf`. *Solution:* Allocate memory with `thd_strmake` so the memory in `save` is not local. --------------------------------------------------------------------------- Fix test main.bug12969156 when WITH_ASAN=ON *Problem:* ASAN complains about stack-buffer-overflow on function `mysql_heartbeat`: ``` ==90890==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fe746d06d14 at pc 0x7fe760f5b017 bp 0x7fe746d06cd0 sp 0x7fe746d06478 WRITE of size 24 at 0x7fe746d06d14 thread T16777215 Address 0x7fe746d06d14 is located in stack of thread T26 at offset 340 in frame #0 0x7fe746d0a55c in mysql_heartbeat(void*) /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:62 This frame has 4 object(s): [48, 56) 'result' (line 66) [80, 112) '_db_stack_frame_' (line 63) [144, 200) 'tm_tmp' (line 67) [240, 340) 'buffer' (line 65) <== Memory access at offset 340 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported) Thread T26 created by T25 here: #0 0x7fe760f5f6d5 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216 #1 0x557ccbbcb857 in my_thread_create /home/yura/ws/percona-server/mysys/my_thread.c:104 percona#2 0x7fe746d0b21a in daemon_example_plugin_init /home/yura/ws/percona-server/plugin/daemon_example/daemon_example.cc:148 percona#3 0x557ccb4c69c7 in plugin_initialize /home/yura/ws/percona-server/sql/sql_plugin.cc:1279 percona#4 0x557ccb4d19cd in mysql_install_plugin /home/yura/ws/percona-server/sql/sql_plugin.cc:2279 percona#5 0x557ccb4d218f in Sql_cmd_install_plugin::execute(THD*) /home/yura/ws/percona-server/sql/sql_plugin.cc:4664 percona#6 0x557ccb47695e in mysql_execute_command(THD*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5160 percona#7 0x557ccb47977c in mysql_parse(THD*, Parser_state*, bool) /home/yura/ws/percona-server/sql/sql_parse.cc:5952 percona#8 0x557ccb47b6c2 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/yura/ws/percona-server/sql/sql_parse.cc:1544 percona#9 0x557ccb47de1d in do_command(THD*) /home/yura/ws/percona-server/sql/sql_parse.cc:1065 percona#10 0x557ccb6ac294 in handle_connection /home/yura/ws/percona-server/sql/conn_handler/connection_handler_per_thread.cc:325 percona#11 0x557ccbbfabb0 in pfs_spawn_thread /home/yura/ws/percona-server/storage/perfschema/pfs.cc:2198 percona#12 0x7fe760ab544f in start_thread nptl/pthread_create.c:473 ``` The reason is that `my_thread_cancel` is used to finish the daemon thread. This is not and orderly way of finishing the thread. ASAN does not register the stack variables are not used anymore which generates the error above. This is a benign error as all the variables are on the stack. *Solution*: Finish the thread in orderly way by using a signalling variable. --------------------------------------------------------------------------- PS-8204: Fix XML escape rules for audit plugin https://jira.percona.com/browse/PS-8204 There was a wrong length specified for some XML escape rules. As a result of this terminating null symbol from replacement rule was copied into resulting string. This lead to quer text truncation in audit log file. In addition added empty replacement rules for '\b' and 'f' symbols which just remove them from resulting string. These symboles are not supported in XML 1.0. --------------------------------------------------------------------------- PS-8854: Add main.percona_udf MTR test Add a test to check FNV1A_64, FNV_64, and MURMUR_HASH user-defined functions. --------------------------------------------------------------------------- PS-9218: Merge MySQL 8.4.0 (fix gcc-14 build) https://perconadev.atlassian.net/browse/PS-9218