Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
WL#8599: Reduce contention in IO and SQL threads
(Step 1) This patch introduces the changes for the worklog related to making the slave applier to read from the relay log the same way the Binlog_sender does from the binary log (using a non-shared IO_CACHE, not relying on relay_log->LOCK_log even when reading from the "hot" relay log file). Made binlog_end_pos atomic -------------------------- The MYSQL_BIN_LOG::binlog_end_pos was refactored to be atomic. From the Binlog_sender perspective, this would allow reducing the amount of acquirements of binary log LOCK_binlog_end_pos. With this change, both binary and relay log files readers don't need to acquire the LOCK_binlog_end_pos while checking if they reached the end of the "hot" log file. They only need to acquire the LOCK_binlog_end_pos if they are actually going to wait for updates. @ sql/binlog.h: Renamed binlog_end_pos to atomic_binlog_end_pos and made it atomic. At MYSQL_BIN_LOG::get_binlog_end_pos(), we removed the assertion of the ownership of the LOCK_binlog_end_pos, as it is not necessary since the binlog_end_pos variable become atomic. @ sql/rpl_binlog_sender.cc Refactored Binlog_sender::wait_new_events() to first check if the waiting is really needed (if the binary log was not updated before the acquirement of LOCK_binlog_end_pos), and then, only if the Binlog_sender really need to wait, to enter the stage_master_has_sent_all_binlog_to_slave stage and wait for updates on the binary log. Removed the relay_log->LOCK_log usage from next_event() ------------------------------------------------------- The slave applier was refactored to not use the relay_log->LOCK_log when reading events from the "hot" relay log file. It was introduced a new PSI mutex key(MYSQL_RELAY_LOG::LOCK_log_end_pos) to instrument the LOCK_binlog_end_pos on relay log files. @ mysql-test/suite/perfschema/r/relaylog.test The test case had to be recorded again after the addition of the new PSI mutex key. @ sql/mysqld.(cc|.h) Introduced the new "MYSQL_RELAY_LOG::LOCK_log_end_pos" PSI mutex key. In order to make the slave applier to not need to acquire relay_log->LOCK_log when reading from the "hot" relay log, the slave receiver now opens the relay log with the same flags as the binary log files are opened: O_WRONLY. This lead to many changes in the slave code. The rli->ign_master_log_* that relied on relay_log->LOCK_log are now being protected by the relay_log->LOCK_binlog_end_pos. This change was needed in order to guarantee that the updated generated by events ignored by the receiver thread would be properly handled by the applier regardless relay_log->LOCK_log. @ sql/binlog.h The MYSQL_BIN_LOG::update_binlog_end_pos() function is now also used for the relay log. The function was refactored to remove the relay log specific code. It also has now a new parameter to tell the function that the LOCK_binlog_end_pos was acquired by the caller. MYSQL_BIN_LOG::after_append_to_relay_log(), MYSQL_BIN_LOG::append_event() and MYSQL_BIN_LOG::append_buffer() function were renamed to MYSQL_BIN_LOG::after_write_to_relay_log(), MYSQL_BIN_LOG::write_event() and MYSQL_BIN_LOG::write_buffer() respectively. @ sql/binlog.cc At MYSQL_BIN_LOG::open(), there is no distinction about binary or relay log with respect to the flags used to open the IO_CACHE. At MYSQL_BIN_LOG::open_binlog(), replaced a check for the relay log that were relying on the io_cache_type to actually check if it is a relay log or not. At MYSQL_BIN_LOG::after_write_to_relay_log(), replaced the function used to get the actual file position from my_b_append_tell() to my_b_tell(). Also, instead of just signaling the update of the log file, this function also cleanup the rli->ign_master_log_name_end. MYSQL_BIN_LOG::write_event() is now asserting that the log_file.type is WRITE_CACHE. MYSQL_BIN_LOG::write_buffer() is now asserting that the log_file.type is WRITE_CACHE. It is also calling my_b_write() to write the buffer into the relay log IO_CACHE. MYSQL_BIN_LOG::wait_for_update_relay_log() was refactored to rely on LOCK_binlog_end_pos instead of LOCK_log and was moved to sql/rpl_slave.cc as wait_new_relaylog_events(). At MYSQL_BIN_LOG::close, replaced a check for the relay log that were relying on the io_cache_type to actually check if it is a relay log or not. @ sql/log_event.cc Log_event::write_header() now calculates the event common_header->log_pos by using my_b_tell() as there is no IO_CACHE with SEQ_READ_APPEND type anymore. @ sql/rpl_rli.h It was removed the IO_CACHE *cur_log as it is not needed anymore. It was also removed the cur_log_old_open_count variable. @ sql/rpl_rli.cc Relay_log_info::Relay_log_info() now initialize the relay_log using the WRITE_CACHE cache type. It was added the initialization of the key_RELAYLOG_LOCK_log_end_pos that now is used by the relay log. It was removed any reference to relay_log->LOCK_log at Relay_log_info::init_relay_log_pos() function. @ sql/rpl_slave.cc The write_ignored_events_info_to_relay_log() function now relies on LOCK_binlog_end_pos instead of LOCK_log. At queue_event(), there is a rli->relay_log.lock_binlog_end_pos() call every time the rli->ign_master_log_* variables are going to be handled. It was created the relay_log_space_verification() static function with all the code related to relay log space verification that was inside the next_event() function. The major changes in this step were done at the next_event() static function. It doesn't use the relay_log->LOCK_log anymore, and rely on relay_log->LOCK_binlog_end_pos when reaching the "hot" relay log file boundaries. The function now only reads and event from the relay log file if the log is not "hot" or if current reading position is less than the binlog_end_pos. Introduced the wait_new_relaylog_events() function. @ mysql-test/suite/rpl/t/rpl_relay_log_locking(.test|.result) It was created a test case that relies on debug instrumentation to block the receiver thread while queuing an event and ensure that the applier thread is capable of reading from the relay log up to the last queued event. Other references ---------------- This patch also fixed: BUG#25321231: TUNING THE LOG_LOCK CONTENTION FOR IO_THREAD AND SQL_THREAD IN RPL_SLAVE.CC (Step 2) This patch made channels retrieved_gtid_sets to use their own sid_map/sid_lock and created a class to avoid locking when checking the current server GTID_MODE to be used Master_info and Binlog_sender. Gtid_mode_copy class -------------------- Any operation needing to check the current server GTID_MODE would acquire the global_sid_lock in order to read the GTID_MODE. This is a very fast operation (just to access a server global variable), but while done by many concurrent threads it might generate impact, mostly on commit operations that acquire the global_sid_lock exclusively. Also, when the server is committing a group of transactions, as the global_sid_lock is acquired for writing, any operation trying to check the server GTID_MODE will have to be held. GTID_MODE is a global variable that should not be changed often, but the access to it is protected by any of the four locks described at enum_gtid_mode_lock. Every time a channel receiver thread connects to a master, every time a Gtid_log_event or an Anonymous_gtid_log_event is queued by a receiver thread, or is going to be sent by the Binlog_sender to a receiver, there must be checked if the current GTID_MODE is compatible with the operation. There are some places where the verification is performed while already holding one of the above mentioned locks, but there are other places that rely on no specific lock and, in this case, will rely on the global_sid_lock, blocking any other GTID operation relying on the global_sid_map for writing (like a group of transactions being committed). In order to avoid acquiring lock to check a variable that is not changed often, we introduced a global (atomic) counter of how many times the GTID_MODE was changed since the server startup. The Gtid_mode_copy class was implemented to hold a copy of the last GTID_MODE to be returned without the need of acquiring locks if the local GTID mode counter has the same value as the global atomic counter. @ sql/mysqld.cc Introduced the new PSI_rwlock_key key_rwlock_receiver_sid_lock. @ sql/rpl_gtid_misc.cc Declared the global atomic _gtid_mode_counter. @ sql/rpl_gtid.h Declared the external atomic _gtid_mode_counter. Defined DEFAULT_GTID_MODE as GTID_MODE_OFF. Introduced the Gtid_mode_copy class. @ sql/rpl_binlog_sender.h Inherited from Gtid_mode_copy to the Binlog_sender class. @ sql/rpl_binlog_sender.cc Replaced the calls to get_gtid_mode() by get_gtid_mode_from_copy(). @ sql/rpl_slave.cc At recover_relay_log(), replaced the call to get_gtid_mode() by get_gtid_mode_from_copy(). At init_recovery(), replaced the call to get_gtid_mode() by get_gtid_mode_from_copy(). At start_slave_threads(), replaced the call to get_gtid_mode() by get_gtid_mode_from_copy(). At get_master_version_and_clock(), replaced the call to get_gtid_mode() by get_gtid_mode_from_copy(). At queue_event(), replaced the call to get_gtid_mode() by get_gtid_mode_from_copy(). @ sql/sys_vars.cc Incremented the _gtid_mode_counter when GTID_MODE is changed. Also, made the GTID_MODE global variable to have DEFAULT_GTID_MODE as its default value. Retrieve_gtid_sets with their own SID maps/SID locks ---------------------------------------------------- Any GTID set operation relying on a given SID map (and its respective lock) will be blocked by any other operation (in any other GTID set) holding the SID lock for writing. All server GTID state sets (lost_gtids, executed_gtids, gtids_only_in_table, previous_gtids_logged and owned_gtids) rely on the global SID map (and on the global SID lock). So, when GTIDs are committed in the server, the updates on the GTID state lock the SID map for writing to prevent other threads to perform updates on the GTID state (or read from it while it is being updated). The side effect of this way of avoiding other threads to read from or update a GTID set is blocking any other GTID activity in other GTID sets relying on the same SID map/SID lock. So, before this patch, the replication receiver threads had their Retrieved_Gtid_Set relying on the global SID map/lock. In this way, when a group commit was updating the GTIDs of the committed transactions, any replication receiver trying to queue a Gtid_log_event or finishing queuing a Gtid transaction had to wait for the group commit to unlock the global SID lock. Also, a group commit trying to lock the global SID lock for writing was waiting to all receiver threads queuing GTIDs to finish before having being granted with the lock ownership. The global SID lock on the cases described above is taken for doing small operations, and there is no significant impact on server performance in a slave server replicating using a single replication channel with medium to large transactions and without using MTS. But when the slave is scaled to have many replication channels and/or replicating many small transactions and using MTS, the impact of the concurrency in the global SID lock becomes noticeable. This patch is making all receiver threads to rely on their own (individual) SID maps and locks. @ sql/binlog.cc The MYSQL_BIN_LOG::init_gtid_sets() function was refactored to use the global_sid_lock when dealing with binary log and to use the relay log sid_lock when dealing with relay log. The MYSQL_BIN_LOG::open_binlog() function was refactored to use the global_sid_lock when dealing with binary log and to use the relay log sid_lock when dealing with relay log. The MYSQL_BIN_LOG::reset_logs() function was refactored to use the global_sid_lock when dealing with binary log and to use the relay log sid_lock when dealing with relay log. MYSQL_BIN_LOG::after_write_to_relay_log() now uses only the relay log sid_lock. @ sql/log_event.cc Previous_log_event should assert that the SID map of the GTID set passed as parameter is locked (is it not the global_sid_lock for relay log events). @ sql/mysqld.h Declared the new PSI_rwlock_key key_rwlock_receiver_sid_lock. @ sql/mysqld.cc At gtid_server_init(), initialized the global _gtid_mode_counter. Introduced the new PSI_rwlock_key key_rwlock_receiver_sid_lock. @ sql/rpl_channel_service_interface.cc The channel_get_last_delivered_gno() function now uses the relay log sid_lock. The channel_wait_until_apply_queue_applied() was refactored to avoid blocking both the relay log sid_lock and the global_sid_lock while waiting for the condition. @ sql/rpl_gtid.h Enabled the declaration of Sid_map::clear() regardless of compiler directives. Declared a new static function Sid_map::get_new_sid_map() to retrieve a new empty SID map with its own SID lock. Declared the Gtid_set::clear_set_and_sid_map() function that takes care of cleaning the SID map after cleaning the GTID set. @ sql/rpl_gtid_set.cc Introduced the Gtid_set::clear_set_and_sid_map() function that takes care of cleaning the SID map after cleaning the GTID set. @ sql/rpl_gtid_sid_map.cc Removed the compiler directives preventing the compilation of Sid_map::clear(). @ sql/rpl_rli.h Made the (retrieved) gtid_set a pointer. Added function to get the GTID set SID map (get_sid_map()) and SID lock (get_sid_lock()). Changed add_logged_gtid() function to use the relay log SID map and lock. Declared a new wait_for_gtid_set() function receiving a char* parameter instead of a String*. @ sql/rpl_rli.cc Refactored the gtid_set initialization on Relay_log_info constructor and cleaned up the GTID set, SID map and lock on destructor. Introduced the new wait_for_gtid_set() function receiving a char* parameter instead of a String* and refactored the wait_for_gtid_set() that receives a String* to call the new introduced one. Added some assertions at Relay_log_info::wait_for_gtid_set() to ensure that the GTID set to wait is relying on global_sid_map or has no SID map. Relay_log_info::purge_relay_logs() now uses the relay log SID lock and also clears the relay log SID map when cleaning the retrieved GTID set. Relay_log_info::rli_init_info now uses the relay log SID lock. Relay_log_info::add_gtid_set() now uses the relay log SID lock. @ sql/rpl_slave.cc The recover_relay_log() function now uses the relay log SID lock and also clears the relay log SID map when cleaning the retrieved GTID set. The show_slave_status() functions were refactored to use the relay log SID lock when dealing with the retrieved GTID sets. The request_dump() function now uses the relay log SID lock when dealing with the retrieved GTID set. The queue_event() function now uses the relay log SID lock when dealing with GTIDs of received Gtid_log_events. @ storage/perfschema/table_replication_connection_status.cc The table_replication_connection_status::make_row() function was refactored to use the relay log SID lock when dealing with the retrieved GTID sets. (Step 3) This patch moved the call to flush_master_info() that was done by the I/O thread after a successful call to queue_event() to inside the queue_event() function, in order to take a ride in the already locked mi->data_lock and relay_log->LOCK_log. This will avoid acquiring the above mentioned locks twice for every successful event queued. It also added a new parameter to flush_master_info() to opt the flush of the relay log. Previous approach was leading to flush the relay log twice per event. @ sql/rpl_channel_service_interface.cc Specified the new queue_event() parameter to not flush master info after queuing the event. @ sql/rpl_slave.h Changes flush_master_info() declaration by adding a new parameter telling the function if it needs to acquire the required locks or if the locks are already acquired and a new parameter telling the function if it needs to flush the relay log. Declared QUEUE_EVENT_RESULT enum with the possible results of the queue_event() function. Changes queue_event() declaration to return QUEUE_EVENT_RESULT and also to support a new parameter telling the function to also flush master info on after an event be successfully queued. @ sql/rpl_slave.cc The flush_master_info() function was changed to not acquire the relay_log->LOCK_log always, but rely on the need_lock parameter to do so. It was also changed to only flush the relay log based on the new flush_relay_log parameter. This will prevent flushing the relay log twice when queuing events. On handle_slave_io(), refactored the calls to queue_event() and flush_master_info() to use the new implemented parameters. Refactored queue_event() function to return QUEUE_EVENT_RESULT, and to flush master info without the need of flushing the relay log in the case of a successful event be queued. Added test cases to improve code coverage: - rpl_write_ignored_events: ensure the ignored events not yet consumed by the slave are taken into account by the SQL thread if the I/O thread is stopped before the SQL thread consumed the ignored events info. - rpl_write_ignored_events_fail_writing_rotate: ensure I/O behavior when failures happen while writing the ignored events info to the relay log. Also commented an unreachable code to make gcov happy. Added test cases to verify that receiver threads GTID sets do not rely on global SID anymore. rpl_multi_source_block_receiver: checks that receiver thread receiving GTIDs (and adding them to its retrieved GTID set) can apply GTIDs from a server UUID that doesn't belong to the global SID map yet. rpl_line_topology_receiver_block: checks that receiver thread on slave receiving GTIDs (and adding them to its retrieved GTID set) can have other servers replicating from it.
- Loading branch information