-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Latency glitch fix #558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Latency glitch fix #558
Conversation
Analysis Tables of redundant row format having indexes of size greater than 767 bytes cannot be accessed after a successful upgrade to 8.0 An index with size greater than 767 bytes was permitted to be created on a table with redundant row format prior to 5.7.35 version. When such tables are upgraded to 8.0, they became inaccessible. During open table i.e dd_open_table(), the index size was validated resulting in an error since the redundant row format does not support indexes greater than 767 bytes. Fix Mark such indexes of invalid size as corrupt during dd_open_table(). This involved moving the index size check to later point in time in dd_open_table_one() i.e after dict_table_add_to_cache(). All operations which involves using the index will error out as index corrupted until the index is dropped. Change-Id: Iff3915028f1d3631af2fe85669891144ab048856
Follow up patch to removed unwanted zip file. Change-Id: Iff3915028f1d3631af2fe85669891144ab048856
Fixed the build failure due to unused variable(used only in debug execute if stmt). Change-Id: Iff3915028f1d3631af2fe85669891144ab048856
fields_with_changed_order didn't treat prefix_len key table well. col->get_phy_pos() returns raw data about prefix phy_pos. We need to use field->get_phy_pos() for actual phy_pos here. Change-Id: If13449f9e6e6191cd0f2e7102f62ac28024727f8
…xcom/gcs_xcom_networking Description: Fixed a memory leak in Gcs_ip_allowlist_entry_hostname::get_value. Previously, the function returned nullptr on error without freeing the allocated memory for a vector. This commit modifies Gcs_ip_allowlist_entry_hostname::get_value to properly delete the vector before returning nullptr in error cases. Change-Id: Iab0076839cccb6c4af7feb3640788ee977f0e9ce
Follow up fix for mismatched result due to file path separators. Change-Id: Iec5b01659e12840f30748beb81dbed2dd33471b6
Removing dead code. Contribution by Huaxiong Song. Change-Id: I2965e659b715d5f0b3f920a8a016429a33e07f25
The mysql_server_mock sometimes doesn't shutdown when it receives a SIGTERM. Stacktraces show that the mock-server waits on: server_mock::MySQLServerMock::run () at mock_server/src/mysql_server_mock.cc:318 which is a condition-variable that waits for all connections to be closed. Background ========== The hang appears as: 1. a connection is held open by the client 2. the SIGTERM leads to close_all_connections being called which calls cancel() on each connection, aborting any waiting IO, expecting that it leads to a connection close on each connection. If the connection is _not_ waiting on IO, it does NOT notice the request for shutdown and waits for the client to send something, while the client waits for a shutdown of the mock-server. A deadlock. Change ====== - add a is_terminated_(true) to track that a connection shall be closed - check is_terminated_ before entering an async_read() Change-Id: Idc0cb32fb5b0d67a42ac30644f4a80abbd4a2117
function json_extract Description: ------------ When the source has a table featuring a stored generated column populated using a JSON function, and subsequent update or deletion operations are executed on the underlying base column, it fails with the error message: "Invalid JSON text in argument 1 to function json_extract: 'The document is empty.'". Analysis: ------------ Issue Scenario: The source table includes:- - A stored generated column populated using JSON function - binlog_row_image=MINIMAL Replication encounters failure upon update/delete operations on the underlying base column. Root cause: - The replica attempts to re-evaluate the generated column during the update/delete before image (BI) process. - Re-evaluation of the generated column fails with the error "document is empty" while executing the JSON function, as no underlying base column is passed as part of the BI. Fix: ---- Do not re-evaluate stored generated columns when the base columns are unavailable. Change-Id: I6e41c29a6d87024f59a645eb6e4596e731087c90
from Optimizer During optimization, MySQL chooses an initial plan based on different parameters like access cost, index usability, coverage, etc. During this phase ORDER BY and LIMIT are not considered for the plan creation and access choice. Later during ORDDER BY optimization test_if_cheaper_ordering() chooses index scan as a replacement for sort to get the results in order. This change in the plan is not reflected in the cost data present in the plan. As a result, simple query with ORDER BY and LIMIT will have the cost of sorting the entire table to fetch few LIMIT rows. This cost value is used in comparision with secondary_engine_cost_threshold to decide Heatwave offloading. This could lead to offloading of the queries even when they could perform much better on InnoDB engine. Fix: Update the table access cost if index is choosen in test_if_cheaper_ordering() to provide order. And update query cost accordingly for single table queries and its present in secondary engine. Change-Id: Ica649c9db687ef1162964d865a2b5e4ec4c0d069
from Optimizer Post-push test fix for hypergraph optimizer. Change-Id: Ia0155170e54c4644471e4da1c86925d781617225
…during MySQL 8 upgrade Summary Upgrading a table CREATE TABLE t1 (c1 int GENERATED ALWAYS AS (0), c2 int) ENGINE=MyISAM; from 5.7 to 8.x would fail. This would only occur for a mix of regular columns and generated columns and when using MyISAM. The reason for this was that during that particular type of upgrade for use with the data dictionary, we would align NULL info one way, and the other way everywhere else. This was caught by our sanity checks. Analysis Using our above table as an example: We always use two loops to create columns; the first loop processes regular columns, while the the second loop processes generated columns. Therefore, we'd process c2 (a natural column) before c1 (a generated column). In older code, we'd count the NULL position up in order of column creation so that c2 (which is the second column in the CREATE statement, but the first column to be processed on account of being the first regular column) will be processed first and get the lower NULL position; c1 will be processed second (on account of being a generated column and getting created in the second loop) and get the higher NULL position. In short, we process all regular columns first, then all generated columns, and count up NULL position the entire time. In fill_columns_from_dd(), we processed columns in the same order (first all natural columns, then all generated columns), but would increase the NULL position even when no field was processed (e.g. when skipping a generated column in the first loop that only processes regular columns). Therefore, the NULL positions ended up wrong and tripped our sanity check. This patch aligns the NULL ordering so that the positions generated in fill_columns_from_dd() are compatible with the ordering created by earlier servers and indeed with the behavior elsewhere in contemporary servers. Change-Id: I7bd18e0822c1d86b9dd322a5d994200078ebf84c
Issue: In bug#35183686 fix, we started logging the fields whose column order changed. But while calculating size needed in redo to log index information these columns were missing from calculation. Fix Make sure these colums are also considered while calculating size needed to log index entry. Change-Id: Ic8752c72a8f5beddfc5739688068b9c32b02a700
Extend testing of the ndb_optimized_node_selection variable in MySQL Server, and fix a few issues. ndb_optimized_node_selection MTR test is extended to give some coverage of : - Table types : Read Primary, Read Backup, Fully Replicated - O_N_S variants : 0,1,2,3 Surfaced issues : - Reading and Setting optimized_node_selection option on (one of) the ndb_cluster_connection objects and/or on the session's Ndb object with implications for thread safety, testability, behaviour - Two Round-robin unhinted transaction issues : - Proximity ignoring Round-robin algorithm logic error resulting in imbalance. - Proximity info used in some cases even when optimized_node_selection is disabled. Changes - Avoid setting/clearing ndb_cluster_connection optimized_node_selection value - Rename the variable to indicate that it's a connection default - Modify NdbApi level code to consistently use the Ndb session object's value for decisions - Modify ha_ndbcluster code to consistently set the Ndb object's optimized_node_selection value prior to starting a transaction - Fix Round-robin bugs. Change-Id: I7205837bc3672110490d23c2470a95a963237e3d
Unhinted transactions, or hintable transactions where hinting is disabled will use a round-robin (RR) algorithm, balancing the transaction load across a set of candidate data nodes. The ndb_optimized_node_selection option controls which nodes are candidates. ndb_optimized_node_selection & 1 All data nodes 0 Local data nodes 1 Locality is defined or inferred by a number of NdbApi level concepts such as : Config : LocationDomain Api : DataNodeNeighbour Internal : tryBind remote address->Group Config : Group In both cases there can be multiple candidates, so NdbApi attempts to spread the load across the candidates. However the state memory required to do this is part of the Ndb (session) object, and is reset (to the same 'next' position) in every new Ndb object. In some use cases new Ndb objects are always used, such as when a SQL user connects, issues an unhinted query and then disconnects. In this case there is no RR balance, resulting in resource usage imbalances and limiting system scalability and capacity. This is fixed so that in these cases the first transaction issued by a connection is also handled in a round-robin way, avoiding persistent imbalance. The MTR testcase ndb_optimized_node_selection is extended to cover this scenario. Change-Id: Ia491b3bc4ae96e5e57a27baa375d2d84b3068d60
Testcase ndb_read_balance aims to capture some of the behaviour of the different table distribution types (READ_PRIMARY, READ_BACKUP, FULLY_REPLICATED) in test output. However it was noted that it was not stable. Some reasons for this : - Read Backup + Fully Replicated tables may read a primary or backup fragment replica depending on : - The lock mode (long-locking reads must go to the primary) - Whether the chosen TC is local to the primary replica or not - Unhinted transactions are non-deterministic other than that over some number of iterations they should exhibit round-robin behaviour over relatively 'local' data nodes - Hinted transactions for read-backup and fully replicated tables are also liable to read Primary or Backup nodes depending on locality. The test is modified to : - Show that for READ PRIMARY tables, the primary replicas are effectively always chosen - Show that for READ BACKUP + FULLY REPLICATED tables, the primary replicas are always chosen for locking reads - For non locking reads, a balance of PRIMARY + BACKUP replicas are chosen. This modification should remove the test dependencies on preceding transactions, endinanness etc. Change-Id: I5f6e41ae888a0279f0b9b316e214e8643996410f
Extend testing of the ndb_optimized_node_selection variable in MySQL Server, and fix a few issues. ndb_optimized_node_selection MTR test is extended to give some coverage of : - Table types : Read Primary, Read Backup, Fully Replicated - O_N_S variants : 0,1,2,3 Surfaced issues : - Reading and Setting optimized_node_selection option on (one of) the ndb_cluster_connection objects and/or on the session's Ndb object with implications for thread safety, testability, behaviour - Two Round-robin unhinted transaction issues : - Proximity ignoring Round-robin algorithm logic error resulting in imbalance. - Proximity info used in some cases even when optimized_node_selection is disabled. Changes - Avoid setting/clearing ndb_cluster_connection optimized_node_selection value - Rename the variable to indicate that it's a connection default - Modify NdbApi level code to consistently use the Ndb session object's value for decisions - Modify ha_ndbcluster code to consistently set the Ndb object's optimized_node_selection value prior to starting a transaction - Fix Round-robin bugs. Change-Id: I7205837bc3672110490d23c2470a95a963237e3d
Unhinted transactions, or hintable transactions where hinting is disabled will use a round-robin (RR) algorithm, balancing the transaction load across a set of candidate data nodes. The ndb_optimized_node_selection option controls which nodes are candidates. ndb_optimized_node_selection & 1 All data nodes 0 Local data nodes 1 Locality is defined or inferred by a number of NdbApi level concepts such as : Config : LocationDomain Api : DataNodeNeighbour Internal : tryBind remote address->Group Config : Group In both cases there can be multiple candidates, so NdbApi attempts to spread the load across the candidates. However the state memory required to do this is part of the Ndb (session) object, and is reset (to the same 'next' position) in every new Ndb object. In some use cases new Ndb objects are always used, such as when a SQL user connects, issues an unhinted query and then disconnects. In this case there is no RR balance, resulting in resource usage imbalances and limiting system scalability and capacity. This is fixed so that in these cases the first transaction issued by a connection is also handled in a round-robin way, avoiding persistent imbalance. The MTR testcase ndb_optimized_node_selection is extended to cover this scenario. Change-Id: Ia491b3bc4ae96e5e57a27baa375d2d84b3068d60
Testcase ndb_read_balance aims to capture some of the behaviour of the different table distribution types (READ_PRIMARY, READ_BACKUP, FULLY_REPLICATED) in test output. However it was noted that it was not stable. Some reasons for this : - Read Backup + Fully Replicated tables may read a primary or backup fragment replica depending on : - The lock mode (long-locking reads must go to the primary) - Whether the chosen TC is local to the primary replica or not - Unhinted transactions are non-deterministic other than that over some number of iterations they should exhibit round-robin behaviour over relatively 'local' data nodes - Hinted transactions for read-backup and fully replicated tables are also liable to read Primary or Backup nodes depending on locality. The test is modified to : - Show that for READ PRIMARY tables, the primary replicas are effectively always chosen - Show that for READ BACKUP + FULLY REPLICATED tables, the primary replicas are always chosen for locking reads - For non locking reads, a balance of PRIMARY + BACKUP replicas are chosen. This modification should remove the test dependencies on preceding transactions, endinanness etc. Change-Id: I5f6e41ae888a0279f0b9b316e214e8643996410f
Change-Id: Ieb72bf9e2112184467652926d9536ef5bfbffb97
innodb_directories to datadir after ALTER TABLE In case of ALTER TABLE .. FORCE and ALTER TABLE .. ENGINE, tables are recreated. The default location for recreated tables is default data dir. If the table was created with Data Directory clause, the table is created in the same source directory. Now, If the source table was moved to a different location and then the above Alter statements are run, the table gets recreated in the default data dir. This does not seem correct as the user would expect it to be recreated in the original directory. Furthermore, ALTER TABLE statement does not allow change in data directory so it is important for the table to be recreated at the same location. The solution is to update the data directory flag for corresponding dd_table during restart after ibd file was just moved to new location, get the current path of that ibd and reuse this path during recreation. Also, for those ibd files which were moved during previous versions of the server, we recognize them also as moved by checking the following conditions in fil_tablespace_path_equals(): 1. The file must have .ibd extension 2. old_path and new_path must point to same location (because in previous version when the file is just moved, old_path is updated to new location already) 3. new_path must be different from default data directory 4. dd_table should already exist for this tablespace 5. dd_table data directory flag doesn't exist (because in prev versions before this fix, the dd_table data directory flag is not updated for those ibd files that are moved. This check is added because for those tables that are created with data directory clause, this flag is true. We want to eliminate such cases and be sure that the ibd file is moved) For such ibd files which satisfy the above conditions, we recognize them as moved and populate their tablespace information separately in m_moved_previously_before_90000 which is later used to update dd_table. Note that the variable is named with version 9.0.0 as this is needed for data directories before 9.0.0. We can remove this code in 10.0.0. We added a common method dd_update_table_and_partitions_after_dir_change() which updates dd_table data directory flag for moved ibd files in above cases. Change-Id: I5d4fc2e1f7d23f6a8a9c152449d9d5ecf2344833
…N_SIZE_INLINE' Limit the scale used when computing averages of decimal numbers. Simplified patch backported. Change-Id: If32143ad7ed55f841e3973e78fa8e98cc245e775
Bug#36347647 Contribution by Tencent: Resiliency issue in fts_sync_commit Symptoms: During various operations on tables containing FTS indexes the state of FTS as comitted to database may become inconsistent, affecting the following scenarios: - the server terminates when synchronizing the FTS cache, - synchronization of FTS cache occurs concurrently with another FTS operation This inconsistency may lead to various negative effects including incorrect query results. An example operation which forces the synchronization of FTS cach is OPTIMIZE TABLE with innodb_optimize_fulltext_only set to ON. Root Cause: Function 'fts_cmp_set_sync_doc_id' and 'fts_sql_commit' use different trx_t objects in function 'fts_sync_commit'. This causes a scenario where 'synced_doc_id' in the config table is already committed, but remaining FTS data isn't yet, leading to issues in the scenarios described above - the server terminating between the commits, or concurrent access getting the intermediate state. Fix: When 'fts_cmp_set_sync_doc_id' is called from 'fts_sync_commit' it will use the transaction provided by the caller. Patch based on contribution by Tencent. Change-Id: I65fa5702db5e7b6b2004a7311a6b0aa97449034f
Bug#36347647 Contribution by Tencent: Resiliency issue in fts_sync_commit Post-push fix: Add doxygen documentation to function parameter. Simplified patch backported. Change-Id: Ie3dbf78b4e84ae22c85d1ce873b58ee926566ff8
Change-Id: Ie99c337b7d407db70bf1480df99f5116737556e1
Change-Id: I86408ecad4a991e217b8a07be62cbba3907cd505
Change-Id: Ic57173150c9c9d9069882f09d8c07152e5752475
Change-Id: Ib9dafeb2c373496f2ff1e86b81df3fea7923a26b
ON SHUTDOWN Note: This commit in 8.0 is different from 8.4 and trunk due to testcase Background: ----------- The trx_sys->mysql_trx_list maintains a list of trx_t objects created for user transactions. When the `USE <database>` statement is run, a trx_t object is created for the THD running this query and added to this list. When the connection is closed, the trx_t object of the THD is freed and removed from this list. During shutdown, all the connections are closed before trx_sys is closed. When closing trx_sys, it is expected that all trx_t objects in trx_sys->mysql_trx_list are freed and the list is empty. In replication, when the applier sees an `XA START`, it detaches the THD from the handlertons by storing a backup in the engine's ha_data. In case of InnoDB, the trx_t created for the THD is stored in ha_data and is made nullptr; but not removed from the mysql_trx_list. In replication, the detached engine ha_data is reattached after the XA transaction is processed. During the reattach, the trx_t is restored from the ha_data back to the THD. By design, the detach happens when `XA START` is encountered by the applier. When the applier sees an `XA PREPARE`, the reattach is done via `applier_reset_xa_trans` calling `attach_native_trx`. When the applier sees an `XA COMMIT .. ONE PHASE`, the reattach is done in `ha_commit_low`. If the server is shutdown, the applier performs the reattach in `ha_rollback_low`. The binlog applier can be simulated by enabling `pseudo_replica_mode` and running a `BINLOG '0'`. Issue: ------ When the XA transaction is empty, the ha_list is nullptr. The reattach logic is performed only when ha_list is non-empty. Hence, if the server is shutdown after the applier detaches the engine (i.e, after running an `XA START`), the applier would never reattach the ha_data which leads to the assert when InnoDB closes trx_sys. Due to the design, the problem is seen only with the reattach in `ha_commit_low` and `ha_rollback_low`. Fix: ---- Ensured that reattach is done even if ha_list is empty. Change-Id: Ida617d067d112914c9d55ef4cde9f91956d19a08
Follow up to disable ndb_optimized_node_selection test when running with Query cache, as the cache avoids communication with Ndb altogether. Approved by : Magnus Blaudd <magnus.blaudd@oracle.com> Change-Id: Ia9f2dee674f88e15a0b19da88fcc049f29311cb0
Change-Id: I1713d7edb3fa0bd64f47847dcd65e88c4f33091d
- Post push fix for memory leak Change-Id: I034bb20c71dfe3ae467e762b2dd7c7f95fb0679b
Approved by: Erlend Dahl <erlend.dahl@oracle.com>
Approved by: Erlend Dahl <erlend.dahl@oracle.com>
Approved by: Erlend Dahl <erlend.dahl@oracle.com>
…is repeated Follow up patch. Build breakage on OpenSUSE15 and Ubuntu 20.04 when compiling with GCC 9. AsyncFile.cpp:1172:52: error: use of deleted function 'AsyncFile::odirect_set_log_state::odirect_set_log_state()' 1172 | AsyncFile::odirect_set_log_bp[FsOpenReq::BP_MAX]; | ^ In file included from AsyncFile.cpp:32: AsyncFile.hpp:161:10: note: 'AsyncFile::odirect_set_log_state::odirect_set_log_state()' is implicitly deleted because the default definition would be ill-formed: 161 | struct odirect_set_log_state { | ^~~~~~~~~~~~~~~~~~~~~ AsyncFile.hpp:161:10: error: use of deleted function 'std::atomic<_Tp>::atomic() [with _Tp = NDB_TICKS]' In file included from AsyncFile.hpp:29, from AsyncFile.cpp:32: /usr/include/c++/9/atomic:198:7: note: 'std::atomic<_Tp>::atomic() noexcept [with _Tp = NDB_TICKS]' is implicitly deleted because its exception-specification does not match the implicit exception-specification '' 198 | atomic() noexcept = default; | ^~~~~~ Change-Id: I111c041825f193dec145bec00dd6428c554e6785 (cherry picked from commit 5e99e2ddd92874fcba6b6bcd8f4ddc6254baa1d9)
When sending to another node and not being able to send all the data it is important to make sure it gets sent anyways. Currently this relies on the wakeup timer in the send thread. However this might delay the next send on a ready transporter by as much as 1 millisecond and possibly even 10 millisecond dependent on how the OS handles timer interrupts. A better approach is to immediately wakeup the responsible send thread and have this thread take care of the sending and thus ensuring it is sent as soon as is possible with the send thread.
The patch is against MySQL 8.0, became mysql-trunk by mistake. |
Hi, thank you for your contribution. Please confirm this code is submitted under the terms of the OCA (Oracle's Contribution Agreement) you have previously signed by cutting and pasting the following text as a comment: |
I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it." |
Hi, thank you for your contribution. Your code has been assigned to an internal queue. Please follow |
No description provided.