Skip to content

Commit 65a9cf1

Browse files
author
Tobias Christiani
committed
Bug#36186180 Updating histograms on offloaded tables can deadlock
When running ANALYZE TABLE UPDATE HISTOGRAM on a table that has been loaded into the secondary engine the UPDATE HISTOGRAM command updates both the primary and secondary table share with a snapshot of the histograms. If, while histograms are being updated, another thread runs a query that is offloaded to the secondary engine and this triggers the creation of the secondary table share, then we can end up with a deadlock where: 1) the histogram update thread is waiting on the secondary share to be constructed before it can release histogram MDL, and 2) the offloaded query is waiting for histogram MDL to be relased before it can finish constructing the secondary share because part of the work involved in constructing the table share involves retrieving the histograms on the table from the dictionary. This problem does not happen for tables that exist only in the primary engine since we lock and open such tables before acquiring MDL on the histograms. However, for tables that also exist in the secondary engine we attempt to retrieve the secondary share at the point where we update the primary share, so after having acquired MDL on the histograms, and this creates the possibility of a deadlock. This patch removes histograms from the secondary engine table share altogether, thus avoiding the need the need to update both the primary and secondary share when updating histograms. When optimizing an offloaded query we are still able to use the histograms on the primary table share. This saves memory by avoiding keeping additional copies of the histograms on the secondary table share and avoids the type of lock order violation causing the bug. Change-Id: I36c22d217837a5c6e3e21151c9d1b353cdd86f77
1 parent 8277d62 commit 65a9cf1

File tree

11 files changed

+80
-82
lines changed

11 files changed

+80
-82
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
#
2+
# Bug#36186180 Updating histograms on offloaded tables can deadlock
3+
#
4+
CREATE TABLE t1(x INT) SECONDARY_ENGINE MOCK;
5+
INSERT INTO t1 VALUES (1), (2), (3);
6+
ALTER TABLE t1 SECONDARY_LOAD;
7+
connect con1,localhost,root;
8+
SET DEBUG_SYNC="histogram_update_mdl_acquired SIGNAL histogram_update_ongoing WAIT_FOR secondary_engine_share_open_in_progress";
9+
ANALYZE TABLE t1 UPDATE HISTOGRAM ON x;
10+
connection default;
11+
SET DEBUG_SYNC="now WAIT_FOR histogram_update_ongoing";
12+
SET DEBUG_SYNC="table_share_open_in_progress SIGNAL secondary_engine_share_open_in_progress";
13+
SELECT * FROM t1;
14+
x
15+
connection con1;
16+
Table Op Msg_type Msg_text
17+
test.t1 histogram status Histogram statistics created for column 'x'.
18+
DROP TABLE t1;
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
--source include/have_debug_sync.inc
2+
--disable_query_log
3+
eval INSTALL PLUGIN mock SONAME '$MOCK_PLUGIN';
4+
--enable_query_log
5+
6+
--echo #
7+
--echo # Bug#36186180 Updating histograms on offloaded tables can deadlock
8+
--echo #
9+
10+
CREATE TABLE t1(x INT) SECONDARY_ENGINE MOCK;
11+
INSERT INTO t1 VALUES (1), (2), (3);
12+
ALTER TABLE t1 SECONDARY_LOAD;
13+
14+
--enable_connect_log
15+
--connect (con1,localhost,root)
16+
SET DEBUG_SYNC="histogram_update_mdl_acquired SIGNAL histogram_update_ongoing WAIT_FOR secondary_engine_share_open_in_progress";
17+
--send ANALYZE TABLE t1 UPDATE HISTOGRAM ON x
18+
19+
--connection default
20+
SET DEBUG_SYNC="now WAIT_FOR histogram_update_ongoing";
21+
SET DEBUG_SYNC="table_share_open_in_progress SIGNAL secondary_engine_share_open_in_progress";
22+
# Without the fix this query will deadlock because it gets stuck waiting for UPDATE HISTOGRAM
23+
# to release histogram MDL while opening the secondary share, and at the same time the histogram update
24+
# is stuck waiting for the secondary share to be opened.
25+
SELECT * FROM t1;
26+
27+
--connection con1
28+
--reap;
29+
30+
DROP TABLE t1;
31+
32+
--disable_query_log
33+
UNINSTALL PLUGIN mock;
34+
--enable_query_log

sql/dd_table_share.cc

+1-1
Original file line numberDiff line numberDiff line change
@@ -274,7 +274,7 @@ static bool prepare_share(THD *thd, TABLE_SHARE *share,
274274

275275
// Setup other fields =====================================================
276276

277-
if (share->tmp_table == NO_TMP_TABLE) {
277+
if (share->tmp_table == NO_TMP_TABLE && share->is_primary_engine()) {
278278
share->m_histograms = new (&share->mem_root) Table_histograms_collection();
279279
if (share->m_histograms == nullptr) return true; // OOM.
280280
}

sql/handler.h

+1
Original file line numberDiff line numberDiff line change
@@ -5147,6 +5147,7 @@ class handler {
51475147
table_share = share;
51485148
}
51495149
const TABLE_SHARE *get_table_share() const { return table_share; }
5150+
const TABLE *get_table() const { return table; }
51505151

51515152
/* Estimates calculation */
51525153

sql/histograms/histogram.cc

+7-69
Original file line numberDiff line numberDiff line change
@@ -1486,53 +1486,15 @@ bool update_share_histograms(THD *thd, Table_ref *table) {
14861486
MDL_key::TABLE, table->db, table->table_name, MDL_SHARED_READ));
14871487
assert(table->table != nullptr);
14881488

1489-
// If the table has a shadow copy in a secondary engine we must retrieve the
1490-
// TABLE_SHARE for the secondary engine as well.
14911489
TABLE_SHARE *share = table->table->s;
1492-
TABLE_SHARE *secondary_share = nullptr;
1493-
if (share->has_secondary_engine()) {
1494-
mysql_mutex_lock(&LOCK_open);
1495-
std::string secondary_key =
1496-
create_table_def_key_secondary(table->db, table->table_name);
1497-
secondary_share = get_table_share(
1498-
thd, table->db, table->table_name, secondary_key.c_str(),
1499-
secondary_key.length(), /*open_view=*/false, /*open_secondary=*/true);
1500-
mysql_mutex_unlock(&LOCK_open);
1501-
}
1502-
if (share->has_secondary_engine() && secondary_share == nullptr) return true;
1503-
1504-
auto share_guard = create_scope_guard([secondary_share]() {
1505-
if (secondary_share != nullptr) {
1506-
mysql_mutex_lock(&LOCK_open);
1507-
release_table_share(secondary_share);
1508-
mysql_mutex_unlock(&LOCK_open);
1509-
}
1510-
});
1511-
1512-
// Create Table_histograms objects for the primary and secondary share (if it
1513-
// exists) together with scope guards to clean up in case of failure.
15141490
Table_histograms *table_histograms =
15151491
Table_histograms::create(key_memory_table_share);
15161492
if (table_histograms == nullptr) return true;
15171493
auto table_histograms_guard =
15181494
create_scope_guard([table_histograms]() { table_histograms->destroy(); });
15191495

1520-
Table_histograms *table_histograms_secondary = nullptr;
1521-
if (secondary_share != nullptr) {
1522-
table_histograms_secondary =
1523-
Table_histograms::create(key_memory_table_share);
1524-
if (table_histograms_secondary == nullptr) return true;
1525-
}
1526-
1527-
auto table_histograms_secondary_guard =
1528-
create_scope_guard([table_histograms_secondary]() {
1529-
if (table_histograms_secondary != nullptr) {
1530-
table_histograms_secondary->destroy();
1531-
}
1532-
});
1533-
15341496
// Retrieve histograms from the data dictionary and add them to the
1535-
// TABLE_SHARE.
1497+
// set of table_histograms that is to be inserted into the TABLE_SHARE.
15361498
for (size_t i = 0; i < share->fields; ++i) {
15371499
const Field *field = share->field[i];
15381500
if (field->is_hidden_by_system()) continue;
@@ -1543,44 +1505,20 @@ bool update_share_histograms(THD *thd, Table_ref *table) {
15431505
return true;
15441506
}
15451507

1546-
if (histogram != nullptr) {
1547-
if (table_histograms->insert_histogram(field->field_index(), histogram)) {
1548-
return true;
1549-
}
1550-
if (table_histograms_secondary &&
1551-
table_histograms_secondary->insert_histogram(field->field_index(),
1552-
histogram)) {
1553-
return true;
1554-
}
1508+
if (histogram != nullptr &&
1509+
table_histograms->insert_histogram(field->field_index(), histogram)) {
1510+
return true;
15551511
}
15561512
}
15571513

1558-
// Disable the scope guard that would release the secondary share and attempt
1559-
// to insert the new histogram snapshots and release the secondary share if it
1560-
// was acquired. Since acquiring/releasing shares and modifying the collection
1561-
// of histograms on the share is protected by LOCK_open we attempt to reduce
1562-
// the number of lock/unlock pairs by grouping these operations together.
1563-
share_guard.commit();
1564-
1565-
bool error = false;
15661514
mysql_mutex_lock(&LOCK_open);
1567-
if (share->m_histograms->insert(table_histograms)) {
1568-
error = true;
1569-
} else {
1515+
const bool error = share->m_histograms->insert(table_histograms);
1516+
mysql_mutex_unlock(&LOCK_open);
1517+
if (!error) {
15701518
// If the insertion succeeded ownership responsibility was passed on, so we
15711519
// can disable the scope guard that would free the Table_histograms object.
15721520
table_histograms_guard.commit();
15731521
}
1574-
1575-
if (secondary_share != nullptr) {
1576-
if (secondary_share->m_histograms->insert(table_histograms_secondary)) {
1577-
error = true;
1578-
} else {
1579-
table_histograms_secondary_guard.commit();
1580-
}
1581-
release_table_share(secondary_share);
1582-
}
1583-
mysql_mutex_unlock(&LOCK_open);
15841522
return error;
15851523
}
15861524

sql/histograms/histogram.h

+3-7
Original file line numberDiff line numberDiff line change
@@ -805,20 +805,16 @@ bool auto_update_table_histograms(THD *thd, Table_ref *table);
805805
/**
806806
Retrieve an updated snapshot of the histograms on a table directly from the
807807
dictionary (in an inefficient manner, querying all columns) and inserts this
808-
snapshot in the Table_histograms_collection on the TABLE_SHARE. If the table
809-
has a secondary engine we also insert a new snapshot on the secondary share.
808+
snapshot in the Table_histograms_collection on the TABLE_SHARE.
810809
811810
@param thd The current thread.
812811
@param table The table to retrieve updated histograms for.
813812
814813
@note This function assumes that the table is opened and generally depends on
815814
the surrounding context. It also locks/unlocks LOCK_OPEN.
816815
817-
@return False on success. Returns true if an error occurred in which case it
818-
can have happened that none of the shares were updated, or that only one of
819-
the shares (primary and secondary) were updated, even though we intended to
820-
update both. In other words if this function returns true we do not know to
821-
what extent the share(s) reflect the dictionary state.
816+
@return False on success. Returns true if an error occurred in which case the
817+
TABLE_SHARE will not have been updated.
822818
*/
823819
bool update_share_histograms(THD *thd, Table_ref *table);
824820

sql/histograms/table_histograms.h

+1-1
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ object, because the appropriate protection should already be in place. For
100100
example, for the insert() in sql_base.cc:get_table_share() we do not use mutex
101101
protection since we are in the process of constructing the TABLE_SHARE.
102102
103-
-- insert() in sql_admin.cc:update_share_histograms(): protected by LOCK_open.
103+
-- insert() in histogram.cc:update_share_histograms(): protected by LOCK_open.
104104
105105
-- acquire() in table.cc:open_table_from_share(): protected by LOCK_open.
106106

sql/sql_admin.cc

+1
Original file line numberDiff line numberDiff line change
@@ -1785,6 +1785,7 @@ bool Sql_cmd_analyze_table::handle_histogram_command_inner(
17851785
});
17861786

17871787
if (open_table_and_lock_histograms(thd, table, results)) return true;
1788+
DEBUG_SYNC(thd, "histogram_update_mdl_acquired");
17881789

17891790
// UPDATE/DROP histograms. Commit on success. Rollback on error.
17901791
switch (get_histogram_command()) {

sql/sql_base.cc

+3-2
Original file line numberDiff line numberDiff line change
@@ -830,6 +830,7 @@ TABLE_SHARE *get_table_share(THD *thd, const char *db, const char *table_name,
830830
*/
831831
share->increment_ref_count(); // Mark in use
832832
share->m_open_in_progress = true; // Mark being opened
833+
DEBUG_SYNC(thd, "table_share_open_in_progress");
833834

834835
/*
835836
Temporarily release LOCK_open before opening the table definition,
@@ -891,8 +892,8 @@ TABLE_SHARE *get_table_share(THD *thd, const char *db, const char *table_name,
891892
/*
892893
Read any existing histogram statistics from the data dictionary and
893894
store a copy of them in the TABLE_SHARE. We only perform this step for
894-
non-temporary tables, since temporary tables have share->m_histograms
895-
set to nullptr.
895+
non-temporary and primary engine tables. When these conditions are not
896+
met m_histograms is nullptr.
896897

897898
We need to do this outside the protection of LOCK_open, since the data
898899
dictionary might have to open tables in order to read histogram data

sql/table.cc

+5-2
Original file line numberDiff line numberDiff line change
@@ -7850,8 +7850,11 @@ void TABLE::disable_logical_diffs_for_current_row(const Field *field) const {
78507850
}
78517851

78527852
const histograms::Histogram *TABLE::find_histogram(uint field_index) const {
7853-
if (histograms == nullptr) return nullptr;
7854-
return histograms->find_histogram(field_index);
7853+
const handler *primary = get_primary_handler();
7854+
if (primary == nullptr) return nullptr;
7855+
const TABLE *table = primary->get_table();
7856+
if (table == nullptr || table->histograms == nullptr) return nullptr;
7857+
return table->histograms->find_histogram(field_index);
78557858
}
78567859

78577860
//////////////////////////////////////////////////////////////////////////

sql/table.h

+6
Original file line numberDiff line numberDiff line change
@@ -2469,6 +2469,12 @@ struct TABLE {
24692469
/**
24702470
Find the histogram for the given field index.
24712471

2472+
@note If this is called on a TABLE object that belongs to a secondary
2473+
engine, it will take a round-trip through the handler in order to obtain the
2474+
histogram from the TABLE object associated with the primary engine. This is
2475+
done to avoid storing histograms on both the primary and secondary
2476+
TABLE_SHARE.
2477+
24722478
@param field_index The index of the field we want to find a histogram for.
24732479

24742480
@retval nullptr if no histogram is found.

0 commit comments

Comments
 (0)