Skip to content

Commit

Permalink
WL#14936 - Automatic Table Reload from InnoDB after MySQL Restart
Browse files Browse the repository at this point in the history
-- Patch #1: Persist secondary load information --

Problem:
We need a way of knowing which tables were loaded to HeatWave after
MySQL restarts due to a crash or a planned shutdown.

Solution:
Add a new "secondary_load" flag to the `options` column of mysql.tables.
This flag is toggled after a successful secondary load or unload. The
information about this flag is also reflected in
INFORMATION_SCHEMA.TABLES.CREATE_OPTIONS.

-- Patch #2 --

The second patch in this worklog triggers the table reload from InnoDB
after MySQL restart.

The recovery framework recognizes that the system restarted by checking
whether tables are present in the Global State. If there are no tables
present, the framework will access the Data Dictionary and find which
tables were loaded before the restart.

This patch introduces the "Data Dictionary Worker" - a MySQL service
recovery worker whose task is to query the INFORMATION_SCHEMA.TABLES
table from a separate thread and find all tables whose secondary_load
flag is set to 1.

All tables that were found in the Data Dictionary will be appended to
the list of tables that have to be reloaded by the framework from
InnoDB.

If an error occurs during restart recovery we will not mark the recovery
as failed. This is done because the types of failures that can occur
when the tables are reloaded after a restart are less critical compared
to previously existing recovery situations. Additionally, this code will
soon have to be adapted for the next worklog in this area so we are
proceeding with the simplest solution that makes sense.

A Global Context variable m_globalStateEmpty is added which indicates
whether the Global State should be recovered from an external source.

-- Patch #3 --

This patch adds the "rapid_reload_on_restart" system variable. This
variable is used to control whether tables should be reloaded after a
restart of mysqld or the HeatWave plugin. This variable is persistable
(i.e., SET PERSIST RAPID_RELOAD_ON_RESTART = TRUE/FALSE).

The default value of this variable is set to false.

The variable can be modified in OFF, IDLE, and SUSPENDED states.

-- Patch #4 --

This patch refactors the recovery code by removing all recovery-related
code from ha_rpd.cc and moving it to separate files:

  - ha_rpd_session_factory.h/cc:
  These files contain the MySQLAdminSessionFactory class, which is used
to create admin sessions in separate threads that can be used to issue
SQL queries.

  - ha_rpd_recovery.h/cc:
  These files contain the MySQLServiceRecoveryWorker,
MySQLServiceRecoveryJob and ObjectStoreRecoveryJob classes which were
previously defined in ha_rpd.cc. This file also contains a function that
creates the RecoveryWorkerFactory object. This object is passed to the
constructor of the Recovery Framework and is used to communicate with
the other section of the code located in rpdrecoveryfwk.h/cc.

This patch also renames rpdrecvryfwk to rpdrecoveryfwk for better
readability.

The include relationship between the files is shown on the following
diagram:

        rpdrecoveryfwk.h◄──────────────rpdrecoveryfwk.cc
            ▲    ▲
            │    │
            │    │
            │    └──────────────────────────┐
            │                               │
        ha_rpd_recovery.h◄─────────────ha_rpd_recovery.cc──┐
            ▲                               │           │
            │                               │           │
            │                               │           │
            │                               ▼           │
        ha_rpd.cc───────────────────────►ha_rpd.h       │
                                            ▲           │
                                            │           │
            ┌───────────────────────────────┘           │
            │                                           ▼
    ha_rpd_session_factory.cc──────►ha_rpd_session_factory.h

Other changes:
  - In agreement with Control Plane, the external Global State is now
  invalidated during recovery framework startup if:
    1) Recovery framework recognizes that it should load the Global
    State from an external source AND,
    2) rapid_reload_on_restart is set to OFF.

  - Addressed review comments for Patch #3, rapid_reload_on_restart is
  now also settable while plugin is ON.

  - Provide a single entry point for processing external Global State
  before starting the recovery framework loop.

  - Change when the Data Dictionary is read. Now we will no longer wait
  for the HeatWave nodes to connect before querying the Data Dictionary.
  We will query it when the recovery framework starts, before accepting
  any actions in the recovery loop.

  - Change the reload flow by inserting fake global state entries for
  tables that need to be reloaded instead of manually adding them to a
  list of tables scheduled for reload. This method will be used for the
  next phase where we will recover from Object Storage so both recovery
  methods will now follow the same flow.

  - Update secondary_load_dd_flag added in Patch #1.

  - Increase timeout in wait_for_server_bootup to 300s to account for
  long MySQL version upgrades.

  - Add reload_on_restart and reload_on_restart_dbg tests to the rapid
  suite.

  - Add PLUGIN_VAR_PERSIST_AS_READ_ONLY flag to "rapid_net_orma_port"
  and "rapid_reload_on_restart" definitions, enabling their
  initialization from persisted values along with "rapid_bootstrap" when
  it is persisted as ON.

  - Fix numerous clang-tidy warnings in recovery code.

  - Prevent suspended_basic and secondary_load_dd_flag tests to run on
  ASAN builds due to an existing issue when reinstalling the RAPID
  plugin.

-- Bug#33752387 --

Problem:
A shutdown of MySQL causes a crash in queries fired by DD worker.

Solution:
Prevent MySQL from killing DD worker's queries by instantiating a
DD_kill_immunizer before the queries are fired.

-- Patch #5 --

Problem:
A table can be loaded before the DD Worker queries the Data Dictionary.
This means that table will be wrongly processed as part of the external
global state.

Solution:
If the table is present in the current in-memory global state we will
not consider it as part of the external global state and we will not
process it by the recovery framework.

-- Bug#34197659 --

Problem:
If a table reload after restart causes OOM the cluster will go into
RECOVERYFAILED state.

Solution:
Recognize when the tables are being reloaded after restart and do not
move the cluster into RECOVERYFAILED. In that case only the current
reload will fail and the reload for other tables will be attempted.

Change-Id: Ic0c2a763bc338ea1ae6a7121ff3d55b456271bf0
  • Loading branch information
stojadin2701 committed Jun 1, 2022
1 parent d8cbbf8 commit 32fdb82
Show file tree
Hide file tree
Showing 8 changed files with 29 additions and 11 deletions.
4 changes: 2 additions & 2 deletions mysql-test/r/information_schema_ci.result
Original file line number Diff line number Diff line change
Expand Up @@ -2886,7 +2886,7 @@ SELECT TABLE_NAME, CREATE_OPTIONS
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 't1';
TABLE_NAME CREATE_OPTIONS
t1 SECONDARY_ENGINE="myisam"
t1 SECONDARY_ENGINE="myisam" SECONDARY_LOAD="0"
DROP TABLE t1;
CREATE TABLE t1 (f1 INT);
SELECT TABLE_NAME, CREATE_OPTIONS FROM INFORMATION_SCHEMA.TABLES
Expand All @@ -2897,7 +2897,7 @@ ALTER TABLE t1 SECONDARY_ENGINE=myisam;
SELECT TABLE_NAME, CREATE_OPTIONS FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 't1';
TABLE_NAME CREATE_OPTIONS
t1 SECONDARY_ENGINE="myisam"
t1 SECONDARY_ENGINE="myisam" SECONDARY_LOAD="0"
DROP TABLE t1;
#
# BUG#29406053: OPTIMIZER_SWITCH DERIVED_MERGE=OFF CAUSES TABLE COMMENTS
Expand Down
4 changes: 2 additions & 2 deletions mysql-test/r/information_schema_cs.result
Original file line number Diff line number Diff line change
Expand Up @@ -2886,7 +2886,7 @@ SELECT TABLE_NAME, CREATE_OPTIONS
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 't1';
TABLE_NAME CREATE_OPTIONS
t1 SECONDARY_ENGINE="myisam"
t1 SECONDARY_ENGINE="myisam" SECONDARY_LOAD="0"
DROP TABLE t1;
CREATE TABLE t1 (f1 INT);
SELECT TABLE_NAME, CREATE_OPTIONS FROM INFORMATION_SCHEMA.TABLES
Expand All @@ -2897,7 +2897,7 @@ ALTER TABLE t1 SECONDARY_ENGINE=myisam;
SELECT TABLE_NAME, CREATE_OPTIONS FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 't1';
TABLE_NAME CREATE_OPTIONS
t1 SECONDARY_ENGINE="myisam"
t1 SECONDARY_ENGINE="myisam" SECONDARY_LOAD="0"
DROP TABLE t1;
#
# BUG#29406053: OPTIMIZER_SWITCH DERIVED_MERGE=OFF CAUSES TABLE COMMENTS
Expand Down
2 changes: 1 addition & 1 deletion sql/dd/cache/dictionary_client.h
Original file line number Diff line number Diff line change
Expand Up @@ -1199,7 +1199,7 @@ class Dictionary_client {
verifying that an object with the same id already exists. The old object,
which may be present in the shared dictionary cache, is not modified. To
make the changes visible in the shared cache, please call
remove_uncommuitted_objects().
remove_uncommitted_objects().
@note A precondition is that the object has been acquired from the
shared cache indirectly by acquire_for_modification(). For storing
Expand Down
6 changes: 4 additions & 2 deletions sql/dd/dd_table.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2136,10 +2136,12 @@ static bool fill_dd_table_from_create_info(
assert(create_info->default_table_charset);
tab_obj->set_collation_id(create_info->default_table_charset->number);

// Secondary engine.
if (create_info->secondary_engine.str != nullptr)
// Secondary engine and secondary load.
if (create_info->secondary_engine.str != nullptr) {
table_options->set("secondary_engine",
make_string_type(create_info->secondary_engine));
table_options->set("secondary_load", false);
}

tab_obj->set_engine_attribute(create_info->engine_attribute);
tab_obj->set_secondary_engine_attribute(
Expand Down
1 change: 1 addition & 0 deletions sql/dd/impl/types/abstract_table_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ static const std::set<String_type> default_valid_option_keys = {
"plugin_version",
"row_type",
"secondary_engine",
"secondary_load",
"server_i_s_table",
"server_p_s_table",
"stats_auto_recalc",
Expand Down
10 changes: 10 additions & 0 deletions sql/item_strfunc.cc
Original file line number Diff line number Diff line change
Expand Up @@ -4620,6 +4620,16 @@ String *Item_func_get_dd_create_options::val_str(String *str) {
}
}

if (p->exists("secondary_load")) {
dd::String_type opt_value;
p->get("secondary_load", &opt_value);
if (!opt_value.empty()) {
ptr = my_stpcpy(ptr, " SECONDARY_LOAD=\"");
ptr = my_stpcpy(ptr, opt_value.c_str());
ptr = my_stpcpy(ptr, "\"");
}
}

if (ptr == option_buff)
oss << "";
else
Expand Down
2 changes: 1 addition & 1 deletion sql/mysqld.h
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ extern CHARSET_INFO *character_set_filesystem;
enum enum_server_operational_state {
SERVER_BOOTING, /* Server is not operational. It is starting */
SERVER_OPERATING, /* Server is fully initialized and operating */
SERVER_SHUTTING_DOWN /* erver is shutting down */
SERVER_SHUTTING_DOWN /* Server is shutting down */
};
enum_server_operational_state get_server_state();

Expand Down
11 changes: 8 additions & 3 deletions sql/sql_table.cc
Original file line number Diff line number Diff line change
Expand Up @@ -11464,9 +11464,9 @@ bool Sql_cmd_secondary_load_unload::mysql_secondary_load_or_unload(
hton->post_ddl != nullptr);

dd::cache::Dictionary_client::Auto_releaser releaser(thd->dd_client());
const dd::Table *table_def = nullptr;
if (thd->dd_client()->acquire(table_list->db, table_list->table_name,
&table_def))
dd::Table *table_def = nullptr;
if (thd->dd_client()->acquire_for_modification(
table_list->db, table_list->table_name, &table_def))
return true;

// Cleanup that must be done regardless of commit or rollback.
Expand Down Expand Up @@ -11514,6 +11514,11 @@ bool Sql_cmd_secondary_load_unload::mysql_secondary_load_or_unload(
thd->variables.lock_wait_timeout))
return true;

// Update the secondary_load flag based on the current operation.
if (table_def->options().set("secondary_load", is_load) ||
thd->dd_client()->update(table_def))
return true;

// Close primary table.
close_all_tables_for_name(thd, table_list->table->s, false, nullptr);
table_list->table = nullptr;
Expand Down

0 comments on commit 32fdb82

Please sign in to comment.