Skip to content

Commit

Permalink
SHOW SLAVE STATUS shows real error
Browse files Browse the repository at this point in the history
Summary: In parallel replication SHOW SLAVE STATUS doesn't show the actual error. It actually creates the real error (Slave_reporting_capability::va_report) and set m_last_error right after setting coordinator error so we just need to reverse the order of the two and show the real error in coordinator. Accessing the last error is protected by err_lock and also fallback to previous message in case the error number is 0 and can't be trusted, just in case.

Reviewed By: hermanlee, Pushapgl

Differential Revision: D28925760

fbshipit-source-id: f351b8ca3e6
  • Loading branch information
yizhang82 authored and facebook-github-bot committed Jun 10, 2021
1 parent 5473299 commit 1deffd7
Showing 1 changed file with 33 additions and 23 deletions.
56 changes: 33 additions & 23 deletions sql/rpl_rli_pdb.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1692,33 +1692,60 @@ void Slave_worker::do_report(loglevel level, int err_code, const char *msg,

gtid_next->to_string(global_sid_map, buff_gtid, true);

if (is_group_replication_applier_channel) {
snprintf(buff_coord, sizeof(buff_coord),
"Worker %u failed executing transaction '%s'", internal_id,
buff_gtid);
} else {
snprintf(buff_coord, sizeof(buff_coord),
"Worker %u failed executing transaction '%s' at "
"master log %s, end_log_pos %llu",
internal_id, buff_gtid, log_name, log_pos);
}

/*
Error reporting by the worker. The worker updates its error fields as well
as reports the error in the log.
*/
this->va_report(level, err_code, buff_coord, msg, args);

if (level == ERROR_LEVEL && (!has_temporary_error(thd, err_code) ||
thd->get_transaction()->cannot_safely_rollback(
Transaction_ctx::SESSION))) {
char coordinator_errmsg[MAX_SLAVE_ERRMSG];
const char *err_msg = nullptr;

mysql_mutex_lock(&err_lock);
if (m_last_error.number) {
/* We know for sure there is an valid error */
err_msg = m_last_error.message;
} else {
/* Fallback to generic error message just in case */
err_msg = buff_coord;
}

if (is_group_replication_applier_channel) {
snprintf(coordinator_errmsg, MAX_SLAVE_ERRMSG,
"Coordinator stopped because there were error(s) in the "
"worker(s). "
"The most recent failure being: Worker %u failed executing "
"transaction '%s'. See error log and/or "
"The most recent failure being: %s; "
"See error log and/or "
"performance_schema.replication_applier_status_by_worker "
"table for "
"more details about this failure or others, if any.",
internal_id, buff_gtid);
err_msg);
} else {
snprintf(coordinator_errmsg, MAX_SLAVE_ERRMSG,
"Coordinator stopped because there were error(s) in the "
"worker(s). "
"The most recent failure being: Worker %u failed executing "
"transaction '%s' at master log %s, end_log_pos %llu. "
"The most recent failure being: %s; "
"See error log and/or "
"performance_schema.replication_applier_status_by_worker "
"table for "
"more details about this failure or others, if any.",
internal_id, buff_gtid, log_name, log_pos);
err_msg);
}
mysql_mutex_unlock(&err_lock);

/*
We want to update the errors in coordinator as well as worker.
Expand All @@ -1730,23 +1757,6 @@ void Slave_worker::do_report(loglevel level, int err_code, const char *msg,
*/
c_rli->fill_coord_err_buf(level, err_code, coordinator_errmsg);
}

if (is_group_replication_applier_channel) {
snprintf(buff_coord, sizeof(buff_coord),
"Worker %u failed executing transaction '%s'", internal_id,
buff_gtid);
} else {
snprintf(buff_coord, sizeof(buff_coord),
"Worker %u failed executing transaction '%s' at "
"master log %s, end_log_pos %llu",
internal_id, buff_gtid, log_name, log_pos);
}

/*
Error reporting by the worker. The worker updates its error fields as well
as reports the error in the log.
*/
this->va_report(level, err_code, buff_coord, msg, args);
}

#ifndef DBUG_OFF
Expand Down

0 comments on commit 1deffd7

Please sign in to comment.