Permalink
Show file tree
Hide file tree
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
MDEV-12699 Improve crash recovery of corrupted data pages
InnoDB crash recovery used to read every data page for which redo log exists. This is unnecessary for those pages that are initialized by the redo log. If a newly created page is corrupted, recovery could unnecessarily fail. It would suffice to reinitialize the page based on the redo log records. To add insult to injury, InnoDB crash recovery could hang if it encountered a corrupted page. We will fix also that problem. InnoDB would normally refuse to start up if it encounters a corrupted page on recovery, but that can be overridden by setting innodb_force_recovery=1. Data pages are completely initialized by the records MLOG_INIT_FILE_PAGE2 and MLOG_ZIP_PAGE_COMPRESS. MariaDB 10.4 additionally recognizes MLOG_INIT_FREE_PAGE, which notifies that a page has been freed and its contents can be discarded (filled with zeroes). The record MLOG_INDEX_LOAD notifies that redo logging has been re-enabled after being disabled. We can avoid loading the page if all buffered redo log records predate the MLOG_INDEX_LOAD record. For the internal tables of FULLTEXT INDEX, no MLOG_INDEX_LOAD records were written before commit aa3f7a1. Hence, we will skip these optimizations for tables whose name starts with FTS_. This is joint work with Thirunarayanan Balathandayuthapani. fil_space_t::enable_lsn, file_name_t::enable_lsn: The LSN of the latest recovered MLOG_INDEX_LOAD record for a tablespace. mlog_init: Page initialization operations discovered during redo log scanning. FIXME: This really belongs in recv_sys->addr_hash, and should be removed in MDEV-19176. recv_addr_state: Add the new state RECV_WILL_NOT_READ to indicate that according to mlog_init, the page will be initialized based on redo log record contents. recv_add_to_hash_table(): Set the RECV_WILL_NOT_READ state if appropriate. For now, we do not treat MLOG_ZIP_PAGE_COMPRESS as page initialization. This works around bugs in the crash recovery of ROW_FORMAT=COMPRESSED tables. recv_mark_log_index_load(): Process a MLOG_INDEX_LOAD record by resetting the state to RECV_NOT_PROCESSED and by updating the fil_name_t::enable_lsn. recv_init_crash_recovery_spaces(): Copy fil_name_t::enable_lsn to fil_space_t::enable_lsn. recv_recover_page(): Add the parameter init_lsn, to ignore any log records that precede the page initialization. Add DBUG output about skipped operations. buf_page_create(): Initialize FIL_PAGE_LSN, so that recv_recover_page() will not wrongly skip applying the page-initialization record due to the field containing some newer LSN as a leftover from a different page. Do not invoke ibuf_merge_or_delete_for_page() during crash recovery. recv_apply_hashed_log_recs(): Remove some unnecessary lookups. Note if a corrupted page was found during recovery. After invoking buf_page_create(), do invoke ibuf_merge_or_delete_for_page() via mlog_init.ibuf_merge() in the last recovery batch. ibuf_merge_or_delete_for_page(): Relax a debug assertion. innobase_start_or_create_for_mysql(): Abort startup if a corrupted page was found during recovery. Corrupted pages will not be flagged if innodb_force_recovery is set. However, the recv_sys->found_corrupt_fs flag can be set regardless of innodb_force_recovery if file names are found to be incorrect (for example, multiple files with the same tablespace ID).
- Loading branch information
Showing
11 changed files
with
459 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| CREATE TABLE t1(a BIGINT PRIMARY KEY) ENGINE=InnoDB, ENCRYPTED=YES; | ||
| INSERT INTO t1 VALUES(1); | ||
| CREATE TABLE t2(a BIGINT PRIMARY KEY) ENGINE=InnoDB, ENCRYPTED=YES; | ||
| INSERT INTO t1 VALUES(2); | ||
| SET GLOBAL innodb_flush_log_at_trx_commit=1; | ||
| INSERT INTO t2 VALUES(2); | ||
| # Kill the server | ||
| # Corrupt the pages | ||
| SELECT * FROM t1; | ||
| ERROR 42000: Unknown storage engine 'InnoDB' | ||
| SELECT * FROM t1; | ||
| a | ||
| 1 | ||
| 2 | ||
| SELECT * FROM t2; | ||
| a | ||
| 2 | ||
| CHECK TABLE t1,t2; | ||
| Table Op Msg_type Msg_text | ||
| test.t1 check status OK | ||
| test.t2 check status OK | ||
| DROP TABLE t1, t2; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,60 @@ | ||
| --source include/have_innodb.inc | ||
| --source include/have_file_key_management_plugin.inc | ||
|
|
||
| --disable_query_log | ||
| call mtr.add_suppression("InnoDB: Plugin initialization aborted"); | ||
| call mtr.add_suppression("Plugin 'InnoDB' init function returned error"); | ||
| call mtr.add_suppression("Plugin 'InnoDB' registration as a STORAGE ENGINE failed"); | ||
| call mtr.add_suppression("InnoDB: Database page corruption on disk or a failed file read of tablespace test/t1 page"); | ||
| call mtr.add_suppression("InnoDB: Failed to read file '.*test.t1\\.ibd' at offset 3: Table is encrypted but decrypt failed"); | ||
| call mtr.add_suppression("InnoDB: The page \\[page id: space=\\d+, page number=3\\] in file '.*test.t1\\.ibd' cannot be decrypted"); | ||
| call mtr.add_suppression("InnoDB: Table in tablespace \\d+ encrypted. However key management plugin or used key_version \\d+ is not found or used encryption algorithm or method does not match. Can't continue opening the table."); | ||
| --enable_query_log | ||
|
|
||
| let INNODB_PAGE_SIZE=`select @@innodb_page_size`; | ||
| CREATE TABLE t1(a BIGINT PRIMARY KEY) ENGINE=InnoDB, ENCRYPTED=YES; | ||
| INSERT INTO t1 VALUES(1); | ||
| # Force a redo log checkpoint. | ||
| --source include/restart_mysqld.inc | ||
| --source ../../suite/innodb/include/no_checkpoint_start.inc | ||
| CREATE TABLE t2(a BIGINT PRIMARY KEY) ENGINE=InnoDB, ENCRYPTED=YES; | ||
| INSERT INTO t1 VALUES(2); | ||
| SET GLOBAL innodb_flush_log_at_trx_commit=1; | ||
| INSERT INTO t2 VALUES(2); | ||
|
|
||
| --let CLEANUP_IF_CHECKPOINT=DROP TABLE t1,t2; | ||
| --source ../../suite/innodb/include/no_checkpoint_end.inc | ||
|
|
||
| --echo # Corrupt the pages | ||
|
|
||
| perl; | ||
| my $ps = $ENV{INNODB_PAGE_SIZE}; | ||
|
|
||
| my $file = "$ENV{MYSQLD_DATADIR}/test/t1.ibd"; | ||
| open(FILE, "+<$file") || die "Unable to open $file"; | ||
| binmode FILE; | ||
| seek (FILE, $ENV{INNODB_PAGE_SIZE} * 3, SEEK_SET) or die "seek"; | ||
| print FILE "junk"; | ||
| close FILE or die "close"; | ||
|
|
||
| $file = "$ENV{MYSQLD_DATADIR}/test/t2.ibd"; | ||
| open(FILE, "+<$file") || die "Unable to open $file"; | ||
| binmode FILE; | ||
| # Corrupt pages 1 to 3. MLOG_INIT_FILE_PAGE2 should protect us! | ||
| # Unfortunately, we are not immune to page 0 corruption. | ||
| seek (FILE, $ps, SEEK_SET) or die "seek"; | ||
| print FILE chr(0xff) x ($ps * 3); | ||
| close FILE or die "close"; | ||
| EOF | ||
|
|
||
| --source include/start_mysqld.inc | ||
| --error ER_UNKNOWN_STORAGE_ENGINE | ||
| SELECT * FROM t1; | ||
| let $restart_parameters=--innodb_force_recovery=1; | ||
| --source include/restart_mysqld.inc | ||
|
|
||
| SELECT * FROM t1; | ||
| SELECT * FROM t2; | ||
| CHECK TABLE t1,t2; | ||
|
|
||
| DROP TABLE t1, t2; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| CREATE TABLE t1(a BIGINT PRIMARY KEY) ENGINE=InnoDB; | ||
| INSERT INTO t1 VALUES(1); | ||
| CREATE TABLE t2(a BIGINT PRIMARY KEY) ENGINE=InnoDB; | ||
| INSERT INTO t1 VALUES(2); | ||
| SET GLOBAL innodb_flush_log_at_trx_commit=1; | ||
| INSERT INTO t2 VALUES(1); | ||
| # Kill the server | ||
| # Corrupt the pages | ||
| SELECT * FROM t1; | ||
| ERROR 42000: Unknown storage engine 'InnoDB' | ||
| SELECT * FROM t1; | ||
| a | ||
| 0 | ||
| 2 | ||
| SELECT * FROM t2; | ||
| a | ||
| 1 | ||
| CHECK TABLE t1,t2; | ||
| Table Op Msg_type Msg_text | ||
| test.t1 check status OK | ||
| test.t2 check status OK | ||
| DROP TABLE t1, t2; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| --innodb_doublewrite=0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,60 @@ | ||
| --source include/have_innodb.inc | ||
|
|
||
| --disable_query_log | ||
| call mtr.add_suppression("InnoDB: Plugin initialization aborted"); | ||
| call mtr.add_suppression("Plugin 'InnoDB' init function returned error"); | ||
| call mtr.add_suppression("Plugin 'InnoDB' registration as a STORAGE ENGINE failed"); | ||
| call mtr.add_suppression("InnoDB: Database page corruption on disk or a failed file read of tablespace test/t1 page"); | ||
| call mtr.add_suppression("InnoDB: Failed to read file '.*test.t1\\.ibd' at offset 3: Page read from tablespace is corrupted."); | ||
| --enable_query_log | ||
|
|
||
| let INNODB_PAGE_SIZE=`select @@innodb_page_size`; | ||
| CREATE TABLE t1(a BIGINT PRIMARY KEY) ENGINE=InnoDB; | ||
| INSERT INTO t1 VALUES(1); | ||
| # Force a redo log checkpoint. | ||
| --source include/restart_mysqld.inc | ||
| --source ../include/no_checkpoint_start.inc | ||
| CREATE TABLE t2(a BIGINT PRIMARY KEY) ENGINE=InnoDB; | ||
| INSERT INTO t1 VALUES(2); | ||
| SET GLOBAL innodb_flush_log_at_trx_commit=1; | ||
| INSERT INTO t2 VALUES(1); | ||
|
|
||
| --let CLEANUP_IF_CHECKPOINT=DROP TABLE t1,t2; | ||
| --source ../include/no_checkpoint_end.inc | ||
|
|
||
| --echo # Corrupt the pages | ||
|
|
||
| perl; | ||
| my $ps = $ENV{INNODB_PAGE_SIZE}; | ||
|
|
||
| my $file = "$ENV{MYSQLD_DATADIR}/test/t1.ibd"; | ||
| open(FILE, "+<$file") || die "Unable to open $file"; | ||
| binmode FILE; | ||
| sysseek(FILE, 3*$ps, 0) || die "Unable to seek $file\n"; | ||
| die "Unable to read $file" unless sysread(FILE, $page, $ps) == $ps; | ||
| # Replace the a=1 with a=0. | ||
| $page =~ s/\x80\x0\x0\x0\x0\x0\x0\x1/\x80\x0\x0\x0\x0\x0\x0\x0/; | ||
| sysseek(FILE, 3*$ps, 0) || die "Unable to seek $file\n"; | ||
| syswrite(FILE, $page, $ps)==$ps || die "Unable to write $file\n"; | ||
| close FILE or die "close"; | ||
|
|
||
| $file = "$ENV{MYSQLD_DATADIR}/test/t2.ibd"; | ||
| open(FILE, "+<$file") || die "Unable to open $file"; | ||
| binmode FILE; | ||
| # Corrupt pages 1 to 3. MLOG_INIT_FILE_PAGE2 should protect us! | ||
| # Unfortunately, we are not immune to page 0 corruption. | ||
| seek (FILE, $ps, SEEK_SET) or die "seek"; | ||
| print FILE chr(0xff) x ($ps * 3); | ||
| close FILE or die "close"; | ||
| EOF | ||
|
|
||
| --source include/start_mysqld.inc | ||
| --error ER_UNKNOWN_STORAGE_ENGINE | ||
| SELECT * FROM t1; | ||
| let $restart_parameters=--innodb_force_recovery=1; | ||
| --source include/restart_mysqld.inc | ||
| SELECT * FROM t1; | ||
| SELECT * FROM t2; | ||
| CHECK TABLE t1,t2; | ||
|
|
||
| DROP TABLE t1, t2; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.