Skip to content

Commit

Permalink
MDEV-17899: Fix a regression from MDEV-17793
Browse files Browse the repository at this point in the history
The fix of MDEV-17793 was updating SYS_INDEXES.TABLE_ID in order
to make the table invisible to purge (lazily delete old undo log
records).

By design of InnoDB, an update of TABLE_ID cannot be rolled back,
because the rollback would effectively drop all indexes of the table
due to the internal 'trigger' on SYS_INDEXES modifications.

So, we revert the code change of MDEV-17793 and instead fix
MDEV-17793 in a different way: by tweaking the undo log parsing
during purge.

The MDEV-17793 bug scenario is that a table becomes empty and
a third instant ALTER TABLE is executed before purge processes
the undo log record for the second instant ALTER TABLE. After
this point, when purge sees the record, the table could have
a mismatching number of rows.

The test case works with this alternative fix. But what about
a scenario where a fourth instant ALTER TABLE arrives before
purge processes the second one? Could anything bad happen?

Purge is only doing two things: First, free any BLOBs that
were affected by the update record, and then, reset the
DB_TRX_ID,DB_ROLL_PTR if a matching record is found.
For the hidden metadata record, the only BLOB that we update
is the hidden metadata BLOB that was introduced by MDEV-15562.
Any other BLOBs (for the initial default values of instantly
added columns) are never updated.

So, in our scenario, the metadata BLOB that was created by
the first instant ALTER TABLE (if it involved dropping or
permuting columns) would be freed by purge when it is processing
the undo record of the second ALTER TABLE. The BLOB value that
was written by the second ALTER TABLE should be freed when
the table is emptied. This is currently not done: MDEV-17383
should fix that. There is no possibility of double-free, because
purge would only free old values of BLOBs.

What about MVCC and other callers of trx_undo_update_rec_get_update()?
The answer is simple: they should never be accessing the hidden
metadata record in the first place.

dict_table_t::reassign_id(): Remove.

btr_cur_pessimistic_delete(): Clarify a comment.

row_mysql_table_id_reassign(), row_discard_tablespace_for_mysql():
Add comments explaining that the operation cannot be rolled back.

trx_undo_update_rec_get_update(): Avoid out-of-bounds access when
parsing a metadata record. Avoid unnecessary memory allocation when
filtering out fields from the update vector.
  • Loading branch information
dr-m committed Dec 11, 2018
1 parent 1c53aef commit 1b31d88
Show file tree
Hide file tree
Showing 5 changed files with 60 additions and 98 deletions.
3 changes: 2 additions & 1 deletion storage/innobase/btr/btr0cur.cc
Expand Up @@ -5836,7 +5836,8 @@ btr_cur_pessimistic_delete(
only user record, also delete the metadata record
if one exists for instant ADD COLUMN
(not generic ALTER TABLE).
If we are deleting the metadata record and the
If we are deleting the metadata record
(in the rollback of instant ALTER TABLE) and the
table becomes empty, clean up the whole page. */

const rec_t* first_rec = page_rec_get_next_const(
Expand Down
60 changes: 0 additions & 60 deletions storage/innobase/handler/handler0alter.cc
Expand Up @@ -5277,53 +5277,6 @@ dict_index_t::instant_metadata(const dtuple_t& row, mem_heap_t* heap) const
return entry;
}

/** Assign a new id to invalidate old undo log records, so
that purge will be unable to refer to fields that used to be
instantly added to the end of the index. This is only to be
used during ALTER TABLE when the table is empty, before
invoking dict_index_t::clear_instant_alter().
@param[in,out] trx dictionary transaction
@return error code */
inline dberr_t dict_table_t::reassign_id(trx_t* trx)
{
DBUG_ASSERT(instant);
ut_ad(magic_n == DICT_TABLE_MAGIC_N);
ut_ad(!is_temporary());

table_id_t new_id;
dict_hdr_get_new_id(&new_id, NULL, NULL);
pars_info_t* pinfo = pars_info_create();

pars_info_add_ull_literal(pinfo, "old", id);
pars_info_add_ull_literal(pinfo, "new", new_id);

ut_ad(mutex_own(&dict_sys->mutex));
ut_ad(rw_lock_own(dict_operation_lock, RW_LOCK_X));
ut_ad(trx->dict_operation_lock_mode == RW_X_LATCH);

dberr_t err = que_eval_sql(
pinfo,
"PROCEDURE RENUMBER_TABLE_ID_PROC () IS\n"
"BEGIN\n"
"UPDATE SYS_TABLES SET ID=:new WHERE ID=:old;\n"
"UPDATE SYS_COLUMNS SET TABLE_ID=:new WHERE TABLE_ID=:old;\n"
"UPDATE SYS_INDEXES SET TABLE_ID=:new WHERE TABLE_ID=:old;\n"
"UPDATE SYS_VIRTUAL SET TABLE_ID=:new WHERE TABLE_ID=:old;\n"
"END;\n"
, FALSE, trx);
if (err == DB_SUCCESS) {
auto fold = ut_fold_ull(id);
HASH_DELETE(dict_table_t, id_hash, dict_sys->table_id_hash,
fold, this);
id = new_id;
fold = ut_fold_ull(id);
HASH_INSERT(dict_table_t, id_hash, dict_sys->table_id_hash,
fold, this);
}

return err;
}

/** Insert or update SYS_COLUMNS and the hidden metadata record
for instant ALTER TABLE.
@param[in] ha_alter_info ALTER TABLE context
Expand Down Expand Up @@ -5612,19 +5565,6 @@ static bool innobase_instant_try(
empty_table:
/* The table is empty. */
ut_ad(page_is_root(block->frame));
if (index->table->instant) {
/* Assign a new dict_table_t::id
to invalidate old undo log records in purge,
so that they cannot refer to fields that were
instantly added to the end of the index,
instead of using the canonical positions
that will be replaced below
by index->clear_instant_alter(). */
err = index->table->reassign_id(trx);
if (err != DB_SUCCESS) {
goto func_exit;
}
}
/* MDEV-17383: free metadata BLOBs! */
btr_page_empty(block, NULL, index, 0, &mtr);
index->clear_instant_alter();
Expand Down
9 changes: 0 additions & 9 deletions storage/innobase/include/dict0mem.h
Expand Up @@ -1676,15 +1676,6 @@ struct dict_table_t {
const char* old_v_col_names,
const ulint* col_map);

/** Assign a new id to invalidate old undo log records, so
that purge will be unable to refer to fields that used to be
instantly added to the end of the index. This is only to be
used during ALTER TABLE when the table is empty, before
invoking dict_index_t::clear_instant_alter().
@param[in,out] trx dictionary transaction
@return error code */
inline dberr_t reassign_id(trx_t* trx);

/** Add the table definition to the data dictionary cache */
void add_to_cache();

Expand Down
10 changes: 10 additions & 0 deletions storage/innobase/row/row0mysql.cc
Expand Up @@ -2847,6 +2847,10 @@ row_mysql_table_id_reassign(
pars_info_add_ull_literal(info, "old_id", table->id);
pars_info_add_ull_literal(info, "new_id", *new_id);

/* Note: This cannot be rolled back. Rollback would see the
UPDATE SYS_INDEXES as two operations: DELETE and INSERT.
It would invoke btr_free_if_exists() when rolling back the
INSERT, effectively dropping all indexes of the table. */
err = que_eval_sql(
info,
"PROCEDURE RENUMBER_TABLE_PROC () IS\n"
Expand Down Expand Up @@ -3135,6 +3139,12 @@ row_discard_tablespace_for_mysql(
err = row_discard_tablespace_foreign_key_checks(trx, table);

if (err == DB_SUCCESS) {
/* Note: This cannot be rolled back.
Rollback would see the UPDATE SYS_INDEXES
as two operations: DELETE and INSERT.
It would invoke btr_free_if_exists()
when rolling back the INSERT, effectively
dropping all indexes of the table. */
err = row_discard_tablespace(trx, table);
}
}
Expand Down
76 changes: 48 additions & 28 deletions storage/innobase/trx/trx0rec.cc
Expand Up @@ -1548,7 +1548,6 @@ trx_undo_update_rec_get_update(
upd_t* update;
ulint n_fields;
byte* buf;
ulint i;
bool first_v_col = true;
bool is_undo_log = true;
ulint n_skip_field = 0;
Expand All @@ -1561,7 +1560,7 @@ trx_undo_update_rec_get_update(
n_fields = 0;
}

update = upd_create(n_fields + 2, heap);
*upd = update = upd_create(n_fields + 2, heap);

update->info_bits = info_bits;

Expand All @@ -1587,8 +1586,7 @@ trx_undo_update_rec_get_update(

/* Store then the updated ordinary columns to the update vector */

for (i = 0; i < n_fields; i++) {

for (ulint i = 0; i < n_fields; i++) {
const byte* field;
ulint len;
ulint orig_len;
Expand Down Expand Up @@ -1621,23 +1619,53 @@ trx_undo_update_rec_get_update(
upd_field_set_v_field_no(upd_field, field_no, index);
} else if (UNIV_UNLIKELY((update->info_bits
& ~REC_INFO_DELETED_FLAG)
== REC_INFO_MIN_REC_FLAG)
&& index->is_instant()) {
== REC_INFO_MIN_REC_FLAG)) {
ut_ad(type == TRX_UNDO_UPD_EXIST_REC);
const ulint uf = index->first_user_field();
ut_ad(field_no >= uf);

if (update->info_bits != REC_INFO_MIN_REC_FLAG) {
/* Generic instant ALTER TABLE */
if (field_no == uf) {
upd_field->new_val.type
.metadata_blob_init();
} else if (field_no >= index->n_fields) {
/* This is reachable during
purge if the table was emptied
and converted to the canonical
format on a later ALTER TABLE.
In this case,
row_purge_upd_exist_or_extern()
would only be interested in
freeing any BLOBs that were
updated, that is, the metadata
BLOB above. Other BLOBs in
the metadata record are never
updated; they are for the
initial DEFAULT values of the
instantly added columns, and
they will never change.
Note: if the table becomes
empty during ROLLBACK or is
empty during subsequent ALTER
TABLE, and btr_page_empty() is
called to re-create the root
page without the metadata
record, in that case we should
only free the latest version
of BLOBs in the record,
which purge would never touch. */
field_no = REC_MAX_N_FIELDS;
n_skip_field++;
} else {
ut_ad(field_no > uf);
dict_col_copy_type(
dict_index_get_nth_col(
index, field_no - 1),
&upd_field->new_val.type);
}
} else {
/* Instant ADD COLUMN...LAST */
dict_col_copy_type(
dict_index_get_nth_col(index,
field_no),
Expand Down Expand Up @@ -1701,31 +1729,23 @@ trx_undo_update_rec_get_update(
}
}

/* In rare scenario, we could have skipped virtual column (as they
are dropped. We will regenerate a update vector and skip them */
if (n_skip_field > 0) {
ulint n = 0;
ut_ad(n_skip_field <= n_fields);
/* We may have to skip dropped indexed virtual columns.
Also, we may have to trim the update vector of a metadata record
if dict_index_t::clear_instant_alter() was invoked on the table
later, and the number of fields no longer matches. */

upd_t* new_update = upd_create(
n_fields + 2 - n_skip_field, heap);
if (n_skip_field) {
upd_field_t* d = upd_get_nth_field(update, 0);
const upd_field_t* const end = d + n_fields + 2;

for (i = 0; i < n_fields + 2; i++) {
upd_field = upd_get_nth_field(update, i);

if (upd_field->field_no == REC_MAX_N_FIELDS) {
continue;
for (const upd_field_t* s = d; s != end; s++) {
if (s->field_no != REC_MAX_N_FIELDS) {
*d++ = *s;
}

upd_field_t* new_upd_field
= upd_get_nth_field(new_update, n);
*new_upd_field = *upd_field;
n++;
}
ut_ad(n == n_fields + 2 - n_skip_field);
*upd = new_update;
} else {
*upd = update;

ut_ad(d + n_skip_field == end);
update->n_fields = d - upd_get_nth_field(update, 0);
}

return(const_cast<byte*>(ptr));
Expand Down

0 comments on commit 1b31d88

Please sign in to comment.