fix data-integrity problem when add unique key #912

cenkore · 2021-01-08T09:51:40Z

Hi,
Add a PR for #485 #477 #167 .

Insert syntax in gh-ost will be rewritten with insert ignore into (row-copy) and replace into(binlog apply goroutine) . if the data in the original table or data after apply would cause conflicts, means does not meet the conditions for adding a unique key, data will be lost in current version. This PR can detect data conflicts and exit, including conflicts in the original table data, and the data after binlog apply. PTAL.

The error looks like the following and is very easy to reproduce：

original conflict

2021-01-08 16:58:15 ERROR Error 1062: Duplicate entry 'cenkore' for key 'name'

apply confict

2021-01-08 16:59:37 ERROR Error 1062: Duplicate entry 'cenkore' for key 'name'; query=
insert /* gh-ost adetestdb._testuk_gho */ into
adetestdb._testuk_gho
(id, name, location, inserttime)
values
(?, ?, ?, ?)
; args=[3 cenkore cn 2021-01-08 16:59:30]

thanks.

shlomi-noach · 2021-01-08T10:17:36Z

Adding a UNIQUE KEY while table is being written to, is by definition a race condition, unless you can guarantee the application never writes duplicates.

Can you please clarify what is the exact use case (possibly with a test) that this PR solves?

Thank you!

cenkore · 2021-01-08T15:15:47Z

Thanks for your quick reply. @shlomi-noach

In this PR, if duplicates are on the newly added unique key , it will fail instead of ignore.

Background

Gh-ost contains two threads to process the data of the published table, one thread copies the data of the original table, using Insert ignore into, and the other thread is used to process incremental data, by parsing and applying the binlog, using replace into; Same data is likely to be processed twice because of two threads are processed concurrently. In order to avoid conflicts leading to failure, insert ignore into and replace into are designed. However, when adding a unique key will cause the actual duplicate data on the newly added unique key would be ignored, because duplicate data will be ignored or replaced. This will cause data loss. In extreme cases, if you add a new column and add a unique key to this column at the same time, then there will be only one record left after the table is published. It may cause serious problems.

Proposal

Based on the above background, we can do special processing for the scene. For data copy, only copy the data that primary key value is not in _xx_gho, which can be rewritten as insert into _xx_gho select * from xx a left join _xx_gho b on a.pk = b.pk where b.pk is null; for binlog apply, You can rewrite replace into as delete + insert.

Rationale

The data copy, only the primary key value is not in the gho table is copied, but we don’t need use ignore , so we only ignore the conflict of the primary key value, but the conflict on the newly added unique key would return error; and the incremental part, assuming that the data has not been copied, delete is a dry-run. If there is a conflict with the newly added unique key, it will fail, but replace cannot do this. If the data has been copied, delete the primary key value and execute insert again. Similarly, if the data conflicts with the newly added unique key, it will also fail.

Thank you!

cenkore · 2021-01-21T04:16:27Z

Add 2 test cases

original table definition: create table uk(id int, name varchar(10),inser_time datetime, primary key(id))
alter: add unique key(name)

Test case 1:

prepare: insert into uk values(1,'cenkore',now())

execute: ./gh-ost --host=10.0.0.1 --database=adetestdb --table=uk --verbose --alter="add unique key(name)" --concurrent-rowcount --conf="/usr/local/gh-ost.cnf" --port=66666 --nice-ratio=0.1 --execute --initially-drop-ghost-table --initially-drop-old-table --postpone-cut-over-flag-file="/tmp/uk.postpone"

We can insert a duplicate row before cutover table by popstpone flag, insert sql: insert into uk values(2,'cenkore',now())

result:

current version: handle success, BUT loss one row which id is 1, new table value as follows:

+----+---------+---------------------+
| id | name | inser_time |
+----+---------+---------------------+
| 2 | cenkore | 2021-01-21 11:06:45 |
+----+---------+---------------------+
this pr : handle failure with error:

2021-01-21 11:59:31 FATAL Error 1062: Duplicate entry 'cenkore' for key 'name'; query=
insert /* gh-ost adetestdb._uk_gho */ into
adetestdb._uk_gho
(id, name, inser_time)
values
(?, ?, ?)
; args=[2 cenkore 2021-01-21 11:58:31]

Test case 2:

prepare: insert into uk values (1,'cenkore',now()),(2,'cenkore',now())

execute: ./gh-ost --host=10.0.0.1 --database=adetestdb --table=uk --verbose --alter="add unique key(name)" --concurrent-rowcount --conf="/usr/local/gh-ost.cnf" --port=66666 --nice-ratio=0.1 --execute --initially-drop-ghost-table --initially-drop-old-table

result:

current version: handle success, BUT loss one row which id is 2, new table value as follows:

+----+---------+---------------------+
| id | name | inser_time |
+----+---------+---------------------+
| 1 | cenkore | 2021-01-21 11:51:01 |
+----+---------+---------------------+
this pr : handle failure with error:

2021-01-21 11:56:38 FATAL Error 1062: Duplicate entry 'cenkore' for key 'name'

shlomi-noach · 2021-02-18T07:01:48Z

@cenkore thank you for the PR and for the explanation.

I've been thinking, possibly we don't need to change the insert ignore. We can keep that as it is, without the left join. It may be enough to replace the replace into with (delete where pk=? + insert into). I think the logic is still correct.
I'm not sure yet. Can you see a problem with that?

shlomi-noach · 2021-02-18T07:03:16Z

Oh, strike that. The trivial case when the table is pre-populated with conflicting values + zero traffic in the binary logs is enough to prove my thinking was wrong.

shlomi-noach · 2021-02-21T11:58:17Z

@cenkore would you mind submitting an identical PR on https://github.com/openark/gh-ost/, where I can take a closer look please?

cenkore · 2021-02-22T11:09:37Z

@shlomi-noach ok, I will submit later. Thank you.

cenkore · 2021-02-23T05:33:40Z

@shlomi-noach I submitted to openark/gh-ost, but I can’t guarantee the accuracy of my operation, because it's difficult for me which merge is between two forked sibling projects, and your project is ahead of github/gh-ost. :D

shlomi-noach · 2021-02-23T06:25:53Z

@cenkore thank you so much for the effort! I will double check.

jiangnanora · 2021-09-16T03:42:59Z

When will this pr be merged

AndrewHuangMiao · 2025-02-18T10:51:09Z

When this will be fixed?

meiji163 · 2025-03-22T03:07:42Z

This situation is addressed by #1500

iss485 fix

6058707

pomelo2016 added 2 commits January 14, 2021 15:18

gofmt

d0610c9

gotest

9785e11

cenkore mentioned this pull request Feb 23, 2021

fix data-integrity problem when add unique key openark/gh-ost#16

Open

timvaillancourt added the data-integrity label May 3, 2021

timvaillancourt added this to the v1.2.0 milestone May 3, 2021

timvaillancourt added the bug label May 4, 2021

timvaillancourt modified the milestones: v1.1.2, v1.2.0, v1.1.3 Jun 17, 2021

sjjian mentioned this pull request Feb 22, 2022

support gh-ost precheck actiontech/sqle#317

Open

timvaillancourt modified the milestones: v1.1.5, v1.1.6 Jun 23, 2022

timvaillancourt modified the milestones: v1.1.6, v1.1.7 Dec 7, 2023

timvaillancourt modified the milestones: v1.1.7, v1.1.8 Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix data-integrity problem when add unique key #912

fix data-integrity problem when add unique key #912

Uh oh!

cenkore commented Jan 8, 2021

Uh oh!

shlomi-noach commented Jan 8, 2021 •

edited

Loading

Uh oh!

cenkore commented Jan 8, 2021

Uh oh!

cenkore commented Jan 21, 2021

Uh oh!

shlomi-noach commented Feb 18, 2021

Uh oh!

shlomi-noach commented Feb 18, 2021

Uh oh!

shlomi-noach commented Feb 21, 2021

Uh oh!

cenkore commented Feb 22, 2021 •

edited

Loading

Uh oh!

cenkore commented Feb 23, 2021

Uh oh!

shlomi-noach commented Feb 23, 2021

Uh oh!

jiangnanora commented Sep 16, 2021

Uh oh!

AndrewHuangMiao commented Feb 18, 2025

Uh oh!

meiji163 commented Mar 22, 2025

Uh oh!

Uh oh!

fix data-integrity problem when add unique key #912

Are you sure you want to change the base?

fix data-integrity problem when add unique key #912

Uh oh!

Conversation

cenkore commented Jan 8, 2021

Uh oh!

shlomi-noach commented Jan 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cenkore commented Jan 8, 2021

Background

Proposal

Rationale

Uh oh!

cenkore commented Jan 21, 2021

Test case 1:

Test case 2:

Uh oh!

shlomi-noach commented Feb 18, 2021

Uh oh!

shlomi-noach commented Feb 18, 2021

Uh oh!

shlomi-noach commented Feb 21, 2021

Uh oh!

cenkore commented Feb 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cenkore commented Feb 23, 2021

Uh oh!

shlomi-noach commented Feb 23, 2021

Uh oh!

jiangnanora commented Sep 16, 2021

Uh oh!

AndrewHuangMiao commented Feb 18, 2025

Uh oh!

meiji163 commented Mar 22, 2025

Uh oh!

Uh oh!

shlomi-noach commented Jan 8, 2021 •

edited

Loading

cenkore commented Feb 22, 2021 •

edited

Loading