Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*(ticdc): split old update kv entry after restarting changefeed #10919

Merged
merged 34 commits into from
May 6, 2024

Conversation

lidezhu
Copy link
Collaborator

@lidezhu lidezhu commented Apr 17, 2024

What problem does this PR solve?

Issue Number: close #10918

What is changed and how it works?

When the downstream is mysql-compatiable, we change the logic to handle update events at restart

  1. When start changefeed, we always get the current timestamp from pd(represented as thresholdTs later);
  2. Previous behavior: when sink module receive an update event which commit ts is before thresholdTs, it split it into a delete event and a replace event, and then send them downstream;
  3. Current behavior: when puller module receive an update kv entry which commit ts is before thresholdTs, it split into a delete kv entry and a insert kv entry, and then write them into sorter;
  4. This change makes sure that inside a transaction which commit ts is before thresholdTs, all delete events will send downstream before insert events;

When the downstream is mysql-compatiable, We also change the logic to handle update events after the previous restart stage finishes

Previously, when meet a transaction with multiple update events which change the primary key or the not null unique key inside sink module, we always split them into delete events and replace events; This may cause data inconsistency problem as the following example:
Suppose a table t has the schema create table t(a int primary key), and it have two rows a=1 and a=2;
If a transaction contains two update events:

1. update t set a = 3 where a = 2;
2. update t set a = 2 where a = 1;

In the ideal scenario, we expect these two events to be splite into the following events:

1. delete from t where a = 2;
2. replace into t values(3);
3. delete from t where a = 1;
4. replace into t values(2);

After the transaction, table t have two rows a=2 and a=3;

But inside cdc, we cannot get the original order of these two update events, so these two update events may be split into the following events:

1. delete from t where a = 1;
2. replace into t values(2);
3. delete from t where a = 2;
4. replace into t values(3);

After the transaction, table t have only one row a=3;(Data inconsistency happens!)

So we do not split any update events inside sink module when the downstream is mysql, this may cause duplicate key entry error when the order to execute update events inside a transaction is wrong;
This error will cause changefeed to restart and enter the previous restart change, the update events will be split inside puller, and the delete events will be send before insert events;

When apply redo log

When apply redo log, split update events which update handle key to delete events and insert events, and cache the insert events until all delete events in the same transaction are emitted. If the insert events is too many(larger than 50), events will be written to a temp local file;

Check List

Tests

  • Integration test
  • Unit test

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Fix potential risk of data inconsistency when there are dependencies between update statements in the same transaction.

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. do-not-merge/needs-triage-completed and removed do-not-merge/needs-linked-issue labels Apr 17, 2024
Copy link

codecov bot commented Apr 17, 2024

Codecov Report

Attention: Patch coverage is 48.50498% with 155 lines in your changes are missing coverage. Please review.

Project coverage is 57.4095%. Comparing base (be15534) to head (76ad8e6).

Additional details and impacted files
Components Coverage Δ
cdc 61.6882% <48.5049%> (-0.0717%) ⬇️
dm 51.2335% <ø> (+0.0040%) ⬆️
engine 63.4161% <ø> (+0.0141%) ⬆️
Flag Coverage Δ
unit 57.4095% <48.5049%> (-0.0233%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

@@               Coverage Diff                @@
##             master     #10919        +/-   ##
================================================
- Coverage   57.4327%   57.4095%   -0.0233%     
================================================
  Files           851        851                
  Lines        125230     125467       +237     
================================================
+ Hits          71923      72030       +107     
- Misses        47914      48021       +107     
- Partials       5393       5416        +23     

@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 17, 2024
@lidezhu
Copy link
Collaborator Author

lidezhu commented Apr 18, 2024

/retest

3 similar comments
@lidezhu
Copy link
Collaborator Author

lidezhu commented Apr 18, 2024

/retest

@lidezhu
Copy link
Collaborator Author

lidezhu commented Apr 19, 2024

/retest

@lidezhu
Copy link
Collaborator Author

lidezhu commented Apr 20, 2024

/retest

@lidezhu lidezhu force-pushed the fix-duplicate-key branch 2 times, most recently from 9892dcb to ea9216f Compare April 20, 2024 01:33
@lidezhu
Copy link
Collaborator Author

lidezhu commented Apr 20, 2024

/retest

@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 20, 2024
@asddongmen
Copy link
Contributor

asddongmen commented Apr 20, 2024

/review default

@ti-chi-bot ti-chi-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 23, 2024
@sdojjy
Copy link
Member

sdojjy commented Apr 23, 2024

/retest-required

@lidezhu
Copy link
Collaborator Author

lidezhu commented May 5, 2024

/test dm-integration-test

@asddongmen
Copy link
Contributor

/lgtm

Copy link
Contributor

ti-chi-bot bot commented May 6, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: asddongmen, sdojjy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@lidezhu
Copy link
Collaborator Author

lidezhu commented May 6, 2024

/unhold

@ti-chi-bot ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 6, 2024
@ti-chi-bot ti-chi-bot bot merged commit c710066 into pingcap:master May 6, 2024
28 checks passed
@lidezhu lidezhu deleted the fix-duplicate-key branch May 6, 2024 10:52
@lidezhu lidezhu added needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. labels May 7, 2024
@lidezhu
Copy link
Collaborator Author

lidezhu commented May 7, 2024

/run-cherry-picker

ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request May 7, 2024
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-8.1: #11027.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.5: #11028.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.1: #11029.

ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request May 7, 2024
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request May 7, 2024
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.5: #11030.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. needs-update-docs Should update docs after this PR is merged. Remove this label once the docs are updated release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TiCDC meet error: Duplicate entry xxx for key xxx
6 participants