Skip to content

[Mysql] Fix duplicate split which cause duplicate data when open scanNewlyAddedTableEnabled#2096

Merged
ruanhang1993 merged 3 commits intoapache:masterfrom
EMsnap:2095
Jul 21, 2023
Merged

[Mysql] Fix duplicate split which cause duplicate data when open scanNewlyAddedTableEnabled#2096
ruanhang1993 merged 3 commits intoapache:masterfrom
EMsnap:2095

Conversation

@EMsnap
Copy link
Copy Markdown
Member

@EMsnap EMsnap commented Apr 20, 2023

Fix #2095

I carefully debugged the bug and there is the analysis:

1、In MySqlSnapshotSplitAssigner, split table happens asynchronously
2、After the finish of every table split the table will be moved from remaining table
3、After tm calls getNext() from MySqlSnapshotSplitAssigner, the table will be added to alreadyProcessedTables
4、However when job restarts and open scanNewlyAddedTableEnabled and discover newly added table, it gets into the following logic

image

The alreadyProcessedTables + remainingTables won't always be euqals to the tables splitted in the last run.

This happens when table is so large and tm can't process split on time.

For example:
1、The task reads from A and B, and each of them split into 1000 chunks
2、The remaingTable is empty since they are already splitted, but alreadyProcessedTables only contains A since B is not fetch by tm now
3、 job restarts and B will be spliited again.

@EMsnap
Copy link
Copy Markdown
Member Author

EMsnap commented Apr 20, 2023

@leonardBang @lzshlzsh PLAL

@ruanhang1993
Copy link
Copy Markdown
Contributor

@EMsnap Thanks.
Please rebase the master branch.

@EMsnap
Copy link
Copy Markdown
Member Author

EMsnap commented Jul 6, 2023

@EMsnap Thanks. Please rebase the master branch.

Sure, done

Thanks for the update

@ruanhang1993
Copy link
Copy Markdown
Contributor

@EMsnap Thanks. Please rebase the master branch.

Sure, done

Thanks for the update

Thanks @EMsnap . The CI failed. Please take a look at it.

@EMsnap
Copy link
Copy Markdown
Member Author

EMsnap commented Jul 11, 2023

@EMsnap Thanks. Please rebase the master branch.

Sure, done
Thanks for the update

Thanks @EMsnap . The CI failed. Please take a look at it.

image

I guess the failed UT is not related to the pr since the pr works on mysql-cdc, could u help rerun the UT ?

@ruanhang1993
Copy link
Copy Markdown
Contributor

@EMsnap I think this PR should be modified. Actually all captured tables are not equals to remainingTables + alreadyProcessedTables. The method computeTablesPendingSnapshot may be helpful to understand it.
图片 1

remainingSplits.addAll(schemaLessSnapshotSplits);
if (!chunkSplitter.hasNextChunk()) {
remainingTables.remove(nextTable);
addAlreadyProcessedTablesIfNotExists(nextTable);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should change the code in captureNewlyAddedTables instead of changing the location of addAlreadyProcessedTablesIfNotExists.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted, I'll change the implementation asap

@EMsnap
Copy link
Copy Markdown
Member Author

EMsnap commented Jul 14, 2023

@EMsnap I think this PR should be modified. Actually all captured tables are not equals to remainingTables + alreadyProcessedTables. The method computeTablesPendingSnapshot may be helpful to understand it. 图片 1

Thanks for the reply, I'll take a look.
But where is this computeTablesPendingSnapshot method located? I can't find it on the master branch

@ruanhang1993
Copy link
Copy Markdown
Contributor

computeTablesPendingSnapshot

Sorry, the computeTablesPendingSnapshot are not provided before the Flink 1.18. Please ignore that.

@ruanhang1993
Copy link
Copy Markdown
Contributor

@EMsnap Thanks for the work. We try to fix this in the version 2.4.1. So I provide a new changes to fix this.

Copy link
Copy Markdown
Contributor

@ruanhang1993 ruanhang1993 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ruanhang1993 ruanhang1993 merged commit 33a3940 into apache:master Jul 21, 2023
@EMsnap
Copy link
Copy Markdown
Member Author

EMsnap commented Jul 21, 2023

@EMsnap Thanks for the work. We try to fix this in the version 2.4.1. So I provide a new changes to fix this.

Thanks for your reply and your new changes, great job !

ChaomingZhangCN pushed a commit to ChaomingZhangCN/flink-cdc that referenced this pull request Jan 13, 2025
…y added tables (apache#2096)


Co-authored-by: Hang Ruan <ruanhang1993@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Duplicate Split which cause duplicate data when open scanNewlyAddedTableEnabled

2 participants