Skip to content

[cdc-common] Introduce CollectionFilter to filter collection from given TableId pattern#2673

Merged
leonardBang merged 1 commit intoapache:masterfrom
GOODBOY008:master-filter
Nov 25, 2023
Merged

[cdc-common] Introduce CollectionFilter to filter collection from given TableId pattern#2673
leonardBang merged 1 commit intoapache:masterfrom
GOODBOY008:master-filter

Conversation

@GOODBOY008
Copy link
Copy Markdown
Member

@GOODBOY008 GOODBOY008 commented Nov 9, 2023

Close #2669
[cdc-common] Introduce CollectionFilter to filter collection from given TableId pattern

Copy link
Copy Markdown
Contributor

@leonardBang leonardBang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @GOODBOY008 for the contribution ,could you add some implementation classes and corresponding tests as well?

Copy link
Copy Markdown
Contributor

@Jiabao-Sun Jiabao-Sun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @GOODBOY008 for this great work.
I left some comments.

Comment thread flink-cdc-common/src/main/java/com/ververica/cdc/common/schema/Selectors.java Outdated
Comment thread flink-cdc-common/src/main/java/com/ververica/cdc/common/schema/Selectors.java Outdated
assertNotAllowed(filter, "db2", "sc1", "A2");

// schemaName, tableName
filter = Selectors.tableSelector().includeTables("sc1\\.A[0-9]+,sc2\\.B[0-1]+").build();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GOODBOY008 @Jiabao-Sun Could we use "sc1.A[0-9]+,sc2.B[0-1]+" here? I mean if . is part of regex pattern and then we use sc1.A\\.*,sc2.B\\.*.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most common case for entire database sync should be mydb.users, mydb.orders, which is better than mydb\\.users, mydb\\.orders

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sc1.A[0-9]+ can be matched to sc1.a1 and sc1.a2, because . can match any character.

Write mydb.users and mydb.orders directly that can match the corresponding table, but if there is no db constraint, there may be side effects: mydb_orders, mydb-orders ...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use \. if . is one part of regex pattern just like we treats ,, , can be part of a regex pattern

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean we can follow same rule and offer a more elegant table id pattern

private static TableId getTableId(String nameSpace, String schemaName, String tableName) {
TableId id;
if (nameSpace == null && schemaName == null) {
id = TableId.tableId(tableName);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we also add test for this kind of table.

Copy link
Copy Markdown
Contributor

@leonardBang leonardBang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @GOODBOY008 for the update, the updated PR especially the tests looks good to me.

@leonardBang leonardBang merged commit 9c3c7fa into apache:master Nov 25, 2023
@leonardBang leonardBang added this to the V3.0.0 milestone Nov 25, 2023
e-mhui pushed a commit to e-mhui/flink-cdc-connectors that referenced this pull request Dec 2, 2023
ChaomingZhangCN pushed a commit to ChaomingZhangCN/flink-cdc that referenced this pull request Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[cdc-common] Introduce CollectionFilter to filter collection from given TableId pattern

3 participants