Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pipeline-connector] Ability to provide a schema replacement #2908

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

WholeWorld-Timothy
Copy link
Contributor

@WholeWorld-Timothy WholeWorld-Timothy commented Dec 21, 2023

In our usage scenario, the source schema of mysql and the target schema of starrocks are often different, but the table names and column names are consistent, and this pr provides this capability.
Maintain a route configuration as follows:

route:
  - source-table: test.[\S]*
    sink-table: test02.<>

Represents the table below the test will be transferred to the test02, where the symbol <> the representative table name is unchanged, just change the schema. For example, the test.table would be replaced with the test02.table.
And maintain a route configuration as follows:

route:
  - source-table: test.[\S]*
    sink-table: test02.s02_<>

Represents the table below the test will be transferred to the test02, Replace the original table name with the <> symbol. For example, the test.table would be replaced with the test02.s02_table.
And the user can define the replacement symbols by themselves as follows:

route:
  - source-table: mrtdb.[\S]*
    sink-table: mrtdb_sd.sd_s<>
    replace-symbol: s<>

Represents the table below the test will be transferred to the test02, Replace the original table name with the s<> symbol.

@WholeWorld-Timothy WholeWorld-Timothy changed the title Ability to provide a schema replacement [pipeline-connector][starrocks] Ability to provide a schema replacement Dec 21, 2023
@WholeWorld-Timothy
Copy link
Contributor Author

PTAL @banmoy

@lvyanquan
Copy link
Contributor

Hi, @WholeWorld-Timothy this idea looks good to me.
Can we support adding table name prefix and table name suffix based on this placeholder, And can we expand this to modifying the schema?

@WholeWorld-Timothy
Copy link
Contributor Author

The current mode is a support for modification mode, and I think it is a good idea to add a table name prefix or a table name suffix based on this placeholder, so I can try it out.

@WholeWorld-Timothy WholeWorld-Timothy changed the title [pipeline-connector][starrocks] Ability to provide a schema replacement [pipeline-connector] Ability to provide a schema replacement Jan 4, 2024
@WholeWorld-Timothy
Copy link
Contributor Author

PTAL @lvyanquan

TableId.parse(
replaceBy.getNamespace(),
replaceBy.getSchemaName(),
replaceBy.getTableName().replace("<>", tableId.getTableName()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to extract this literal into a static constant, and determine a value that will not be accidentally used in the table name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed a version and put the definition of the <> symbol into the configuration file, just like this form:

route:
  - source-table: mrtdb.[\S]*
    sink-table: mrtdb_sd.sd_s<>
    replace-symbol: s<>

In this way, the user can define the replacement symbols by themselves. Is it any better?

Copy link
Contributor

@lvyanquan lvyanquan Jan 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Please add some tests for these scenes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two methods, RouteFunctionTest#testSchemaChangeRouting And RouteFunctionTest#testTableNameChangeRouting were modified to test this scenes.

Copy link
Contributor

@lvyanquan lvyanquan Jan 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To expose this parameter to the users, please add description to cdc-pipeline.md.
And add this configuration to testParsingFullDefinition.

By the way, fix this typo, thanks😊.
截屏2024-01-05 11 21 47

Signed-off-by: 张田 <zhang_tian@inspur.com>
…y themselves.

Signed-off-by: 张田 <zhang_tian@inspur.com>
@github-actions github-actions bot added the docs Improvements or additions to documentation label Jan 6, 2024
Copy link
Contributor

@lvyanquan lvyanquan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

everhopingandwaiting

This comment was marked as duplicate.

Copy link

@everhopingandwaiting everhopingandwaiting left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@WholeWorld-Timothy
Copy link
Contributor Author

My local compilation and testing are correct, so the error in check may not be caused by the code? How can rerun these check ? I click rerun in github desktop doesn t work

@whhe
Copy link
Member

whhe commented Jan 30, 2024

Thanks for your contribution, please rebase the master branch and I'll take a look on the CI failure later.

Besides, there are some doc modifications which seems to be unnecessary, could you please revert them back?

@yuxiqian
Copy link
Contributor

Hi @WholeWorld-Timothy, could you please rebase this PR with latest master branch and address the comments above?

cc @whhe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cli common composer docs Improvements or additions to documentation runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants