Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Towards Flink CDC 3.0 #2600

Closed
57 tasks done
leonardBang opened this issue Nov 2, 2023 · 17 comments · Fixed by #2610
Closed
57 tasks done

Towards Flink CDC 3.0 #2600

leonardBang opened this issue Nov 2, 2023 · 17 comments · Fixed by #2610
Assignees
Labels
enhancement New feature or request umbrella umbrella issue 【3.0】 task for Flink CDC 3.0
Milestone

Comments

@leonardBang
Copy link
Contributor

leonardBang commented Nov 2, 2023

Motivation

Provide a user-friendly, end-to-end streaming ELT framework for Data integration users.

Solution

The design docs by Flink CDC maintainers as following:

Umbrella tasks for each module

[prepare and mis]

[module] flink-cdc-common

[module] flink-cdc-cli

[module] flink-cdc-composer

[module] flink-cdc-runtime

[module] flink-cdc-dist

[module] flink-cdc-connect/flink-cdc-pipeline-connectors

  • Kafka Sink
  • Paimon datalake:(optional in 3.0 @lvyanquan

[module] Docs for cdc 3.0 pipeline

Shared Utilities

@leonardBang leonardBang added the enhancement New feature or request label Nov 2, 2023
@leonardBang leonardBang self-assigned this Nov 2, 2023
@leonardBang leonardBang added this to the V3.0.0 milestone Nov 2, 2023
@leonardBang leonardBang pinned this issue Nov 2, 2023
@leonardBang leonardBang added the umbrella umbrella issue label Nov 2, 2023
@leonardBang leonardBang linked a pull request Nov 2, 2023 that will close this issue
@leonardBang leonardBang reopened this Nov 3, 2023
@leonardBang leonardBang added the 【3.0】 task for Flink CDC 3.0 label Nov 3, 2023
@GOODBOY008
Copy link
Member

GOODBOY008 commented Nov 3, 2023

Hi @PatrickRen , I can collaborate with you on the subtasks of flink-cdc-composer.

@maver1ck
Copy link
Contributor

maver1ck commented Nov 9, 2023

Hi All,
Are we planning to bump Debezium as part of this task ?

@leonardBang
Copy link
Contributor Author

Hi All, Are we planning to bump Debezium as part of this task ?

Hi, maverck, we don't have a plan to dump debezium version yet, but this should happen soon, currently we‘re busy on the 3.0 version whose position is the end-to-end steaming ELT framework.

@maver1ck
Copy link
Contributor

maver1ck commented Nov 9, 2023

Hi @leonardBang,
As Debezium 2.x is not fully compatible with 1.9.7 I think upgrade also needs major version bump in this repo.
So migration to 3.0 is great opportunity for this to happen.

@leonardBang
Copy link
Contributor Author

Hi @leonardBang, As Debezium 2.x is not fully compatible with 1.9.7 I think upgrade also needs major version bump in this repo. So migration to 3.0 is great opportunity for this to happen.

Hey, maver1ck
As we plan to keep compatibility between 2.4.2 cdc connector and 3.0 cdc connector, the bump version work will be scheduled later, Bumping DBZ 2.x is a big but should to do work from my side.

@maver1ck
Copy link
Contributor

maver1ck commented Nov 13, 2023

Hi @leonardBang
Thank you for the answer. I went through documentation and I'm excited. I was waiting years for Flink based Kafka Connect alternative. You may have one of the first users here :)

@leonardBang
Copy link
Contributor Author

leonardBang commented Dec 13, 2023

Close this issue as all subtasks finished, will kickoff 3.1 version soon

@sunyaf
Copy link

sunyaf commented Dec 20, 2023

sink type mysql error

Exception in thread "main" java.lang.RuntimeException: Cannot find factory with identifier "mysql" in the classpath.

Available factory classes are:

com.ververica.cdc.connectors.mysql.factory.MySqlDataSourceFactory
at com.ververica.cdc.composer.utils.FactoryDiscoveryUtils.getFactoryByIdentifier(FactoryDiscoveryUtils.java:60)
at com.ververica.cdc.composer.flink.FlinkPipelineComposer.createDataSink(FlinkPipelineComposer.java:144)
at com.ververica.cdc.composer.flink.FlinkPipelineComposer.compose(FlinkPipelineComposer.java:108)
at com.ververica.cdc.cli.CliExecutor.run(CliExecutor.java:65)
at com.ververica.cdc.cli.CliFrontend.main(CliFrontend.java:62)

@Jiang-HT
Copy link

sink mysql demo error : Cannot find factory with identifier "mysql" in the classpath

@leonardBang leonardBang unpinned this issue Jan 2, 2024
@KiwiGeorge
Copy link

sink type mysql error

Exception in thread "main" java.lang.RuntimeException: Cannot find factory with identifier "mysql" in the classpath.

Available factory classes are:

com.ververica.cdc.connectors.mysql.factory.MySqlDataSourceFactory at com.ververica.cdc.composer.utils.FactoryDiscoveryUtils.getFactoryByIdentifier(FactoryDiscoveryUtils.java:60) at com.ververica.cdc.composer.flink.FlinkPipelineComposer.createDataSink(FlinkPipelineComposer.java:144) at com.ververica.cdc.composer.flink.FlinkPipelineComposer.compose(FlinkPipelineComposer.java:108) at com.ververica.cdc.cli.CliExecutor.run(CliExecutor.java:65) at com.ververica.cdc.cli.CliFrontend.main(CliFrontend.java:62)

Have you finished this issue? I encountered this too

@lvyanquan
Copy link
Contributor

sink mysql demo error : Cannot find factory with identifier "mysql" in the classpath

Hi @Jiang-HT @KiwiGeorge We don't support MySQL as data sink now, and current support data sinks are Doris and StarRocks.

@KiwiGeorge
Copy link

sink mysql demo error : Cannot find factory with identifier "mysql" in the classpath

Hi @Jiang-HT @KiwiGeorge We don't support MySQL as data sink now, and current support data sinks are Doris and StarRocks.

I see there is source code in the project, so... when will you support mysql as data sink?

@shreelatak
Copy link

Do you support Kafka as a data sink?
https://nightlies.apache.org/flink/flink-cdc-docs-master/docs/connectors/kafka/#pipeline-connector-options per this, I see Kafka sink, but I seem to face the error:
Exception in thread "main" java.lang.RuntimeException: Cannot find factory with identifier "kafka" in the classpath.

Available factory classes are:

com.ververica.cdc.connectors.mysql.factory.MySqlDataSourceFactory
at com.ververica.cdc.composer.utils.FactoryDiscoveryUtils.getFactoryByIdentifier(FactoryDiscoveryUtils.java:60)
at com.ververica.cdc.composer.flink.FlinkPipelineComposer.createDataSink(FlinkPipelineComposer.java:144)
at com.ververica.cdc.composer.flink.FlinkPipelineComposer.compose(FlinkPipelineComposer.java:108)
at com.ververica.cdc.cli.CliExecutor.run(CliExecutor.java:65)
at com.ververica.cdc.cli.CliFrontend.main(CliFrontend.java:62)

@yuxiqian
Copy link
Contributor

Hi @shreelatak, Kafka pipeline connector is supported in Flink CDC 3.1. However, it's a breaking update and package names like com.ververica.cdc have been renamed to org.apache.flink.cdc, so it won't work with any previous versions. You may download latest version of Flink CDC here and Kafka connectors here.

@shreelatak
Copy link

shreelatak commented May 21, 2024

Hello @yuxiqian , thank you so much for the revert! Do you know which supporting mysql-cdc connector I can find for Flink CDC version 3.1, and where?

My yaml file has the below contents:

source:
   type: mysql
   name: MySQL Source
   hostname: 127.0.0.1
   port: 3306
   username: ****
   password: ****
   tables: patient.\.*

sink:
  type: kafka
  name: Kafka Sink
  properties.bootstrap.servers: PLAINTEXT://localhost:9092
  topic: quickstart_events
  

pipeline:
  name: MySQL to Kafka Pipeline
  parallelism: 2

With the previous Flink CDC version (3.0.1) and connector [flink-cdc-pipeline-connector-mysql-3.0.0.jar], I encountered no issues. Now since having upgraded the Flink CDC (to 3.1) I am facing :

Exception in thread "main" java.lang.RuntimeException: Cannot find factory with identifier "mysql" in the classpath.
Available factory classes are:

at org.apache.flink.cdc.composer.utils.FactoryDiscoveryUtils.getFactoryByIdentifier(FactoryDiscoveryUtils.java:62)
at org.apache.flink.cdc.composer.flink.translator.DataSourceTranslator.translate(DataSourceTranslator.java:47)
at org.apache.flink.cdc.composer.flink.FlinkPipelineComposer.compose(FlinkPipelineComposer.java:101)
at org.apache.flink.cdc.cli.CliExecutor.run(CliExecutor.java:71)
at org.apache.flink.cdc.cli.CliFrontend.main(CliFrontend.java:71)

@yuxiqian
Copy link
Contributor

yuxiqian commented May 21, 2024

Connectors are now released under Group ID org.apache.flink. Here are all available connector jars: https://central.sonatype.com/search?q=g:org.apache.flink%20cdc&smo=true

P.S. It is preferred to share Flink CDC related issues on GitHub Discussions or Flink JIRA Kanban.

@wangrenjun-vs
Copy link

Do you support Kafka as a data sink? https://nightlies.apache.org/flink/flink-cdc-docs-master/docs/connectors/kafka/#pipeline-connector-options per this, I see Kafka sink, but I seem to face the error: Exception in thread "main" java.lang.RuntimeException: Cannot find factory with identifier "kafka" in the classpath.

Available factory classes are:

com.ververica.cdc.connectors.mysql.factory.MySqlDataSourceFactory at com.ververica.cdc.composer.utils.FactoryDiscoveryUtils.getFactoryByIdentifier(FactoryDiscoveryUtils.java:60) at com.ververica.cdc.composer.flink.FlinkPipelineComposer.createDataSink(FlinkPipelineComposer.java:144) at com.ververica.cdc.composer.flink.FlinkPipelineComposer.compose(FlinkPipelineComposer.java:108) at com.ververica.cdc.cli.CliExecutor.run(CliExecutor.java:65) at com.ververica.cdc.cli.CliFrontend.main(CliFrontend.java:62)

do you has solve this problem? I encountered this too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request umbrella umbrella issue 【3.0】 task for Flink CDC 3.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants