Skip to content

[ISSUES #261] rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT#247

Merged
sunxiaojian merged 1 commit into
apache:masterfrom
oudb:rocketmq-connect-kafka-connector-adapter
Aug 19, 2022
Merged

[ISSUES #261] rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT#247
sunxiaojian merged 1 commit into
apache:masterfrom
oudb:rocketmq-connect-kafka-connector-adapter

Conversation

@oudb
Copy link
Copy Markdown
Contributor

@oudb oudb commented Aug 11, 2022

What is the purpose of the change

需要一个通用适配器connector,快速地让现存的大量的kafka connector运行在rocketmq-connect,使得数据在rocketmq导入导出。

Brief changelog

增加2个Connector,一个是SourceConnector:org.apache.rocketmq.connect.kafka.connector.KafkaRocketmqSourceConnector,用来适配Kafka Source Connector。一个是SinkConnector:org.apache.rocketmq.connect.kafka.connector.KafkaRocketmqSinkConnector,用来适配Kafka Sink Connector

Verifying this change

参考ReadMe的快速开始,运行kafka-file-connector

Follow this checklist to help us incorporate your contribution quickly and easily. Notice, it would be helpful if you could finish the following 5 checklist(the last one is not necessary)before request the community to review your PR.

  • Make sure there is a Github issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a Github issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue.
  • Format the pull request title like [ISSUE #123] Fix UnknownException when host config not exist. Each commit in the pull request should have a meaningful subject line and body.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Write necessary unit-test(over 80% coverage) to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add integration-test in test module.
  • Run mvn -B clean apache-rat:check findbugs:findbugs checkstyle:checkstyle to make sure basic checks pass. Run mvn clean install -DskipITs to make sure unit-test pass. Run mvn clean test-compile failsafe:integration-test to make sure integration-test pass.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

@sunxiaojian sunxiaojian reopened this Aug 12, 2022
@sunxiaojian
Copy link
Copy Markdown
Contributor

What is the purpose of the change

需要一个通用适配器connector,快速地让现存的大量的kafka connector运行在rocketmq-connect,使得数据在rocketmq导入导出。

Brief changelog

增加2个Connector,一个是SourceConnector:org.apache.rocketmq.connect.kafka.connector.KafkaRocketmqSourceConnector,用来适配Kafka Source Connector。一个是SinkConnector:org.apache.rocketmq.connect.kafka.connector.KafkaRocketmqSinkConnector,用来适配Kafka Sink Connector

Verifying this change

参考ReadMe的快速开始,运行kafka-file-connector

Follow this checklist to help us incorporate your contribution quickly and easily. Notice, it would be helpful if you could finish the following 5 checklist(the last one is not necessary)before request the community to review your PR.

  • Make sure there is a Github issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a Github issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue.
  • Format the pull request title like [ISSUE #123] Fix UnknownException when host config not exist. Each commit in the pull request should have a meaningful subject line and body.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Write necessary unit-test(over 80% coverage) to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add integration-test in test module.
  • Run mvn -B clean apache-rat:check findbugs:findbugs checkstyle:checkstyle to make sure basic checks pass. Run mvn clean install -DskipITs to make sure unit-test pass. Run mvn clean test-compile failsafe:integration-test to make sure integration-test pass.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

先创建个issue吧,用于纪录和讨论

@odbozhou odbozhou added the enhancement New feature or request label Aug 12, 2022
@odbozhou
Copy link
Copy Markdown
Contributor

@sunxiaojian debezium适配的时候好像有适配kafka source端,看一下能不能融合到一起。
@oudb 非常好的想法,期待这个特性可以尽快合入,一起把这个特性做的更好。

@sunxiaojian
Copy link
Copy Markdown
Contributor

@sunxiaojian debezium适配的时候好像有适配kafka source端,看一下能不能融合到一起。 @oudb 非常好的想法,期待这个特性可以尽快合入,一起把这个特性做的更好。

ok , 大概review了一下功能,非常不错的能力,和debezium中的source 和 sink 兼容实现有所不同;
当前PR实现,应该是为了满足 kafka source -> rocketmqConnect -> kafka sink 这一种场景, source和sink限制在必须是kafka connect的插件,debezium中的实现是 希望做到 kafka connect source -> RocketMQ Connect -> kafka connect sink 或者 kafka connect source -> RocketMQ Connect -> RocketMQ Connect sink 或者 RocketMQ connect source -> RocketMQ Connect ->Kafka connect sink 这三种形式兼容,当然,这完全可以作为两种能力来提供,只为了满足用户可以无缝将kafka connect的插件迁移到 rocketmq connect而无需做任何适配, 持续兼容和维护下去,会是一个突出的功能

@oudb
Copy link
Copy Markdown
Contributor Author

oudb commented Aug 12, 2022

是的,该PR专注于 kafka source -> rocketmqConnect -> kafka sink。与其他 RocketMQ Connector交互,我希望是通过transform来提供, 名字暂定为KafkaRocketmqTransformation, 比如kafka connect source -> KafkaRocketmqTransformation ->RocketMQ Connect-> RocketMQ Connect sink 。通过KafkaRocketmqTransformation还可以达到Kafka的transform和RocketMQ的transform混合使用。

@sunxiaojian
Copy link
Copy Markdown
Contributor

是的,该PR专注于 kafka source -> rocketmqConnect -> kafka sink。与其他 RocketMQ Connector交互,我希望是通过transform来提供, 名字暂定为KafkaRocketmqTransformation, 比如kafka connect source -> KafkaRocketmqTransformation ->RocketMQ Connect-> RocketMQ Connect sink 。通过KafkaRocketmqTransformation还可以达到Kafka的transform和RocketMQ的transform混合使用。

可以看一下debezium 本项目中debezium下面的kafka connect adaptor, 整体来看

@oudb
Copy link
Copy Markdown
Contributor Author

oudb commented Aug 15, 2022

我提交该pr的之前看过debezium的kafka connect adaptor,有以下几点我还是想提交一个通用的灵活的适配器:
(1)debezium的kafka connect adaptor专注于debezium的适配,缺少通过kafka插件组件来提供通用和灵活
(2)直接适配其他的kafka connector可能还有其他地方没有考虑到,比如source connector的位点序列再反序列化的类型丢失,如Long序列化再反序列化变为Integer。
(3)debezium的kafka connect adaptor为了和其他rokectmq原生connector和插件交互,scheme之间做个映射。 但 我想的是大多数用户使用一个通用的适配器都是kafka source connector和kafka sink connector配对使用的,很少会和rokectmq原生connector和其他插件混合使用,这不但引入开销而且必须保证scheme能做到一一映射,可能会后期升级有影响。

关于scheme和transforms适配,我想的是通过在rokectmq connect的transforms层面去做,这样用户觉得必要的时候(比如和rokectmq原生connector和其插件配合的时候)才去做这个配置。

@sunxiaojian
Copy link
Copy Markdown
Contributor

我提交该pr的之前看过debezium的kafka connect adaptor,有以下几点我还是想提交一个通用的灵活的适配器: (1)debezium的kafka connect adaptor专注于debezium的适配,缺少通过kafka插件组件来提供通用和灵活 (2)直接适配其他的kafka connector可能还有其他地方没有考虑到,比如source connector的位点序列再反序列化的类型丢失,如Long序列化再反序列化变为Integer。 (3)debezium的kafka connect adaptor为了和其他rokectmq原生connector和插件交互,scheme之间做个映射。 但 我想的是大多数用户使用一个通用的适配器都是kafka source connector和kafka sink connector配对使用的,很少会和rokectmq原生connector和其他插件混合使用,这不但引入开销而且必须保证scheme能做到一一映射,可能会后期升级有影响。

关于scheme和transforms适配,我想的是通过在rokectmq connect的transforms层面去做,这样用户觉得必要的时候(比如和rokectmq原生connector和其插件配合的时候)才去做这个配置。

通用性比较认同,这个也是计划要去支持;
丢失类型问题这个好像不存在,source 的 offset 无论是rocketmq 还是kafka 定义的都是 object类型,不是固定类型,所以在返回offset不会直接拿固定类型,具体什么类型的是有包含业务逻辑的插件内部来决定的, 这个是在转换过程中遇到问题了吗?
关于升级问题,应该是个逃不开的问题,无论是运行在 rocketmq 上面的 kafka connect,还是直接运行在kafka 上的kafka connect 都面临这个问题,还是但是好在都是api层面的变更,不会非常频繁;并且保证api变更时项目修改最小就可以了,
其实无论在transform层面做schema转换还是直接转换逻辑应该差别不大,在做transform 转换时可以考虑是否把debezium的 adaptor 中 schema的转换直接提成transform就可以了

@odbozhou odbozhou changed the title rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT [issues #261] rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT Aug 19, 2022
@odbozhou odbozhou changed the title [issues #261] rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT [Issues #261] rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT Aug 19, 2022
@odbozhou odbozhou changed the title [Issues #261] rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT [ISSUES #261] rocketmq-connect-kafka-connector-adapter 0.0.1-SNAPSHOT Aug 19, 2022
@sunxiaojian sunxiaojian merged commit 5ddf386 into apache:master Aug 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

discuss enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants