[Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema #2371 #2720

eyys · 2022-09-12T14:19:18Z

Purpose of this pull request

[Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema #2371

Check list

Code changed are covered with tests, or it does not need tests for reason:
If any new Jar binary package adding in your PR, please add License Notice according
New License Guide
If necessary, please update the documentation to describe the new feature. https://github.com/apache/incubator-seatunnel/tree/dev/docs

此拉取请求的目的

Kafka 支持自定义Schema #2371

检查列表

添加e2e测试
添加example测试

…stom schema

TyrantLucifer · 2022-09-12T14:26:01Z

...or-v2-example/src/main/java/org/apache/seatunnel/example/flink/v2/KafkaToConsoleExample.java

@@ -0,0 +1,51 @@
+/*


seatunnel-examples module is only used by developers. So test cases should not be submitted

TyrantLucifer · 2022-09-12T14:29:45Z

...-kafka/src/main/java/org/apache/seatunnel/connectors/seatunnel/kafka/source/KafkaSource.java

-public class KafkaSource implements SeaTunnelSource<SeaTunnelRow, KafkaSourceSplit, KafkaSourceState> {
-
-    private static final String DEFAULT_CONSUMER_GROUP = "SeaTunnel-Consumer-Group";
+public class KafkaSource<T> implements SeaTunnelSource<T, KafkaSourceSplit, KafkaSourceState> {


Why change the generic type?

Refer to the PULSAR implementation

TyrantLucifer · 2022-09-12T14:30:16Z

...-kafka/src/main/java/org/apache/seatunnel/connectors/seatunnel/kafka/source/KafkaSource.java

    }

    @Override
-    public SeaTunnelRowType getProducedType() {
-        return this.typeInfo;
+    public SeaTunnelDataType<T> getProducedType() {


The same as above.

TyrantLucifer · 2022-09-12T14:30:25Z

...-kafka/src/main/java/org/apache/seatunnel/connectors/seatunnel/kafka/source/KafkaSource.java

    }

    @Override
-    public SourceReader<SeaTunnelRow, KafkaSourceSplit> createReader(SourceReader.Context readerContext) throws Exception {
-        return new KafkaSourceReader(this.metadata, this.typeInfo, readerContext);
+    public SourceReader<T, KafkaSourceSplit> createReader(SourceReader.Context readerContext) throws Exception {


The same as above.

TyrantLucifer · 2022-09-12T14:30:49Z

...-kafka/src/main/java/org/apache/seatunnel/connectors/seatunnel/kafka/source/KafkaSource.java

+        } else {
+            rowType = SeaTunnelSchema.buildSimpleTextSchema();
+        }
+        deserializationSchema = (DeserializationSchema<T>) new JsonDeserializationSchema(false, false, rowType);


T should replaced by SeaTunnelRow

TyrantLucifer · 2022-09-12T14:31:03Z

.../src/main/java/org/apache/seatunnel/connectors/seatunnel/kafka/source/KafkaSourceReader.java

@@ -46,7 +47,7 @@
 import java.util.Set;
 import java.util.stream.Collectors;

-public class KafkaSourceReader implements SourceReader<SeaTunnelRow, KafkaSourceSplit> {
+public class KafkaSourceReader<T> implements SourceReader<T, KafkaSourceSplit> {


The same as above.

TyrantLucifer · 2022-09-12T14:31:13Z

.../src/main/java/org/apache/seatunnel/connectors/seatunnel/kafka/source/KafkaSourceReader.java

-    // TODO support user custom type
-    private SeaTunnelRowType typeInfo;
+
+    private final DeserializationSchema<T> deserializationSchema;


The same as above.

TyrantLucifer · 2022-09-12T14:31:20Z

.../src/main/java/org/apache/seatunnel/connectors/seatunnel/kafka/source/KafkaSourceReader.java

@@ -87,7 +89,7 @@ public void close() throws IOException {
    }

    @Override
-    public void pollNext(Collector<SeaTunnelRow> output) throws Exception {
+    public void pollNext(Collector<T> output) throws Exception {


The same as above.

TyrantLucifer · 2022-09-12T14:35:15Z

.../src/main/java/org/apache/seatunnel/connectors/seatunnel/kafka/source/KafkaSourceReader.java

+                    }else{
+                        String v = stringDeserializer.deserialize(partition.topic(), record.value());
+                        String t = partition.topic();
+                        output.collect((T) new SeaTunnelRow(new Object[]{t, v}));


If user does not assign schema, it will generate a simple schema as the following shown:

content data

So the row only have one field named content, but you generate a row with two fields.

TyrantLucifer · 2022-09-12T14:37:36Z

BTW, you should add spark-e2e test cases and update doc of kafka source conenctor too.

ashulin

You need to check the code style and add license header.

hailin0

Can you add docs & e2e-testcase(spark)?

hailin0 · 2022-09-13T02:49:01Z

...nk-connector-v2-e2e/connector-kafka-flink-e2e/src/test/resources/kafka/kafka_to_console.conf

+        name = "string"
+        age = "int"


test all datatypes?

[Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports cu…

a1536c2

…stom schema

TyrantLucifer requested changes Sep 12, 2022

View reviewed changes

ashulin reviewed Sep 12, 2022

View reviewed changes

hailin0 reviewed Sep 13, 2022

View reviewed changes

eyys closed this Sep 18, 2022

This was referenced Sep 18, 2022

[Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema #2371 #2782

Closed

[Feature][connectors-v2][kafka] Kafka supports custom schema #2371 #2783

Merged

eyys deleted the develop branch September 29, 2022 11:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema #2371 #2720

[Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema #2371 #2720

eyys commented Sep 12, 2022

TyrantLucifer Sep 12, 2022

TyrantLucifer Sep 12, 2022

eyys Sep 18, 2022

TyrantLucifer Sep 12, 2022

TyrantLucifer Sep 12, 2022

TyrantLucifer Sep 12, 2022

TyrantLucifer Sep 12, 2022

TyrantLucifer Sep 12, 2022

TyrantLucifer Sep 12, 2022

TyrantLucifer Sep 12, 2022

TyrantLucifer commented Sep 12, 2022

ashulin left a comment

hailin0 left a comment

hailin0 Sep 13, 2022

[Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema #2371 #2720

[Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema #2371 #2720

Conversation

eyys commented Sep 12, 2022

Purpose of this pull request

Check list

此拉取请求的目的

检查列表

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TyrantLucifer commented Sep 12, 2022

ashulin left a comment

Choose a reason for hiding this comment

hailin0 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment