[cdc-pipeline-connector][doris] Introduce Doris cdc pipeline DataSink #2729
[cdc-pipeline-connector][doris] Introduce Doris cdc pipeline DataSink #2729JNSimba wants to merge 26 commits intoapache:masterfrom
Conversation
| private void applyAddColumnEvent(AddColumnEvent event) throws IOException, IllegalArgumentException { | ||
| TableId tableId = event.tableId(); | ||
| List<AddColumnEvent.ColumnWithPosition> addedColumns = event.getAddedColumns(); | ||
| for(AddColumnEvent.ColumnWithPosition col: addedColumns){ |
There was a problem hiding this comment.
There are some kinds of ColumnPosition, Does here deal with LAST type only?
There was a problem hiding this comment.
Yes, because adding a value column in Doris does not support adding it to the middle of multiple Key columns.
At the same time, the default is to use JSON format to import. The order of the columns will not affect the data quality. The value column can be directly appended to the end.
# Conflicts: # flink-cdc-common/src/main/java/com/ververica/cdc/common/schema/Schema.java
|
Addressed the comments. PTAL @lvyanquan |
|
@JNSimba Run 'mvn spotless:apply' to fix these violations. |
| /** A serializer for Event to Tuple2<String, byte[]> */ | ||
| public class DorisEventSerializer implements DorisRecordSerializer<Event> { | ||
| private ObjectMapper objectMapper = new ObjectMapper(); | ||
| private Map<TableId, Schema> schemaMaps = new HashMap<>(); |
There was a problem hiding this comment.
A troublesome problem is that we need to maintain the schemaMaps in State to recover from failure, so we need to add a subclass of DorisWriter or DorisBatchWriter to overwrite snapshotState method.
What do you think?
There was a problem hiding this comment.
Sorry for this comment, as the discuss before, CreateTableEvent will always sent before DataChangEvent, so we don't need to consider this situation.
There was a problem hiding this comment.
Yes, I understand that it will resend CreateTableEvent even if it is restarted.
# Conflicts: # flink-cdc-common/src/main/java/com/ververica/cdc/common/data/GenericStringData.java
# Conflicts: # flink-cdc-common/src/main/java/com/ververica/cdc/common/utils/SchemaUtils.java
# Conflicts: # flink-cdc-common/src/main/java/com/ververica/cdc/common/utils/SchemaUtils.java
|
Addressed the comments. PTAL @lvyanquan |
|
Thanks for your contribution. Over look good to me. Left some comments, and can you clean the commit message? |
|
Addressed the comments. PTAL @lvyanquan |
|
Addressed the comments. PTAL @lvyanquan |
|
Resolved by 4abd86a |
This closes #2646
add a
DorisDataSinkthat implement interface ofDataSinkto build a pipeline.