-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(connector): support hudi sink #8824
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 3147 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
1460 | 8 | 1679 | 0 |
Click to see the invalid file list
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/HoodieRisingWaveMergeOnReadTable.java
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/HoodieRisingWaveWriter.java
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/HudiSink.java
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/HudiSinkFactory.java
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/RisingWaveCreateHandleFactory.java
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/RisingWaveInsertCommitActionExecutor.java
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/RisingWaveUpsertPreppedDeltaCommitActionExecutor.java
- java/connector-node/risingwave-sink-hudi/src/test/java/com/risingwave/connector/HudiSinkFactoryTest.java
...gwave-sink-hudi/src/main/java/com/risingwave/connector/HoodieRisingWaveMergeOnReadTable.java
Show resolved
Hide resolved
...node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/HoodieRisingWaveWriter.java
Show resolved
Hide resolved
java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/HudiSink.java
Show resolved
Hide resolved
...nector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/HudiSinkFactory.java
Show resolved
Hide resolved
...singwave-sink-hudi/src/main/java/com/risingwave/connector/RisingWaveCreateHandleFactory.java
Show resolved
Hide resolved
...e-sink-hudi/src/main/java/com/risingwave/connector/RisingWaveInsertCommitActionExecutor.java
Show resolved
Hide resolved
...src/main/java/com/risingwave/connector/RisingWaveUpsertPreppedDeltaCommitActionExecutor.java
Show resolved
Hide resolved
...or-node/risingwave-sink-hudi/src/test/java/com/risingwave/connector/HudiSinkFactoryTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 3246 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
1497 | 9 | 1740 | 0 |
Click to see the invalid file list
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/HoodieRisingWaveMergeOnReadTable.java
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/HoodieRisingWaveWriter.java
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/HudiSink.java
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/HudiSinkConfig.java
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/HudiSinkFactory.java
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/RisingWaveCreateHandleFactory.java
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/RisingWaveInsertCommitActionExecutor.java
- java/connector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/RisingWaveUpsertPreppedDeltaCommitActionExecutor.java
- java/connector-node/risingwave-sink-hudi/src/test/java/com/risingwave/connector/HudiSinkFactoryTest.java
...nnector-node/risingwave-sink-hudi/src/main/java/com/risingwave/connector/HudiSinkConfig.java
Show resolved
Hide resolved
Codecov Report
@@ Coverage Diff @@
## main #8824 +/- ##
==========================================
- Coverage 70.98% 70.97% -0.01%
==========================================
Files 1241 1241
Lines 207549 207549
==========================================
- Hits 147329 147314 -15
- Misses 60220 60235 +15
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 7 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simple implementation, generally LGTM, but maybe we should add some test case?
* limitations under the License. | ||
*/ | ||
|
||
package com.risingwave.connector; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
package com.risingwave.connector; | |
package com.risingwave.connector.hudi; |
if (delete != null) { | ||
map.put(key, HudiSinkRowOp.deleteOp(key)); | ||
} else { | ||
map.remove(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what case this may be null?
KeyGenUtils.getRecordKey(rec, this.recordKeyField, false), ""); | ||
switch (row.getOp()) { | ||
case INSERT: | ||
sinkRowMap.insert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In such a simple case, I think maybe we don't need this row map. Hudi's log file can be used to append change logs. This way we can append records when writing, rather than writing all in clear.
Some(s) if s == "iceberg" || s == "hudi" => { | ||
// iceberg with multiple parallelism will fail easily with concurrent commit | ||
// on metadata | ||
// TODO: reset iceberg sink to have multiple parallelism |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
In this PR, we support a very trivial hudi sink. The hudi sink only supports hudi MOR table, has only one parallelism, and is sinking to only one file group, with file id named
risingwave-file-id
. For this file group, the initial write is regarded as insert and write the initial base parquet file, and all subsequent updates are regarded as update, and write the log file.An example SQL to create the hudi sink is
Checklist For Contributors
./risedev check
(or alias,./risedev c
)Checklist For Reviewers
Documentation
Click here for Documentation
Types of user-facing changes
Please keep the types that apply to your changes, and remove the others.
Release note