Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sink): support doris sink #12336

Merged
merged 26 commits into from
Sep 21, 2023
Merged

feat(sink): support doris sink #12336

merged 26 commits into from
Sep 21, 2023

Conversation

xxhZs
Copy link
Contributor

@xxhZs xxhZs commented Sep 15, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

We support doris sink with https://doris.apache.org/docs/dev/data-operate/import/import-way/stream-load-manual.

Currently, it is not possible to automatically align the element names and types in Doris, so users need to ensure that the struct in both the RW (Read-Write) and Doris are identical. If they are not identical, it may result in a failed insert into Doris (i.e., this row's struct will be null or set to default values)

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

SQL Example:

append-only:

CREATE SINK bhv_doris_sink
FROM
    bhv_mv WITH (
    connector = 'doris',
    type = 'append-only',
    doris.url = 'http://fe:8030',
    doris.user = 'xxxx',
    doris.password = 'xxxx',
    doris.database = 'demo',
    doris.table='demo_bhv_table',
    force_append_only='true'
);

upsert:

CREATE SINK bhv_doris_sink
FROM
    bhv_mv WITH (
    connector = 'doris',
    type = 'upsert',
    doris.url = 'http://fe:8030',
    doris.user = 'xxxx',
    doris.password = 'xxxx',
    doris.database = 'demo',
    doris.table='demo_bhv_table',
    primary_key = 'user_id'
);

WITH options:

connector: required. We use doris
url: required. The http connection port (not the mysql connection port) for the fe of doris.
username: optional. User name for the dors icredential. Needs to be created in doris.
password: optional. Password for the doris credential. Needs to be created in doris.
database: optional. Database for the doris.
table: optional. Table for the doris.

Notes:

  1. Please ensure that the elements in the structin Doris have the same name and type as those in the struct in RW. Otherwise, it may lead to an inability to insert this struct.
  2. Please ensure that RW can access the network where Doris BE and FE are located. confer (https://doris.apache.org/docs/dev/data-operate/import/import-scenes/external-table-load/)
  3. Supports upsert only if doris is UNIQUE KEY

fmt

fmt
@xxhZs xxhZs marked this pull request as ready for review September 18, 2023 05:56
@xxhZs xxhZs requested a review from a team as a code owner September 18, 2023 05:56
@xxhZs xxhZs requested review from hzxa21 and wenym1 September 18, 2023 05:56
Cargo.lock Outdated Show resolved Hide resolved
src/connector/src/sink/doris_connector.rs Outdated Show resolved Hide resolved
src/connector/src/sink/doris_connector.rs Outdated Show resolved Hide resolved
src/connector/src/sink/doris_connector.rs Outdated Show resolved Hide resolved
src/connector/src/sink/doris.rs Outdated Show resolved Hide resolved
src/connector/src/sink/doris_connector.rs Show resolved Hide resolved
src/connector/src/sink/doris.rs Outdated Show resolved Hide resolved
src/connector/src/sink/doris.rs Outdated Show resolved Hide resolved
.write(row_json_string.into())
.await?;
}
Op::UpdateDelete => {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to ignore update delete here? Does doris support upsert semantic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we will only support upsert in Doris when UNIQUE_KEYS are present. In such cases, insert will also be supported for overwriting.

src/connector/src/sink/encoder/mod.rs Outdated Show resolved Hide resolved
src/connector/src/sink/encoder/json.rs Outdated Show resolved Hide resolved
src/connector/src/sink/doris_connector.rs Outdated Show resolved Hide resolved
src/connector/src/sink/doris_connector.rs Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Sep 20, 2023

Codecov Report

Merging #12336 (b933e79) into main (1625a65) will decrease coverage by 0.15%.
Report is 1 commits behind head on main.
The diff coverage is 15.71%.

@@            Coverage Diff             @@
##             main   #12336      +/-   ##
==========================================
- Coverage   69.71%   69.57%   -0.15%     
==========================================
  Files        1427     1429       +2     
  Lines      237449   238100     +651     
==========================================
+ Hits       165531   165651     +120     
- Misses      71918    72449     +531     
Flag Coverage Δ
rust 69.57% <15.71%> (-0.15%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
src/connector/src/common.rs 1.73% <0.00%> (-0.07%) ⬇️
src/connector/src/sink/boxed.rs 0.00% <0.00%> (ø)
src/connector/src/sink/clickhouse.rs 0.00% <0.00%> (ø)
src/connector/src/sink/doris.rs 0.00% <0.00%> (ø)
src/connector/src/sink/doris_connector.rs 0.00% <0.00%> (ø)
src/connector/src/sink/encoder/mod.rs 35.71% <0.00%> (-2.75%) ⬇️
src/connector/src/sink/iceberg.rs 16.56% <0.00%> (ø)
src/connector/src/sink/kafka.rs 28.06% <0.00%> (-1.94%) ⬇️
src/connector/src/sink/kinesis.rs 0.00% <0.00%> (ø)
src/connector/src/sink/nats.rs 0.00% <0.00%> (ø)
... and 9 more

... and 7 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my local testing, RW timestamp value will be written as null in doris. Please fix this issue.

Rest LGTM! Great job!

integration_tests/doris-sink/create_sink.sql Outdated Show resolved Hide resolved
src/connector/Cargo.toml Outdated Show resolved Hide resolved
integration_tests/doris-sink/docker-compose.yml Outdated Show resolved Hide resolved
src/workspace-hack/Cargo.toml Outdated Show resolved Hide resolved
Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please help to write a simple release note about doris sink by describing the SQL syntax, the WITH options and some extra notes (e.g. the type mapping and the unsupported types). You can refer to this PR's description as an example.

@xxhZs xxhZs added this pull request to the merge queue Sep 21, 2023
@xxhZs xxhZs removed this pull request from the merge queue due to a manual request Sep 21, 2023
@xxhZs xxhZs added the user-facing-changes Contains changes that are visible to users label Sep 21, 2023
@xxhZs xxhZs added this pull request to the merge queue Sep 21, 2023
Merged via the queue into main with commit d8ec952 Sep 21, 2023
32 of 33 checks passed
@xxhZs xxhZs deleted the xxh/doris_sink branch September 21, 2023 12:27
@hzxa21 hzxa21 mentioned this pull request Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change type/feature user-facing-changes Contains changes that are visible to users
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants