-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[opt](partial update) Allow to only specify key columns in partial update #40736
[opt](partial update) Allow to only specify key columns in partial update #40736
Conversation
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
run buildall |
TPC-H: Total hot run time: 42969 ms
|
TeamCity be ut coverage result: |
TPC-DS: Total hot run time: 199885 ms
|
ClickBench: Total hot run time: 30.72 s
|
5e784fd
to
a8d47dd
Compare
run buildall |
TPC-H: Total hot run time: 42573 ms
|
TeamCity be ut coverage result: |
TPC-DS: Total hot run time: 199827 ms
|
ClickBench: Total hot run time: 31.28 s
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…key columns in partial update apache#40736" (apache#40863) picks apache#40736
…update apache#39619 pick [opt](partial update) Remove unnecessary lock and refactor some code for partial update (apache#40062) 1. apache#34112 let partial update fetch rowsets in the initialization of RowsetBuilder rather than flush phase. So we can remove that tablet header lock. 2. refactor some partial update code fix compile pick [Fix](partial update) Fix __DORIS_SEQUENCE_COL__ is not set for newly inserted rows in partial update apache#40272 picks apache#40272 pick [Cherry-pick](branch-2.1) Pick "[Featrue](default value) Support bitmap_empty default value (apache#40364)" (apache#40487) Pick apache#40364 <!--Describe your changes.--> pick [Feature](partial update) Support flexible partial update in stream load with json files (apache#39756) This PR add the ability to update different columns for each row in one stream load Doc: apache/doris-website#1140 ```sql MySQL root@127.1:d1> CREATE TABLE t1 ( -> `k` int(11) NULL, -> `v1` BIGINT NULL, -> `v2` BIGINT NULL DEFAULT "9876", -> `v3` BIGINT NOT NULL, -> `v4` BIGINT NOT NULL DEFAULT "1234", -> `v5` BIGINT NULL -> ) UNIQUE KEY(`k`) DISTRIBUTED BY HASH(`k`) BUCKETS 1 -> PROPERTIES( -> "replication_num" = "1", -> "enable_unique_key_merge_on_write" = "true"); Query OK, 0 rows affected Time: 0.013s MySQL root@127.1:d1> insert into t1 select number, number, number, number, number, number from numbers("number" = "6"); Query OK, 6 rows affected Time: 0.107s MySQL root@127.1:d1> select * from t1; +---+----+----+----+----+----+ | k | v1 | v2 | v3 | v4 | v5 | +---+----+----+----+----+----+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 1 | 1 | 1 | 1 | 1 | | 2 | 2 | 2 | 2 | 2 | 2 | | 3 | 3 | 3 | 3 | 3 | 3 | | 4 | 4 | 4 | 4 | 4 | 4 | | 5 | 5 | 5 | 5 | 5 | 5 | +---+----+----+----+----+----+ ``` test1.json: ```json {"k": 1, "v1": 10} {"k": 2, "v2": 20, "v5": 25} {"k": 3, "v3": 30} {"k": 4, "v4": 20, "v1": 43, "v3": 99} {"k": 5, "v5": null} {"k": 6, "v1": 999, "v3": 777} {"k": 2, "v4": 222} {"k": 1, "v2": 111, "v3": 111} ``` ```bash curl --location-trusted -u root: \ -H "strict_mode:false" \ -H "format:json" \ -H "read_json_by_line:true" \ -H "unique_key_update_mode:UPDATE_FLEXIBLE_COLUMNS" \ -T test1.json \ -XPUT http://<host>:<http_port>/api/d1/t1/_stream_load ``` ```sql MySQL root@127.1:d1> select * from t1; +---+-----+------+-----+------+--------+ | k | v1 | v2 | v3 | v4 | v5 | +---+-----+------+-----+------+--------+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 10 | 111 | 111 | 1 | 1 | | 2 | 2 | 20 | 2 | 222 | 25 | | 3 | 3 | 3 | 30 | 3 | 3 | | 4 | 43 | 4 | 99 | 20 | 4 | | 5 | 5 | 5 | 5 | 5 | <null> | | 6 | 999 | 9876 | 777 | 1234 | <null> | +---+-----+------+-----+------+--------+ ``` fix compile pick [branch-2.1] Picks "[opt](partial update) Allow to only specify key columns in partial update apache#40736" (apache#40863) picks apache#40736 fix
…update apache#39619 pick [opt](partial update) Remove unnecessary lock and refactor some code for partial update (apache#40062) 1. apache#34112 let partial update fetch rowsets in the initialization of RowsetBuilder rather than flush phase. So we can remove that tablet header lock. 2. refactor some partial update code fix compile pick [Fix](partial update) Fix __DORIS_SEQUENCE_COL__ is not set for newly inserted rows in partial update apache#40272 picks apache#40272 pick [Cherry-pick](branch-2.1) Pick "[Featrue](default value) Support bitmap_empty default value (apache#40364)" (apache#40487) Pick apache#40364 <!--Describe your changes.--> pick [Feature](partial update) Support flexible partial update in stream load with json files (apache#39756) This PR add the ability to update different columns for each row in one stream load Doc: apache/doris-website#1140 ```sql MySQL root@127.1:d1> CREATE TABLE t1 ( -> `k` int(11) NULL, -> `v1` BIGINT NULL, -> `v2` BIGINT NULL DEFAULT "9876", -> `v3` BIGINT NOT NULL, -> `v4` BIGINT NOT NULL DEFAULT "1234", -> `v5` BIGINT NULL -> ) UNIQUE KEY(`k`) DISTRIBUTED BY HASH(`k`) BUCKETS 1 -> PROPERTIES( -> "replication_num" = "1", -> "enable_unique_key_merge_on_write" = "true"); Query OK, 0 rows affected Time: 0.013s MySQL root@127.1:d1> insert into t1 select number, number, number, number, number, number from numbers("number" = "6"); Query OK, 6 rows affected Time: 0.107s MySQL root@127.1:d1> select * from t1; +---+----+----+----+----+----+ | k | v1 | v2 | v3 | v4 | v5 | +---+----+----+----+----+----+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 1 | 1 | 1 | 1 | 1 | | 2 | 2 | 2 | 2 | 2 | 2 | | 3 | 3 | 3 | 3 | 3 | 3 | | 4 | 4 | 4 | 4 | 4 | 4 | | 5 | 5 | 5 | 5 | 5 | 5 | +---+----+----+----+----+----+ ``` test1.json: ```json {"k": 1, "v1": 10} {"k": 2, "v2": 20, "v5": 25} {"k": 3, "v3": 30} {"k": 4, "v4": 20, "v1": 43, "v3": 99} {"k": 5, "v5": null} {"k": 6, "v1": 999, "v3": 777} {"k": 2, "v4": 222} {"k": 1, "v2": 111, "v3": 111} ``` ```bash curl --location-trusted -u root: \ -H "strict_mode:false" \ -H "format:json" \ -H "read_json_by_line:true" \ -H "unique_key_update_mode:UPDATE_FLEXIBLE_COLUMNS" \ -T test1.json \ -XPUT http://<host>:<http_port>/api/d1/t1/_stream_load ``` ```sql MySQL root@127.1:d1> select * from t1; +---+-----+------+-----+------+--------+ | k | v1 | v2 | v3 | v4 | v5 | +---+-----+------+-----+------+--------+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 10 | 111 | 111 | 1 | 1 | | 2 | 2 | 20 | 2 | 222 | 25 | | 3 | 3 | 3 | 30 | 3 | 3 | | 4 | 43 | 4 | 99 | 20 | 4 | | 5 | 5 | 5 | 5 | 5 | <null> | | 6 | 999 | 9876 | 777 | 1234 | <null> | +---+-----+------+-----+------+--------+ ``` fix compile pick [branch-2.1] Picks "[opt](partial update) Allow to only specify key columns in partial update apache#40736" (apache#40863) picks apache#40736 fix
…update apache#39619 pick [opt](partial update) Remove unnecessary lock and refactor some code for partial update (apache#40062) 1. apache#34112 let partial update fetch rowsets in the initialization of RowsetBuilder rather than flush phase. So we can remove that tablet header lock. 2. refactor some partial update code fix compile pick [Fix](partial update) Fix __DORIS_SEQUENCE_COL__ is not set for newly inserted rows in partial update apache#40272 picks apache#40272 pick [Cherry-pick](branch-2.1) Pick "[Featrue](default value) Support bitmap_empty default value (apache#40364)" (apache#40487) Pick apache#40364 <!--Describe your changes.--> pick [Feature](partial update) Support flexible partial update in stream load with json files (apache#39756) This PR add the ability to update different columns for each row in one stream load Doc: apache/doris-website#1140 ```sql MySQL root@127.1:d1> CREATE TABLE t1 ( -> `k` int(11) NULL, -> `v1` BIGINT NULL, -> `v2` BIGINT NULL DEFAULT "9876", -> `v3` BIGINT NOT NULL, -> `v4` BIGINT NOT NULL DEFAULT "1234", -> `v5` BIGINT NULL -> ) UNIQUE KEY(`k`) DISTRIBUTED BY HASH(`k`) BUCKETS 1 -> PROPERTIES( -> "replication_num" = "1", -> "enable_unique_key_merge_on_write" = "true"); Query OK, 0 rows affected Time: 0.013s MySQL root@127.1:d1> insert into t1 select number, number, number, number, number, number from numbers("number" = "6"); Query OK, 6 rows affected Time: 0.107s MySQL root@127.1:d1> select * from t1; +---+----+----+----+----+----+ | k | v1 | v2 | v3 | v4 | v5 | +---+----+----+----+----+----+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 1 | 1 | 1 | 1 | 1 | | 2 | 2 | 2 | 2 | 2 | 2 | | 3 | 3 | 3 | 3 | 3 | 3 | | 4 | 4 | 4 | 4 | 4 | 4 | | 5 | 5 | 5 | 5 | 5 | 5 | +---+----+----+----+----+----+ ``` test1.json: ```json {"k": 1, "v1": 10} {"k": 2, "v2": 20, "v5": 25} {"k": 3, "v3": 30} {"k": 4, "v4": 20, "v1": 43, "v3": 99} {"k": 5, "v5": null} {"k": 6, "v1": 999, "v3": 777} {"k": 2, "v4": 222} {"k": 1, "v2": 111, "v3": 111} ``` ```bash curl --location-trusted -u root: \ -H "strict_mode:false" \ -H "format:json" \ -H "read_json_by_line:true" \ -H "unique_key_update_mode:UPDATE_FLEXIBLE_COLUMNS" \ -T test1.json \ -XPUT http://<host>:<http_port>/api/d1/t1/_stream_load ``` ```sql MySQL root@127.1:d1> select * from t1; +---+-----+------+-----+------+--------+ | k | v1 | v2 | v3 | v4 | v5 | +---+-----+------+-----+------+--------+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 10 | 111 | 111 | 1 | 1 | | 2 | 2 | 20 | 2 | 222 | 25 | | 3 | 3 | 3 | 30 | 3 | 3 | | 4 | 43 | 4 | 99 | 20 | 4 | | 5 | 5 | 5 | 5 | 5 | <null> | | 6 | 999 | 9876 | 777 | 1234 | <null> | +---+-----+------+-----+------+--------+ ``` fix compile pick [branch-2.1] Picks "[opt](partial update) Allow to only specify key columns in partial update apache#40736" (apache#40863) picks apache#40736 fix
…update apache#39619 pick [opt](partial update) Remove unnecessary lock and refactor some code for partial update (apache#40062) 1. apache#34112 let partial update fetch rowsets in the initialization of RowsetBuilder rather than flush phase. So we can remove that tablet header lock. 2. refactor some partial update code fix compile pick [Fix](partial update) Fix __DORIS_SEQUENCE_COL__ is not set for newly inserted rows in partial update apache#40272 picks apache#40272 pick [Cherry-pick](branch-2.1) Pick "[Featrue](default value) Support bitmap_empty default value (apache#40364)" (apache#40487) Pick apache#40364 <!--Describe your changes.--> pick [Feature](partial update) Support flexible partial update in stream load with json files (apache#39756) This PR add the ability to update different columns for each row in one stream load Doc: apache/doris-website#1140 ```sql MySQL root@127.1:d1> CREATE TABLE t1 ( -> `k` int(11) NULL, -> `v1` BIGINT NULL, -> `v2` BIGINT NULL DEFAULT "9876", -> `v3` BIGINT NOT NULL, -> `v4` BIGINT NOT NULL DEFAULT "1234", -> `v5` BIGINT NULL -> ) UNIQUE KEY(`k`) DISTRIBUTED BY HASH(`k`) BUCKETS 1 -> PROPERTIES( -> "replication_num" = "1", -> "enable_unique_key_merge_on_write" = "true"); Query OK, 0 rows affected Time: 0.013s MySQL root@127.1:d1> insert into t1 select number, number, number, number, number, number from numbers("number" = "6"); Query OK, 6 rows affected Time: 0.107s MySQL root@127.1:d1> select * from t1; +---+----+----+----+----+----+ | k | v1 | v2 | v3 | v4 | v5 | +---+----+----+----+----+----+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 1 | 1 | 1 | 1 | 1 | | 2 | 2 | 2 | 2 | 2 | 2 | | 3 | 3 | 3 | 3 | 3 | 3 | | 4 | 4 | 4 | 4 | 4 | 4 | | 5 | 5 | 5 | 5 | 5 | 5 | +---+----+----+----+----+----+ ``` test1.json: ```json {"k": 1, "v1": 10} {"k": 2, "v2": 20, "v5": 25} {"k": 3, "v3": 30} {"k": 4, "v4": 20, "v1": 43, "v3": 99} {"k": 5, "v5": null} {"k": 6, "v1": 999, "v3": 777} {"k": 2, "v4": 222} {"k": 1, "v2": 111, "v3": 111} ``` ```bash curl --location-trusted -u root: \ -H "strict_mode:false" \ -H "format:json" \ -H "read_json_by_line:true" \ -H "unique_key_update_mode:UPDATE_FLEXIBLE_COLUMNS" \ -T test1.json \ -XPUT http://<host>:<http_port>/api/d1/t1/_stream_load ``` ```sql MySQL root@127.1:d1> select * from t1; +---+-----+------+-----+------+--------+ | k | v1 | v2 | v3 | v4 | v5 | +---+-----+------+-----+------+--------+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 10 | 111 | 111 | 1 | 1 | | 2 | 2 | 20 | 2 | 222 | 25 | | 3 | 3 | 3 | 30 | 3 | 3 | | 4 | 43 | 4 | 99 | 20 | 4 | | 5 | 5 | 5 | 5 | 5 | <null> | | 6 | 999 | 9876 | 777 | 1234 | <null> | +---+-----+------+-----+------+--------+ ``` fix compile pick [branch-2.1] Picks "[opt](partial update) Allow to only specify key columns in partial update apache#40736" (apache#40863) picks apache#40736 fix
…key columns in partial update apache#40736" (apache#40863) picks apache#40736
pick the following PRs: other PR: - apache#39619 - apache#40062 - apache#40272 - apache#40364 - apache#40736 - apache#41439 main PR: - apache#39756 - apache#41950 - apache#41701
branch-2.1-pick: #40863
branch-2.0-pick: #40864