[Enhancement] Optimize mem usage of partial update #14187

sevev · 2022-11-28T06:10:56Z

What type of PR is this：

Which issues of this PR fixes ：

Problem Summary(Required) ：

We have partially optimized the primary key model for large import memory usage in this pr(#12068), but the enhancement doesn't work if the load is partial update. And we also need a lot of memory if you do a large number of partial updates in one transaction. So this pr will try to reduce the memory usage of large partial update.

There are two reasons for large memory usage during partial column updates:

The first one is that updating a few columns may increase the segment file size and we need to load all data of segment into memory which will cost a lot of memory.
The second one is that doing partial update requires reading data from other columns into memory, which can take up a lot of memory if the table has many columns.

In order to reduce memory usage, the following two adjustments are made:

The first one is to estimate the length of the updated partial columns in each row when importing data, thus reducing the size of the segment file
The second one is don't load all the data of the rowset into memory at once, but to load them one by one according to the segment.

In my test env, one BE with two HDD, using StreamLoad, create a table with 65 column, 20 buckets:

CREATE TABLE `partial_test` (
  `col_1` bigint(20) NOT NULL COMMENT "",
  `col_2` bigint(20) NOT NULL COMMENT "",
  `col_3` bigint(20) NOT NULL COMMENT "",
  `col_4` varchar(150) NOT NULL COMMENT "",
  `col_5` varchar(150) NOT NULL COMMENT "",
  `col_6` varchar(150) NULL COMMENT "",
  `col_7` varchar(150) NULL COMMENT "",
  `col_8` varchar(1024) NULL COMMENT "",
  `col_9` varchar(120) NULL COMMENT "",
  `col_10` varchar(60) NULL COMMENT "",
  `col_11` varchar(10) NULL COMMENT "",
  `col_12` varchar(120) NULL COMMENT "",
  `col_13` varchar(524) NULL COMMENT "",
  `col_14` varchar(100) NULL COMMENT "",
  `col_15` varchar(150) NULL COMMENT "",
  `col_16` varchar(150) NULL COMMENT "",
  `col_17` varchar(150) NULL COMMENT "",
  `col_18` bigint(20) NULL COMMENT "",
  `col_19` varchar(500) NULL COMMENT "",
  `col_20` varchar(150) NULL COMMENT "",
  `col_21` tinyint(4) NULL COMMENT "",
  `col_22` int(11) NULL COMMENT "",
  `col_23` varchar(524) NULL COMMENT "",
  `col_24` bigint(20) NULL COMMENT "",
  `col_25` bigint(20) NULL COMMENT "",
  `col_26` varchar(8) NULL COMMENT "",
  `col_27` decimal64(18, 6) NULL COMMENT "",
  `col_28` decimal64(18, 6) NULL COMMENT "",
  `col_29` decimal64(18, 6) NULL COMMENT "",
  `col_30` decimal64(18, 6) NULL COMMENT "",
  `col_31` decimal64(18, 6) NULL COMMENT "",
  `col_32` decimal64(18, 6) NULL COMMENT "",
  `col_33` bigint(20) NULL COMMENT "",
  `col_34` decimal64(18, 6) NULL COMMENT "",
  `col_35` varchar(8) NULL COMMENT "",
  `col_36` decimal64(18, 6) NULL COMMENT "",
  `col_37` decimal64(18, 6) NULL COMMENT "",
  `col_38` varchar(8) NULL COMMENT "",
  `col_39` decimal64(18, 6) NULL COMMENT "",
  `col_40` decimal64(18, 6) NULL COMMENT "",
  `col_41` varchar(8) NULL COMMENT "",
  `col_42` decimal64(18, 6) NULL COMMENT "",
  `col_43` decimal64(18, 6) NULL COMMENT "",
  `col_44` decimal64(18, 6) NULL COMMENT "",
  `col_45` decimal64(18, 6) NULL COMMENT "",
  `col_46` int(11) NULL COMMENT "",
  `col_47` int(11) NOT NULL COMMENT "",
  `col_48` tinyint(4) NULL COMMENT "",
  `col_49` varchar(200) NULL COMMENT "",
  `col_50` tinyint(4) NULL COMMENT "",
  `col_51` varchar(200) NULL COMMENT "",
  `col_52` varchar(10) NULL COMMENT "",
  `col_53` tinyint(4) NULL COMMENT "",
  `col_54` tinyint(4) NULL COMMENT "",
  `col_55` varchar(150) NULL COMMENT "",
  `col_56` varchar(150) NULL COMMENT "",
  `col_57` varchar(500) NULL COMMENT "",
  `col_58` tinyint(4) NULL COMMENT "",
  `col_59` varchar(100) NULL COMMENT "",
  `col_60` varchar(150) NULL COMMENT "",
  `col_61` varchar(150) NULL COMMENT "",
  `col_62` varchar(150) NULL COMMENT "",
  `col_63` varchar(150) NULL COMMENT "",
  `col_64` datetime NULL COMMENT "",
  `col_65` datetime NULL COMMENT ""
) ENGINE=OLAP 
PRIMARY KEY(`col_1`, `col_2`, `col_3`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`col_1`, `col_2`) BUCKETS 20 
PROPERTIES (
"replication_num" = "1",
"in_memory" = "false",
"storage_format" = "V2",
"enable_persistent_index" = "true",
"compression" = "LZ4"
);

PrimaryKey Length	RowNum	BucketNum	Column Num	Partial ColumnNum	PartialUpdate RowsNum	Load time(s)	Apply time(ms)	Peak UpdateMemory usage	Note
12 Bytes	300M	20	65	5	100M	135261	106693	78.9G	branch-main
12 Bytes	300M	20	65	5	100M	166449	149870	10.3G	branch-opt
12 Bytes	300M	20	65	5	100K	2078	529	60.1M	branch-main
12 Bytes	300M	20	65	5	100K	2211	541	60.2M	branch-opt

Checklist:

I have added test cases for my bug fix or my new feature
This pr will affect users' behaviors
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function

Bugfix cherry-pick branch check:

sevev · 2022-11-28T06:13:45Z

run starrocks_admit_test

github-actions · 2022-11-28T06:58:49Z

clang-tidy review says "All clean, LGTM! 👍"

sevev · 2022-11-28T15:53:43Z

run starrocks_be_unittest

github-actions · 2022-11-28T16:44:29Z

clang-tidy review says "All clean, LGTM! 👍"

be/src/storage/delta_writer.cpp

github-actions

clang-tidy made some suggestions

be/src/storage/rowset_update_state.h

be/src/storage/memtable.h

be/src/storage/delta_writer.cpp

be/src/storage/tablet_updates.h

sevev · 2022-12-22T02:23:26Z

run starrocks_be_unittest

github-actions · 2022-12-22T03:02:25Z

clang-tidy review says "All clean, LGTM! 👍"

sevev · 2022-12-26T01:59:20Z

run starrocks_admit_test

wanpengfei-git · 2022-12-26T02:26:21Z

run starrocks_admit_test

github-actions · 2022-12-26T07:26:07Z

clang-tidy review says "All clean, LGTM! 👍"

sonarcloud · 2022-12-26T07:26:59Z

SonarCloud Quality Gate failed.

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

No Coverage information
4.8% Duplication

sevev · 2022-12-29T06:07:04Z

@mergify backport branch-2.5

We have partially optimized the primary key model for large import memory usage in this pr(#12068), but the enhancement doesn't work if the load is partial update. And we also need a lot of memory if you do a large number of partial updates in one transaction. So this pr will try to reduce the memory usage of large partial update. There are two reasons for large memory usage during partial column updates: 1. The first one is that updating a few columns may increase the segment file size and we need to load all data of segment into memory which will cost a lot of memory. 2. The second one is that doing partial update requires reading data from other columns into memory, which can take up a lot of memory if the table has many columns. In order to reduce memory usage, the following two adjustments are made: 1. The first one is to estimate the length of the updated partial columns in each row when importing data, thus reducing the size of the segment file 2. The second one is not to load all the data of the rowset into memory at once, but to load them one by one according to the segment. In my test env, one BE with two HDD, using StreamLoad, create a table with 65 column, 20 buckets: ``` CREATE TABLE `partial_test` ( `col_1` bigint(20) NOT NULL COMMENT "", `col_2` bigint(20) NOT NULL COMMENT "", `col_3` bigint(20) NOT NULL COMMENT "", `col_4` varchar(150) NOT NULL COMMENT "", `col_5` varchar(150) NOT NULL COMMENT "", `col_6` varchar(150) NULL COMMENT "", `col_7` varchar(150) NULL COMMENT "", `col_8` varchar(1024) NULL COMMENT "", `col_9` varchar(120) NULL COMMENT "", `col_10` varchar(60) NULL COMMENT "", `col_11` varchar(10) NULL COMMENT "", `col_12` varchar(120) NULL COMMENT "", `col_13` varchar(524) NULL COMMENT "", `col_14` varchar(100) NULL COMMENT "", `col_15` varchar(150) NULL COMMENT "", `col_16` varchar(150) NULL COMMENT "", `col_17` varchar(150) NULL COMMENT "", `col_18` bigint(20) NULL COMMENT "", `col_19` varchar(500) NULL COMMENT "", `col_20` varchar(150) NULL COMMENT "", `col_21` tinyint(4) NULL COMMENT "", `col_22` int(11) NULL COMMENT "", `col_23` varchar(524) NULL COMMENT "", `col_24` bigint(20) NULL COMMENT "", `col_25` bigint(20) NULL COMMENT "", `col_26` varchar(8) NULL COMMENT "", `col_27` decimal64(18, 6) NULL COMMENT "", `col_28` decimal64(18, 6) NULL COMMENT "", `col_29` decimal64(18, 6) NULL COMMENT "", `col_30` decimal64(18, 6) NULL COMMENT "", `col_31` decimal64(18, 6) NULL COMMENT "", `col_32` decimal64(18, 6) NULL COMMENT "", `col_33` bigint(20) NULL COMMENT "", `col_34` decimal64(18, 6) NULL COMMENT "", `col_35` varchar(8) NULL COMMENT "", `col_36` decimal64(18, 6) NULL COMMENT "", `col_37` decimal64(18, 6) NULL COMMENT "", `col_38` varchar(8) NULL COMMENT "", `col_39` decimal64(18, 6) NULL COMMENT "", `col_40` decimal64(18, 6) NULL COMMENT "", `col_41` varchar(8) NULL COMMENT "", `col_42` decimal64(18, 6) NULL COMMENT "", `col_43` decimal64(18, 6) NULL COMMENT "", `col_44` decimal64(18, 6) NULL COMMENT "", `col_45` decimal64(18, 6) NULL COMMENT "", `col_46` int(11) NULL COMMENT "", `col_47` int(11) NOT NULL COMMENT "", `col_48` tinyint(4) NULL COMMENT "", `col_49` varchar(200) NULL COMMENT "", `col_50` tinyint(4) NULL COMMENT "", `col_51` varchar(200) NULL COMMENT "", `col_52` varchar(10) NULL COMMENT "", `col_53` tinyint(4) NULL COMMENT "", `col_54` tinyint(4) NULL COMMENT "", `col_55` varchar(150) NULL COMMENT "", `col_56` varchar(150) NULL COMMENT "", `col_57` varchar(500) NULL COMMENT "", `col_58` tinyint(4) NULL COMMENT "", `col_59` varchar(100) NULL COMMENT "", `col_60` varchar(150) NULL COMMENT "", `col_61` varchar(150) NULL COMMENT "", `col_62` varchar(150) NULL COMMENT "", `col_63` varchar(150) NULL COMMENT "", `col_64` datetime NULL COMMENT "", `col_65` datetime NULL COMMENT "" ) ENGINE=OLAP PRIMARY KEY(`col_1`, `col_2`, `col_3`) COMMENT "OLAP" DISTRIBUTED BY HASH(`col_1`, `col_2`) BUCKETS 20 PROPERTIES ( "replication_num" = "1", "in_memory" = "false", "storage_format" = "V2", "enable_persistent_index" = "true", "compression" = "LZ4" ); ``` |PrimaryKey Length| RowNum|BucketNum| Column Num| Partial ColumnNum | PartialUpdate RowsNum| Load time(s)| Apply time(ms)| Peak UpdateMemory usage | Note | |---------------------|----------|------------|----------------|--------------------|------------------------------|----|-----|-----|----| |12 Bytes| 300M | 20 | 65 | 5 | 100M | 135261 | 106693 | 78.9G | branch-main | |12 Bytes| 300M | 20 | 65 | 5 | 100M | 166449| 149870 | 10.3G | branch-opt | |12 Bytes| 300M | 20 | 65 | 5 | 100K | 2078 | 529 | 60.1M | branch-main | |12 Bytes| 300M | 20 | 65 | 5 | 100K | 2211 | 541 | 60.2M | branch-opt | (cherry picked from commit 545b7be) # Conflicts: # be/src/storage/memtable.h # be/src/storage/rowset_update_state.cpp # be/src/storage/rowset_update_state.h # be/src/storage/tablet_updates.cpp

mergify · 2022-12-29T06:08:02Z

backport branch-2.5

✅ Backports have been created

#15966 [Enhancement] Optimize mem usage of partial update (backport #14187) has been created for branch branch-2.5

We have partially optimized the primary key model for large import memory usage in this pr(StarRocks#12068), but the enhancement doesn't work if the load is partial update. And we also need a lot of memory if you do a large number of partial updates in one transaction. So this pr will try to reduce the memory usage of large partial update. There are two reasons for large memory usage during partial column updates: 1. The first one is that updating a few columns may increase the segment file size and we need to load all data of segment into memory which will cost a lot of memory. 2. The second one is that doing partial update requires reading data from other columns into memory, which can take up a lot of memory if the table has many columns. In order to reduce memory usage, the following two adjustments are made: 1. The first one is to estimate the length of the updated partial columns in each row when importing data, thus reducing the size of the segment file 2. The second one is not to load all the data of the rowset into memory at once, but to load them one by one according to the segment. In my test env, one BE with two HDD, using StreamLoad, create a table with 65 column, 20 buckets: ``` CREATE TABLE `partial_test` ( `col_1` bigint(20) NOT NULL COMMENT "", `col_2` bigint(20) NOT NULL COMMENT "", `col_3` bigint(20) NOT NULL COMMENT "", `col_4` varchar(150) NOT NULL COMMENT "", `col_5` varchar(150) NOT NULL COMMENT "", `col_6` varchar(150) NULL COMMENT "", `col_7` varchar(150) NULL COMMENT "", `col_8` varchar(1024) NULL COMMENT "", `col_9` varchar(120) NULL COMMENT "", `col_10` varchar(60) NULL COMMENT "", `col_11` varchar(10) NULL COMMENT "", `col_12` varchar(120) NULL COMMENT "", `col_13` varchar(524) NULL COMMENT "", `col_14` varchar(100) NULL COMMENT "", `col_15` varchar(150) NULL COMMENT "", `col_16` varchar(150) NULL COMMENT "", `col_17` varchar(150) NULL COMMENT "", `col_18` bigint(20) NULL COMMENT "", `col_19` varchar(500) NULL COMMENT "", `col_20` varchar(150) NULL COMMENT "", `col_21` tinyint(4) NULL COMMENT "", `col_22` int(11) NULL COMMENT "", `col_23` varchar(524) NULL COMMENT "", `col_24` bigint(20) NULL COMMENT "", `col_25` bigint(20) NULL COMMENT "", `col_26` varchar(8) NULL COMMENT "", `col_27` decimal64(18, 6) NULL COMMENT "", `col_28` decimal64(18, 6) NULL COMMENT "", `col_29` decimal64(18, 6) NULL COMMENT "", `col_30` decimal64(18, 6) NULL COMMENT "", `col_31` decimal64(18, 6) NULL COMMENT "", `col_32` decimal64(18, 6) NULL COMMENT "", `col_33` bigint(20) NULL COMMENT "", `col_34` decimal64(18, 6) NULL COMMENT "", `col_35` varchar(8) NULL COMMENT "", `col_36` decimal64(18, 6) NULL COMMENT "", `col_37` decimal64(18, 6) NULL COMMENT "", `col_38` varchar(8) NULL COMMENT "", `col_39` decimal64(18, 6) NULL COMMENT "", `col_40` decimal64(18, 6) NULL COMMENT "", `col_41` varchar(8) NULL COMMENT "", `col_42` decimal64(18, 6) NULL COMMENT "", `col_43` decimal64(18, 6) NULL COMMENT "", `col_44` decimal64(18, 6) NULL COMMENT "", `col_45` decimal64(18, 6) NULL COMMENT "", `col_46` int(11) NULL COMMENT "", `col_47` int(11) NOT NULL COMMENT "", `col_48` tinyint(4) NULL COMMENT "", `col_49` varchar(200) NULL COMMENT "", `col_50` tinyint(4) NULL COMMENT "", `col_51` varchar(200) NULL COMMENT "", `col_52` varchar(10) NULL COMMENT "", `col_53` tinyint(4) NULL COMMENT "", `col_54` tinyint(4) NULL COMMENT "", `col_55` varchar(150) NULL COMMENT "", `col_56` varchar(150) NULL COMMENT "", `col_57` varchar(500) NULL COMMENT "", `col_58` tinyint(4) NULL COMMENT "", `col_59` varchar(100) NULL COMMENT "", `col_60` varchar(150) NULL COMMENT "", `col_61` varchar(150) NULL COMMENT "", `col_62` varchar(150) NULL COMMENT "", `col_63` varchar(150) NULL COMMENT "", `col_64` datetime NULL COMMENT "", `col_65` datetime NULL COMMENT "" ) ENGINE=OLAP PRIMARY KEY(`col_1`, `col_2`, `col_3`) COMMENT "OLAP" DISTRIBUTED BY HASH(`col_1`, `col_2`) BUCKETS 20 PROPERTIES ( "replication_num" = "1", "in_memory" = "false", "storage_format" = "V2", "enable_persistent_index" = "true", "compression" = "LZ4" ); ``` |PrimaryKey Length| RowNum|BucketNum| Column Num| Partial ColumnNum | PartialUpdate RowsNum| Load time(s)| Apply time(ms)| Peak UpdateMemory usage | Note | |---------------------|----------|------------|----------------|--------------------|------------------------------|----|-----|-----|----| |12 Bytes| 300M | 20 | 65 | 5 | 100M | 135261 | 106693 | 78.9G | branch-main | |12 Bytes| 300M | 20 | 65 | 5 | 100M | 166449| 149870 | 10.3G | branch-opt | |12 Bytes| 300M | 20 | 65 | 5 | 100K | 2078 | 529 | 60.1M | branch-main | |12 Bytes| 300M | 20 | 65 | 5 | 100K | 2211 | 541 | 60.2M | branch-opt |

We have partially optimized the primary key model for large import memory usage in this pr(#12068), but the enhancement doesn't work if the load is partial update. And we also need a lot of memory if you do a large number of partial updates in one transaction. So this pr will try to reduce the memory usage of large partial update. There are two reasons for large memory usage during partial column updates: 1. The first one is that updating a few columns may increase the segment file size and we need to load all data of segment into memory which will cost a lot of memory. 2. The second one is that doing partial update requires reading data from other columns into memory, which can take up a lot of memory if the table has many columns. In order to reduce memory usage, the following two adjustments are made: 1. The first one is to estimate the length of the updated partial columns in each row when importing data, thus reducing the size of the segment file 2. The second one is not to load all the data of the rowset into memory at once, but to load them one by one according to the segment. In my test env, one BE with two HDD, using StreamLoad, create a table with 65 column, 20 buckets: ``` CREATE TABLE `partial_test` ( `col_1` bigint(20) NOT NULL COMMENT "", `col_2` bigint(20) NOT NULL COMMENT "", `col_3` bigint(20) NOT NULL COMMENT "", `col_4` varchar(150) NOT NULL COMMENT "", `col_5` varchar(150) NOT NULL COMMENT "", `col_6` varchar(150) NULL COMMENT "", `col_7` varchar(150) NULL COMMENT "", `col_8` varchar(1024) NULL COMMENT "", `col_9` varchar(120) NULL COMMENT "", `col_10` varchar(60) NULL COMMENT "", `col_11` varchar(10) NULL COMMENT "", `col_12` varchar(120) NULL COMMENT "", `col_13` varchar(524) NULL COMMENT "", `col_14` varchar(100) NULL COMMENT "", `col_15` varchar(150) NULL COMMENT "", `col_16` varchar(150) NULL COMMENT "", `col_17` varchar(150) NULL COMMENT "", `col_18` bigint(20) NULL COMMENT "", `col_19` varchar(500) NULL COMMENT "", `col_20` varchar(150) NULL COMMENT "", `col_21` tinyint(4) NULL COMMENT "", `col_22` int(11) NULL COMMENT "", `col_23` varchar(524) NULL COMMENT "", `col_24` bigint(20) NULL COMMENT "", `col_25` bigint(20) NULL COMMENT "", `col_26` varchar(8) NULL COMMENT "", `col_27` decimal64(18, 6) NULL COMMENT "", `col_28` decimal64(18, 6) NULL COMMENT "", `col_29` decimal64(18, 6) NULL COMMENT "", `col_30` decimal64(18, 6) NULL COMMENT "", `col_31` decimal64(18, 6) NULL COMMENT "", `col_32` decimal64(18, 6) NULL COMMENT "", `col_33` bigint(20) NULL COMMENT "", `col_34` decimal64(18, 6) NULL COMMENT "", `col_35` varchar(8) NULL COMMENT "", `col_36` decimal64(18, 6) NULL COMMENT "", `col_37` decimal64(18, 6) NULL COMMENT "", `col_38` varchar(8) NULL COMMENT "", `col_39` decimal64(18, 6) NULL COMMENT "", `col_40` decimal64(18, 6) NULL COMMENT "", `col_41` varchar(8) NULL COMMENT "", `col_42` decimal64(18, 6) NULL COMMENT "", `col_43` decimal64(18, 6) NULL COMMENT "", `col_44` decimal64(18, 6) NULL COMMENT "", `col_45` decimal64(18, 6) NULL COMMENT "", `col_46` int(11) NULL COMMENT "", `col_47` int(11) NOT NULL COMMENT "", `col_48` tinyint(4) NULL COMMENT "", `col_49` varchar(200) NULL COMMENT "", `col_50` tinyint(4) NULL COMMENT "", `col_51` varchar(200) NULL COMMENT "", `col_52` varchar(10) NULL COMMENT "", `col_53` tinyint(4) NULL COMMENT "", `col_54` tinyint(4) NULL COMMENT "", `col_55` varchar(150) NULL COMMENT "", `col_56` varchar(150) NULL COMMENT "", `col_57` varchar(500) NULL COMMENT "", `col_58` tinyint(4) NULL COMMENT "", `col_59` varchar(100) NULL COMMENT "", `col_60` varchar(150) NULL COMMENT "", `col_61` varchar(150) NULL COMMENT "", `col_62` varchar(150) NULL COMMENT "", `col_63` varchar(150) NULL COMMENT "", `col_64` datetime NULL COMMENT "", `col_65` datetime NULL COMMENT "" ) ENGINE=OLAP PRIMARY KEY(`col_1`, `col_2`, `col_3`) COMMENT "OLAP" DISTRIBUTED BY HASH(`col_1`, `col_2`) BUCKETS 20 PROPERTIES ( "replication_num" = "1", "in_memory" = "false", "storage_format" = "V2", "enable_persistent_index" = "true", "compression" = "LZ4" ); ``` |PrimaryKey Length| RowNum|BucketNum| Column Num| Partial ColumnNum | PartialUpdate RowsNum| Load time(s)| Apply time(ms)| Peak UpdateMemory usage | Note | |---------------------|----------|------------|----------------|--------------------|------------------------------|----|-----|-----|----| |12 Bytes| 300M | 20 | 65 | 5 | 100M | 135261 | 106693 | 78.9G | branch-main | |12 Bytes| 300M | 20 | 65 | 5 | 100M | 166449| 149870 | 10.3G | branch-opt | |12 Bytes| 300M | 20 | 65 | 5 | 100K | 2078 | 529 | 60.1M | branch-main | |12 Bytes| 300M | 20 | 65 | 5 | 100K | 2211 | 541 | 60.2M | branch-opt |

github-actions bot added the title needs [type] label Nov 28, 2022

wanpengfei-git added the be-build label Nov 28, 2022

sevev changed the title ~~[WIP][Enhancement]Optimize mem usage of partial update~~ [Enhancement]Optimize mem usage of partial update Nov 28, 2022

github-actions bot added title needs [type] and removed be-build title needs [type] labels Nov 28, 2022

sevev changed the title ~~[Enhancement]Optimize mem usage of partial update~~ [Enhancement] Optimize mem usage of partial update Nov 28, 2022

wanpengfei-git added the be-build label Nov 28, 2022

decster reviewed Dec 13, 2022

View reviewed changes

be/src/storage/delta_writer.cpp Outdated Show resolved Hide resolved

github-actions bot removed the be-build label Dec 13, 2022

wanpengfei-git added the be-build label Dec 13, 2022

github-actions bot reviewed Dec 13, 2022

View reviewed changes

be/src/storage/rowset_update_state.h Show resolved Hide resolved

be/src/storage/rowset_update_state.h Show resolved Hide resolved

decster previously approved these changes Dec 13, 2022

View reviewed changes

chaoyli reviewed Dec 21, 2022

View reviewed changes

be/src/storage/memtable.h Outdated Show resolved Hide resolved

be/src/storage/delta_writer.cpp Show resolved Hide resolved

chaoyli reviewed Dec 21, 2022

View reviewed changes

be/src/storage/delta_writer.cpp Show resolved Hide resolved

be/src/storage/tablet_updates.h Show resolved Hide resolved

sevev added 7 commits December 21, 2022 14:53

First commit: support apply partial rowset by segment

7ba4e45

modify ut

696d001

modify ut

18c7704

update

ad03a41

add some comments

dd58bf5

modify estimate partial rowset size policy

ba35db6

update estimate partial rowset size policy

e0ead98

sevev dismissed decster’s stale review via e0ead98 December 21, 2022 08:03

sevev force-pushed the optimize_mem_usage_of_partial_update branch from 270e0c6 to e0ead98 Compare December 21, 2022 08:03

github-actions bot removed the be-build label Dec 21, 2022

fix compile error

3f8c779

wanpengfei-git added the be-build label Dec 22, 2022

sevev requested a review from chaoyli December 23, 2022 12:23

chaoyli approved these changes Dec 24, 2022

View reviewed changes

decster approved these changes Dec 26, 2022

View reviewed changes

wanpengfei-git added the Approved Ready to merge label Dec 26, 2022

decster merged commit 545b7be into StarRocks:main Dec 26, 2022

github-actions bot removed Approved Ready to merge be-build labels Dec 26, 2022

wanpengfei-git added the be-build label Dec 26, 2022

mergify bot mentioned this pull request Dec 29, 2022

[Enhancement] Optimize mem usage of partial update (backport #14187) #15966

Closed

github-actions bot added 2.5 2.4 2.3 labels Feb 23, 2023

wanpengfei-git removed the 2.3 label Feb 23, 2023

sevev deleted the optimize_mem_usage_of_partial_update branch August 7, 2023 01:52

luohaha mentioned this pull request Dec 21, 2023

[Enhancement] Improve the problem of generating too many small files for partial update column mode #37523

Merged

23 tasks

This was referenced Dec 21, 2023

[Enhancement] Improve the problem of generating too many small files for partial update column mode (backport #37523) #37588

Merged

[Enhancement] Improve the problem of generating too many small files for partial update column mode (backport #37523) #37589

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement] Optimize mem usage of partial update #14187

[Enhancement] Optimize mem usage of partial update #14187

sevev commented Nov 28, 2022 •

edited

sevev commented Nov 28, 2022

github-actions bot commented Nov 28, 2022

sevev commented Nov 28, 2022

github-actions bot commented Nov 28, 2022

github-actions bot left a comment

sevev commented Dec 22, 2022

github-actions bot commented Dec 22, 2022

sevev commented Dec 26, 2022

wanpengfei-git commented Dec 26, 2022

github-actions bot commented Dec 26, 2022

sonarcloud bot commented Dec 26, 2022

sevev commented Dec 29, 2022

mergify bot commented Dec 29, 2022

[Enhancement] Optimize mem usage of partial update #14187

[Enhancement] Optimize mem usage of partial update #14187

Conversation

sevev commented Nov 28, 2022 • edited

What type of PR is this：

Which issues of this PR fixes ：

Problem Summary(Required) ：

Checklist:

Bugfix cherry-pick branch check:

sevev commented Nov 28, 2022

github-actions bot commented Nov 28, 2022

sevev commented Nov 28, 2022

github-actions bot commented Nov 28, 2022

github-actions bot left a comment

Choose a reason for hiding this comment

sevev commented Dec 22, 2022

github-actions bot commented Dec 22, 2022

sevev commented Dec 26, 2022

wanpengfei-git commented Dec 26, 2022

github-actions bot commented Dec 26, 2022

sonarcloud bot commented Dec 26, 2022

sevev commented Dec 29, 2022

mergify bot commented Dec 29, 2022

✅ Backports have been created

sevev commented Nov 28, 2022 •

edited