Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](load) fix that load channel failed to be released in time #14119

Merged
merged 1 commit into from
Nov 9, 2022

Conversation

liaoxin01
Copy link
Contributor

@liaoxin01 liaoxin01 commented Nov 9, 2022

Proposed changes

Issue Number: close #xxx

Problem summary

Problem

After Import failed, the load channel was released after a long time.

W1107 22:39:40.250975 6352 tablet_sink.cpp:158] VNodeChannel[11200-10003], load_id=bfc1f85bd4d743db-8804da3331d2cc84, txn_id=2038581140244480, node=172.16.11.186:8060, tablet writer failed to reduce mem consumption by flushing memtable, tablet_id=11291, txn_id=2038581117976576, err=6, errcode=-238, msg: 0# doris::Status::ConstructErrorStatus(short) at /disk0/be/src/common/status.cpp:78
I1108 02:40:38.573726 6308 load_channel_mgr.cpp:188] erase timeout load channel: bfc1f85bd4d743db-8804da3331d2cc84

Why

When the data transmission of a node channel fails, then the node channel is marked as cancel. The subsequent add batch skips the node.
The normal node channel sends the last data with an EOS flag. The destination will process the EOS flag and actively erase the corresponding load channel.
However, the cancelled node channel will not send EOS, nor actively send a cancel request to the destination, so the destination load channel will not be released.

How

When closing, the failed node channel initiates a cancel operation.

Checklist(Required)

  1. Does it affect the original behavior:
    • Yes
    • No
    • I don't know
  2. Has unit tests been added:
    • Yes
    • No
    • No Need
  3. Has document been added or modified:
    • Yes
    • No
    • No Need
  4. Does it need to update dependencies:
    • Yes
    • No
  5. Are there any changes that cannot be rolled back:
    • Yes (If Yes, please explain WHY)
    • No

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

zhannngchen
zhannngchen previously approved these changes Nov 9, 2022
Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

be/src/exec/tablet_sink.cpp Show resolved Hide resolved
@github-actions
Copy link
Contributor

github-actions bot commented Nov 9, 2022

PR approved by anyone and no changes requested.

@hello-stephen
Copy link
Contributor

hello-stephen commented Nov 9, 2022

TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 34.78 seconds
load time: 448 seconds
storage size: 17180315880 Bytes
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20221109143215_clickbench_pr_42855.html

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 3690c4d into apache:master Nov 9, 2022
liaoxin01 added a commit to liaoxin01/doris that referenced this pull request Nov 12, 2022
…led to be released in time (apache#14119)"

commit 3690c4d
Author: Xin Liao <liaoxinbit@126.com>
Date:   Wed Nov 9 22:38:08 2022 +0800
    [fix](load) fix that load channel failed to be released in time (apache#14119)
liaoxin01 added a commit to liaoxin01/doris that referenced this pull request Nov 23, 2022
luwei16 pushed a commit to luwei16/incubator-doris that referenced this pull request Apr 7, 2023
…led to be released in time (apache#14119)"

commit 3690c4d
Author: Xin Liao <liaoxinbit@126.com>
Date:   Wed Nov 9 22:38:08 2022 +0800
    [fix](load) fix that load channel failed to be released in time (apache#14119)
luwei16 added a commit to luwei16/incubator-doris that referenced this pull request Apr 7, 2023
…electdb-cloud-dev (20221130 23a144c) (apache#1199)

* [feature](selectdb-cloud) Fix file cache metrics nullptr error (apache#1060)

* [feature](selectdb-cloud) Fix abort copy when -235 (apache#1039)

* [feature](selectdb-cloud) Replace libfdb_c.so to make it compatible with different OS (apache#925)

* [feature](selectdb-cloud) Optimize RPC retry in cloud_meta_mgr (apache#1027)

* Optimize RETRY_RPC in cloud_meta_mgr
* Add random sleep for RETRY_RPC
* Add a simple backoff strategy for rpc retry

* [feature](selectdb-cloud) Copy into support select by column name (apache#1055)

* Copy into support select by column name
* Fix broker load core dump due to mis-match of number of columns between remote and schema

* [feature](selectdb-cloud) Fix test_dup_mv_schema_change case (apache#1022)

* [feature](selectdb-cloud) Make the broker execute on the specified cluster (apache#1043)

* Make the broker execute on the specified cluster
* Pass the cluster parameter

* [feature](selectdb-cloud) Support concurrent BaseCompaction and CumuCompaction on a tablet (apache#1059)

* [feature](selectdb-cloud) Reduce meta-service log (apache#1067)

* Quote string in the tagged log
* Add template to enable customized log for RPC requests

* [feature](selectdb-cloud) Use read-only txn + read-write txn for `commit_txn` (apache#1065)

* [feature](selectdb-cloud) Pick "[fix](load) fix that load channel failed to be released in time (apache#14119)"

commit 3690c4d
Author: Xin Liao <liaoxinbit@126.com>
Date:   Wed Nov 9 22:38:08 2022 +0800
    [fix](load) fix that load channel failed to be released in time (apache#14119)

* [feature](selectdb-cloud) Add compaction profile log (apache#1072)

* [feature](selectdb-cloud) Fix abort txn fail when copy job `getAllFileStatus` exception (apache#1066)

* Revert "[feature](selectdb-cloud) Copy into support select by column name (apache#1055)"

This reverts commit f1a543e.

* [feature](selectdb-cloud) Pick"[fix](metric) fix the bug of not updating the query latency metric apache#14172 (apache#1076)"

* [feature](selectdb-cloud) Distinguish KV_TXN_COMMIT_ERR or KV_TXN_CONFLICT while commit failed (apache#1082)

* [feature](selectdb-cloud) Support configuring base compaction concurrency (apache#1080)

* [feature](selectdb-cloud) Enhance start.sh/stop.sh for selectdb_cloud (apache#1079)

* [feature](selectdb-cloud) Add smoke testing (apache#1056)

Add smoke test, 1. upload,query http data api. 2. internal, external stage. 3. select,insert

* [feature](selectdb-cloud) Disable admin stmt in cloud mode (apache#1064)

Disable the following stmt.

* AdminRebalanceDiskStmt/AdminCancelRebalanceDiskStmt
* AdminRepairTableStmt/AdminCancelRepairTableStmt
* AdminCheckTabletsStmt
* AdminCleanTrashStmt
* AdminCompactTableStmt
* AdminCopyTabletStmt
* AdminDiagnoseTabletStmt
* AdminSetConfigStmt
* AdminSetReplicaStatusStmt
* AdminShowConfigStmt
* AdminShowReplicaDistributionStmt
* AdminShowReplicaStatusStmt
* AdminShowTabletStorageFormatStmt

Leaving a backdoor for the user root:

* AdminSetConfigStmt
* AdminShowConfigStmt
* AdminShowReplicaDistributionStmt
* AdminShowReplicaStatusStmt
* AdminDiagnoseTabletStmt

* [feature](selectdb-cloud) Update copy into doc (apache#1063)

* [feature](selectdb-cloud) Fix AdminSetConfigStmt cannot work with root (apache#1085)

* [feature](selectdb-cloud) Fix userid null lead to checkpoint error (apache#1083)

* [feature](selectdb-cloud) Support controling the space used for upload (apache#1091)

* [feature](selectdb-cloud) Pick "[fix](sequence) fix that update table core dump with sequence column (apache#13847)" (apache#1092)

* [Fix](memory-leak) Fix boost::stacktrace memory leak (1097)

* [Fix](selectdb-cloud) Several picks to fix memtracker  (apache#1087)

* [enhancement](memtracker)  Add independent and unique scanner mem tracker for each query (apache#13262)

* [enhancement](memory) Print memory usage log when memory allocation fails (apache#13301)

* [enhancement](memtracker) Print query memory usage log every second when `memory_verbose_track` is enabled (apache#13302)

* [fix](memory) Fix USE_JEMALLOC=true UBSAN compilation error apache#13398

* [enhancement](memtracker) Fix bthread local consume mem tracker (apache#13368)

    Previously, bthread_getspecific was called every time bthread local was used. In the test at apache#10823, it was found 
    that frequent calls to bthread_getspecific had performance problems.

    So a cache is implemented on pthread local based on the btls key, but the btls key cannot correctly sense bthread switching.

    So, based on bthread_self to get the bthread id to implement the cache.

* [enhancement](memtracker) Fix brpc causing query mem tracker to be inaccurate apache#13401

* [fix](memtracker) Fix transmit_tracker null pointer because phamp is not thread safe apache#13528

* [enhancement](memtracker) Fix Brpc mem count and refactored thread context macro  (apache#13469)

* [fix](memtracker) Fix the usage of bthread mem tracker  (apache#13708)

    bthead context init has performance loss, temporarily delete it first, it will be completely refactored in apache#13585.

* [enhancement](memtracker) Refactor load channel + memtable mem tracker (apache#13795)

* [fix](load) Fix load channel mgr lock (apache#13960)

    hot fix load channel mgr lock

* [fix](memtracker) Fix DCHECK !std::count(_consumer_tracker_stack.begin(), _consumer_tracker_stack.end(), tracker)

* [tempfix][memtracker] wait pick 0b945fe

Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>

* [feature](selectdb-cloud) Add more recycler case (apache#1094)

* [feature](selectdb-cloud) Pick "[improvement](load) some simple optimization for reduce load memory policy (apache#14215)" (apache#1096)

* [feature](selectdb-cloud) Reduce unnecessary get rowset rpc when prepare compaction (apache#1099)

* [feature](selectdb-cloud) Pick "[improvement](load) reduce memory in batch for small load channels (apache#14214)" (apache#1100)

* [feature](selectdb-cloud) Pick "[improvement](load) release load channel actively when error occurs (apache#14218)" (apache#1102)

* [feature](selectdb-cloud) Print build info of ms/recycler to stdout when launch (apache#1105)

* [feature](selectdb-cloud) copy into support select by column name and load with partial columns (apache#1104)

e.g.
```
COPY INTO test_table FROM (SELECT col1, col2, col3 FROM @ext_stage('1.parquet'))

COPY INTO test_table (id, name) FROM (SELECT col1, col2 FROM @ext_stage('1.parquet'))
```

* [fix](selectdb-cloud) Pick "[Fix](array-type) bugfix for array column with delete condition (apache#13361)" (apache#1109)

Fix for SQL with array column:
delete from tbl where c_array is null;

more info please refer to apache#13360

Co-authored-by: camby <104178625@qq.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [feature](selectdb-cloud) Copy into support force (apache#1081)

* [feature](selectdb-cloud) Add abort txn, abort tablet job http api (apache#1101)

Abort load txn by txn_id:
```
curl "{meta_sevice_ip}:{brpc_port}/MetaService/http/abort_txn?token=greedisgood9999" -d '{
"cloud_unique_id": string,
"txn_id": int64
}'
```

Abort load txn by db_id and label:
```
curl "{meta_sevice_ip}:{brpc_port}/MetaService/http/abort_txn?token=greedisgood9999" -d '{
"cloud_unique_id": string,
"db_id": int64,
"label": string
}'
```

Only support abort compaction job currently:
```
curl "{meta_sevice_ip}:{brpc_port}/MetaService/http/abort_tablet_job?token=greedisgood9999" -d '{
"cloud_unique_id": string,
"job" : {
  "idx": {"tablet_id": int64},
  "compaction": [{"id": string}]
}
}'
```

* [feature](selectdb-cloud) Fix external stage data for smoke test and retry to create stage (apache#1119)

* [feature](selectdb-cloud) Fix data leaks when truncating table (apache#1114)

* Drop cloud partition when truncating table
* Add retry strategy for dropCloudMaterializedIndex

* [feature](selectdb-cloud) Fix missing library when compiling unit test (apache#1128)

* [feature](selectdb-cloud) Validate the object storage when create stage (apache#1115)

* [feature](selectdb-cloud) Fix incorrectly setting cumulative point when committing base compaction (apache#1127)

* [feature](selectdb-cloud) Fix missing lease when preparing cumulative compaction (apache#1131)

* [feature](selectdb-cloud) Fix unbalanced tablet distribution (apache#1121)

* Fix the bug of unbalanced tablet distribution
* Use replica index hash to BE

* [feature](selectdb-cloud) Fix core dump when get tablets info by BE web page (apache#1113)

* [feature](selectdb-cloud) Fix start_fe.sh --version (apache#1106)

* [feature](selectdb-cloud) Print tablet stats before and after compaction (apache#1132)

* Log num rowsets before and after compaction
* Print tablet stats after committing compaction

* [feature](selectdb-cloud) Allow root user execute AlterSystemStmt (apache#1143)

* [feature](selectdb-cloud) Fix BE UT (apache#1141)

* [feature](selectdb-cloud) Select BE for the first bucket of every partition randomly (apache#1136)

* [feature](selectdb-cloud) Fix query_limit int -> int64 (apache#1154)

* [feature](selectdb-cloud) Add more cloud recycler case (apache#1116)

* add more cloud recycler case
* modify cloud recycler case dateset from sf0.1 to sf1

* [feature](selectdb-cloud) Fix misuse of aws transfer which may delete tmp file prematurely (apache#1160)

* [feature](selectdb-cloud) Add test for copy into http data api and userId (apache#1044)

* Add test for copy into http data api and userId
* Add external and internal stage cross use regression case.

* [feature](selectdb-cloud)  Pass the cloud compaction regression test (apache#1173)

* [feature](selectdb-cloud) Modify max_bytes_per_broker_scanner default value to 150G (apache#1184)

* [feature](selectdb-cloud) Fix missing lock when calling Tablet::delete_predicates (apache#1182)

* [improvement](config)change default remote_fragment_exec_timeout_ms to 30 seconds

* [improvement](config) change default value of broker_load_default_timeout_second to 12 hours

* [feature](selectdb-cloud) Fix replay copy into (apache#1167)

* Add stage ddl regression
* fix replay copy into
* remove unused log
* fix user name

* [feature](selectdb-cloud) Fix FE --version option not work after fe started (apache#1161)

* [feature](selectdb-cloud) BE accesses object store using HTTP (apache#1111)

* [feature](selectdb-cloud) Refactor recycle copy jobs (apache#1062)

* [fix](FE) Pick fix from doris master (apache#1177) (apache#1178)

Commit: 53e5f39
Author: starocean999 <40539150+starocean999@users.noreply.github.com>
Committer: GitHub <noreply@github.com>
Date: Mon Oct 31 2022 10:19:32 GMT+0800 (China Standard Time)
fix result exprs should be substituted in the same way as agg exprs (apache#13744)

Commit: a4a9912
Author: starocean999 <40539150+starocean999@users.noreply.github.com>
Committer: GitHub <noreply@github.com>
Date: Thu Nov 03 2022 10:26:59 GMT+0800 (China Standard Time)
fix group by constant value bug (apache#13827)

Commit: 84b969a
Author: starocean999 <40539150+starocean999@users.noreply.github.com>
Committer: GitHub <noreply@github.com>
Date: Thu Nov 10 2022 11:10:42 GMT+0800 (China Standard Time)
fix the grouping expr should check col name from base table first, then alias (apache#14077)

Commit: ae4f4b9
Author: starocean999 <40539150+starocean999@users.noreply.github.com>
Committer: GitHub <noreply@github.com>
Date: Thu Nov 24 2022 10:31:58 GMT+0800 (China Standard Time)
fix having clause should use column name first then alias (apache#14408)

* [feature](selectdb-cloud) Deal with getNextTransactionId rpc exception (apache#1181)

Before fixing, getNextTransactionId will return -1 if there is RPC exception,
it will cause schema change and the previous load task execute in parallel unexpectedly.

* [feature](selectdb-cloud) Throw exception for unsupported operations in CloudGlobalTransactionMgr (apache#1180)

* [improvement](load) Add more log on RPC error (apache#1183)

* [feature](selectdb-cloud) Add copy_into case(json, parquet, orc) and tpch_sf1 to smoke test (apache#1140)

* [feature](selectdb-cloud) Recycle dropped stage (apache#1071)

* log s3 response code
* add log in S3Accessor::delete_objects_by_prefix
* Fix show copy
* remove empty line

* [feature](selectdb-cloud) Support bthread for new scanner (apache#1117)

* Support bthread for new scanner
* Keep the number of remote threads same as local threads

* [feature](selectdb-cloud) Implement self-explained cloud unique id for instance id searching (apache#1089)

1. Implement self-explained cloud unique id for instance id searching
2. Fix register core when metaservice start error
3. Fix drop_instance not set mtime
4. Add HTTP API to get instance info

```
curl "127.0.0.1:5008/MetaService/http/get_instance?token=greedisgood9999&cloud_unique_id=regression-cloud-unique-id-fe-1"

curl "127.0.0.1:5008/MetaService/http/get_instance?token=greedisgood9999&cloud_unique_id=1:regression_instance0:regression-cloud-unique-id-fe-1"

curl "127.0.0.1:5008/MetaService/http/get_instance?token=greedisgood9999&instance_id=regression_instance0"
```

* [improvement](memory) simplify memory config related to tcmalloc  and add gc (apache#1191)

* [improvement](memory) simplify memory config related to tcmalloc

There are several configs related to tcmalloc, users do know how to config them. Actually users just want two modes, performance or compact, in performance mode, users want doris run query and load quickly while in compact mode, users want doris run with less memory usage.

If we want to config tcmalloc individually, we can use env variables which are supported by tcmalloc.

* [improvement](tcmalloc) add moderate mode and avoid oom  with a lot of cache (apache#14374)

ReleaseToSystem aggressively when there are little free memory.

* [feature](selectdb-cloud) Pick "[fix](hashjoin) fix coredump of hash join in ubsan build apache#13479" (apache#1190)

commit b5cd167
Author: TengJianPing <18241664+jacktengg@users.noreply.github.com>
Date:   Thu Oct 20 10:16:19 2022 +0800
    [fix](hashjoin) fix coredump of hash join in ubsan build (apache#13479)

* [feature](selectdb-cloud) Support close FileWriter without forcing sync data to storage medium (apache#1134)

* Trace accumulated time
* Support close FileWriter without forcing sync data to storage medium
* Avoid trace overhead when disable trace

* [feature](selectdb-cloud) Pick "[BugFix](function) fix reverse function dynamic buffer overflow due to illegal character apache#13671" (apache#1146)

* pick [opt](exec) Replace get_utf8_byte_length function by array (apache#13664)
* pick [BugFix](function) fix reverse function dynamic buffer overflow due to illegal character apache#13671
Co-authored-by: HappenLee <happenlee@hotmail.com>

* [feature](selectdb-cloud) Pick "[fix](fe) Inconsistent behavior for string comparison in FE and BE (apache#13604)" (apache#1150)

Co-authored-by: xueweizhang <zxw520blue1@163.com>

* [feature](selectdb-cloud) Copy into support delete_on condition (apache#1148)

* [feature](selectdb-cloud) Pick "[fix](agg)fix group by constant value bug (apache#13827)" (apache#1152)

* [fix](agg)fix group by constant value bug

* keep only one const grouping exprs if no agg exprs

Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>

* [feature](selectdb-cloud) Pick "[fix](join)the build and probe expr should be calculated before converting input block to nullable (apache#13436)" (apache#1155)

* [fix](join)the build and probe expr should be calculated before converting input block to nullable

* remove_nullable can be called on const column

Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>

* [feature](selectdb-cloud) Pick "[Bug](predicate) fix core dump on bool type runtime filter (apache#13417)" (apache#1156)

fix core dump on bool type runtime filter

Co-authored-by: Pxl <pxl290@qq.com>

* [feature](selectdb-cloud) Pick "[Fix](agg) fix bitmap agg core dump when phmap pointer assert alignment (apache#13381)" (apache#1157)

Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>

* [feature](selectdb-cloud) Pick "[Bug](function) fix core dump on case when have 1000 condition apache#13315" (apache#1158)

Co-authored-by: Pxl <pxl290@qq.com>

* [feature](selectdb-cloud) Pick "[fix](sort)the sort expr nullable info is wrong in some case (apache#12003)"

* [feature](selectdb-cloud) Pick "[Improvement](decimal) print decimal according to the real precision and scale (apache#13437)"

* [feature](selectdb-cloud) Pick "[bugfix](VecDateTimeValue) eat the value of microsecond in function from_date_format_str (apache#13446)"

* [bugfix](VecDateTimeValue) eat the value of microsecond in function from_date_format_str

* add sql based regression test

Co-authored-by: xiaojunjie <xiaojunjie@baidu.com>

Co-authored-by: Lightman <31928846+Lchangliang@users.noreply.github.com>
Co-authored-by: meiyi <myimeiyi@gmail.com>
Co-authored-by: Xiaocc <598887962@qq.com>
Co-authored-by: Lei Zhang <27994433+SWJTU-ZhangLei@users.noreply.github.com>
Co-authored-by: Xin Liao <liaoxinbit@126.com>
Co-authored-by: Luwei <814383175@qq.com>
Co-authored-by: plat1ko <platonekosama@gmail.com>
Co-authored-by: deardeng <565620795@qq.com>
Co-authored-by: Kidd <107781942+k-i-d-d@users.noreply.github.com>
Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>
Co-authored-by: zhannngchen <48427519+zhannngchen@users.noreply.github.com>
Co-authored-by: camby <104178625@qq.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
Co-authored-by: AlexYue <yj976240184@qq.com>
Co-authored-by: xueweizhang <zxw520blue1@163.com>
Co-authored-by: Pxl <pxl290@qq.com>
Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
Co-authored-by: xiaojunjie <971308896@qq.com>
Co-authored-by: xiaojunjie <xiaojunjie@baidu.com>
luwei16 added a commit to luwei16/incubator-doris that referenced this pull request Apr 7, 2023
```
 branch                                   date

                            20221115          20221205  20221210
                            9de1fec6c
                                v                 v         v
doris-1.2-lts                ---o-o-o-o-o-o-o---  .         .
                                 \                .         .
selectdb-cloud-dev-merge          o--o--o---o-o-o .         .
                                  /              \.         .
selectdb-cloud-dev-20221205       |               o-o--o--o-.
                                 /                /        \.
selectdb-cloud-dev-20221210     |                /          o--o (final)
                                /               /          /    \
selectdb-cloud-dev           ---o---o---o-o-o---o--o----o--o-----X-----
                                ^               ^          ^
                            67875dd2b       7cf9fb0ab   479d081f8
```


* Revert "[enhancement](compaction) opt compaction task producer and quick compaction (#13495)" (#13833)

This reverts commit 4f2ea0776ca3fe5315ab5ef7e00eefabfb5771a0.

* [feature](Nereids): add rule for matching plan into HyperGraph. (#13805)

* [fix](analytic) fix coredump cause by empty analytic parameter types (#13808)



* fix fe compile error

* [Bugfix](upgrade) Fix 1.1 upgrade 1.2 coredump when schema change (#13822)

When upgrade 1.2 version from 1.1, FE version will don't match BE version for a period of time. After upgrade BE and doing schema change, BE will use a field desc_tbl that add in 1.2 version FE. BE will coredump because the field desc_tbl is nullptr. So it need to refuse the request.

* [feature](nereids) add rule for semi/anti join exploration, when there is project between them (#13756)

* [feature](syntax) support SELECT * EXCEPT (#13844)



* [feature](syntax) support SELECT * EXCEPT: add regression test

* [enhancement](Nereids) add merge project rule to column prune rule set (#13835)

when we do column prune, we add project on child plan. If child plan is Project. we need to merge them.

* [fix](nereids) map literal to double in FilterSelectivityCalculator (#13776)

fix literal to double bug: all literal type implements getDouble() function

* [enhancement](Nereids) use join estimation v2 only when stats derive v2 is enable (#13845)

join estimation V2 should be invoked when enableNereidsStatsDeriveV2=true

* [javaudf](string) Fix string format in java udf (#13854)

* [improvement](memory) simplify memory config related to tcmalloc (#13781)

There are several configs related to tcmalloc, users do know how to config them. Actually users just want two modes, performance or compact, in performance mode, users want doris run query and load quickly while in compact mode, users want doris run with less memory usage.

If we want to config tcmalloc individually, we can use env variables which are supported by tcmalloc.

* [minor](load) Improve error message for string type in loading process (#13718)

* [fix](spark load)The where condition does not take effect when spark load loads the file (#13803)

* [enhancement](olap scanner) Scanner row bytes buffer is too small bug (#13874)

* [enhancement](olap scanner) Scanner row bytes buffer is too small, please try to increase be config

Co-authored-by: yiguolei <yiguolei@gmail.com>

* [minor](log) remove some e.printStackTrace() (#13870)

* [enhancement](test) retry start be or fe when port has been bind. (#13860)



Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>

* [docs](tablet-docs) fix the tablet-repair-and-balance.md doucument. (#13853)

Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>

* [doc](spark-doris-connetor)Add spark Doris connector to support streamload documentation #13834

* [fix](join)ColumnNullable need handle const column with nullable const value (#13866)

* [enhancement](profile) add profile to show column predicates (#13862)

* [community](collaborators) add more collaborators (#13880)

* [fix](dynamic-partition) fix wrong check of replication num (#13755)

* [regression](join) add right anti join with other predicate regression case (#13815)

* [meta](recover) change dropInfo and RecoverInfo to GSON (#13830)

* [chore](macOS) Fix compilation errors caused by the deprecated function (#13890)

* [enhancement](Nereids) add eliminate unnecessary project rule (#13886)

This rule eliminate project that output set is same with its child. If the project is the root of plan, the elimination condition is project's output is exactly the same with its child.

The reason to add this rule is when we do join reorder in optimization, the root of plan after transformed maybe a Project and its output set is same with the root of plan before transformed. If we had a Project on the top of the root and its output set is same with the root of plan too. We will have two exactly same projects in memo. One of them is the parent of the other. After MergeProject, we will get a new Project exactly same like the child and need to add to parent's group. Then we trigger Merge Group. Since merge will produce a cycle, the merge will be denied and we will get a final plan with two consecutive projects.

## for example:
**BEFORE OPTIMIZATION**
```
LogicalProject1( projects=[c_custkey#0, c_name#1]) [GroupId#1]
+--LogicalJoin(type=LEFT_SEMI_JOIN)                [GroupId#2]
   |--LogicalProject(...)
   |  +--LogicalJoin(type=INNER_JOIN)
   |  ...
   +--LogicalOlapScan(...)
```
**AFTER APPLY RULE: LOGICAL_SEMI_JOIN_LOGICAL_JOIN_TRANSPOSE_PROJECT**
```
LogicalProject1( projects=[c_custkey#0, c_name#1])    [GroupId#1]
+--LogicalProject2( projects=[c_custkey#0, c_name#1]) [GroupId#2]
   +--LogicalJoin(type=INNER_JOIN)                    [GroupId#10]
      |--LogicalProject(...)
      |  +--LogicalJoin(type=LEFT_SEMI_JOIN)
      |  ...
      +--LogicalOlapScan(...)
```
**AFTER APPLY RULE: MERGE_PROJECTS**
```
LogicalProject3( projects=[c_custkey#0, c_name#1])  [should be in GroupId#1, but in GroupId#2 in fact]
+--LogicalJoin(type=INNER_JOIN)                     [GroupId#10]
   |--LogicalProject(...)
   |  +--LogicalJoin(type=LEFT_SEMI_JOIN)
   |  ...
   +--LogicalOlapScan(...)
```
Since we have exaclty GroupExpression(LogicalProject3 and LogicalProject2) in GroupId#1 and GroupId#2, we need to do MergeGroup(GroupId#1, GroupId#2). But we have child of GroupId#1 in GroupId#2. So the merge is denied.
If the best GroupExpression in GroupId#2 is LogicalProject3, we will get two consecutive projects in the final plan.

* [fix](fe) Inconsistent behavior for string comparison in FE and BE (#13604)

* [enhancement](Nereids) generate correct distribution spec after project (#13725)

after project, some Slot maybe project to another one. So we need to replace ExprId in DistributionSpecHash to the new one. if we do project other than Alias, We need to return DistributionSpecAny other than child's DistributionSpec.

* [fix](Nereids) throw NPE when call getOutputExprIds in LogicalProperties (#13898)

* [Improve](Nereids): refactor eliminate outer join (#13402)

Refactor eliminate outer join #12985

Evaluate the expression with ConstantFoldRule. If the evaluation result is NULL or FALSE, then the elimination condition is satisfied.

* [feature](Nereids) Support lots of scalar function and fix some bug (#13764)

Proposed changes
1. function interfaces that can search the matched signature, say ComputeSignature. It's equal to the Function.CompareMode.
   - IdenticalSignature: equal to Function.CompareMode.IS_IDENTICAL
   - NullOrIdenticalSignature: equal to Function.CompareMode.IS_INDISTINGUISHABLE
   - ImplicitlyCastableSignature: equal to Function.CompareMode.IS_SUPERTYPE_OF
   - ExplicitlyCastableSignature: equal to Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF
3. generate lots of scalar functions
4. bug-fix: disassemble avg function compute wrong result because the wrong input type, the AggregateParam.inputTypesBeforeDissemble is use to save the origin input type and pass to backend to find the correct global aggregate function.
5. bug-fix: subquery with OneRowRelation will crash because wrong nullable property


Note:
1. currently no more unit test/regression test for the scalar functions, I will add the test until migrate aggregate functions for unified processing.
2. A known problem is can not invoke the variable length function, I will fix it later.

* [fix](rpc) The proxy removed when rpc exception occurs is not an abnormal proxy (#13836)

`BackendServiceProxy.getInstance()` uses the round robin strategy to obtain the proxy,
so when the current RPC request is abnormal, the proxy removed by 
`BackendServiceProxy.getInstance().removeProxy(...)` is not an abnormal proxy.

* [Vectorized](function) support topn_array function (#13869)

* [Enhancement](Nereids)optimize merge group in memo #13900

* [improvement](scan) speed up inserting strings into ColumnString (#13397)

* [Opt](function) opt the function of ndv (#13887)

* [fix](keyword) add BIN as keyword (#13907)

* [feature](function)add regexp functions: regexp_replace_one, regexp_extract_all (#13766)

* [feature](nereids) support common table expression (#12742)

Support common table expression(CTE) in Nereids:
- Just implemented inline CTE, which means we will copy the logicalPlan of CTE everywhere it is referenced;
- If the name of CTE is the same as an existing table or view, we will choose CTE first;

* [Load](Sink) remove validate the column data when data is NULL (#13919)

* [feature](new-scan) support transactional insert in new scan framework (#13858)


Support running transactional insert operation with new scan framework. eg:

admin set frontend config("enable_new_load_scan_node" = "true");
begin;
insert into tbl1 values(1,2);
insert into tbl1 values(3,4);
insert into tbl1 values(5,6);
commit;
Add some limitation to transactional insert

Do not support non-literal value in insert stmt
Fix some issue about array type:

Forbid cast other non-array type to NESTED array type, it may cause BE crash.
Add getStringValueForArray() method for Expr, to get valid string-formatted array type value.
Add useLocalSessionState=true in regression-test jdbc url
without this config, the jdbc driver will send some init cmd each time it connect to server, such as
select @@session.tx_read_only.
But when we use transactional insert, after begin command, Doris do not support any other type of
stmt except for insert, commit or rollback.
So adding this config to let the jdbc NOT send cmd when connecting.

* [fix](doc) fix 404 link (#13908)

* [regression-test](query) Add the regression case of the query under the large wide table. #13897

Co-authored-by: smallhibiscus <844981280>

* [fix](storage) evaluate_and of ComparisonPredicateBase has logical error (#13895)

* [fix](unique-key-merge-on-write) Types don't match when calling IndexedColumnIterator::seek_at_or_after (#13885)

* [fix](sequence) fix that update table core dump with sequence column (#13847)

* [fix](sequence) fix that update table core dump with sequence column

* update

* [Bugfix](MV) Fixed load negative values into bitmap type materialized views successfully under non-vectorization (#13719)

* [Bugfix](MV) Fixed load negative values into bitmap type materialized views successfully under non-vectorization

* [enhancement](memtracker) Refactor load channel + memtable mem tracker (#13795)

* [fix](function) fix coredump cause by return type mismatch of vectorized repeat function (#13868)


Will not support repeat function during upgrade in vectorized engine.

* [fix](agg)fix group by constant value bug (#13827)

* [fix](agg)fix group by constant value bug

* keep only one const grouping exprs if no agg exprs

* [fix](Nereids) finalize local aggregate should not turn on stream pre agg (#13922)

* [feature](nereids) Support authentication (#13434)

Add a rule to check the permission of a user who are executing a query. Forbid users who don't have SELECT_PRIV on some tables from executing queries on these tables.

* [Feature](join) Support null aware left anti join (#13871)

* [Fix](Nereids) add comments to CostAndEnforcerJob and fix view test case (#13046)

1. add comments to cost and enforcer job as some code is too hard to understand
2. fix nereids_syntax_p0/view.groovy's multi-answer bug.

* [Vectorized](function) support bitmap_to_array function (#13926)

* [docs](round) complement round function documentation (#13838)

* [fix](typo) check catalog enable exception message spelling mistake (#13925)

* [Enhancement](function) optimize the `upper` and `lower` functions using the simd instruction. (#13326)

optimize the `upper` and `lower` functions using the simd instruction.

* [revert](Nereids): revert GroupExpression Children ImmutableList. (#13918)

* [optimization](array-type) update the exception message when create table with array column (#13731)

This pr is used to update the exception message when create table with array column.
Co-authored-by: hucheng01 <hucheng01@baidu.com>

* [Bug](array-type) Fix array product calculate decimal type return wrong result (#13794)

* [enhancement](chore) remove debug log which is really too frequent #13909

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [doc](jsonb type)add documents for JSONB datatype (#13792)

* [BugFix](Concat) output of string concat function exceeds UINT makes crash (#13916)

* [Improvement](javaudf) support different date argument for date/datetime type (#13920)

* [refactor](crossjoin) refactor cross join (#13896)

* [fix](meta)(recover) fix recover info persist bug (#13948)

introduced from #13830

* [improvement](exec) add more debug info on fragment exec error (#13899)

* [feature-wip][refactor](multi-catalog) Persist external catalog related metadata. (#13746)

Persist external catalog/db/table, including the columns of external tables.
After this change, external objects could have their own uniq ID through their lifetime,
this is required for the statistic information collection.

* [fix](runtime-filter) build thread destruct first may cause probe thread coredump (#13911)

* [enhancment](Nereids) enable push down filter through aggregation (#13938)

* [enhancement](Nereids) remove unnecessary int cast (#13881)

* [enhancement](Nereids) remove unnecessary string cast (#13730)

convert string like literal to the cast type instead of run cast in runtime

* [minor](error msg) Fix wrong error message (#13950)

* [enhancement](compaction) introduce segment compaction (#12609) (#12866)

## Design

### Trigger

Every time when a rowset writer produces more than N (e.g. 10) segments, we trigger segment compaction. Note that only one segment compaction job for a single rowset at a time to ensure no recursing/queuing nightmare.

### Target Selection

We collect segments during every trigger. We skip big segments whose row num > M (e.g. 10000) coz we get little benefits from compacting them comparing our effort. Hence, we only pick the 'Longest Consecutive Small" segment group to do actual compaction.

### Compaction Process

A new thread pool is introduced to help do the job. We submit the above-mentioned 'Longest Consecutive Small" segment group to the pool. Then the worker thread does the followings:

- build a MergeIterator from the target segments
- create a new segment writer
- for each block readed from MergeIterator, the Writer append it

### SegID handling

SegID must remain consecutive after segment compaction. 

If a rowset has small segments named seg_0, seg_1, seg_2, seg_3 and a big segment seg_4:

- we create a segment named "seg_0-3" to save compacted data for seg_0, seg_1, seg_2 and seg_3
- delete seg_0, seg_1, seg_2 and seg_3
- rename seg_0-3 to seg_0
- rename seg_4 to seg_1

It is worth noticing that we should wait inflight segment compaction tasks to finish before building rowset meta and committing this txn.

* [fix](Nerieds) fix tpch and support trace plan's change event (#13957)

This pr fix some bugs for run tpc-h
1. fix the avg(decimal) crash the backend. The fix code in `Avg.getFinalType()` and every child class of `ComputeSinature`
2. fix the ReorderJoin dead loop. The fix code in `ReorderJoin.findInnerJoin()`
3. fix the TimestampArithmetic can not bind the functions in the child. The fix code in `BindFunction.FunctionBinder.visitTimestampArithmetic()`

New feature: support trace the plan's change event, you can `set enable_nereids_trace=true` to open trace log and see some log like this:
```
2022-11-03 21:07:38,391 INFO (mysql-nio-pool-0|208) [Job.printTraceLog():128] ========== RewriteBottomUpJob ANALYZE_FILTER_SUBQUERY ==========
before:
LogicalProject ( projects=[S_ACCTBAL#17, S_NAME#13, N_NAME#4, P_PARTKEY#19, P_MFGR#21, S_ADDRESS#14, S_PHONE#16, S_COMMENT#18] )
+--LogicalFilter ( predicates=((((((((P_PARTKEY#19 = PS_PARTKEY#7) AND (S_SUPPKEY#12 = PS_SUPPKEY#8)) AND (P_SIZE#24 = 15)) AND (P_TYPE#23 like '%BRASS')) AND (S_NATIONKEY#15 = N_NATIONKEY#3)) AND (N_REGIONKEY#5 = R_REGIONKEY#0)) AND (R_NAME#1 = 'EUROPE')) AND (PS_SUPPLYCOST#10 =  (SCALARSUBQUERY) (QueryPlan: LogicalAggregate ( phase=LOCAL, outputExpr=[min(PS_SUPPLYCOST#31) AS `min(PS_SUPPLYCOST)`#33], groupByExpr=[] )), (CorrelatedSlots: [P_PARTKEY#19, S_SUPPKEY#12, S_NATIONKEY#15, N_NATIONKEY#3, N_REGIONKEY#5, R_REGIONKEY#0, R_NAME#1]))) )
   +--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
      |--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
      |  |--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
      |  |  |--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
      |  |  |  |--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.part, output=[P_PARTKEY#19, P_NAME#20, P_MFGR#21, P_BRAND#22, P_TYPE#23, P_SIZE#24, P_CONTAINER#25, P_RETAILPRICE#26, P_COMMENT#27], candidateIndexIds=[], selectedIndexId=11076, preAgg=ON )
      |  |  |  +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.supplier, output=[S_SUPPKEY#12, S_NAME#13, S_ADDRESS#14, S_NATIONKEY#15, S_PHONE#16, S_ACCTBAL#17, S_COMMENT#18], candidateIndexIds=[], selectedIndexId=11124, preAgg=ON )
      |  |  +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.partsupp, output=[PS_PARTKEY#7, PS_SUPPKEY#8, PS_AVAILQTY#9, PS_SUPPLYCOST#10, PS_COMMENT#11], candidateIndexIds=[], selectedIndexId=11092, preAgg=ON )
      |  +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.nation, output=[N_NATIONKEY#3, N_NAME#4, N_REGIONKEY#5, N_COMMENT#6], candidateIndexIds=[], selectedIndexId=11044, preAgg=ON )
      +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.region, output=[R_REGIONKEY#0, R_NAME#1, R_COMMENT#2], candidateIndexIds=[], selectedIndexId=11108, preAgg=ON )

after:
LogicalProject ( projects=[S_ACCTBAL#17, S_NAME#13, N_NAME#4, P_PARTKEY#19, P_MFGR#21, S_ADDRESS#14, S_PHONE#16, S_COMMENT#18] )
+--LogicalFilter ( predicates=((((((((P_PARTKEY#19 = PS_PARTKEY#7) AND (S_SUPPKEY#12 = PS_SUPPKEY#8)) AND (P_SIZE#24 = 15)) AND (P_TYPE#23 like '%BRASS')) AND (S_NATIONKEY#15 = N_NATIONKEY#3)) AND (N_REGIONKEY#5 = R_REGIONKEY#0)) AND (R_NAME#1 = 'EUROPE')) AND (PS_SUPPLYCOST#10 = min(PS_SUPPLYCOST)#33)) )
   +--LogicalProject ( projects=[P_PARTKEY#19, P_NAME#20, P_MFGR#21, P_BRAND#22, P_TYPE#23, P_SIZE#24, P_CONTAINER#25, P_RETAILPRICE#26, P_COMMENT#27, S_SUPPKEY#12, S_NAME#13, S_ADDRESS#14, S_NATIONKEY#15, S_PHONE#16, S_ACCTBAL#17, S_COMMENT#18, PS_PARTKEY#7, PS_SUPPKEY#8, PS_AVAILQTY#9, PS_SUPPLYCOST#10, PS_COMMENT#11, N_NATIONKEY#3, N_NAME#4, N_REGIONKEY#5, N_COMMENT#6, R_REGIONKEY#0, R_NAME#1, R_COMMENT#2, min(PS_SUPPLYCOST)#33] )
      +--LogicalApply ( correlationSlot=[P_PARTKEY#19, S_SUPPKEY#12, S_NATIONKEY#15, N_NATIONKEY#3, N_REGIONKEY#5, R_REGIONKEY#0, R_NAME#1], correlationFilter=Optional.empty )
         |--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
         |  |--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
         |  |  |--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
         |  |  |  |--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
         |  |  |  |  |--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.part, output=[P_PARTKEY#19, P_NAME#20, P_MFGR#21, P_BRAND#22, P_TYPE#23, P_SIZE#24, P_CONTAINER#25, P_RETAILPRICE#26, P_COMMENT#27], candidateIndexIds=[], selectedIndexId=11076, preAgg=ON )
         |  |  |  |  +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.supplier, output=[S_SUPPKEY#12, S_NAME#13, S_ADDRESS#14, S_NATIONKEY#15, S_PHONE#16, S_ACCTBAL#17, S_COMMENT#18], candidateIndexIds=[], selectedIndexId=11124, preAgg=ON )
         |  |  |  +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.partsupp, output=[PS_PARTKEY#7, PS_SUPPKEY#8, PS_AVAILQTY#9, PS_SUPPLYCOST#10, PS_COMMENT#11], candidateIndexIds=[], selectedIndexId=11092, preAgg=ON )
         |  |  +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.nation, output=[N_NATIONKEY#3, N_NAME#4, N_REGIONKEY#5, N_COMMENT#6], candidateIndexIds=[], selectedIndexId=11044, preAgg=ON )
         |  +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.region, output=[R_REGIONKEY#0, R_NAME#1, R_COMMENT#2], candidateIndexIds=[], selectedIndexId=11108, preAgg=ON )
         +--LogicalAggregate ( phase=LOCAL, outputExpr=[min(PS_SUPPLYCOST#31) AS `min(PS_SUPPLYCOST)`#33], groupByExpr=[] )
            +--LogicalFilter ( predicates=(((((P_PARTKEY#19 = PS_PARTKEY#28) AND (S_SUPPKEY#12 = PS_SUPPKEY#29)) AND (S_NATIONKEY#15 = N_NATIONKEY#3)) AND (N_REGIONKEY#5 = R_REGIONKEY#0)) AND (CAST(R_NAME AS STRING) = CAST(EUROPE AS STRING))) )
               +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.partsupp, output=[PS_PARTKEY#28, PS_SUPPKEY#29, PS_AVAILQTY#30, PS_SUPPLYCOST#31, PS_COMMENT#32], candidateIndexIds=[], selectedIndexId=11092, preAgg=ON )

```

* [chore](be web ui)upgrade jquery version to 3.6.0 (#13942)

* upgrade jquery version to 3.6.0

* update license dist

* [fix](load) Fix load channel mgr lock (#13960)

hot fix load channel mgr lock

* [fix](tablet sink) fallback to non-vectorized interface in tablet_sink if is in progress of upgrding from 1.1-lts to 1.2-lts (#13966)

* [Improvement](javaudf) improve java loader usage (#13962)

* [typo](doc) fixed spelling errors (#13974)

* [doc](routineload)Common mistakes in adding routine load #13975

* [enhancement](test) support tablet repair and balance process in ut (#13940)

* [refactor](iceberg-hudi) disable iceberg and hudi table by default (#13932)

* [test](java-udf)add java udf RegressionTest about the currently supported data types #13972

* [fix](storage) rm unacessary check (#13986) (#13988)

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

* [feature-wip](dlf) prepare to support aliyun dlf (#13969)

[What is DLF](https://www.alibabacloud.com/product/datalake-formation)

This PR is a preparation for support DLF, with some changes of multi catalog

1. Add RuntimeException for most of hive meta store or es client visit operation.
2. Add DLF related dependencies.
3. Move the checks of es catalog properties to the analysis phase of creating es catalog

TODO(in next PR):

1. Refactor the `getSplit` method to support not only hdfs, but s3-compatible object storage.
2. Finish the implementation of supporting DLF

* [feature](table-valued-function) Support S3 tvf (#13959)

This pr does three things:

1. Modified the framework of table-valued-function(tvf).
2. be support `fetch_table_schema` rpc.
3. Implemented `S3(path, AK, SK, format)` table-valued-function.

* [fix](memtracker) Fix DCHECK !std::count(_consumer_tracker_stack.begin(), _consumer_tracker_stack.end(), tracker)

* [feature-array](array-type) Add array function array_popback (#13641)

Remove the last element from array.

```
mysql> select array_popback(['test', NULL, 'value']);
+-----------------------------------------------------+
| array_popback(ARRAY('test', NULL, 'value')) |
+-----------------------------------------------------+
| [test, NULL]                                        |
+-----------------------------------------------------+
```

* [feature](function)add search functions: multi_search_all_positions & multi_match_any (#13763)


Co-authored-by: yiliang qiu <yiliang.qiu@qq.com>

* [chore](gutil) remove some gutil macros and solve some macro conflict with brpc (#13954)


Co-authored-by: yiguolei <yiguolei@gmail.com>

* [security](fe jar) upgrade commons-codec:commons-codec to 1.13 #13951

* [typo](docs) fix docs,delete redundant words #13849

* [fix](repeat)remove unmaterialized expr from repeat node (#13953)

* [typo](docs)fix config doc #14010

* [feature](Nereids) support statement having aggregate function in order by list (#13976)

1. add a feature that support statement having aggregate function in order by list. such as:
    SELECT COUNT(*) FROM t GROUP BY c1 ORDER BY COUNT(*) DESC;
2. add clickbench analyze unit tests

* [feat](Nereids) add graph simplifier (#14007)

* [enhancement](Nereids) remove unnecessary decimal cast (#13745)

* [Bug](udf) Make UDF's type always nullable (#14002)

* [typo](doc) fix get_start doc (#14001)

* [fix](load) fix a bug that reduce memory work on hard limit might be triggered twice (#13967)

When the load mem hard limit reached, all load channel should wait on the lock of LoadChannelMgr, util current reduce mem work finished. In current implementation, there's a bug might cause some threads be woke up before reduce mem work finished:

thread A found that soft limit reached, picked a load channel and waiting for reduce memory work finish.
The memory keep increasing
thread B found that hard limit reached (either the load mem hard limit, or process soft limit), it picked a load channel to reduce memory and set the variable _should_wait_flush to true
thread C found that _should_wait_flush is true, waiting on _wait_flush_cond
thread A finished it's reduce memory work, found that _should_wait_flush is true, set it to false, and notify all threads.
thread C is woke up and pick a load channel to do the reduce memory work, and now thread B's work is not finished.
We can see 2 threads doing reduce memory work when hard limit reached, it's quite confusing.

* [enhancement](Nereids) support otherJoinConjuncts in cascades join reorder (#13681)

* [refactor](cv)wait on condition variable more gently (#12620)

* [enhancement](profile) add instanceNum, tableIds to profile. (#13985)

* [bug](like function)fix like '' (empty string) get wrong result with all rows #14035

* [Enhancement](function) add to_bitmap() function with int type (#13973)

to_bitmap function only support string param only,add to_bitmap() function with int type, this can avoid convert int type to string and then convert string to int

* [enhancement](memtracker) Refactor mem tracker hierarchy (#13585)

mem tracker can be logically divided into 4 layers: 1)process 2)type 3)query/load/compation task etc. 4)exec node etc.

type includes

enum Type {
        GLOBAL = 0,        // Life cycle is the same as the process, e.g. Cache and default Orphan
        QUERY = 1,         // Count the memory consumption of all Query tasks.
        LOAD = 2,          // Count the memory consumption of all Load tasks.
        COMPACTION = 3,    // Count the memory consumption of all Base and Cumulative tasks.
        SCHEMA_CHANGE = 4, // Count the memory consumption of all SchemaChange tasks.
        CLONE = 5, // Count the memory consumption of all EngineCloneTask. Note: Memory that does not contain make/release snapshots.
        BATCHLOAD = 6,  // Count the memory consumption of all EngineBatchLoadTask.
        CONSISTENCY = 7 // Count the memory consumption of all EngineChecksumTask.
    }
Object pointers are no longer saved between each layer, and the values of process and each type are periodically aggregated.

other fix:

In [fix](memtracker) Fix transmit_tracker null pointer because phamp is not thread safe #13528, I tried to separate the memory that was manually abandoned in the query from the orphan mem tracker. But in the actual test, the accuracy of this part of the memory cannot be guaranteed, so put it back to the orphan mem tracker again.

* [fix](priv) fix meta replay bug when upgrading from 1.1.x to 1.2.x (#14046)

* [Enhancement](Dictionary-codec) update dict once on same segment (#13936)

update dict once on same segment

* [feature](Nereids) support query that group by use alias generated in aggregate output (#14030)

support query having alias in group by list, such as:
SELECT c1 AS a, SUM(c2) FROM t GROUP BY a;

* [thirdpart](lib) Add lock free queue of concurrentqueue (#14045)

* [feature-wip](multi-catalog) fix page index filter bug (#14015)

Fix page index filter not take effect when multiple columns
Co-authored-by: jinzhe <jinzhe@selectdb.com>

* [fix](Nereids) Use simple cost to calculate benefit and avoid unuseless calculation  (#14056)

In GraphSimplifier, we can use simple cost to calculate the benefit.
And only when the best neighbor of the apply step is the processing edge, we need to update recursively.

* [feature](multi-catalog) Support data on s3-compatible oss and support aliyun DLF (#13994)

Support Aliyun DLF
Support data on s3-compatible object storage, such as aliyun oss.
Refactor some interface of catalog, to make it more tidy.
Fix bug that the default text format field delimiter of hive should be \x01
Add a new class PooledHiveMetaStoreClient to wrap the IMetaStoreClient.

* [Bug](Bitmap) fix sub_bitmap calculate wrong result to return null (#13978)

fix sub_bitmap calculate wrong result to return null

* [fix](build) fix compile fail on Segment::open (#14058)

* [regression](Nereids) add back tpch regression test cases (#13826)

1. add back TPC-H regression test cases
2. fix decimal problem on aggregate function sum and agg introduced by #13764 
3. fix memo merge group NPE introduced by #13900

* [enhancement](Nereids) tpch q21 anti and semi join reorder (#14037)

estimation of anti and semi join need re-work. we just let tpch q21 pass.

* [chore](bin) do not set heap limit for tcmalloc until doris does not allocates large unused memory (#13761)

We set heap limit for tcmalloc to avoid oom introduced by tcmalloc which allocates memory for cache even free memory of a machine is little. However, doris allocates large memory unused in some cases, so tcmalloc would throw an oom exception even ther are a lot free memory in a machine.

We can set the limit after we fix the problem again.

* [fix](statistics) ColumnStatistics was changed unexpectedly when show stats (#14068)

The logic of show stats would change the internal collected ColumnStat unexpectedly which would cause inaccurate cost and inefficient plan

* [improvement](profile) support ordinary user to get query profile via http api (#14016)

* [Nereids][Improve] infer predicate after push down predicate (#12996)

This PR implements the function of predicate inference

For example:

``` sql
select * from student left join score on student.id = score.sid where score.sid > 1
```
transformed logical plan tree:

                    left join
             /                    \
       filter(sid >1)     filter(id > 1) <---- inferred predicate
         |                           |
      scan                      scan  

See `InferPredicatesTest`  for more cases

 The logic is as follows:
  1. poll up bottom predicate then infer additional predicates
    for example:
    select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id
    1. poll up bottom predicate
       select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t.id = 1
    2. infer
       select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t.id = 1 and t2.id = 1
    finally transformed sql:
       select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t2.id = 1
  2. put these predicates into `otherJoinConjuncts` , these predicates are processed in the next
    round of predicate push-down


Now only support infer `ComparisonPredicate`.

TODO: We should determine whether `expression` satisfies the condition for replacement
             eg: Satisfy `expression` is non-deterministic

* [fix](keyranges) fix the split error of keyranges (#14049)

fix the split error of keyranges

* use extern template to date_time_add (#13970)

* [feature](information_schema) add `backends` information_schema table (#13086)

* [feature](inverted index)WIP inverted index api: SQL syntax and metadata (#13430)

Introduce a SQL syntax for creating inverted index and related metadata changes.

```
-- create table with INVERTED index 

CREATE TABLE httplogs (
  ts datetime,
  clientip varchar(20),
  request string,
  status smallint,
  size int,
  INDEX idx_size (size) USING INVERTED,
  INDEX idx_status (status) USING INVERTED,
  INDEX idx_clientip (clientip) USING INVERTED PROPERTIES("parser"="none")
)
DUPLICATE KEY(ts)
DISTRIBUTED BY RANDOM BUCKETS 10

-- add an INVERTED index  to a table

CREATE INDEX idx_request ON httplogs(request) USING INVERTED PROPERTIES("parser"="english");
```

* [opt](ssb) Add query hint for the SSB queries (#14089)

* [refactor](new-scan) remove old vectorized scan node (#14029)

* [docs](odbc) fix docs for sqlserver odbc table (#14017)

Signed-off-by: nextdreamblue <zxw520blue1@163.com>

Signed-off-by: nextdreamblue <zxw520blue1@163.com>

* [enhancement](load) shrink reserved buffer for page builder (#14012) (#14014)

* [enhancement](load) shrink reserved buffer for page builder (#14012)

For table with hundreds of text type columns, flushing its memtable may cost huge memory.
These memory are consumed when initializing page builder, as it reserves 1MB for each column.
So memory consumption grows in proportion with column number. Shrinking the reservation may
reduce memory substantially in load process.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

* response to the review

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

* Update binary_plain_page.h

* Update binary_dict_page.cpp

* Update binary_plain_page.h

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

* [typo](docs)update array type doc #14057

* [fix](JSON) Fail to parse JSONPath (libc++) (#13941)

* [fix](ctas) text column type len = 1 when create table as select (#13906)

Signed-off-by: nextdreamblue <zxw520blue1@163.com>

* [bug](ColumnDecimal)call set_decimalv2_type when cloning ColumnDecimal (#14061)

* call set_decimalv2_type when cloning ColumnDecimal

* clang format

* [fix](Vectorized)fix json_object and json_array function return wrong result on vectorized engine (#13775)

Issue Number: close #13598

* [Compile](join) Boost compiling and linking (#14081)

* [docs](array-type) update the docs to specify how to use array function when import data (#13995)

Co-authored-by: hucheng01 <hucheng01@baidu.com>

* [feature](Nereids) binding slot in order by that not show in project (#14042)

1. binding slot in order by that not show in project, such as:
SELECT c1 FROM t WHERE c2 > 0 ORDER BY c3

2. not check unbound when bind slot reference. Instead, do it in analysis check.

* [improve](Nereids): remove redundant code, add annotation in Memo. (#14083)

* [fix](Nereids) aggregate disassemble generate error output list on GLOBAL phase aggregate (#14079)

we must use localAggregateFunction as key of globalOutputSMap, because we use local output exprs to generate global output in disassembleDistinct

* [performance-wip] (vectorization) Opt HashJoin Performance  (#12390)

* [fix](compile) fix compile error #14103

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [feature](table-valued-function) Support `desc from s3()` and modify the syntax of tvf (#14047)

This pr does two things:

Support desc function s3()
modify the syntax of tvf

* [enhancement](Nereids) use post-order to generate runtime filter in RuntimeFilterGenerator (#13949)

change runtime filter generator from pre-order to post-order, it maybe change the quantity of generated runtime filters.
and the ut will be corrected.

* [refractor](array) refractor DataTypeArray from_string (#13905)

refractor DataTypeArray from_string, make it more clear;
support ',' and ']' inside string element, for example: ['hello,,,', 'world][]']
support empty elements, such as [,] ==> [0,0]
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [Enhancement][fix](profile)() modify some profiles (#14074)

1. add RemainedDownPredicates
2. fix core dump when _scan_ranges is empty
3. fix invalid memory access on vLiteral's debug_string()
4. enlarge mv test wait time

* [fix](load) fix that load channel failed to be released in time (#14119)

* [typo](docs)add udf doc and optimize udf regression test (#14000)

* [Bug](udf) fix java-udaf process string type error and add some tests (#14106)

* [improvement](join) Share hash table in fragments for broadcast join (#13921)

* [feature](table-valued-function)S3 table valued function supports parquet/orc/json file format #14130

S3 table valued function supports parquet/orc/json file format.
For example: parquet format

* [Bug](outfile) Fix wrong decimal format for ORC (#14124)

* [fix](nereids) cannot collect decimal column stats (#13961)

When execute analyze table, doris fails on decimal columns.
The root cause is the scale in decimalV2 is 9, but 2 in schema.
There is no need to check scale for decimalV2, since it is not a float point type.

* [fix](grouping)the grouping expr should check col name from base table first, then alias (#14077)

* [fix](grouping)the grouping expr should check col name from base table first, then alias

* fix fe ut, the behavior would be same as mysql

* [Fix] add hll param for if function (#12366)

* [Fix] add hll param for if function

* add ut

Co-authored-by: shizhiqiang03 <shizhiqiang03@meituan.com>

* [feature](nereids) let user define right deep tree penalty by session variable (#14040)

it is hard for us to find a proper factor for all queries.
default is 0.7

* [fix](ctas) fix wrong string column length after executing ctas from external table  (#14090)

* [feature](Nereids): InnerJoinLeftAssociate, InnerJoinRightAssociate and JoinExchange. (#14051)

* [feature](function) add new function uuid() (#14092)

* [enhance](Nereids): add missing hypergraph rule. (#14087)

* [fix](memtracker) Fix scanner thread ending after fragment thread causing mem tracker null pointer #14143

* [Enhancement](runtime-filter) enlarge runtime filter in predicate threshold (#13581)

enlarge runtime filter in predicate threshold

* [feature](Nereids) support circle graph (#14082)

* [fix](schemeChange) fe oom because replicas too many when schema change (#12850)

* [chore][build] add instructions to build version string (#14067)

* [feature-wip](multi-catalog) lazy read for ParquetReader (#13917)

Read predicate columns firstly, and use VExprContext(push-down predicates)
to generate the select vector, which is then applied to read the non-predicate columns.
The data in non-predicate columns may be skipped by select vector, so the value-decode-time can be reduced.
If a whole page can be skipped, the decompress-time can also be reduced.

* [fix](nereids) column stats min/max missing (#14091)

in the result of SHOW COLUMN STATS tbl, min/max value is not displayed.

* [enhancement](Nereids) analyze check input slots must in child's output (#14107)

* [Improvement](join) Support nested loop outer join (#13965)

* [docs](recover) modify recover doc (#13904)

* [feature-wip](statistic) persistence table statistics into olap table (#13883)

1. Supports for persisting collected statistics to a pre-built OLAP table named `column_statistics`.
2. Use a much simpler mechanism to collect statistics: all the gauges are collected in single one SQL for each partition and then the whole column, which defined in class `AnalysisJob`
3. Implement a cache to manage the statistics records in FE

TODO:

1. Use opentelemetry to monitor the execution time of each job
2. Format the internal analysis SQL
3. split SQL to promise the in expr's child count not exceeds the FE limits of generated SQL for deleting expired records
4. Implements show statements

* [fix](doc): remove incubator. (#14159)

* [UDF](java udf)  using config to enable java udf instead of macro at compile time (#14062)

* [UDF](java udf) useing config to enable java udf instead of macro at compile time

* [enhancement](plugin) import audit logs for slow queries into a separate table (#14100)

* import audit logs for slow queries into a separate table

* [docs](outfile) Add ORC to outfile document (#14153)

* [Bugfix] Fix upgrade from 1.1 coredump (#14163)

When upgrade from 1.1 to master, and then rollback to 1.1, and upgrade to master again, BE will coredump because some rowsets has schema and some rowsets has no schema. In the first time upgrade from 1.1, BE will flush schema in all rowsets and after rollback to 1.1, BE do compaction, and create some new rowset without schema. And the second time upgrade from 1.1, BE coredump because some conditions depend on having all or none of the rowsets.

* [Improvement](profile) Improve readability for runtime filters in profile string (#14165)

* [Improvement](profile) Improve readability for runtime filters in profile string

* update

* [fix](metric) fix the bug of not updating the query latency metric #14172

* [Docs](README)Update the README.md (#14156)

Add the new release in Readme.md

* [fix](decimal) change log fatal to log warning to avoid code dump on decimal type (#14150)

* [fix](cast)fix cast to char(N) error (#14168)

* [chore](build) Optimize the compilation time (#14170)

Currently, it takes too much time to build BE from source in workflow environments (P0/P1) which affects the efficiency of daily development.

We can measure the time by executing the following command.

time EXTRA_CXX_FLAGS='-O3' BUILD_TYPE=ASAN ./build.sh --be --fe --clean -j "$(nproc)"
This PR optimizes the compilation time by exploiting the following methods.

Reduce the codegen by removing some useless std::visit.
Disable the optimization for some template functions which are instantiated by std::visit conditionally (except for the RELEASE build).

* [feature](Nereids) prune runtime filters which cannot reduce the tuple number of probe table (#13990)

1. add a post processor: runtime filter pruner 
Doris generates RFs (runtime filter) on Join node to reduce the probe table at scan stage. But some RFs have no effect, because its selectivity is 100%. This pr will remove them.
A RF is effective if
a. the build column value range covers part of that of probe column, OR
b. the build column ndv is less than that of probe column, OR
c. the build column's ColumnStats.selectivity < 1, OR
d. the build column is reduced by another RF, which satisfies above criterions.

2. explain graph
a. add RF info in Join and Scan node
b. add predicate count in Scan node

3. Rename session variable
rename `enable_remove_no_conjuncts_runtime_filter_policy` to `enable_runtime_filter_prune` 

4. fix min/max column stats derive bug
`select max(A) as X from T group by B`  
X.min is A.min, not A.max

* [feature](Nereids) replace order by keys by child output if possible (#14108)

To support query like that:
SELECT c1 + 1 as a, sum(c2) FROM t GROUP BY c1 + 1 ORDER BY c1 + 1

After rewrite, plan will equal to
SELECT c1 + 1 as a, sum(c2) FROM t GROUP BY c1 + 1 ORDER BY a

* [Feature](Sequence) Support sequence_match and sequence_count functions (#13785)

* [refactor](Nereids) remove DecimalType, use DecimalV2Type instead (#14166)

* [Bug](runtimefilter) Fix concurrent bug in runtime filter #14177

For runtime filter, signal will be called by a thread which is different from the await thread. So there will be a potential race for variable is_ready

* [enhancement](thirdparty) support create stripe reader by column names (#14184)

ORC NextStripeReader now only support read columns by indices, but it is hard to get column indices for complex types.
We patch ORC adapter to support read columns by column names.
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [Opt](exec) prevent the scan key split whole range (#14088)

prevent the scan key split whole range

* [feature](docs) add docs for SHOW-CATALOG-RECYCLE-BIN (#14185)

* [Bug](nljoin) Keep compatibility for nljoin (#14182)

* [Enhancement](Nerieds) Support numbers TableValuedFunction and some bitmap/hll aggregate function (#14169)

## Problem summary
This pr support
1. `numbers` TableValuedFunction for nereids test, like `select * from numbers(number = 10, backend_num = 1)`
2. bitmap/hll aggregate function
3. support find variable length function in function registry, like `coalesce`
4. fix a bug that print nerieds trace will throw exception because use RewriteRule in ApplyRuleJob, e.g: `AggregateDisassemble`, introduced by #13957

* [test](array function)add array_range function test (#14123)

* add array_range function test

* add array_range function test

* [enhancement](load) Increase batch size of node channel to improve import performance (#13912)

* [chore](cmake) Fix wrong statements (#14187)

* [feature](running_difference) support running_difference function (#13737)

* [feature-array](array-type) Add array function array_with_constant (#14115)

Return array of constants with length num.

```
mysql> select array_with_constant(4, 1223);
+------------------------------+
| array_with_constant(4, 1223) |
+------------------------------+
| [1223, 1223, 1223, 1223]     |
+------------------------------+
1 row in set (0.01 sec)
```
co-authored-by @eldenmoon

* [fix](chore) read max_map_count from proc and make notice much more understandable (#14137)

Some users can not use sysctl under non-root in linux, so we read max_map_count from proc.
Notice users that they can change max_map_count under root.

* [regression-test] sleep longer to void  error (#14186)

* [typo](comment) Fix a lot of spell errors in be comments (#14208)

fix typos.

* [test](jdbc external table) add jdbc regression test case (#14086)

* [test](jdbc postgresql case)add jdbc test case for postgresql  (#14162)

* [fix](scankey) fix extended scan key errors. (#14200)

Issue Number: close #14199

* [feature](partition) support new create partition syntax (#13772)

Create partitions use :
```
PARTITION BY RANGE(event_day)(
        FROM ("2000-11-14") TO ("2021-11-14") INTERVAL 1 YEAR,
        FROM ("2021-11-14") TO ("2022-11-14") INTERVAL 1 MONTH,
        FROM ("2022-11-14") TO ("2023-01-03") INTERVAL 1 WEEK,
        FROM ("2023-01-03") TO ("2023-01-14") INTERVAL 1 DAY,
        PARTITION p_20230114 VALUES [('2023-01-14'), ('2023-01-15'))
)

PARTITION BY RANGE(event_time)(
        FROM ("2023-01-03 12") TO ("2023-01-14 22") INTERVAL 1 HOUR
)
```
can create a year/month/week/day/hour's date partitions in a batch,
also it is compatible with the single partitioning method.

* [improvement](load) reduce memory in batch for small load channels (#14214)

* [enhancement](memory) Support try catch bad alloc (#14135)

* [improvement](load) release load channel actively when error occurs (#14218)

* update (#14215)

* [hotfix](memtracker) Fix expired `DCHECK(_limit != -1);` and segment_meta_mem_tracker inelegant end (#14223)

* [fix](schema) Release memory of TabletSchemaPB in RowsetMetaPB #13993

* [fix](ctas) use json_object in CTAS get wrong result (#14173)

* [fix](ctas) use json_object in CTAS get wrong result

Signed-off-by: nextdreamblue <zxw520blue1@163.com>

* [enhancement](be)close ExecNode ASAP to release resource earlier (#14203)

* [fix](compaction) segcompaction coredump if the rowset starts with a big segment (#14174) (#14176)

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

* [feature](remote)Only query can use local cache when reading remote files. (#13865)

When calling select on remote files, download cache files to local disk.
When calling alter table on remote files, read files directly from remote storage. So if tablet is too large, it will not take up too many local disk when creating local cache file.

* [chore](build) Split the compliation units to build them in parallel (#14232)

* [enhancement](Nereids) add output set and output exprid set cache (#14151)

* [BugFix](file cache) don't clean clone dir when doing _gc_unused_file_caches (#14194)

* use another file_size overload for noexcept

* don't gc clone dir

* use better status

* [improvement](log) print info of error replicas (#14220)

* (fix)(multi-catalog)(es) Fix error result because not used fields_context (#14229)

Fix error result because not used fields_context

* [feature](Nereids) add circle detector and avoid overlap (#14164)

* [test](multi-catalog)Regression test for external hive parquet table (#13611)

* [feature-wip](multi-catalog) Support hive partition cache (#14134)

* [multi-catalog](fix) the eof of lazy read columns may be not equal to the eof of predicate columns (#14212)

Fix three bugs:
1. The EOF of lazy read columns may be not equal to the EOF of predicate columns.
(for example: If the predicate column has 3 pages, with 400 rows for each, but the last page
is filtered by page index. When batch_size=992, the EOF of predicate column is true.
However, we should set batch_size=800 for lazy read column, so the EOF of lazy read column may be false.)
2. The array column does not count the number of nulls
3. Generate wrong NullMap for array column

* [temp](statistics) disable statistic tables

* [feature](selectdb-cloud) Fix txn manager conflict with branch-1.2-lts (#1118)

* [feature](selectdb-cloud) Fix sql_parser.cup

* [fix] fix conflict in SetOperationStmt.java (#1125)

* [feature](selectdb-cloud) Move some files from io to cloud (#1129)

* [feature](selectdb-cloude) Modify header file and macro some file (#1133)

* [feature](selectdb-cloude) Modify header file and macro some file

* tmp

* Fix FE implict merge conflict

* [feature](selectdb-cloud) remove master file cache (#1137)

* [chore-fix-merger](dynamic-table) fix some code conflicts about dynamic table

* Fix memtracker 1.2 conflict (#1147)

* [chore-fix-merge](selectdb-cloud) Fix write path conflicts (#1142)

* replace FileSystemPtr to FileSystemSPtr
* unify create_rowset_writer
* remove cache_path in Segment::open

* Fix some compilation error due to merge

* [chore-fix-compile](topn-opt) fix header file circular reference (#1162)

* [feature](selectdb-cloud) Fix conflict of blocking_priority_queue (#1171)

* [feature](selectdb-cloud) Fix bug when merging blocking_priority_queue (#1174)

* [chore-fix-merge](selectdb-cloud)  Fix conflict in BetaRowsetWriter (#1179)

* Fix compile error FileSystemSPtr and schema_change.cpp

* [feature](selectdb-cloud) rm io_ctx (#1187)

* [chore-fix-merge](selectdb-cloud) Fix FileSystem related codes (#1204)

* [chore-fix-merge](selectdb-cloud)  Make FileSystem ctor no public to avoid stack-allocate or make unique (#1207)

* [feature](selctdb-cloud) Fix compilation error cause by merge

        modified:   be/src/cloud/cloud_base_compaction.cpp
        modified:   be/src/cloud/io/local_file_system.cpp
        modified:   be/src/cloud/io/local_file_system.h
        modified:   be/src/cloud/io/s3_file_system.h
        modified:   be/src/cloud/olap/beta_rowset_writer.cpp
        modified:   be/src/cloud/olap/olap_server.cpp
        modified:   be/src/cloud/olap/segment.cpp
        modified:   be/src/olap/data_dir.cpp
        modified:   be/src/olap/rowset/segment_v2/segment_iterator.cpp
        modified:   be/src/olap/task/engine_alter_tablet_task.cpp
        modified:   be/src/runtime/exec_env_init.cpp
        modified:   be/src/runtime/fragment_mgr.cpp
        modified:   be/src/service/internal_service.cpp
        modified:   be/src/vec/common/sort/vsort_exec_exprs.h
        modified:   be/src/vec/exec/scan/new_olap_scan_node.h
        modified:   be/src/vec/exprs/vectorized_fn_call.cpp
        modified:   be/src/vec/functions/match.cpp

* [chore-fix-merge](selectdb-cloud) Fix inverted index cache limit related exec_env_init.cpp (#1228)

* [chore-fix-merge](topn-rpc-service) add `be_exec_version` to fetch rpc for compability (#1229)

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
Co-authored-by: jakevin <jakevingoo@gmail.com>
Co-authored-by: TengJianPing <18241664+jacktengg@users.noreply.github.com>
Co-authored-by: Lightman <31928846+Lchangliang@users.noreply.github.com>
Co-authored-by: minghong <englefly@gmail.com>
Co-authored-by: qiye <jianliang5669@gmail.com>
Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com>
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
Co-authored-by: Yulei-Yang <yulei.yang0699@gmail.com>
Co-authored-by: jiafeng.zhang <zhangjf1@gmail.com>
Co-authored-by: yiguolei <676222867@qq.com>
Co-authored-by: yiguolei <yiguolei@gmail.com>
Co-authored-by: wxy <dut.xiangyu@gmail.com>
Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>
Co-authored-by: caoliang-web <71004656+caoliang-web@users.noreply.github.com>
Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>
Co-authored-by: Pxl <pxl290@qq.com>
Co-authored-by: luozenglin <37725793+luozenglin@users.noreply.github.com>
Co-authored-by: Adonis Ling <adonis0147@gmail.com>
Co-authored-by: xueweizhang <zxw520blue1@163.com>
Co-authored-by: shee <13843187+qzsee@users.noreply.github.com>
Co-authored-by: 924060929 <924060929@qq.com>
Co-authored-by: ZenoYang <cookie.yz@qq.com>
Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
Co-authored-by: mch_ucchi <41606806+sohardforaname@users.noreply.github.com>
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: Fy <fuyu0824@pku.edu.cn>
Co-authored-by: gnehil <adamlee489@gmail.com>
Co-authored-by: Hong Liu <844981280@qq.com>
Co-authored-by: Ashin Gau <AshinGau@users.noreply.github.com>
Co-authored-by: Xin Liao <liaoxinbit@126.com>
Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>
Co-authored-by: carlvinhust2012 <huchenghappy@126.com>
Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com>
Co-authored-by: camby <104178625@qq.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Co-authored-by: Kang <kxiao.tiger@gmail.com>
Co-authored-by: AlexYue <yj976240184@gmail.com>
Co-authored-by: Mingyu Chen <morningman@163.com>
Co-authored-by: zhannngchen <48427519+zhannngchen@users.noreply.github.com>
Co-authored-by: Jibing-Li <64681310+Jibing-Li@users.noreply.github.com>
Co-authored-by: yinzhijian <373141588@qq.com>
Co-authored-by: zhengyu <freeman.zhang1992@gmail.com>
Co-authored-by: lihaijian <bigmudhaijian@gmail.com>
Co-authored-by: Liqf <109049295+LemonLiTree@users.noreply.github.com>
Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com>
Co-authored-by: lihangyu <15605149486@163.com>
Co-authored-by: Yiliang Qiu <68439848+qqIsAProgrammer@users.noreply.github.com>
Co-authored-by: yiliang qiu <yiliang.qiu@qq.com>
Co-authored-by: zhoumengyks <111965739+zhoumengyks@users.noreply.github.com>
Co-authored-by: Wanghuan <imnu2054wh@126.com>
Co-authored-by: zy-kkk <zhongykk@qq.com>
Co-authored-by: 谢健 <jianxie0@gmail.com>
Co-authored-by: TaoZex <45089228+TaoZex@users.noreply.github.com>
Co-authored-by: slothever <18522955+wsjz@users.noreply.github.com>
Co-authored-by: minghong <minghong.zhou@163.com>
Co-authored-by: Kikyou1997 <33112463+Kikyou1997@users.noreply.github.com>
Co-authored-by: ChPi <chjie93@gmail.com>
Co-authored-by: hucheng01 <hucheng01@baidu.com>
Co-authored-by: WenYao <729673078@qq.com>
Co-authored-by: shizhiqiang03 <shizhiqiang03@meituan.com>
Co-authored-by: yongjinhou <109586248+yongjinhou@users.noreply.github.com>
Co-authored-by: Luwei <814383175@qq.com>
Co-authored-by: Luzhijing <82810928+luzhijing@users.noreply.github.com>
Co-authored-by: abmdocrt <Yukang.Lian2022@gmail.com>
Co-authored-by: Yixi Zhang <83794882+ZhangYiXi-dev@users.noreply.github.com>
Co-authored-by: lsy3993 <110876560+lsy3993@users.noreply.github.com>
Co-authored-by: catpineapple <42031973+catpineapple@users.noreply.github.com>
Co-authored-by: plat1ko <platonekosama@gmail.com>
Co-authored-by: pengxiangyu <diablowcg@163.com>
Co-authored-by: Stalary <stalary@163.com>
Co-authored-by: Lei Zhang <27994433+SWJTU-ZhangLei@users.noreply.github.com>
Co-authored-by: YueW <45946325+Tanya-W@users.noreply.github.com>
Co-authored-by: Kidd <107781942+k-i-d-d@users.noreply.github.com>
Co-authored-by: Xiaocc <598887962@qq.com>
luwei16 added a commit to luwei16/incubator-doris that referenced this pull request Apr 7, 2023
…-dev (561fddc 20221228) (apache#1304)

```
                              20211227      20221228
                             db04150a8d    cd65d15ede
                                  v            v
selectdb-cloud-release-2.0  --o---.-----o------o-----o--o------------------
                                  .             \
                                  .              \
selectdb-cloud-release-2.1  --o---o               \
                                   \               \
                                    \___________    \
                                                \    \  
selectdb-cloud-merge-2.0-2.1(tmp)                o----o---o
                                                /          \
selectdb-cloud-dev          ----o-----o--------o-----o--o---o---------------
                                               ^
                                           561fddc
                                            20221228
```


* [feature](selectdb-cloud) Fix file cache metrics nullptr error (apache#1060)

* [feature](selectdb-cloud) Fix abort copy when -235 (apache#1039)

* [feature](selectdb-cloud) Replace libfdb_c.so to make it compatible with different OS (apache#925)

* [feature](selectdb-cloud) Optimize RPC retry in cloud_meta_mgr (apache#1027)

* Optimize RETRY_RPC in cloud_meta_mgr
* Add random sleep for RETRY_RPC
* Add a simple backoff strategy for rpc retry

* [feature](selectdb-cloud) Copy into support select by column name (apache#1055)

* Copy into support select by column name
* Fix broker load core dump due to mis-match of number of columns between remote and schema

* [feature](selectdb-cloud) Fix test_dup_mv_schema_change case (apache#1022)

* [feature](selectdb-cloud) Make the broker execute on the specified cluster (apache#1043)

* Make the broker execute on the specified cluster
* Pass the cluster parameter

* [feature](selectdb-cloud) Support concurrent BaseCompaction and CumuCompaction on a tablet (apache#1059)

* [feature](selectdb-cloud) Reduce meta-service log (apache#1067)

* Quote string in the tagged log
* Add template to enable customized log for RPC requests

* [feature](selectdb-cloud) Use read-only txn + read-write txn for `commit_txn` (apache#1065)

* [feature](selectdb-cloud) Pick "[fix](load) fix that load channel failed to be released in time (apache#14119)"

commit 3690c4d
Author: Xin Liao <liaoxinbit@126.com>
Date:   Wed Nov 9 22:38:08 2022 +0800
    [fix](load) fix that load channel failed to be released in time (apache#14119)

* [feature](selectdb-cloud) Add compaction profile log (apache#1072)

* [feature](selectdb-cloud) Fix abort txn fail when copy job `getAllFileStatus` exception (apache#1066)

* Revert "[feature](selectdb-cloud) Copy into support select by column name (apache#1055)"

This reverts commit f1a543e.

* [feature](selectdb-cloud) Pick"[fix](metric) fix the bug of not updating the query latency metric apache#14172 (apache#1076)"

* [feature](selectdb-cloud) Distinguish KV_TXN_COMMIT_ERR or KV_TXN_CONFLICT while commit failed (apache#1082)

* [feature](selectdb-cloud) Support configuring base compaction concurrency (apache#1080)

* [feature](selectdb-cloud) Enhance start.sh/stop.sh for selectdb_cloud (apache#1079)

* [feature](selectdb-cloud) Add smoke testing (apache#1056)

Add smoke test, 1. upload,query http data api. 2. internal, external stage. 3. select,insert

* [feature](selectdb-cloud) Disable admin stmt in cloud mode (apache#1064)

Disable the following stmt.

* AdminRebalanceDiskStmt/AdminCancelRebalanceDiskStmt
* AdminRepairTableStmt/AdminCancelRepairTableStmt
* AdminCheckTabletsStmt
* AdminCleanTrashStmt
* AdminCompactTableStmt
* AdminCopyTabletStmt
* AdminDiagnoseTabletStmt
* AdminSetConfigStmt
* AdminSetReplicaStatusStmt
* AdminShowConfigStmt
* AdminShowReplicaDistributionStmt
* AdminShowReplicaStatusStmt
* AdminShowTabletStorageFormatStmt

Leaving a backdoor for the user root:

* AdminSetConfigStmt
* AdminShowConfigStmt
* AdminShowReplicaDistributionStmt
* AdminShowReplicaStatusStmt
* AdminDiagnoseTabletStmt

* [feature](selectdb-cloud) Update copy into doc (apache#1063)

* [feature](selectdb-cloud) Fix AdminSetConfigStmt cannot work with root (apache#1085)

* [feature](selectdb-cloud) Fix userid null lead to checkpoint error (apache#1083)

* [feature](selectdb-cloud) Support controling the space used for upload (apache#1091)

* [feature](selectdb-cloud) Pick "[fix](sequence) fix that update table core dump with sequence column (apache#13847)" (apache#1092)

* [Fix](memory-leak) Fix boost::stacktrace memory leak (1097)

* [Fix](selectdb-cloud) Several picks to fix memtracker  (apache#1087)

* [enhancement](memtracker)  Add independent and unique scanner mem tracker for each query (apache#13262)

* [enhancement](memory) Print memory usage log when memory allocation fails (apache#13301)

* [enhancement](memtracker) Print query memory usage log every second when `memory_verbose_track` is enabled (apache#13302)

* [fix](memory) Fix USE_JEMALLOC=true UBSAN compilation error apache#13398

* [enhancement](memtracker) Fix bthread local consume mem tracker (apache#13368)

    Previously, bthread_getspecific was called every time bthread local was used. In the test at apache#10823, it was found 
    that frequent calls to bthread_getspecific had performance problems.

    So a cache is implemented on pthread local based on the btls key, but the btls key cannot correctly sense bthread switching.

    So, based on bthread_self to get the bthread id to implement the cache.

* [enhancement](memtracker) Fix brpc causing query mem tracker to be inaccurate apache#13401

* [fix](memtracker) Fix transmit_tracker null pointer because phamp is not thread safe apache#13528

* [enhancement](memtracker) Fix Brpc mem count and refactored thread context macro  (apache#13469)

* [fix](memtracker) Fix the usage of bthread mem tracker  (apache#13708)

    bthead context init has performance loss, temporarily delete it first, it will be completely refactored in apache#13585.

* [enhancement](memtracker) Refactor load channel + memtable mem tracker (apache#13795)

* [fix](load) Fix load channel mgr lock (apache#13960)

    hot fix load channel mgr lock

* [fix](memtracker) Fix DCHECK !std::count(_consumer_tracker_stack.begin(), _consumer_tracker_stack.end(), tracker)

* [tempfix][memtracker] wait pick 0b945fe

Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>

* [feature](selectdb-cloud) Add more recycler case (apache#1094)

* [feature](selectdb-cloud) Pick "[improvement](load) some simple optimization for reduce load memory policy (apache#14215)" (apache#1096)

* [feature](selectdb-cloud) Reduce unnecessary get rowset rpc when prepare compaction (apache#1099)

* [feature](selectdb-cloud) Pick "[improvement](load) reduce memory in batch for small load channels (apache#14214)" (apache#1100)

* [feature](selectdb-cloud) Pick "[improvement](load) release load channel actively when error occurs (apache#14218)" (apache#1102)

* [feature](selectdb-cloud) Print build info of ms/recycler to stdout when launch (apache#1105)

* [feature](selectdb-cloud) copy into support select by column name and load with partial columns (apache#1104)

e.g.
```
COPY INTO test_table FROM (SELECT col1, col2, col3 FROM @ext_stage('1.parquet'))

COPY INTO test_table (id, name) FROM (SELECT col1, col2 FROM @ext_stage('1.parquet'))
```

* [fix](selectdb-cloud) Pick "[Fix](array-type) bugfix for array column with delete condition (apache#13361)" (apache#1109)

Fix for SQL with array column:
delete from tbl where c_array is null;

more info please refer to apache#13360

Co-authored-by: camby <104178625@qq.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [feature](selectdb-cloud) Copy into support force (apache#1081)

* [feature](selectdb-cloud) Add abort txn, abort tablet job http api (apache#1101)

Abort load txn by txn_id:
```
curl "{meta_sevice_ip}:{brpc_port}/MetaService/http/abort_txn?token=greedisgood9999" -d '{
"cloud_unique_id": string,
"txn_id": int64
}'
```

Abort load txn by db_id and label:
```
curl "{meta_sevice_ip}:{brpc_port}/MetaService/http/abort_txn?token=greedisgood9999" -d '{
"cloud_unique_id": string,
"db_id": int64,
"label": string
}'
```

Only support abort compaction job currently:
```
curl "{meta_sevice_ip}:{brpc_port}/MetaService/http/abort_tablet_job?token=greedisgood9999" -d '{
"cloud_unique_id": string,
"job" : {
  "idx": {"tablet_id": int64},
  "compaction": [{"id": string}]
}
}'
```

* [feature](selectdb-cloud) Fix external stage data for smoke test and retry to create stage (apache#1119)

* [feature](selectdb-cloud) Fix data leaks when truncating table (apache#1114)

* Drop cloud partition when truncating table
* Add retry strategy for dropCloudMaterializedIndex

* [feature](selectdb-cloud) Fix missing library when compiling unit test (apache#1128)

* [feature](selectdb-cloud) Validate the object storage when create stage (apache#1115)

* [feature](selectdb-cloud) Fix incorrectly setting cumulative point when committing base compaction (apache#1127)

* [feature](selectdb-cloud) Fix missing lease when preparing cumulative compaction (apache#1131)

* [feature](selectdb-cloud) Fix unbalanced tablet distribution (apache#1121)

* Fix the bug of unbalanced tablet distribution
* Use replica index hash to BE

* [feature](selectdb-cloud) Fix core dump when get tablets info by BE web page (apache#1113)

* [feature](selectdb-cloud) Fix start_fe.sh --version (apache#1106)

* [feature](selectdb-cloud) Print tablet stats before and after compaction (apache#1132)

* Log num rowsets before and after compaction
* Print tablet stats after committing compaction

* [feature](selectdb-cloud) Allow root user execute AlterSystemStmt (apache#1143)

* [feature](selectdb-cloud) Fix BE UT (apache#1141)

* [feature](selectdb-cloud) Select BE for the first bucket of every partition randomly (apache#1136)

* [feature](selectdb-cloud) Fix query_limit int -> int64 (apache#1154)

* [feature](selectdb-cloud) Add more cloud recycler case (apache#1116)

* add more cloud recycler case
* modify cloud recycler case dateset from sf0.1 to sf1

* [feature](selectdb-cloud) Fix misuse of aws transfer which may delete tmp file prematurely (apache#1160)

* [feature](selectdb-cloud) Add test for copy into http data api and userId (apache#1044)

* Add test for copy into http data api and userId
* Add external and internal stage cross use regression case.

* [feature](selectdb-cloud)  Pass the cloud compaction regression test (apache#1173)

* [feature](selectdb-cloud) Modify max_bytes_per_broker_scanner default value to 150G (apache#1184)

* [feature](selectdb-cloud) Fix missing lock when calling Tablet::delete_predicates (apache#1182)

* [improvement](config)change default remote_fragment_exec_timeout_ms to 30 seconds

* [improvement](config) change default value of broker_load_default_timeout_second to 12 hours

* [feature](selectdb-cloud) Fix replay copy into (apache#1167)

* Add stage ddl regression
* fix replay copy into
* remove unused log
* fix user name

* [feature](selectdb-cloud) Fix FE --version option not work after fe started (apache#1161)

* [feature](selectdb-cloud) BE accesses object store using HTTP (apache#1111)

* [feature](selectdb-cloud) Refactor recycle copy jobs (apache#1062)

* [fix](FE) Pick fix from doris master (apache#1177) (apache#1178)

Commit: 53e5f39
Author: starocean999 <40539150+starocean999@users.noreply.github.com>
Committer: GitHub <noreply@github.com>
Date: Mon Oct 31 2022 10:19:32 GMT+0800 (China Standard Time)
fix result exprs should be substituted in the same way as agg exprs (apache#13744)

Commit: a4a9912
Author: starocean999 <40539150+starocean999@users.noreply.github.com>
Committer: GitHub <noreply@github.com>
Date: Thu Nov 03 2022 10:26:59 GMT+0800 (China Standard Time)
fix group by constant value bug (apache#13827)

Commit: 84b969a
Author: starocean999 <40539150+starocean999@users.noreply.github.com>
Committer: GitHub <noreply@github.com>
Date: Thu Nov 10 2022 11:10:42 GMT+0800 (China Standard Time)
fix the grouping expr should check col name from base table first, then alias (apache#14077)

Commit: ae4f4b9
Author: starocean999 <40539150+starocean999@users.noreply.github.com>
Committer: GitHub <noreply@github.com>
Date: Thu Nov 24 2022 10:31:58 GMT+0800 (China Standard Time)
fix having clause should use column name first then alias (apache#14408)

* [feature](selectdb-cloud) Deal with getNextTransactionId rpc exception (apache#1181)

Before fixing, getNextTransactionId will return -1 if there is RPC exception,
it will cause schema change and the previous load task execute in parallel unexpectedly.

* [feature](selectdb-cloud) Throw exception for unsupported operations in CloudGlobalTransactionMgr (apache#1180)

* [improvement](load) Add more log on RPC error (apache#1183)

* [feature](selectdb-cloud) Add copy_into case(json, parquet, orc) and tpch_sf1 to smoke test (apache#1140)

* [feature](selectdb-cloud) Recycle dropped stage (apache#1071)

* log s3 response code
* add log in S3Accessor::delete_objects_by_prefix
* Fix show copy
* remove empty line

* [feature](selectdb-cloud) Support bthread for new scanner (apache#1117)

* Support bthread for new scanner
* Keep the number of remote threads same as local threads

* [feature](selectdb-cloud) Implement self-explained cloud unique id for instance id searching (apache#1089)

1. Implement self-explained cloud unique id for instance id searching
2. Fix register core when metaservice start error
3. Fix drop_instance not set mtime
4. Add HTTP API to get instance info

```
curl "127.0.0.1:5008/MetaService/http/get_instance?token=greedisgood9999&cloud_unique_id=regression-cloud-unique-id-fe-1"

curl "127.0.0.1:5008/MetaService/http/get_instance?token=greedisgood9999&cloud_unique_id=1:regression_instance0:regression-cloud-unique-id-fe-1"

curl "127.0.0.1:5008/MetaService/http/get_instance?token=greedisgood9999&instance_id=regression_instance0"
```

* [improvement](memory) simplify memory config related to tcmalloc  and add gc (apache#1191)

* [improvement](memory) simplify memory config related to tcmalloc

There are several configs related to tcmalloc, users do know how to config them. Actually users just want two modes, performance or compact, in performance mode, users want doris run query and load quickly while in compact mode, users want doris run with less memory usage.

If we want to config tcmalloc individually, we can use env variables which are supported by tcmalloc.

* [improvement](tcmalloc) add moderate mode and avoid oom  with a lot of cache (apache#14374)

ReleaseToSystem aggressively when there are little free memory.

* [feature](selectdb-cloud) Pick "[fix](hashjoin) fix coredump of hash join in ubsan build apache#13479" (apache#1190)

commit b5cd167
Author: TengJianPing <18241664+jacktengg@users.noreply.github.com>
Date:   Thu Oct 20 10:16:19 2022 +0800
    [fix](hashjoin) fix coredump of hash join in ubsan build (apache#13479)

* [feature](selectdb-cloud) Support close FileWriter without forcing sync data to storage medium (apache#1134)

* Trace accumulated time
* Support close FileWriter without forcing sync data to storage medium
* Avoid trace overhead when disable trace

* [feature](selectdb-cloud) Pick "[BugFix](function) fix reverse function dynamic buffer overflow due to illegal character apache#13671" (apache#1146)

* pick [opt](exec) Replace get_utf8_byte_length function by array (apache#13664)
* pick [BugFix](function) fix reverse function dynamic buffer overflow due to illegal character apache#13671
Co-authored-by: HappenLee <happenlee@hotmail.com>

* [feature](selectdb-cloud) Pick "[fix](fe) Inconsistent behavior for string comparison in FE and BE (apache#13604)" (apache#1150)

Co-authored-by: xueweizhang <zxw520blue1@163.com>

* [feature](selectdb-cloud) Copy into support delete_on condition (apache#1148)

* [feature](selectdb-cloud) Pick "[fix](agg)fix group by constant value bug (apache#13827)" (apache#1152)

* [fix](agg)fix group by constant value bug

* keep only one const grouping exprs if no agg exprs

Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>

* [feature](selectdb-cloud) Pick "[fix](join)the build and probe expr should be calculated before converting input block to nullable (apache#13436)" (apache#1155)

* [fix](join)the build and probe expr should be calculated before converting input block to nullable

* remove_nullable can be called on const column

Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>

* [feature](selectdb-cloud) Pick "[Bug](predicate) fix core dump on bool type runtime filter (apache#13417)" (apache#1156)

fix core dump on bool type runtime filter

Co-authored-by: Pxl <pxl290@qq.com>

* [feature](selectdb-cloud) Pick "[Fix](agg) fix bitmap agg core dump when phmap pointer assert alignment (apache#13381)" (apache#1157)

Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>

* [feature](selectdb-cloud) Pick "[Bug](function) fix core dump on case when have 1000 condition apache#13315" (apache#1158)

Co-authored-by: Pxl <pxl290@qq.com>

* [feature](selectdb-cloud) Pick "[fix](sort)the sort expr nullable info is wrong in some case (apache#12003)"

* [feature](selectdb-cloud) Pick "[Improvement](decimal) print decimal according to the real precision and scale (apache#13437)"

* [feature](selectdb-cloud) Pick "[bugfix](VecDateTimeValue) eat the value of microsecond in function from_date_format_str (apache#13446)"

* [bugfix](VecDateTimeValue) eat the value of microsecond in function from_date_format_str

* add sql based regression test

Co-authored-by: xiaojunjie <xiaojunjie@baidu.com>

* [feature](selectdb-cloud) Allow ShowProcesslistStmt for normal user (apache#1153)

* [feature](selectdb-cloud) tcmalloc gc does not work in somecases (apache#1202)

* [feature](selectdb-cloud) show data stmt supports db level stats and add metrics for table data size (apache#1145)

* [feature](selectdb-cloud) Fix bug in calculating number of available threads for base compaction (apache#1203)

* [feature](selectdb-cloud) Fix unexpected remaining cluster ids on observer when dropping cluster (apache#1194)

We don't have the RPC `dropCluster` on, all clusters are built with tags in the backends info..

In the previous, FE master drop a cluster by counting clusters retrieved from meta-service, observers update map 
 `clusteIdToBackend` and `clusterNameToId` by replaying backend node operations, which leads to inconsistency of FE master and FE observer.

We can treat empty clusters as dropped clusters to keep consistency.

Check <https://selectdb.feishu.cn/wiki/wikcnqI6HfD5mw8kHoGD5DqDxOe> for more info.

* [feature](selectdb-cloud) Bump version to 2.0.13

* [opt](tcmalloc) Optimize policy of tcmalloc gc (apache#1214)

Release memory when memory pressure is above pressure limit and
keep at lease 2% memory as tcmalloc cache.

* [feature](selectdb-cloud) Fix some bugs of cloud cluster (apache#1213)

1. fix executing load in multi clusters
2. fix use@ cluster on fe observer
3. fix forward without cloud cluster, we set cloud cluster when use cluster on observer

* [fix](tcmalloc) Do not release cache aggressively when rss is low (apache#1216)

* [fix](tcmalloc) Fix negative to_free_bytes due to physical_limit (apache#1217)

* [feature](selectdb-cloud) Fix old cluster information left in Context (apache#1220)

* [feature](selectdb-cloud)  Add multi cluster regression case (apache#1226)

* [feature](selectdb-cloud) Fix too many obs client log (apache#1227)

* [fix](memory) Fix memory leak by calling boost::stacktrace (apache#14269) (incomplete pick) (apache#1210)

boost::stacktrace::stacktrace() has memory leak, so use glog internal func to print stacktrace.
The reason for the memory leak of boost::stacktrace is that a state is saved in the thread local of each thread but not actively released. The test found that each thread leaked about 100M after calling boost::stacktrace.
refer to:
boostorg/stacktrace#118
boostorg/stacktrace#111

Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>

* [feature](selectdb-cloud) Check md5sum of libfdb.xz (apache#1163)

* [feature](selectdb-cloud) Add multi cluster regression case (apache#1231)

* add multi cluster regression case
* refine code of multi cluster regression test

* [fix](memtracker) Fix segment_meta_mem_tracker pick error (stacktrace) (apache#1237)

* [feature](selectdb-cloud) Fix and improve compaction trace (apache#1233)

* [feature](selectdb-cloud) Support cloud cluster in select hints (apache#984)

e.g.
```
SELECT /*+ SET_VAR(cloud_cluster = ${cluster_name}) */ * from table
```

* [feature](selectdb-cloud) Fix load parquet coredump (apache#1238)

* [feature](selectdb-cloud) Improve FE cluster metrics for monitoring (apache#1232)

* [feature](selectdb-cloud) Add multi cluster async copy into regression case (apache#1242)

* add multi cluster regression case
* refine code of multi cluster regression test
* Add multi cluster async copy into regression case

* [feature](selectdb-cloud) Add error url regression case (apache#1246)

* [feature](selectdb-cloud) Upgrade mariadb client version (apache#1240)

This change _may_ fix "Failed to execute sql: java.lang.ClassCastException:  java.util.LinkedHashMap$Entry cannot be cast to java.util.HashMap$TreeNode"

* [feature](selectdb-cloud) Fix replay copy job and fail msg (apache#1239)

* [feature](selectdb-cloud) Fix improper number of input rowsets of cumulative compaction (apache#1235)

Remove the logic that returns input rowsets directly if the total size is larger than the promotion size in cumulative compaction policy.

* Pick "[Feature](runtime-filter) add runtime filter breaking change adapt apache#13246" (apache#1221)

This commit fix tpcds q85.

* [feature](selectdb-cloud) Update http api doc (apache#1230)

* [feature](selectdb-cloud) Optimize count/max/min query by caching the index info when write (apache#1222)

* [feature](selectdb-cloud) Fix the codedump about fragment_executor double prepare (apache#1249)

* [feature](selectdb-cloud) Clean copy jobs by num (apache#1219)

* [feature](selectdb-cloud) Fix is_same_v failed bug in begin_rpc (apache#1250)

* [feature](selectdb-cloud) Change delete logic of fdbbackup (apache#1248)

* [feature](selectdb-cloud) Fix misuse of aws transfer which may delete tmp file prematurely (apache#1159)

* Fix misusage of aws transfer manager
* Share TransferManager in a S3FileSystem
* Fix uploading incorrect data when opening file failed
* Add ut for uploading to s3

* [feature](selectdb-cloud) Bump version to 2.0.14 (apache#1255)

* [feature](selectdb-cloud) Improve copy into with delete on for json/parquet/orc (apache#1257)

* [feature](selectdb-cloud) Implement tablet balance at partition level (apache#1247)

* [feature](selectdb-cloud) Add pad_segment http action to manually overwrite an unrecoverable segment with an empty segment (apache#1254)

* [feature](selectdb-cloud) Fix be ut (apache#1262)

* [feature](selectdb-cloud) Modify regression case to adapt cloud mode (apache#1264)

* [feature][selectdb-cloud] Fix unknown table caused by partition level balance when replay (apache#1265)

* [feature][selectdb-cloud] Adjust the log level of table creation (apache#1260)

* [feature](selectdb-cloud) Add config of the number of warn log files (apache#1245)

* [feature](selectdb-cloud) Fix test_multiply case incorrect variable (apache#1266)

* [feature](selectdb-cloud) Check connection timeout when create stage (apache#1253)

* [feature][selectdb-cloud] Add auth check for undetermined cluster (apache#1258)

* [feature](selectdb-cloud) Meta-service support conf rate limit (apache#1205)

* [feature](selectdb-cloud) Check the config for file cache when launch to increase robustness (apache#1269)

* [feature][selectdb-cloud] Add MaxBuildRowsetTime, MaxBuildRowsetTime, UploadSpeed in tablet sink profile (apache#1252)

* [feature](selectdb-cloud) Fix three regression case for cloud (apache#1271)

* [feature](selectdb-cloud) Reduce log of get_tablet_stats (apache#1274)

* [feature](selectdb-cloud) Deprecate max_upload_speed and min_upload_speed in PTabletWriterAddBlockResult

* [feature](selectdb-cloud) Add more conf for BRPC to get rid of "overcrowed"

DECLARE_uint64(max_body_size);
DECLARE_int64(socket_max_unwritten_bytes);

* Pick "[Chore](regression) Fix wrong result for decimal (apache#13644)"

commit e007343
Author: Gabriel <gabrielleebuaa@gmail.com>
Date:   Wed Oct 26 09:24:46 2022 +0800

    [Chore](regression) Fix wrong result for decimal (apache#13644)

* [feature](selectdb-cloud) Fix transfer handle doesn't init (apache#1276)

* [feature](selectdb-cloud) Fix test_segment_iterator_delete case (apache#1275)

* [feature][selectdb-cloud] Update copy upload doc (apache#1273)

* [Fix](inverted index) pick clucene error processing from dev (apache#1287)

* [bug][inverted]fix be core when throw CLuceneError (apache#1261)

* [bug][inverted]fix be core when throw CLuceneError

* catch clucene error and add warning logs

* optimize code

* [Fix](inverted index) return error if inverted index writer init failed (apache#1267)

* [Fix](inverted index) return error if inverted index writer init failed

* [Fix](segment_writer) need to return error status when create segment writer

Co-authored-by: airborne12 <airborne12@gmail.com>

Co-authored-by: luennng <luennng@gmail.com>
Co-authored-by: airborne12 <airborne12@gmail.com>

* [enhancement] support convert TYPE_FLOAT in function convert_type_to_primitive (apache#1290)

* [feature][selectdb-cloud] Fix meta service range get instance when launch (apache#1293)

Co-authored-by: Lightman <31928846+Lchangliang@users.noreply.github.com>
Co-authored-by: meiyi <myimeiyi@gmail.com>
Co-authored-by: Xiaocc <598887962@qq.com>
Co-authored-by: Lei Zhang <27994433+SWJTU-ZhangLei@users.noreply.github.com>
Co-authored-by: Xin Liao <liaoxinbit@126.com>
Co-authored-by: Luwei <814383175@qq.com>
Co-authored-by: plat1ko <platonekosama@gmail.com>
Co-authored-by: deardeng <565620795@qq.com>
Co-authored-by: Kidd <107781942+k-i-d-d@users.noreply.github.com>
Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>
Co-authored-by: zhannngchen <48427519+zhannngchen@users.noreply.github.com>
Co-authored-by: camby <104178625@qq.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
Co-authored-by: AlexYue <yj976240184@qq.com>
Co-authored-by: xueweizhang <zxw520blue1@163.com>
Co-authored-by: Pxl <pxl290@qq.com>
Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
Co-authored-by: xiaojunjie <971308896@qq.com>
Co-authored-by: xiaojunjie <xiaojunjie@baidu.com>
Co-authored-by: airborne12 <airborne08@gmail.com>
Co-authored-by: luennng <luennng@gmail.com>
Co-authored-by: airborne12 <airborne12@gmail.com>
Co-authored-by: YueW <45946325+Tanya-W@users.noreply.github.com>
@liaoxin01 liaoxin01 deleted the fix_load_channel branch February 6, 2024 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants