Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixup leak memory #1244

Merged
merged 1 commit into from Jun 4, 2019
Merged

fixup leak memory #1244

merged 1 commit into from Jun 4, 2019

Conversation

worker24h
Copy link
Contributor

When I declared that the compilation mode was BUILD_TYPE=LSAN, there was a memory leak after running doris.

be.out:
Direct leak of 32816 byte(s) in 1 object(s) allocated from:
#0 0x1089666 in __interceptor_malloc ../../../../libsanitizer/lsan/lsan_interceptors.cc:53
#1 0x7ff459547280 in __alloc_dir (/lib64/libc.so.6+0xc0280)

SUMMARY: LeakSanitizer: 32816 byte(s) leaked in 1 allocation(s).

When I declared that the compilation mode was BUILD_TYPE=LSAN, there was a memory leak after running doris.

be.out:
Direct leak of 32816 byte(s) in 1 object(s) allocated from:
    #0 0x1089666 in __interceptor_malloc ../../../../libsanitizer/lsan/lsan_interceptors.cc:53
    #1 0x7ff459547280 in __alloc_dir (/lib64/libc.so.6+0xc0280)

SUMMARY: LeakSanitizer: 32816 byte(s) leaked in 1 allocation(s).
@imay imay merged commit ae75e44 into apache:master Jun 4, 2019
luwei16 added a commit to luwei16/incubator-doris that referenced this pull request Apr 7, 2023
```
 branch                                   date

                            20221115          20221205  20221210
                            9de1fec6c
                                v                 v         v
doris-1.2-lts                ---o-o-o-o-o-o-o---  .         .
                                 \                .         .
selectdb-cloud-dev-merge          o--o--o---o-o-o .         .
                                  /              \.         .
selectdb-cloud-dev-20221205       |               o-o--o--o-.
                                 /                /        \.
selectdb-cloud-dev-20221210     |                /          o--o (final)
                                /               /          /    \
selectdb-cloud-dev           ---o---o---o-o-o---o--o----o--o-----X-----
                                ^               ^          ^
                            67875dd2b       7cf9fb0ab   479d081f8
```


* Revert "[enhancement](compaction) opt compaction task producer and quick compaction (#13495)" (#13833)

This reverts commit 4f2ea0776ca3fe5315ab5ef7e00eefabfb5771a0.

* [feature](Nereids): add rule for matching plan into HyperGraph. (#13805)

* [fix](analytic) fix coredump cause by empty analytic parameter types (#13808)



* fix fe compile error

* [Bugfix](upgrade) Fix 1.1 upgrade 1.2 coredump when schema change (#13822)

When upgrade 1.2 version from 1.1, FE version will don't match BE version for a period of time. After upgrade BE and doing schema change, BE will use a field desc_tbl that add in 1.2 version FE. BE will coredump because the field desc_tbl is nullptr. So it need to refuse the request.

* [feature](nereids) add rule for semi/anti join exploration, when there is project between them (#13756)

* [feature](syntax) support SELECT * EXCEPT (#13844)



* [feature](syntax) support SELECT * EXCEPT: add regression test

* [enhancement](Nereids) add merge project rule to column prune rule set (#13835)

when we do column prune, we add project on child plan. If child plan is Project. we need to merge them.

* [fix](nereids) map literal to double in FilterSelectivityCalculator (#13776)

fix literal to double bug: all literal type implements getDouble() function

* [enhancement](Nereids) use join estimation v2 only when stats derive v2 is enable (#13845)

join estimation V2 should be invoked when enableNereidsStatsDeriveV2=true

* [javaudf](string) Fix string format in java udf (#13854)

* [improvement](memory) simplify memory config related to tcmalloc (#13781)

There are several configs related to tcmalloc, users do know how to config them. Actually users just want two modes, performance or compact, in performance mode, users want doris run query and load quickly while in compact mode, users want doris run with less memory usage.

If we want to config tcmalloc individually, we can use env variables which are supported by tcmalloc.

* [minor](load) Improve error message for string type in loading process (#13718)

* [fix](spark load)The where condition does not take effect when spark load loads the file (#13803)

* [enhancement](olap scanner) Scanner row bytes buffer is too small bug (#13874)

* [enhancement](olap scanner) Scanner row bytes buffer is too small, please try to increase be config

Co-authored-by: yiguolei <yiguolei@gmail.com>

* [minor](log) remove some e.printStackTrace() (#13870)

* [enhancement](test) retry start be or fe when port has been bind. (#13860)



Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>

* [docs](tablet-docs) fix the tablet-repair-and-balance.md doucument. (#13853)

Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>

* [doc](spark-doris-connetor)Add spark Doris connector to support streamload documentation #13834

* [fix](join)ColumnNullable need handle const column with nullable const value (#13866)

* [enhancement](profile) add profile to show column predicates (#13862)

* [community](collaborators) add more collaborators (#13880)

* [fix](dynamic-partition) fix wrong check of replication num (#13755)

* [regression](join) add right anti join with other predicate regression case (#13815)

* [meta](recover) change dropInfo and RecoverInfo to GSON (#13830)

* [chore](macOS) Fix compilation errors caused by the deprecated function (#13890)

* [enhancement](Nereids) add eliminate unnecessary project rule (#13886)

This rule eliminate project that output set is same with its child. If the project is the root of plan, the elimination condition is project's output is exactly the same with its child.

The reason to add this rule is when we do join reorder in optimization, the root of plan after transformed maybe a Project and its output set is same with the root of plan before transformed. If we had a Project on the top of the root and its output set is same with the root of plan too. We will have two exactly same projects in memo. One of them is the parent of the other. After MergeProject, we will get a new Project exactly same like the child and need to add to parent's group. Then we trigger Merge Group. Since merge will produce a cycle, the merge will be denied and we will get a final plan with two consecutive projects.

## for example:
**BEFORE OPTIMIZATION**
```
LogicalProject1( projects=[c_custkey#0, c_name#1]) [GroupId#1]
+--LogicalJoin(type=LEFT_SEMI_JOIN)                [GroupId#2]
   |--LogicalProject(...)
   |  +--LogicalJoin(type=INNER_JOIN)
   |  ...
   +--LogicalOlapScan(...)
```
**AFTER APPLY RULE: LOGICAL_SEMI_JOIN_LOGICAL_JOIN_TRANSPOSE_PROJECT**
```
LogicalProject1( projects=[c_custkey#0, c_name#1])    [GroupId#1]
+--LogicalProject2( projects=[c_custkey#0, c_name#1]) [GroupId#2]
   +--LogicalJoin(type=INNER_JOIN)                    [GroupId#10]
      |--LogicalProject(...)
      |  +--LogicalJoin(type=LEFT_SEMI_JOIN)
      |  ...
      +--LogicalOlapScan(...)
```
**AFTER APPLY RULE: MERGE_PROJECTS**
```
LogicalProject3( projects=[c_custkey#0, c_name#1])  [should be in GroupId#1, but in GroupId#2 in fact]
+--LogicalJoin(type=INNER_JOIN)                     [GroupId#10]
   |--LogicalProject(...)
   |  +--LogicalJoin(type=LEFT_SEMI_JOIN)
   |  ...
   +--LogicalOlapScan(...)
```
Since we have exaclty GroupExpression(LogicalProject3 and LogicalProject2) in GroupId#1 and GroupId#2, we need to do MergeGroup(GroupId#1, GroupId#2). But we have child of GroupId#1 in GroupId#2. So the merge is denied.
If the best GroupExpression in GroupId#2 is LogicalProject3, we will get two consecutive projects in the final plan.

* [fix](fe) Inconsistent behavior for string comparison in FE and BE (#13604)

* [enhancement](Nereids) generate correct distribution spec after project (#13725)

after project, some Slot maybe project to another one. So we need to replace ExprId in DistributionSpecHash to the new one. if we do project other than Alias, We need to return DistributionSpecAny other than child's DistributionSpec.

* [fix](Nereids) throw NPE when call getOutputExprIds in LogicalProperties (#13898)

* [Improve](Nereids): refactor eliminate outer join (#13402)

Refactor eliminate outer join #12985

Evaluate the expression with ConstantFoldRule. If the evaluation result is NULL or FALSE, then the elimination condition is satisfied.

* [feature](Nereids) Support lots of scalar function and fix some bug (#13764)

Proposed changes
1. function interfaces that can search the matched signature, say ComputeSignature. It's equal to the Function.CompareMode.
   - IdenticalSignature: equal to Function.CompareMode.IS_IDENTICAL
   - NullOrIdenticalSignature: equal to Function.CompareMode.IS_INDISTINGUISHABLE
   - ImplicitlyCastableSignature: equal to Function.CompareMode.IS_SUPERTYPE_OF
   - ExplicitlyCastableSignature: equal to Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF
3. generate lots of scalar functions
4. bug-fix: disassemble avg function compute wrong result because the wrong input type, the AggregateParam.inputTypesBeforeDissemble is use to save the origin input type and pass to backend to find the correct global aggregate function.
5. bug-fix: subquery with OneRowRelation will crash because wrong nullable property


Note:
1. currently no more unit test/regression test for the scalar functions, I will add the test until migrate aggregate functions for unified processing.
2. A known problem is can not invoke the variable length function, I will fix it later.

* [fix](rpc) The proxy removed when rpc exception occurs is not an abnormal proxy (#13836)

`BackendServiceProxy.getInstance()` uses the round robin strategy to obtain the proxy,
so when the current RPC request is abnormal, the proxy removed by 
`BackendServiceProxy.getInstance().removeProxy(...)` is not an abnormal proxy.

* [Vectorized](function) support topn_array function (#13869)

* [Enhancement](Nereids)optimize merge group in memo #13900

* [improvement](scan) speed up inserting strings into ColumnString (#13397)

* [Opt](function) opt the function of ndv (#13887)

* [fix](keyword) add BIN as keyword (#13907)

* [feature](function)add regexp functions: regexp_replace_one, regexp_extract_all (#13766)

* [feature](nereids) support common table expression (#12742)

Support common table expression(CTE) in Nereids:
- Just implemented inline CTE, which means we will copy the logicalPlan of CTE everywhere it is referenced;
- If the name of CTE is the same as an existing table or view, we will choose CTE first;

* [Load](Sink) remove validate the column data when data is NULL (#13919)

* [feature](new-scan) support transactional insert in new scan framework (#13858)


Support running transactional insert operation with new scan framework. eg:

admin set frontend config("enable_new_load_scan_node" = "true");
begin;
insert into tbl1 values(1,2);
insert into tbl1 values(3,4);
insert into tbl1 values(5,6);
commit;
Add some limitation to transactional insert

Do not support non-literal value in insert stmt
Fix some issue about array type:

Forbid cast other non-array type to NESTED array type, it may cause BE crash.
Add getStringValueForArray() method for Expr, to get valid string-formatted array type value.
Add useLocalSessionState=true in regression-test jdbc url
without this config, the jdbc driver will send some init cmd each time it connect to server, such as
select @@session.tx_read_only.
But when we use transactional insert, after begin command, Doris do not support any other type of
stmt except for insert, commit or rollback.
So adding this config to let the jdbc NOT send cmd when connecting.

* [fix](doc) fix 404 link (#13908)

* [regression-test](query) Add the regression case of the query under the large wide table. #13897

Co-authored-by: smallhibiscus <844981280>

* [fix](storage) evaluate_and of ComparisonPredicateBase has logical error (#13895)

* [fix](unique-key-merge-on-write) Types don't match when calling IndexedColumnIterator::seek_at_or_after (#13885)

* [fix](sequence) fix that update table core dump with sequence column (#13847)

* [fix](sequence) fix that update table core dump with sequence column

* update

* [Bugfix](MV) Fixed load negative values into bitmap type materialized views successfully under non-vectorization (#13719)

* [Bugfix](MV) Fixed load negative values into bitmap type materialized views successfully under non-vectorization

* [enhancement](memtracker) Refactor load channel + memtable mem tracker (#13795)

* [fix](function) fix coredump cause by return type mismatch of vectorized repeat function (#13868)


Will not support repeat function during upgrade in vectorized engine.

* [fix](agg)fix group by constant value bug (#13827)

* [fix](agg)fix group by constant value bug

* keep only one const grouping exprs if no agg exprs

* [fix](Nereids) finalize local aggregate should not turn on stream pre agg (#13922)

* [feature](nereids) Support authentication (#13434)

Add a rule to check the permission of a user who are executing a query. Forbid users who don't have SELECT_PRIV on some tables from executing queries on these tables.

* [Feature](join) Support null aware left anti join (#13871)

* [Fix](Nereids) add comments to CostAndEnforcerJob and fix view test case (#13046)

1. add comments to cost and enforcer job as some code is too hard to understand
2. fix nereids_syntax_p0/view.groovy's multi-answer bug.

* [Vectorized](function) support bitmap_to_array function (#13926)

* [docs](round) complement round function documentation (#13838)

* [fix](typo) check catalog enable exception message spelling mistake (#13925)

* [Enhancement](function) optimize the `upper` and `lower` functions using the simd instruction. (#13326)

optimize the `upper` and `lower` functions using the simd instruction.

* [revert](Nereids): revert GroupExpression Children ImmutableList. (#13918)

* [optimization](array-type) update the exception message when create table with array column (#13731)

This pr is used to update the exception message when create table with array column.
Co-authored-by: hucheng01 <hucheng01@baidu.com>

* [Bug](array-type) Fix array product calculate decimal type return wrong result (#13794)

* [enhancement](chore) remove debug log which is really too frequent #13909

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [doc](jsonb type)add documents for JSONB datatype (#13792)

* [BugFix](Concat) output of string concat function exceeds UINT makes crash (#13916)

* [Improvement](javaudf) support different date argument for date/datetime type (#13920)

* [refactor](crossjoin) refactor cross join (#13896)

* [fix](meta)(recover) fix recover info persist bug (#13948)

introduced from #13830

* [improvement](exec) add more debug info on fragment exec error (#13899)

* [feature-wip][refactor](multi-catalog) Persist external catalog related metadata. (#13746)

Persist external catalog/db/table, including the columns of external tables.
After this change, external objects could have their own uniq ID through their lifetime,
this is required for the statistic information collection.

* [fix](runtime-filter) build thread destruct first may cause probe thread coredump (#13911)

* [enhancment](Nereids) enable push down filter through aggregation (#13938)

* [enhancement](Nereids) remove unnecessary int cast (#13881)

* [enhancement](Nereids) remove unnecessary string cast (#13730)

convert string like literal to the cast type instead of run cast in runtime

* [minor](error msg) Fix wrong error message (#13950)

* [enhancement](compaction) introduce segment compaction (#12609) (#12866)

## Design

### Trigger

Every time when a rowset writer produces more than N (e.g. 10) segments, we trigger segment compaction. Note that only one segment compaction job for a single rowset at a time to ensure no recursing/queuing nightmare.

### Target Selection

We collect segments during every trigger. We skip big segments whose row num > M (e.g. 10000) coz we get little benefits from compacting them comparing our effort. Hence, we only pick the 'Longest Consecutive Small" segment group to do actual compaction.

### Compaction Process

A new thread pool is introduced to help do the job. We submit the above-mentioned 'Longest Consecutive Small" segment group to the pool. Then the worker thread does the followings:

- build a MergeIterator from the target segments
- create a new segment writer
- for each block readed from MergeIterator, the Writer append it

### SegID handling

SegID must remain consecutive after segment compaction. 

If a rowset has small segments named seg_0, seg_1, seg_2, seg_3 and a big segment seg_4:

- we create a segment named "seg_0-3" to save compacted data for seg_0, seg_1, seg_2 and seg_3
- delete seg_0, seg_1, seg_2 and seg_3
- rename seg_0-3 to seg_0
- rename seg_4 to seg_1

It is worth noticing that we should wait inflight segment compaction tasks to finish before building rowset meta and committing this txn.

* [fix](Nerieds) fix tpch and support trace plan's change event (#13957)

This pr fix some bugs for run tpc-h
1. fix the avg(decimal) crash the backend. The fix code in `Avg.getFinalType()` and every child class of `ComputeSinature`
2. fix the ReorderJoin dead loop. The fix code in `ReorderJoin.findInnerJoin()`
3. fix the TimestampArithmetic can not bind the functions in the child. The fix code in `BindFunction.FunctionBinder.visitTimestampArithmetic()`

New feature: support trace the plan's change event, you can `set enable_nereids_trace=true` to open trace log and see some log like this:
```
2022-11-03 21:07:38,391 INFO (mysql-nio-pool-0|208) [Job.printTraceLog():128] ========== RewriteBottomUpJob ANALYZE_FILTER_SUBQUERY ==========
before:
LogicalProject ( projects=[S_ACCTBAL#17, S_NAME#13, N_NAME#4, P_PARTKEY#19, P_MFGR#21, S_ADDRESS#14, S_PHONE#16, S_COMMENT#18] )
+--LogicalFilter ( predicates=((((((((P_PARTKEY#19 = PS_PARTKEY#7) AND (S_SUPPKEY#12 = PS_SUPPKEY#8)) AND (P_SIZE#24 = 15)) AND (P_TYPE#23 like '%BRASS')) AND (S_NATIONKEY#15 = N_NATIONKEY#3)) AND (N_REGIONKEY#5 = R_REGIONKEY#0)) AND (R_NAME#1 = 'EUROPE')) AND (PS_SUPPLYCOST#10 =  (SCALARSUBQUERY) (QueryPlan: LogicalAggregate ( phase=LOCAL, outputExpr=[min(PS_SUPPLYCOST#31) AS `min(PS_SUPPLYCOST)`#33], groupByExpr=[] )), (CorrelatedSlots: [P_PARTKEY#19, S_SUPPKEY#12, S_NATIONKEY#15, N_NATIONKEY#3, N_REGIONKEY#5, R_REGIONKEY#0, R_NAME#1]))) )
   +--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
      |--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
      |  |--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
      |  |  |--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
      |  |  |  |--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.part, output=[P_PARTKEY#19, P_NAME#20, P_MFGR#21, P_BRAND#22, P_TYPE#23, P_SIZE#24, P_CONTAINER#25, P_RETAILPRICE#26, P_COMMENT#27], candidateIndexIds=[], selectedIndexId=11076, preAgg=ON )
      |  |  |  +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.supplier, output=[S_SUPPKEY#12, S_NAME#13, S_ADDRESS#14, S_NATIONKEY#15, S_PHONE#16, S_ACCTBAL#17, S_COMMENT#18], candidateIndexIds=[], selectedIndexId=11124, preAgg=ON )
      |  |  +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.partsupp, output=[PS_PARTKEY#7, PS_SUPPKEY#8, PS_AVAILQTY#9, PS_SUPPLYCOST#10, PS_COMMENT#11], candidateIndexIds=[], selectedIndexId=11092, preAgg=ON )
      |  +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.nation, output=[N_NATIONKEY#3, N_NAME#4, N_REGIONKEY#5, N_COMMENT#6], candidateIndexIds=[], selectedIndexId=11044, preAgg=ON )
      +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.region, output=[R_REGIONKEY#0, R_NAME#1, R_COMMENT#2], candidateIndexIds=[], selectedIndexId=11108, preAgg=ON )

after:
LogicalProject ( projects=[S_ACCTBAL#17, S_NAME#13, N_NAME#4, P_PARTKEY#19, P_MFGR#21, S_ADDRESS#14, S_PHONE#16, S_COMMENT#18] )
+--LogicalFilter ( predicates=((((((((P_PARTKEY#19 = PS_PARTKEY#7) AND (S_SUPPKEY#12 = PS_SUPPKEY#8)) AND (P_SIZE#24 = 15)) AND (P_TYPE#23 like '%BRASS')) AND (S_NATIONKEY#15 = N_NATIONKEY#3)) AND (N_REGIONKEY#5 = R_REGIONKEY#0)) AND (R_NAME#1 = 'EUROPE')) AND (PS_SUPPLYCOST#10 = min(PS_SUPPLYCOST)#33)) )
   +--LogicalProject ( projects=[P_PARTKEY#19, P_NAME#20, P_MFGR#21, P_BRAND#22, P_TYPE#23, P_SIZE#24, P_CONTAINER#25, P_RETAILPRICE#26, P_COMMENT#27, S_SUPPKEY#12, S_NAME#13, S_ADDRESS#14, S_NATIONKEY#15, S_PHONE#16, S_ACCTBAL#17, S_COMMENT#18, PS_PARTKEY#7, PS_SUPPKEY#8, PS_AVAILQTY#9, PS_SUPPLYCOST#10, PS_COMMENT#11, N_NATIONKEY#3, N_NAME#4, N_REGIONKEY#5, N_COMMENT#6, R_REGIONKEY#0, R_NAME#1, R_COMMENT#2, min(PS_SUPPLYCOST)#33] )
      +--LogicalApply ( correlationSlot=[P_PARTKEY#19, S_SUPPKEY#12, S_NATIONKEY#15, N_NATIONKEY#3, N_REGIONKEY#5, R_REGIONKEY#0, R_NAME#1], correlationFilter=Optional.empty )
         |--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
         |  |--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
         |  |  |--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
         |  |  |  |--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] )
         |  |  |  |  |--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.part, output=[P_PARTKEY#19, P_NAME#20, P_MFGR#21, P_BRAND#22, P_TYPE#23, P_SIZE#24, P_CONTAINER#25, P_RETAILPRICE#26, P_COMMENT#27], candidateIndexIds=[], selectedIndexId=11076, preAgg=ON )
         |  |  |  |  +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.supplier, output=[S_SUPPKEY#12, S_NAME#13, S_ADDRESS#14, S_NATIONKEY#15, S_PHONE#16, S_ACCTBAL#17, S_COMMENT#18], candidateIndexIds=[], selectedIndexId=11124, preAgg=ON )
         |  |  |  +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.partsupp, output=[PS_PARTKEY#7, PS_SUPPKEY#8, PS_AVAILQTY#9, PS_SUPPLYCOST#10, PS_COMMENT#11], candidateIndexIds=[], selectedIndexId=11092, preAgg=ON )
         |  |  +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.nation, output=[N_NATIONKEY#3, N_NAME#4, N_REGIONKEY#5, N_COMMENT#6], candidateIndexIds=[], selectedIndexId=11044, preAgg=ON )
         |  +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.region, output=[R_REGIONKEY#0, R_NAME#1, R_COMMENT#2], candidateIndexIds=[], selectedIndexId=11108, preAgg=ON )
         +--LogicalAggregate ( phase=LOCAL, outputExpr=[min(PS_SUPPLYCOST#31) AS `min(PS_SUPPLYCOST)`#33], groupByExpr=[] )
            +--LogicalFilter ( predicates=(((((P_PARTKEY#19 = PS_PARTKEY#28) AND (S_SUPPKEY#12 = PS_SUPPKEY#29)) AND (S_NATIONKEY#15 = N_NATIONKEY#3)) AND (N_REGIONKEY#5 = R_REGIONKEY#0)) AND (CAST(R_NAME AS STRING) = CAST(EUROPE AS STRING))) )
               +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.partsupp, output=[PS_PARTKEY#28, PS_SUPPKEY#29, PS_AVAILQTY#30, PS_SUPPLYCOST#31, PS_COMMENT#32], candidateIndexIds=[], selectedIndexId=11092, preAgg=ON )

```

* [chore](be web ui)upgrade jquery version to 3.6.0 (#13942)

* upgrade jquery version to 3.6.0

* update license dist

* [fix](load) Fix load channel mgr lock (#13960)

hot fix load channel mgr lock

* [fix](tablet sink) fallback to non-vectorized interface in tablet_sink if is in progress of upgrding from 1.1-lts to 1.2-lts (#13966)

* [Improvement](javaudf) improve java loader usage (#13962)

* [typo](doc) fixed spelling errors (#13974)

* [doc](routineload)Common mistakes in adding routine load #13975

* [enhancement](test) support tablet repair and balance process in ut (#13940)

* [refactor](iceberg-hudi) disable iceberg and hudi table by default (#13932)

* [test](java-udf)add java udf RegressionTest about the currently supported data types #13972

* [fix](storage) rm unacessary check (#13986) (#13988)

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

* [feature-wip](dlf) prepare to support aliyun dlf (#13969)

[What is DLF](https://www.alibabacloud.com/product/datalake-formation)

This PR is a preparation for support DLF, with some changes of multi catalog

1. Add RuntimeException for most of hive meta store or es client visit operation.
2. Add DLF related dependencies.
3. Move the checks of es catalog properties to the analysis phase of creating es catalog

TODO(in next PR):

1. Refactor the `getSplit` method to support not only hdfs, but s3-compatible object storage.
2. Finish the implementation of supporting DLF

* [feature](table-valued-function) Support S3 tvf (#13959)

This pr does three things:

1. Modified the framework of table-valued-function(tvf).
2. be support `fetch_table_schema` rpc.
3. Implemented `S3(path, AK, SK, format)` table-valued-function.

* [fix](memtracker) Fix DCHECK !std::count(_consumer_tracker_stack.begin(), _consumer_tracker_stack.end(), tracker)

* [feature-array](array-type) Add array function array_popback (#13641)

Remove the last element from array.

```
mysql> select array_popback(['test', NULL, 'value']);
+-----------------------------------------------------+
| array_popback(ARRAY('test', NULL, 'value')) |
+-----------------------------------------------------+
| [test, NULL]                                        |
+-----------------------------------------------------+
```

* [feature](function)add search functions: multi_search_all_positions & multi_match_any (#13763)


Co-authored-by: yiliang qiu <yiliang.qiu@qq.com>

* [chore](gutil) remove some gutil macros and solve some macro conflict with brpc (#13954)


Co-authored-by: yiguolei <yiguolei@gmail.com>

* [security](fe jar) upgrade commons-codec:commons-codec to 1.13 #13951

* [typo](docs) fix docs,delete redundant words #13849

* [fix](repeat)remove unmaterialized expr from repeat node (#13953)

* [typo](docs)fix config doc #14010

* [feature](Nereids) support statement having aggregate function in order by list (#13976)

1. add a feature that support statement having aggregate function in order by list. such as:
    SELECT COUNT(*) FROM t GROUP BY c1 ORDER BY COUNT(*) DESC;
2. add clickbench analyze unit tests

* [feat](Nereids) add graph simplifier (#14007)

* [enhancement](Nereids) remove unnecessary decimal cast (#13745)

* [Bug](udf) Make UDF's type always nullable (#14002)

* [typo](doc) fix get_start doc (#14001)

* [fix](load) fix a bug that reduce memory work on hard limit might be triggered twice (#13967)

When the load mem hard limit reached, all load channel should wait on the lock of LoadChannelMgr, util current reduce mem work finished. In current implementation, there's a bug might cause some threads be woke up before reduce mem work finished:

thread A found that soft limit reached, picked a load channel and waiting for reduce memory work finish.
The memory keep increasing
thread B found that hard limit reached (either the load mem hard limit, or process soft limit), it picked a load channel to reduce memory and set the variable _should_wait_flush to true
thread C found that _should_wait_flush is true, waiting on _wait_flush_cond
thread A finished it's reduce memory work, found that _should_wait_flush is true, set it to false, and notify all threads.
thread C is woke up and pick a load channel to do the reduce memory work, and now thread B's work is not finished.
We can see 2 threads doing reduce memory work when hard limit reached, it's quite confusing.

* [enhancement](Nereids) support otherJoinConjuncts in cascades join reorder (#13681)

* [refactor](cv)wait on condition variable more gently (#12620)

* [enhancement](profile) add instanceNum, tableIds to profile. (#13985)

* [bug](like function)fix like '' (empty string) get wrong result with all rows #14035

* [Enhancement](function) add to_bitmap() function with int type (#13973)

to_bitmap function only support string param only,add to_bitmap() function with int type, this can avoid convert int type to string and then convert string to int

* [enhancement](memtracker) Refactor mem tracker hierarchy (#13585)

mem tracker can be logically divided into 4 layers: 1)process 2)type 3)query/load/compation task etc. 4)exec node etc.

type includes

enum Type {
        GLOBAL = 0,        // Life cycle is the same as the process, e.g. Cache and default Orphan
        QUERY = 1,         // Count the memory consumption of all Query tasks.
        LOAD = 2,          // Count the memory consumption of all Load tasks.
        COMPACTION = 3,    // Count the memory consumption of all Base and Cumulative tasks.
        SCHEMA_CHANGE = 4, // Count the memory consumption of all SchemaChange tasks.
        CLONE = 5, // Count the memory consumption of all EngineCloneTask. Note: Memory that does not contain make/release snapshots.
        BATCHLOAD = 6,  // Count the memory consumption of all EngineBatchLoadTask.
        CONSISTENCY = 7 // Count the memory consumption of all EngineChecksumTask.
    }
Object pointers are no longer saved between each layer, and the values of process and each type are periodically aggregated.

other fix:

In [fix](memtracker) Fix transmit_tracker null pointer because phamp is not thread safe #13528, I tried to separate the memory that was manually abandoned in the query from the orphan mem tracker. But in the actual test, the accuracy of this part of the memory cannot be guaranteed, so put it back to the orphan mem tracker again.

* [fix](priv) fix meta replay bug when upgrading from 1.1.x to 1.2.x (#14046)

* [Enhancement](Dictionary-codec) update dict once on same segment (#13936)

update dict once on same segment

* [feature](Nereids) support query that group by use alias generated in aggregate output (#14030)

support query having alias in group by list, such as:
SELECT c1 AS a, SUM(c2) FROM t GROUP BY a;

* [thirdpart](lib) Add lock free queue of concurrentqueue (#14045)

* [feature-wip](multi-catalog) fix page index filter bug (#14015)

Fix page index filter not take effect when multiple columns
Co-authored-by: jinzhe <jinzhe@selectdb.com>

* [fix](Nereids) Use simple cost to calculate benefit and avoid unuseless calculation  (#14056)

In GraphSimplifier, we can use simple cost to calculate the benefit.
And only when the best neighbor of the apply step is the processing edge, we need to update recursively.

* [feature](multi-catalog) Support data on s3-compatible oss and support aliyun DLF (#13994)

Support Aliyun DLF
Support data on s3-compatible object storage, such as aliyun oss.
Refactor some interface of catalog, to make it more tidy.
Fix bug that the default text format field delimiter of hive should be \x01
Add a new class PooledHiveMetaStoreClient to wrap the IMetaStoreClient.

* [Bug](Bitmap) fix sub_bitmap calculate wrong result to return null (#13978)

fix sub_bitmap calculate wrong result to return null

* [fix](build) fix compile fail on Segment::open (#14058)

* [regression](Nereids) add back tpch regression test cases (#13826)

1. add back TPC-H regression test cases
2. fix decimal problem on aggregate function sum and agg introduced by #13764 
3. fix memo merge group NPE introduced by #13900

* [enhancement](Nereids) tpch q21 anti and semi join reorder (#14037)

estimation of anti and semi join need re-work. we just let tpch q21 pass.

* [chore](bin) do not set heap limit for tcmalloc until doris does not allocates large unused memory (#13761)

We set heap limit for tcmalloc to avoid oom introduced by tcmalloc which allocates memory for cache even free memory of a machine is little. However, doris allocates large memory unused in some cases, so tcmalloc would throw an oom exception even ther are a lot free memory in a machine.

We can set the limit after we fix the problem again.

* [fix](statistics) ColumnStatistics was changed unexpectedly when show stats (#14068)

The logic of show stats would change the internal collected ColumnStat unexpectedly which would cause inaccurate cost and inefficient plan

* [improvement](profile) support ordinary user to get query profile via http api (#14016)

* [Nereids][Improve] infer predicate after push down predicate (#12996)

This PR implements the function of predicate inference

For example:

``` sql
select * from student left join score on student.id = score.sid where score.sid > 1
```
transformed logical plan tree:

                    left join
             /                    \
       filter(sid >1)     filter(id > 1) <---- inferred predicate
         |                           |
      scan                      scan  

See `InferPredicatesTest`  for more cases

 The logic is as follows:
  1. poll up bottom predicate then infer additional predicates
    for example:
    select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id
    1. poll up bottom predicate
       select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t.id = 1
    2. infer
       select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t.id = 1 and t2.id = 1
    finally transformed sql:
       select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t2.id = 1
  2. put these predicates into `otherJoinConjuncts` , these predicates are processed in the next
    round of predicate push-down


Now only support infer `ComparisonPredicate`.

TODO: We should determine whether `expression` satisfies the condition for replacement
             eg: Satisfy `expression` is non-deterministic

* [fix](keyranges) fix the split error of keyranges (#14049)

fix the split error of keyranges

* use extern template to date_time_add (#13970)

* [feature](information_schema) add `backends` information_schema table (#13086)

* [feature](inverted index)WIP inverted index api: SQL syntax and metadata (#13430)

Introduce a SQL syntax for creating inverted index and related metadata changes.

```
-- create table with INVERTED index 

CREATE TABLE httplogs (
  ts datetime,
  clientip varchar(20),
  request string,
  status smallint,
  size int,
  INDEX idx_size (size) USING INVERTED,
  INDEX idx_status (status) USING INVERTED,
  INDEX idx_clientip (clientip) USING INVERTED PROPERTIES("parser"="none")
)
DUPLICATE KEY(ts)
DISTRIBUTED BY RANDOM BUCKETS 10

-- add an INVERTED index  to a table

CREATE INDEX idx_request ON httplogs(request) USING INVERTED PROPERTIES("parser"="english");
```

* [opt](ssb) Add query hint for the SSB queries (#14089)

* [refactor](new-scan) remove old vectorized scan node (#14029)

* [docs](odbc) fix docs for sqlserver odbc table (#14017)

Signed-off-by: nextdreamblue <zxw520blue1@163.com>

Signed-off-by: nextdreamblue <zxw520blue1@163.com>

* [enhancement](load) shrink reserved buffer for page builder (#14012) (#14014)

* [enhancement](load) shrink reserved buffer for page builder (#14012)

For table with hundreds of text type columns, flushing its memtable may cost huge memory.
These memory are consumed when initializing page builder, as it reserves 1MB for each column.
So memory consumption grows in proportion with column number. Shrinking the reservation may
reduce memory substantially in load process.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

* response to the review

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

* Update binary_plain_page.h

* Update binary_dict_page.cpp

* Update binary_plain_page.h

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

* [typo](docs)update array type doc #14057

* [fix](JSON) Fail to parse JSONPath (libc++) (#13941)

* [fix](ctas) text column type len = 1 when create table as select (#13906)

Signed-off-by: nextdreamblue <zxw520blue1@163.com>

* [bug](ColumnDecimal)call set_decimalv2_type when cloning ColumnDecimal (#14061)

* call set_decimalv2_type when cloning ColumnDecimal

* clang format

* [fix](Vectorized)fix json_object and json_array function return wrong result on vectorized engine (#13775)

Issue Number: close #13598

* [Compile](join) Boost compiling and linking (#14081)

* [docs](array-type) update the docs to specify how to use array function when import data (#13995)

Co-authored-by: hucheng01 <hucheng01@baidu.com>

* [feature](Nereids) binding slot in order by that not show in project (#14042)

1. binding slot in order by that not show in project, such as:
SELECT c1 FROM t WHERE c2 > 0 ORDER BY c3

2. not check unbound when bind slot reference. Instead, do it in analysis check.

* [improve](Nereids): remove redundant code, add annotation in Memo. (#14083)

* [fix](Nereids) aggregate disassemble generate error output list on GLOBAL phase aggregate (#14079)

we must use localAggregateFunction as key of globalOutputSMap, because we use local output exprs to generate global output in disassembleDistinct

* [performance-wip] (vectorization) Opt HashJoin Performance  (#12390)

* [fix](compile) fix compile error #14103

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [feature](table-valued-function) Support `desc from s3()` and modify the syntax of tvf (#14047)

This pr does two things:

Support desc function s3()
modify the syntax of tvf

* [enhancement](Nereids) use post-order to generate runtime filter in RuntimeFilterGenerator (#13949)

change runtime filter generator from pre-order to post-order, it maybe change the quantity of generated runtime filters.
and the ut will be corrected.

* [refractor](array) refractor DataTypeArray from_string (#13905)

refractor DataTypeArray from_string, make it more clear;
support ',' and ']' inside string element, for example: ['hello,,,', 'world][]']
support empty elements, such as [,] ==> [0,0]
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [Enhancement][fix](profile)() modify some profiles (#14074)

1. add RemainedDownPredicates
2. fix core dump when _scan_ranges is empty
3. fix invalid memory access on vLiteral's debug_string()
4. enlarge mv test wait time

* [fix](load) fix that load channel failed to be released in time (#14119)

* [typo](docs)add udf doc and optimize udf regression test (#14000)

* [Bug](udf) fix java-udaf process string type error and add some tests (#14106)

* [improvement](join) Share hash table in fragments for broadcast join (#13921)

* [feature](table-valued-function)S3 table valued function supports parquet/orc/json file format #14130

S3 table valued function supports parquet/orc/json file format.
For example: parquet format

* [Bug](outfile) Fix wrong decimal format for ORC (#14124)

* [fix](nereids) cannot collect decimal column stats (#13961)

When execute analyze table, doris fails on decimal columns.
The root cause is the scale in decimalV2 is 9, but 2 in schema.
There is no need to check scale for decimalV2, since it is not a float point type.

* [fix](grouping)the grouping expr should check col name from base table first, then alias (#14077)

* [fix](grouping)the grouping expr should check col name from base table first, then alias

* fix fe ut, the behavior would be same as mysql

* [Fix] add hll param for if function (#12366)

* [Fix] add hll param for if function

* add ut

Co-authored-by: shizhiqiang03 <shizhiqiang03@meituan.com>

* [feature](nereids) let user define right deep tree penalty by session variable (#14040)

it is hard for us to find a proper factor for all queries.
default is 0.7

* [fix](ctas) fix wrong string column length after executing ctas from external table  (#14090)

* [feature](Nereids): InnerJoinLeftAssociate, InnerJoinRightAssociate and JoinExchange. (#14051)

* [feature](function) add new function uuid() (#14092)

* [enhance](Nereids): add missing hypergraph rule. (#14087)

* [fix](memtracker) Fix scanner thread ending after fragment thread causing mem tracker null pointer #14143

* [Enhancement](runtime-filter) enlarge runtime filter in predicate threshold (#13581)

enlarge runtime filter in predicate threshold

* [feature](Nereids) support circle graph (#14082)

* [fix](schemeChange) fe oom because replicas too many when schema change (#12850)

* [chore][build] add instructions to build version string (#14067)

* [feature-wip](multi-catalog) lazy read for ParquetReader (#13917)

Read predicate columns firstly, and use VExprContext(push-down predicates)
to generate the select vector, which is then applied to read the non-predicate columns.
The data in non-predicate columns may be skipped by select vector, so the value-decode-time can be reduced.
If a whole page can be skipped, the decompress-time can also be reduced.

* [fix](nereids) column stats min/max missing (#14091)

in the result of SHOW COLUMN STATS tbl, min/max value is not displayed.

* [enhancement](Nereids) analyze check input slots must in child's output (#14107)

* [Improvement](join) Support nested loop outer join (#13965)

* [docs](recover) modify recover doc (#13904)

* [feature-wip](statistic) persistence table statistics into olap table (#13883)

1. Supports for persisting collected statistics to a pre-built OLAP table named `column_statistics`.
2. Use a much simpler mechanism to collect statistics: all the gauges are collected in single one SQL for each partition and then the whole column, which defined in class `AnalysisJob`
3. Implement a cache to manage the statistics records in FE

TODO:

1. Use opentelemetry to monitor the execution time of each job
2. Format the internal analysis SQL
3. split SQL to promise the in expr's child count not exceeds the FE limits of generated SQL for deleting expired records
4. Implements show statements

* [fix](doc): remove incubator. (#14159)

* [UDF](java udf)  using config to enable java udf instead of macro at compile time (#14062)

* [UDF](java udf) useing config to enable java udf instead of macro at compile time

* [enhancement](plugin) import audit logs for slow queries into a separate table (#14100)

* import audit logs for slow queries into a separate table

* [docs](outfile) Add ORC to outfile document (#14153)

* [Bugfix] Fix upgrade from 1.1 coredump (#14163)

When upgrade from 1.1 to master, and then rollback to 1.1, and upgrade to master again, BE will coredump because some rowsets has schema and some rowsets has no schema. In the first time upgrade from 1.1, BE will flush schema in all rowsets and after rollback to 1.1, BE do compaction, and create some new rowset without schema. And the second time upgrade from 1.1, BE coredump because some conditions depend on having all or none of the rowsets.

* [Improvement](profile) Improve readability for runtime filters in profile string (#14165)

* [Improvement](profile) Improve readability for runtime filters in profile string

* update

* [fix](metric) fix the bug of not updating the query latency metric #14172

* [Docs](README)Update the README.md (#14156)

Add the new release in Readme.md

* [fix](decimal) change log fatal to log warning to avoid code dump on decimal type (#14150)

* [fix](cast)fix cast to char(N) error (#14168)

* [chore](build) Optimize the compilation time (#14170)

Currently, it takes too much time to build BE from source in workflow environments (P0/P1) which affects the efficiency of daily development.

We can measure the time by executing the following command.

time EXTRA_CXX_FLAGS='-O3' BUILD_TYPE=ASAN ./build.sh --be --fe --clean -j "$(nproc)"
This PR optimizes the compilation time by exploiting the following methods.

Reduce the codegen by removing some useless std::visit.
Disable the optimization for some template functions which are instantiated by std::visit conditionally (except for the RELEASE build).

* [feature](Nereids) prune runtime filters which cannot reduce the tuple number of probe table (#13990)

1. add a post processor: runtime filter pruner 
Doris generates RFs (runtime filter) on Join node to reduce the probe table at scan stage. But some RFs have no effect, because its selectivity is 100%. This pr will remove them.
A RF is effective if
a. the build column value range covers part of that of probe column, OR
b. the build column ndv is less than that of probe column, OR
c. the build column's ColumnStats.selectivity < 1, OR
d. the build column is reduced by another RF, which satisfies above criterions.

2. explain graph
a. add RF info in Join and Scan node
b. add predicate count in Scan node

3. Rename session variable
rename `enable_remove_no_conjuncts_runtime_filter_policy` to `enable_runtime_filter_prune` 

4. fix min/max column stats derive bug
`select max(A) as X from T group by B`  
X.min is A.min, not A.max

* [feature](Nereids) replace order by keys by child output if possible (#14108)

To support query like that:
SELECT c1 + 1 as a, sum(c2) FROM t GROUP BY c1 + 1 ORDER BY c1 + 1

After rewrite, plan will equal to
SELECT c1 + 1 as a, sum(c2) FROM t GROUP BY c1 + 1 ORDER BY a

* [Feature](Sequence) Support sequence_match and sequence_count functions (#13785)

* [refactor](Nereids) remove DecimalType, use DecimalV2Type instead (#14166)

* [Bug](runtimefilter) Fix concurrent bug in runtime filter #14177

For runtime filter, signal will be called by a thread which is different from the await thread. So there will be a potential race for variable is_ready

* [enhancement](thirdparty) support create stripe reader by column names (#14184)

ORC NextStripeReader now only support read columns by indices, but it is hard to get column indices for complex types.
We patch ORC adapter to support read columns by column names.
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>

* [Opt](exec) prevent the scan key split whole range (#14088)

prevent the scan key split whole range

* [feature](docs) add docs for SHOW-CATALOG-RECYCLE-BIN (#14185)

* [Bug](nljoin) Keep compatibility for nljoin (#14182)

* [Enhancement](Nerieds) Support numbers TableValuedFunction and some bitmap/hll aggregate function (#14169)

## Problem summary
This pr support
1. `numbers` TableValuedFunction for nereids test, like `select * from numbers(number = 10, backend_num = 1)`
2. bitmap/hll aggregate function
3. support find variable length function in function registry, like `coalesce`
4. fix a bug that print nerieds trace will throw exception because use RewriteRule in ApplyRuleJob, e.g: `AggregateDisassemble`, introduced by #13957

* [test](array function)add array_range function test (#14123)

* add array_range function test

* add array_range function test

* [enhancement](load) Increase batch size of node channel to improve import performance (#13912)

* [chore](cmake) Fix wrong statements (#14187)

* [feature](running_difference) support running_difference function (#13737)

* [feature-array](array-type) Add array function array_with_constant (#14115)

Return array of constants with length num.

```
mysql> select array_with_constant(4, 1223);
+------------------------------+
| array_with_constant(4, 1223) |
+------------------------------+
| [1223, 1223, 1223, 1223]     |
+------------------------------+
1 row in set (0.01 sec)
```
co-authored-by @eldenmoon

* [fix](chore) read max_map_count from proc and make notice much more understandable (#14137)

Some users can not use sysctl under non-root in linux, so we read max_map_count from proc.
Notice users that they can change max_map_count under root.

* [regression-test] sleep longer to void  error (#14186)

* [typo](comment) Fix a lot of spell errors in be comments (#14208)

fix typos.

* [test](jdbc external table) add jdbc regression test case (#14086)

* [test](jdbc postgresql case)add jdbc test case for postgresql  (#14162)

* [fix](scankey) fix extended scan key errors. (#14200)

Issue Number: close #14199

* [feature](partition) support new create partition syntax (#13772)

Create partitions use :
```
PARTITION BY RANGE(event_day)(
        FROM ("2000-11-14") TO ("2021-11-14") INTERVAL 1 YEAR,
        FROM ("2021-11-14") TO ("2022-11-14") INTERVAL 1 MONTH,
        FROM ("2022-11-14") TO ("2023-01-03") INTERVAL 1 WEEK,
        FROM ("2023-01-03") TO ("2023-01-14") INTERVAL 1 DAY,
        PARTITION p_20230114 VALUES [('2023-01-14'), ('2023-01-15'))
)

PARTITION BY RANGE(event_time)(
        FROM ("2023-01-03 12") TO ("2023-01-14 22") INTERVAL 1 HOUR
)
```
can create a year/month/week/day/hour's date partitions in a batch,
also it is compatible with the single partitioning method.

* [improvement](load) reduce memory in batch for small load channels (#14214)

* [enhancement](memory) Support try catch bad alloc (#14135)

* [improvement](load) release load channel actively when error occurs (#14218)

* update (#14215)

* [hotfix](memtracker) Fix expired `DCHECK(_limit != -1);` and segment_meta_mem_tracker inelegant end (#14223)

* [fix](schema) Release memory of TabletSchemaPB in RowsetMetaPB #13993

* [fix](ctas) use json_object in CTAS get wrong result (#14173)

* [fix](ctas) use json_object in CTAS get wrong result

Signed-off-by: nextdreamblue <zxw520blue1@163.com>

* [enhancement](be)close ExecNode ASAP to release resource earlier (#14203)

* [fix](compaction) segcompaction coredump if the rowset starts with a big segment (#14174) (#14176)

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

* [feature](remote)Only query can use local cache when reading remote files. (#13865)

When calling select on remote files, download cache files to local disk.
When calling alter table on remote files, read files directly from remote storage. So if tablet is too large, it will not take up too many local disk when creating local cache file.

* [chore](build) Split the compliation units to build them in parallel (#14232)

* [enhancement](Nereids) add output set and output exprid set cache (#14151)

* [BugFix](file cache) don't clean clone dir when doing _gc_unused_file_caches (#14194)

* use another file_size overload for noexcept

* don't gc clone dir

* use better status

* [improvement](log) print info of error replicas (#14220)

* (fix)(multi-catalog)(es) Fix error result because not used fields_context (#14229)

Fix error result because not used fields_context

* [feature](Nereids) add circle detector and avoid overlap (#14164)

* [test](multi-catalog)Regression test for external hive parquet table (#13611)

* [feature-wip](multi-catalog) Support hive partition cache (#14134)

* [multi-catalog](fix) the eof of lazy read columns may be not equal to the eof of predicate columns (#14212)

Fix three bugs:
1. The EOF of lazy read columns may be not equal to the EOF of predicate columns.
(for example: If the predicate column has 3 pages, with 400 rows for each, but the last page
is filtered by page index. When batch_size=992, the EOF of predicate column is true.
However, we should set batch_size=800 for lazy read column, so the EOF of lazy read column may be false.)
2. The array column does not count the number of nulls
3. Generate wrong NullMap for array column

* [temp](statistics) disable statistic tables

* [feature](selectdb-cloud) Fix txn manager conflict with branch-1.2-lts (#1118)

* [feature](selectdb-cloud) Fix sql_parser.cup

* [fix] fix conflict in SetOperationStmt.java (#1125)

* [feature](selectdb-cloud) Move some files from io to cloud (#1129)

* [feature](selectdb-cloude) Modify header file and macro some file (#1133)

* [feature](selectdb-cloude) Modify header file and macro some file

* tmp

* Fix FE implict merge conflict

* [feature](selectdb-cloud) remove master file cache (#1137)

* [chore-fix-merger](dynamic-table) fix some code conflicts about dynamic table

* Fix memtracker 1.2 conflict (#1147)

* [chore-fix-merge](selectdb-cloud) Fix write path conflicts (#1142)

* replace FileSystemPtr to FileSystemSPtr
* unify create_rowset_writer
* remove cache_path in Segment::open

* Fix some compilation error due to merge

* [chore-fix-compile](topn-opt) fix header file circular reference (#1162)

* [feature](selectdb-cloud) Fix conflict of blocking_priority_queue (#1171)

* [feature](selectdb-cloud) Fix bug when merging blocking_priority_queue (#1174)

* [chore-fix-merge](selectdb-cloud)  Fix conflict in BetaRowsetWriter (#1179)

* Fix compile error FileSystemSPtr and schema_change.cpp

* [feature](selectdb-cloud) rm io_ctx (#1187)

* [chore-fix-merge](selectdb-cloud) Fix FileSystem related codes (#1204)

* [chore-fix-merge](selectdb-cloud)  Make FileSystem ctor no public to avoid stack-allocate or make unique (#1207)

* [feature](selctdb-cloud) Fix compilation error cause by merge

        modified:   be/src/cloud/cloud_base_compaction.cpp
        modified:   be/src/cloud/io/local_file_system.cpp
        modified:   be/src/cloud/io/local_file_system.h
        modified:   be/src/cloud/io/s3_file_system.h
        modified:   be/src/cloud/olap/beta_rowset_writer.cpp
        modified:   be/src/cloud/olap/olap_server.cpp
        modified:   be/src/cloud/olap/segment.cpp
        modified:   be/src/olap/data_dir.cpp
        modified:   be/src/olap/rowset/segment_v2/segment_iterator.cpp
        modified:   be/src/olap/task/engine_alter_tablet_task.cpp
        modified:   be/src/runtime/exec_env_init.cpp
        modified:   be/src/runtime/fragment_mgr.cpp
        modified:   be/src/service/internal_service.cpp
        modified:   be/src/vec/common/sort/vsort_exec_exprs.h
        modified:   be/src/vec/exec/scan/new_olap_scan_node.h
        modified:   be/src/vec/exprs/vectorized_fn_call.cpp
        modified:   be/src/vec/functions/match.cpp

* [chore-fix-merge](selectdb-cloud) Fix inverted index cache limit related exec_env_init.cpp (#1228)

* [chore-fix-merge](topn-rpc-service) add `be_exec_version` to fetch rpc for compability (#1229)

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
Co-authored-by: jakevin <jakevingoo@gmail.com>
Co-authored-by: TengJianPing <18241664+jacktengg@users.noreply.github.com>
Co-authored-by: Lightman <31928846+Lchangliang@users.noreply.github.com>
Co-authored-by: minghong <englefly@gmail.com>
Co-authored-by: qiye <jianliang5669@gmail.com>
Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com>
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
Co-authored-by: Yulei-Yang <yulei.yang0699@gmail.com>
Co-authored-by: jiafeng.zhang <zhangjf1@gmail.com>
Co-authored-by: yiguolei <676222867@qq.com>
Co-authored-by: yiguolei <yiguolei@gmail.com>
Co-authored-by: wxy <dut.xiangyu@gmail.com>
Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>
Co-authored-by: caoliang-web <71004656+caoliang-web@users.noreply.github.com>
Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>
Co-authored-by: Pxl <pxl290@qq.com>
Co-authored-by: luozenglin <37725793+luozenglin@users.noreply.github.com>
Co-authored-by: Adonis Ling <adonis0147@gmail.com>
Co-authored-by: xueweizhang <zxw520blue1@163.com>
Co-authored-by: shee <13843187+qzsee@users.noreply.github.com>
Co-authored-by: 924060929 <924060929@qq.com>
Co-authored-by: ZenoYang <cookie.yz@qq.com>
Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
Co-authored-by: mch_ucchi <41606806+sohardforaname@users.noreply.github.com>
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: Fy <fuyu0824@pku.edu.cn>
Co-authored-by: gnehil <adamlee489@gmail.com>
Co-authored-by: Hong Liu <844981280@qq.com>
Co-authored-by: Ashin Gau <AshinGau@users.noreply.github.com>
Co-authored-by: Xin Liao <liaoxinbit@126.com>
Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>
Co-authored-by: carlvinhust2012 <huchenghappy@126.com>
Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com>
Co-authored-by: camby <104178625@qq.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Co-authored-by: Kang <kxiao.tiger@gmail.com>
Co-authored-by: AlexYue <yj976240184@gmail.com>
Co-authored-by: Mingyu Chen <morningman@163.com>
Co-authored-by: zhannngchen <48427519+zhannngchen@users.noreply.github.com>
Co-authored-by: Jibing-Li <64681310+Jibing-Li@users.noreply.github.com>
Co-authored-by: yinzhijian <373141588@qq.com>
Co-authored-by: zhengyu <freeman.zhang1992@gmail.com>
Co-authored-by: lihaijian <bigmudhaijian@gmail.com>
Co-authored-by: Liqf <109049295+LemonLiTree@users.noreply.github.com>
Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com>
Co-authored-by: lihangyu <15605149486@163.com>
Co-authored-by: Yiliang Qiu <68439848+qqIsAProgrammer@users.noreply.github.com>
Co-authored-by: yiliang qiu <yiliang.qiu@qq.com>
Co-authored-by: zhoumengyks <111965739+zhoumengyks@users.noreply.github.com>
Co-authored-by: Wanghuan <imnu2054wh@126.com>
Co-authored-by: zy-kkk <zhongykk@qq.com>
Co-authored-by: 谢健 <jianxie0@gmail.com>
Co-authored-by: TaoZex <45089228+TaoZex@users.noreply.github.com>
Co-authored-by: slothever <18522955+wsjz@users.noreply.github.com>
Co-authored-by: minghong <minghong.zhou@163.com>
Co-authored-by: Kikyou1997 <33112463+Kikyou1997@users.noreply.github.com>
Co-authored-by: ChPi <chjie93@gmail.com>
Co-authored-by: hucheng01 <hucheng01@baidu.com>
Co-authored-by: WenYao <729673078@qq.com>
Co-authored-by: shizhiqiang03 <shizhiqiang03@meituan.com>
Co-authored-by: yongjinhou <109586248+yongjinhou@users.noreply.github.com>
Co-authored-by: Luwei <814383175@qq.com>
Co-authored-by: Luzhijing <82810928+luzhijing@users.noreply.github.com>
Co-authored-by: abmdocrt <Yukang.Lian2022@gmail.com>
Co-authored-by: Yixi Zhang <83794882+ZhangYiXi-dev@users.noreply.github.com>
Co-authored-by: lsy3993 <110876560+lsy3993@users.noreply.github.com>
Co-authored-by: catpineapple <42031973+catpineapple@users.noreply.github.com>
Co-authored-by: plat1ko <platonekosama@gmail.com>
Co-authored-by: pengxiangyu <diablowcg@163.com>
Co-authored-by: Stalary <stalary@163.com>
Co-authored-by: Lei Zhang <27994433+SWJTU-ZhangLei@users.noreply.github.com>
Co-authored-by: YueW <45946325+Tanya-W@users.noreply.github.com>
Co-authored-by: Kidd <107781942+k-i-d-d@users.noreply.github.com>
Co-authored-by: Xiaocc <598887962@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants