-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement](spark load)Support for RM HA #15000
Conversation
TeamCity pipeline, clickbench performance test result: |
这个功能什么时候能合并到新的代码分支里面呢? |
这个需要验证一下,我尽快 进行验证,我这边暂时没有空余的环境可以支撑。可能耽误了一些时间~ |
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Adding RM HA configuration to the spark load. Spark can accept HA parameters via config, we just need to accept it in the DDL CREATE EXTERNAL RESOURCE spark_resource_sinan_node_manager_ha PROPERTIES ( "type" = "spark", "spark.master" = "yarn", "spark.submit.deployMode" = "cluster", "spark.executor.memory" = "10g", "spark.yarn.queue" = "XXXX", "spark.hadoop.yarn.resourcemanager.address" = "XXXX:8032", "spark.hadoop.yarn.resourcemanager.ha.enabled" = "true", "spark.hadoop.yarn.resourcemanager.ha.rm-ids" = "rm1,rm2", "spark.hadoop.yarn.resourcemanager.hostname.rm1" = "XXXX", "spark.hadoop.yarn.resourcemanager.hostname.rm2" = "XXXX", "spark.hadoop.fs.defaultFS" = "hdfs://XXXX", "spark.hadoop.dfs.nameservices" = "hacluster", "spark.hadoop.dfs.ha.namenodes.hacluster" = "mynamenode1,mynamenode2", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode1" = "XXX:8020", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode2" = "XXXX:8020", "spark.hadoop.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider", "working_dir" = "hdfs://XXXX/doris_prd_data/sinan/spark_load/", "broker" = "broker_personas", "broker.username" = "hdfs", "broker.password" = "", "broker.dfs.nameservices" = "XXX", "broker.dfs.ha.namenodes.XXX" = "mynamenode1, mynamenode2", "broker.dfs.namenode.rpc-address.XXXX.mynamenode1" = "XXXX:8020", "broker.dfs.namenode.rpc-address.XXXX.mynamenode2" = "XXXX:8020", "broker.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" ); Co-authored-by: liujh <liujh@t3go.cn>
…che#16973) [fix](segcompaction) core when doing segcompaction for cancelling load(apache#16731) (apache#17432) segcompaction is async and in parallel with load job. If the load job is canncelling, memory structures will be destroyed and cause segcompaction crash. This commit will wait segcompaction finished before destruction. [Improvement](auth)(step-2) add ranger authorizer for hms catalog (apache#17424) [fix](rebalance) fix that the clone operation is not performed due to incorrect condition judgment (apache#17381) [fix](merge-on-write) fix that delete bitmap is not calculated correctly when clone tablet (apache#17334) [fix](orc) fix heap-use-after-free and potential memory leak of orc reader (apache#17431) fix heap-use-after-free The OrcReader has a internal FileInputStream, If the file is empty, the memory of FileInputStream will leak. Besides, there is a Statistics instance in FileInputStream. FileInputStream maybe delete if the orc reader is inited failed, but Statistics maybe used when orc reader is closed, causing heap-use-after-free error. Potential memory leak When init file scanner in file scan node, the file scanner prepare failed, the memory of file scanner will leak. [Enhencement](schema_scanner) Optimize the performance of reading information schema tables (apache#17371) batch fill block batch call rpc from FE to get table desc For 34w colunms SELECT COUNT( * ) FROM information_schema.columns; time: 10.3s --> 0.4s [Improvement](restore) make timeout of restore job's dispatching task progress configuable (apache#17434) when a restore job which has a plenty of replicas, it may fail due to timeout. The error message is: [RestoreJob.checkAndPrepareMeta():782] begin to send create replica tasks to BE for restore. total 381344 tasks. timeout: 600000 Currently, the max value of timeout is fixed, it's not suitable for such cases. [fix](regression) Parameterize the S3 info for segcompaction regression (apache#16731) (apache#17391) Use regression conf instead hard-coded S3 info. Signed-off-by: freemandealer <freeman.zhang1992@gmail.com> [chore](fe) enhance_mysql_data_type (apache#17429) [enhance](report) add local and remote size in tablet meta header action (apache#17406) [improvement](memory) Modify `mem_limit` default value (apache#17322) Modify the default value of mem_limit to auto. auto means process mem limit is equal to max(physical mem * 0.9, 6.4G). 6.4G is the maximum memory reserved for the system. [enhancement](transaction) Reduce hold writeLock time for DatabaseTransactionMgr to clear transaction (apache#17414) * [enhancement](transaction) Reduce hold writeLock time for DatabaseTransactionMgr to clear transaction * fix ut * remove unnessary field for remove txn bdbje log --------- Co-authored-by: caiconghui1 <caiconghui1@jd.com> [Opt](Vec) Use const_col to opt current functions. (apache#17324) [deps](libhdfs) add official hadoop libhdfs for x86 (apache#17435) This is the first step to introduce official hadoop libhdfs to Doris. Because the current hdfs client libhdfs3 lacks some important feature and is hard to maintain. Download the hadoop 3.3.4 binary from hadoop website: https://hadoop.apache.org/releases.html Extract libs and headers which are used for libhdfs, and pack them into hadoop_lib_3.3.4-x86.tar.gz Upload it to https://github.com/apache/doris-thirdparty/releases/tag/hadoop-libs-3.3.4 TODO: The hadoop libs for arm is missing, we need to find a way to build it [docs](typo) Correct the wrong default value of DECIMAL type displayed in the Help CREATE TABLE apache#17422 Correct the wrong default value of DECIMAL type displayed in the Help CREATE TABLE [fix](array)(parquet) fix be core dump due to load from parquet file containing array types (apache#17298) [refactor](functioncontext) remove duplicate type definition in function context (apache#17421) remove duplicate type definition in function context remove unused method in function context not need stale state in vexpr context because vexpr is stateless and function context saves state and they are cloned. remove useless slot_size in all tuple or slot descriptor. remove doris_udf namespace, it is useless. remove some unused macro definitions. init v_conjuncts in vscanner, not need write the same code in every scanner. using unique ptr to manage function context since it could only belong to a single expr context. Issue Number: close #xxx --------- Co-authored-by: yiguolei <yiguolei@gmail.com> [fix](regression) Adjust the test_add_drop_index case to avoid that FE failed to start when replaying the log (apache#17425) This pr is a temporary circumvention fix. In regression case `inverted_index_p0/test_add_drop_index.groovy`, both bitmap index and inverted index are created on the same table, when create or drop bitmap index will change table's state to `SCHEMA_CHANGE`, create or drop inverted index not change the table's state. Before do create or drop inverted index check the table's state whether is `NORMAL` or not, Because of replay log for 'bitmap index' has change table state, and it didn't finish soon lead to table's state not change back to `NORMAL`, then replay log for 'inverted index' failed, FE start failed. [fix](ParquetReader) definition level of repeated parent is wrong (apache#17337) Fix three bugs: 1. `repeated_parent_def_level ` should be the definition of its repeated parent. 2. Failed to parse schema like `decimal(p, s)` 3. Fill wrong offsets for array type [Enchancement](Materialized-View) add more error infomation for select materialized view fail (apache#17262) add more error infomation for select materialized view fail [fix](publish) fix when TabletPublishTxnTask::handle() error, transaction publish success, and query table error (apache#17409) be use EnginePublishVersionTask to publish all replica of all tablets of table of one transaction, and EnginePublishVersionTask use TabletPublishTxnTask to truly publish tablet and make rowset visible. but if TabletPublishTxnTask error, tablet id will add _error_tablet_ids but no return some errors, and EnginePublishVersionTask will not report any error to fe, and fe make this transaction visible, and partition's version add 1. but if you query this table, will return error like "MySQL [test]> select * from test12;ERROR 1105 (HY000): errCode = 2, detailMessage = [INTERNAL_ERROR]failed to initialize storage reader. tablet=14023.730105214.d742d664692db946-386daa993d84d89d, res=[INTERNAL_ERROR][9.134.167.25]fail to find path in version_graph. spec_version: 0-3, backend=9.134.167.25". after this pr, _error_tablet_ids will report to fe, this transaction will not be visible and add ErrMsg like "publish on tablet 14038 failed.". Signed-off-by: nextdreamblue <zxw520blue1@163.com> [fix](merge-on-write) fix cu compaction correctness check (apache#17347) During concurrent import, the same row location may be marked delete multiple times by different versions of rowset. Duplicate row location need to be removed. [feature](filecache) add a const parameter to control the cache version (apache#17441) * [feature](filecache) add a const parameter to control the cache version * fix [docs](typo) fix faq docs, already support rename column. (apache#17428) * Update data-faq.md Already support rename column. * fix --------- Co-authored-by: zhangyu209 <zhangyu209@meituan.com> [doc](auth)auth doc (apache#17358) * auth doc * auth en doc * add note [fix](planner) Slots in the cojuncts of table function node didn't got materialized apache#17460 [enhencement](jdbc catalog) Use Druid instead of HikariCP in JdbcClient (apache#17395) This pr does three things: 1. Use Druid instead of HikariCP in JdbcClient 2. when download udf jar, add the name of the jar package after the local file name. 3. refactor some jdbcResource code [fix](priv) fix duplicated priv check when check column priv (apache#17446) when executing select stmt, columns privilege check will be invoked multiple times(column number in select stmt) Issue Number: close #xxx [deps](libhdfs) fix hadoop libs build error (apache#17470) [chore](config) Increase the default maximum depth limit for expressions (apache#17418) [fix](type compatibility) fix unsigned int type compatibility problem (apache#17427) Fix unsigned int type compatibility value scope problem. When defining columns, map UNSIGNED INT to BIGINT for compatibility. The problems are as follows: It is not consistent with this doc image We support the unsigned int type to be compatible with mysql types, but the unsigned int type is created as the int at the time of definition. This will cause numerical overflow. [Improvement](meta) support return total statistics of all databases for command show proc '/jobs (apache#17342) currently, show proc jobs command can only used on a specific database, if a user want to see overall data of the whole cluster, he has to look into every database and sum them up, it's troublesome. now he can achieve it simply by giving a -1 as dbId. mysql> show proc '/jobs/-1'; +---------------+---------+---------+----------+-----------+-------+ | JobType | Pending | Running | Finished | Cancelled | Total | +---------------+---------+---------+----------+-----------+-------+ | load | 0 | 0 | 0 | 2 | 2 | | delete | 0 | 0 | 0 | 0 | 0 | | rollup | 0 | 0 | 1 | 0 | 1 | | schema_change | 0 | 0 | 2 | 0 | 2 | | export | 0 | 0 | 0 | 3 | 3 | +---------------+---------+---------+----------+-----------+-------+ mysql> show proc '/jobs/-1/rollup'; +----------+------------------+---------------------+---------------------+------------------+-----------------+----------+---------------+----------+------+----------+---------+ | JobId | TableName | CreateTime | FinishTime | BaseIndexName | RollupIndexName | RollupId | TransactionId | State | Msg | Progress | Timeout | +----------+------------------+---------------------+---------------------+------------------+-----------------+----------+---------------+----------+------+----------+---------+ | 17826065 | order_detail | 2023-02-23 04:21:01 | 2023-02-23 04:21:22 | order_detail | rp1 | 17826066 | 6009 | FINISHED | | NULL | 2592000 | +----------+------------------+---------------------+---------------------+------------------+-----------------+----------+---------------+----------+------+----------+---------+ 1 row in set (0.01 sec) [Chore](schema change) remove some unused code in schema change (apache#17459) remove some unused code in schema change. remove some row-based config and code. [fix](planner) Fix incosistency between groupby expression and output of aggregation node (apache#17438) [fix](restore) fix bug when replay restore and reserve dynamic partition (apache#17326) when replay restore a table with reserve_dynamic_partition_enable=true, must registerOrRemoveDynamicPartitionTable with isReplay=true, or maybe cause OBSERVER can not replay restore auditlog success. [Enhance](auth)Users support multiple roles (apache#17236) Describe your changes. 1.support GRANT role [, role] TO user_identity 2.support REVOKE role [, role] FROM user_identity 3.’Show grants‘ Add a column to display the roles owned by users 4.‘alter user’ prohibit deleting user's role 5.Repair Logic of roleName cannot start with RoleManager.DEFAULT_ ROLE [feature](cooldown)add ut for cooldown on be (apache#17246) * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be [fix](resource)Add s3 checker for alter resource (apache#17467) * add s3 validity checker for alter resource. * add s3 validity checker for alter resource. * add s3 validity checker for alter resource. [enhance](Nereids): refactor code in Project (apache#17450) [refactor](Nereids): refactor PushdownLimit (apache#17355) [enhance](Nereids): remove rule flag in LogicalJoin (apache#17452) [Enhancement](Planner)fix unclear exception msg when create table. apache#17473 [fix](planner) only table name should convert to lowercase when create table (apache#17373) we met error: Unknown column '{}DORIS_DELETE_SIGN{}' in 'default_cluster:db.table. that because when we use alias as the tableName to construct a Table, all parts of the name will be lowercase if lowerCaseTableNames = 1. To avoid it, we should extract tableName from alias and only lower tableName [fix](olap)Crashing caused by IS NULL expression (apache#17463) Issue Number: close apache#17462 [feature](Nereids) add rule split limit into two phase (apache#16797) 1. Add a rule split limit, like Limit(Origin) ==> Limit(Global) -> Gather -> Limit(Local) 2. Add a rule: limit-> sort ==> topN 3. fix a bug about topN 4. make the type of limit,offset long in topN And because this rule is always beneficial, we add a rule in the rewrite phase [Enhancement](spark load)Support for RM HA (apache#15000) Adding RM HA configuration to the spark load. Spark can accept HA parameters via config, we just need to accept it in the DDL CREATE EXTERNAL RESOURCE spark_resource_sinan_node_manager_ha PROPERTIES ( "type" = "spark", "spark.master" = "yarn", "spark.submit.deployMode" = "cluster", "spark.executor.memory" = "10g", "spark.yarn.queue" = "XXXX", "spark.hadoop.yarn.resourcemanager.address" = "XXXX:8032", "spark.hadoop.yarn.resourcemanager.ha.enabled" = "true", "spark.hadoop.yarn.resourcemanager.ha.rm-ids" = "rm1,rm2", "spark.hadoop.yarn.resourcemanager.hostname.rm1" = "XXXX", "spark.hadoop.yarn.resourcemanager.hostname.rm2" = "XXXX", "spark.hadoop.fs.defaultFS" = "hdfs://XXXX", "spark.hadoop.dfs.nameservices" = "hacluster", "spark.hadoop.dfs.ha.namenodes.hacluster" = "mynamenode1,mynamenode2", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode1" = "XXX:8020", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode2" = "XXXX:8020", "spark.hadoop.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider", "working_dir" = "hdfs://XXXX/doris_prd_data/sinan/spark_load/", "broker" = "broker_personas", "broker.username" = "hdfs", "broker.password" = "", "broker.dfs.nameservices" = "XXX", "broker.dfs.ha.namenodes.XXX" = "mynamenode1, mynamenode2", "broker.dfs.namenode.rpc-address.XXXX.mynamenode1" = "XXXX:8020", "broker.dfs.namenode.rpc-address.XXXX.mynamenode2" = "XXXX:8020", "broker.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" ); Co-authored-by: liujh <liujh@t3go.cn> [regression-test](Nereids) add binary arithmetic regression test cases(apache#17363) add all of the valid binary arithmetic expressions test for nereids. currently, float, double, stringlike(string, char, varchar) doesn't support div, bitand, bitor, bitxor. some results with float type are incorrect because of inaccurate precision of regression-test framework. [Fix](Lightweight schema Change) query error caused by array default type is unsupported (apache#17331) We have supportted array type default [], but when using lightweight schema Change to add column array type, query failed as follows: Fix "array default type is unsupported" error. Fix the default value filling assignment digit problem. [fix](nereids) fix bugs in nereids window function (apache#17284) fix two problems: 1. push agg-fun in windowExpression down to AggregateNode for example, sql: select sum(sum(a)) over (order by b) Plan: windowExpression( sum(y) over (order by b)) +--- Agg(sum(a) as y, b) 2. push other expr to upper proj for example, sql: select sum(a+1) over () Plan: windowExpression(sum(y) over ()) +--- Project(a + 1 as y,...) +--- Agg(a,...) [vectorized](bug) fix array constructor function change origin column from block (apache#17296) [fix](remote)fix whole file cache and sub file cache (apache#17468) [enhancement](planner) support case transition of timestamp datatype when create table (apache#17305) [Fix](vectorization) fixed that when a column's _fixed_values exceeds the max_pushdown_conditions_per_column limit, the column will not perform predicate pushdown, but if there are subsequent columns that need to be pushed down, the subsequent column pushdown will be misplaced in _scan_keys and it causes query results to be wrong (apache#17405) the max_pushdown_conditions_per_column limit, the column will not perform predicate pushdown, but if there are subsequent columns that need to be pushed down, the subsequent column pushdown will be misplaced in _scan_keys and it causes query results to be wrong Co-authored-by: tongyang.hty <hantongyang@douyu.tv> [fix](DOE)Fix es p0 case error (apache#17502) Fix es array parse error, introduced by apache#16806 [docs](doc) Add docs for Apache Kyuubi (apache#17481) * add kyuubi doc of zh-CN & en [chore](macOS) Disable detect_container_overflow at BE startup (apache#17514) BE failed to start up due to container-overflow errors reported by address sanitizer. [refactor](remove string val) remove string val structure, it is same with string ref (apache#17461) remove stringval, decimalv2val, bigintval [enhancement](regression-test) add sleep 3s for schema change and rollup (apache#17484) Co-authored-by: yiguolei <yiguolei@gmail.com> [enhancement](exception) add exception structure and using unique ptr in VExplodeBitmapTableFunction (apache#17531) add exception class in common. using unique ptr in VExplodeBitmapTableFunction support single exception or nested exception, like this: ---SingleException [E-100] test OS_ERROR bug @ 0x55e80b93c0d9 doris::Exception::Exception<>() @ 0x55e80b938df1 doris::ExceptionTest_NestedError_Test::TestBody() @ 0x55e82e16bafb testing::internal::HandleSehExceptionsInMethodIfSupported<>() @ 0x55e82e15ab3a testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x55e82e1361e3 testing::Test::Run() @ 0x55e82e136f29 testing::TestInfo::Run() @ 0x55e82e1376e4 testing::TestSuite::Run() @ 0x55e82e148042 testing::internal::UnitTestImpl::RunAllTests() @ 0x55e82e16dcab testing::internal::HandleSehExceptionsInMethodIfSupported<>() @ 0x55e82e15ce4a testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x55e82e147bab testing::UnitTest::Run() @ 0x55e80c4b39e3 RUN_ALL_TESTS() @ 0x55e80c4a99b5 main @ 0x7f0a619d0493 __libc_start_main @ 0x55e80b84602a _start @ (nil) (unknown) [feature](function) support type template in SQL function (apache#17344) A new way just like c++ template is proposed in this PR. The previous functions can be defined much simpler using template function. # map element extract template function [['element_at', '%element_extract%'], 'E', ['ARRAY<E>', 'BIGINT'], 'ALWAYS_NULLABLE', ['E']], # map element extract template function [['element_at', '%element_extract%'], 'V', ['MAP<K, V>', 'K'], 'ALWAYS_NULLABLE', ['K', 'V']], BTW, the plain type function is not affected and the legacy ARRAY_X MAP_K_V is still supported for compatability. [opt](string) optimize string equal comparision (apache#17336) Optimize string equal and not-equal comparison by using memequal_small_allow_overflow15. [feature](Nereids): pushdown complex project through inner/outer Join. (apache#17365) [Chore](execution) change PipelineTaskState to enum class && remove some row-based code (apache#17300) 1. change PipelineTaskState to enum class 2. remove some row-based code on FoldConstantExecutor::_get_result 3. reduce memcpy on minmax runtime filter function(Now we can guarantee that the input data is aligned) 4. add Wunused-template check, and remove some unused function, change some static function to inline function. [Feature](array-function) Support array_concat function (apache#17436) [feature](array_function) add support for array_popfront (apache#17416) [fix](memory) Fix MacOS mem_limit parse error and GC after env Init apache#17528 Fix MacOS mem_limit parse result is 0. Fix GC after env Init, otherwise, when the memory is insufficient, BE will start failure. *** Query id: 0-0 *** *** Aborted at 1677833773 (unix time) try "date -d @1677833773" if you are using GNU date *** *** Current BE git commitID: 8ee5f45 *** *** SIGSEGV address not mapped to object (@0x70) received by PID 24145 (TID 0x7fa53c9fd700) from PID 112; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at be/src/common/signal_handler.h:420 1# os::Linux::chained_handler(int, siginfo*, void*) in /usr/local/jdk/jre/lib/amd64/server/libjvm.so 2# JVM_handle_linux_signal in /usr/local/jdk/jre/lib/amd64/server/libjvm.so 3# signalHandler(int, siginfo*, void*) in /usr/local/jdk/jre/lib/amd64/server/libjvm.so 4# 0x00007FA56295A400 in /lib64/libc.so.6 5# doris::MemTrackerLimiter::log_process_usage_str(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) at be/src/runtime/memory/mem_tracker_limiter.cpp:208 6# doris::MemTrackerLimiter::print_log_process_usage(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) at be/src/runtime/memory/mem_tracker_limiter.cpp:226 7# doris::Daemon::memory_maintenance_thread() at be/src/common/daemon.cpp:245 8# doris::Thread::supervise_thread(void*) at be/src/util/thread.cpp:455 9# start_thread in /lib64/libpthread.so.0 10# clone in /lib64/libc.so.6 [improvement](inverted index)Remove searcher bitmap timer to improve query speed (apache#17407) Timer becomes a bottleneck when the query hit volume is very high. [enhance](cooldown) turn write cooldown meta async (apache#16813) [fix](insert) fix memory leak for insert transaction (apache#17530) [enhance](cooldown) skip once failed follow cooldown tablet (apache#16810) [bugfix](jsonb) Fix create mv using jsonb key cause be crash (apache#17430) [Feature](Nereids) support MarkJoin (apache#16616) 1.The new optimizer supports the combination of subquery and disjunction.In the way of MarkJoin, it behaves the same as the old optimizer. For design details see:https://emmymiao87.github.io/jekyll/update/2021/07/25/Mark-Join.html. 2.Implicit type conversion is performed when conjects are generated after subquery parsing 3.Convert the unnesting of scalarSubquery in filter from filter+join to join + Conjuncts. [dependency](fe)Dependency Upgrade (apache#17377) * Upgrade log4j to 2.X - binding log4j version to 2.18.0 - used log4j-1.2-api complete smooth upgrade * Upgrade filerupload to 1.5 * Upgrade commons-io to 2.7 * Upgrade commons-compress to 1.22 * Upgrade gson to 2.8.9 * Upgrade guava to 30.0-jre * Binding jackson version to 2.14.2 * Upgrade netty-all to 4.1.89.final * Upgrade protobuf to 3.21.12 * Upgrade kafka-clints to 3.4.0 * Upgrade calcite version to 1.33.0 * Upgrade aws-java-sdk to 1.12.302 * Upgrade hadoop to 3.3.4 * Upgrade zookeeper to 3.4.14 * Binding tomcat-embed-core to 8.5.86 * Upgrade apache parent pom to 25 * Use hive-exec-core as a hive dependency, add the missing jar-hive-serde separately * Basic public dependencies are extracted to parent dependencies * Use jackson uniformly as the basic json tool * Remove springloaded, spring-boot-devtools has the same functionality * Modify the spark-related dependency scope to provide, which should be provided at runtime [feature-wip](nereids) Support Q-Error to measure the accuracy of derived statistics (apache#17185) Collect each estimated output rows and exact output rows for each plan node, and use this to measure the accuracy of derived statistics. The estimated result is managed by ProfileManager. We would get this estimated result in the http request by query id later. [fix](bitmap) fix wrong result of bitmap_or for null (apache#17456) Result of select bitmap_to_string(bitmap_or(to_bitmap(1), null)) should be 1 instead of null. This PR fix logic of bitmap_or and bitmap_or_count. Other count related funcitons should also be checked and fix, they will be fixed in another PR. [Improvement](datev2) push down datev2 predicates with date literal (apache#17522) [FIX](complex-type) fix Is null predict for map/struct (apache#17497) Fix is null predicate is not supported in select statement for map and struct column [fix](planner) insert default value should not change return type of function object in function set (apache#17536) function now's return type changed to datetimev2 by mistake. It can be reproduced in the following way CREATE TABLE `testdt` ( `c1` int(11) NULL, `c2` datetimev2 NULL DEFAULT CURRENT_TIMESTAMP ) ENGINE=OLAP DUPLICATE KEY(`c1`, `c2`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`c1`) BUCKETS 10 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "light_schema_change" = "true", "disable_auto_compaction" = "false" ); insert into testdt2(c1) values(1); select now(); [BugFix](PG catalog) fix that pg catalog can not get all schemas that a pg user can access. (apache#17517) Describe your changes. In the past, pg catalog use sql SELECT schema_name FROM information_schema.schemata where schema_owner='<UserName>'; to select schemas of an user. Howerver, this sql can not find all schemas that a user can access, that because: A user may not be the owner of an schema, but may have read permission on the schema. A user may inherit the permissions of its user group and thus have read permissions on one schema. For these reasons, we replace the sql statement with select nspname from pg_namespace where has_schema_privilege('<UserName>', nspname, 'USAGE'); [fix](meta) fix catlog parameter when checking privilege of show_create_table stmt (apache#17445) the ctl parameter of show_create_table stmt is not set in checkTblPriv, this is not correct for multicatalog [enhancement](histogram) optimize the histogram bucketing strategy, etc (apache#17264) * optimize the histogram bucketing strategy, etc * fix p0 regression of histogram [fix](DOE) Fix esquery not working (apache#17566) Function esquery does not work because there is a problem parsing the first parameter type. The first parameter, which is SlotRef, will be cast to CastExpr. This will cause error while generating ES DSL. Add more types to adapt esquery function. (# [feature](ui)add profile download button 17547) [Fix](FQDN) fix slow when ip changed (apache#17455) [fix](scanner) remove useless _src_block_mem_reuse to avoid core dump while loading (apache#17559) The _src_block_mem_reuse variable actually not work, since the _src_block is cleared each time when we call get_block. But current code may cause core dump, see issue apache#17587. Because we insert some result column generated by expr into dest block, and such a column holds a pointer to some column in original schema. When clearing the data of _src_block, some column's data in dest block is also cleared. e.g. coalesce will return a result column which holds a pointer to some original column, see issue apache#17588 [feature](nereids)support bitmap runtime filter on nereids (apache#16927) * A in(B) -> bitmap_contains(bitmap_union(B), A) support bitmap runtime filter on nereids * GroupPlan -> Plan * fmt * fix target cast problem remove test code [fix](nereids)fix first_value/lead/lag window function bug in nereids (apache#17315) * [fix](nereids)fix first_value/lead/lag window function bug in nereids * add more test * add order by to fix test case * fix test cases [dependenct](fe) Replace jackson-mapper-asl with fastxml-jsckson (apache#17303) [typo](docs) Add a hyperlink to facilitate user redirect. (apache#17563) [fix](function) fix AES/SM3/SM4 encrypt/ decrypt algorithm initialization vector bug (apache#17420) ECB algorithm, block_encryption_mode does not take effect, it only takes effect when init vector is provided. Solved: 192/256 supports calculation without init vector For other algorithms, an error should be reported when there is no init vector Initialization Vector. The default value for the block_encryption_mode system variable is aes-128-ecb, or ECB mode, which does not require an initialization vector. The alternative permitted block encryption modes CBC, CFB1, CFB8, CFB128, and OFB all require an initialization vector. Reference: https://dev.mysql.com/doc/refman/8.0/en/encryption-functions.html#function_aes-decrypt Note: This fix does not support smooth upgrades. during upgrade process, query may report error: funciton not found [fix](file cache)fix block file cache can't be configured (apache#17511) [Enchancement](function) Inline some aggregate function && remove nullable combinator (apache#17328) 1. Inline some aggregate function 2. remove nullable combinator [fix](in-bitmap) fix result may be wrong if the left side of the in bitmap predicate is a constant (apache#17570) [Bug](array filter) Fix bug due to `ColumnArray::filter_generic` invalid inplace `size_at` after `set_end_ptr` (apache#17554) We should make a new PodArray to add items instead of do it inplace [Refactor](map) remove using column array in map to reduce offset column (apache#17330) 1. remove column array in map 2. add offsets column in map Aim to reduce duplicate offset from key-array and value-array in disk [vectorized](udaf) support array type for java-udaf (apache#17351) [fix](profile) modify load profile some bugs and docs (apache#17533) 1. 'insert into' profile has 'insert' type, can not query by 'load' type 2. 'insert into' profile does not have job_id, can not query by job_id. so put all profiles key with query_id 3. 'broker load' profile does not have some infos, npe [fix](Nereids) store offset of Limit in exchangeNode (apache#17548) When the limit has offset, we should add an exchangeNode and store the offset in it [enhancement](Nereids) refactor costModel framework (apache#17339) refactor cost-model frameWork: 1. Use Cost class to encapsulate double cost 2. Use the `addChildCost` function to calculate the cost with children rather than add directly Note we use the `Cost` class because we hope to customize the operator of adding a child host. Therefore, only when the cost would add the child Cost or be added by the parent we use `Cost`. Otherwise, we use double such as `upperbound` [enhancement](Nereids) support decimalv3 and precision derive (apache#17393) [fix](regression) close p0 fe regression pipline config for avoiding flink load fail (get tableList write lock timeout) (apache#17573) This pull request for bellow problem : when fe config set sys_log_verbos_modules = org.apache.doris, which will make fe get writeLock longer. In this config, make a stream load, that stream load will failed with this message ([ANALYSIS_ERROR]errCode = 2, detailMessage = get tableList write lock timeout, tableList=(Table [id=86135, name=flink_connector, type=OLAP])) [feature](regression) add http test action (apache#17567) [typo](docs) Fix some misspelled words (apache#17593)
…che#16973) [fix](segcompaction) core when doing segcompaction for cancelling load(apache#16731) (apache#17432) segcompaction is async and in parallel with load job. If the load job is canncelling, memory structures will be destroyed and cause segcompaction crash. This commit will wait segcompaction finished before destruction. [Improvement](auth)(step-2) add ranger authorizer for hms catalog (apache#17424) [fix](rebalance) fix that the clone operation is not performed due to incorrect condition judgment (apache#17381) [fix](merge-on-write) fix that delete bitmap is not calculated correctly when clone tablet (apache#17334) [fix](orc) fix heap-use-after-free and potential memory leak of orc reader (apache#17431) fix heap-use-after-free The OrcReader has a internal FileInputStream, If the file is empty, the memory of FileInputStream will leak. Besides, there is a Statistics instance in FileInputStream. FileInputStream maybe delete if the orc reader is inited failed, but Statistics maybe used when orc reader is closed, causing heap-use-after-free error. Potential memory leak When init file scanner in file scan node, the file scanner prepare failed, the memory of file scanner will leak. [Enhencement](schema_scanner) Optimize the performance of reading information schema tables (apache#17371) batch fill block batch call rpc from FE to get table desc For 34w colunms SELECT COUNT( * ) FROM information_schema.columns; time: 10.3s --> 0.4s [Improvement](restore) make timeout of restore job's dispatching task progress configuable (apache#17434) when a restore job which has a plenty of replicas, it may fail due to timeout. The error message is: [RestoreJob.checkAndPrepareMeta():782] begin to send create replica tasks to BE for restore. total 381344 tasks. timeout: 600000 Currently, the max value of timeout is fixed, it's not suitable for such cases. [fix](regression) Parameterize the S3 info for segcompaction regression (apache#16731) (apache#17391) Use regression conf instead hard-coded S3 info. Signed-off-by: freemandealer <freeman.zhang1992@gmail.com> [chore](fe) enhance_mysql_data_type (apache#17429) [enhance](report) add local and remote size in tablet meta header action (apache#17406) [improvement](memory) Modify `mem_limit` default value (apache#17322) Modify the default value of mem_limit to auto. auto means process mem limit is equal to max(physical mem * 0.9, 6.4G). 6.4G is the maximum memory reserved for the system. [enhancement](transaction) Reduce hold writeLock time for DatabaseTransactionMgr to clear transaction (apache#17414) * [enhancement](transaction) Reduce hold writeLock time for DatabaseTransactionMgr to clear transaction * fix ut * remove unnessary field for remove txn bdbje log --------- Co-authored-by: caiconghui1 <caiconghui1@jd.com> [Opt](Vec) Use const_col to opt current functions. (apache#17324) [deps](libhdfs) add official hadoop libhdfs for x86 (apache#17435) This is the first step to introduce official hadoop libhdfs to Doris. Because the current hdfs client libhdfs3 lacks some important feature and is hard to maintain. Download the hadoop 3.3.4 binary from hadoop website: https://hadoop.apache.org/releases.html Extract libs and headers which are used for libhdfs, and pack them into hadoop_lib_3.3.4-x86.tar.gz Upload it to https://github.com/apache/doris-thirdparty/releases/tag/hadoop-libs-3.3.4 TODO: The hadoop libs for arm is missing, we need to find a way to build it [docs](typo) Correct the wrong default value of DECIMAL type displayed in the Help CREATE TABLE apache#17422 Correct the wrong default value of DECIMAL type displayed in the Help CREATE TABLE [fix](array)(parquet) fix be core dump due to load from parquet file containing array types (apache#17298) [refactor](functioncontext) remove duplicate type definition in function context (apache#17421) remove duplicate type definition in function context remove unused method in function context not need stale state in vexpr context because vexpr is stateless and function context saves state and they are cloned. remove useless slot_size in all tuple or slot descriptor. remove doris_udf namespace, it is useless. remove some unused macro definitions. init v_conjuncts in vscanner, not need write the same code in every scanner. using unique ptr to manage function context since it could only belong to a single expr context. Issue Number: close #xxx --------- Co-authored-by: yiguolei <yiguolei@gmail.com> [fix](regression) Adjust the test_add_drop_index case to avoid that FE failed to start when replaying the log (apache#17425) This pr is a temporary circumvention fix. In regression case `inverted_index_p0/test_add_drop_index.groovy`, both bitmap index and inverted index are created on the same table, when create or drop bitmap index will change table's state to `SCHEMA_CHANGE`, create or drop inverted index not change the table's state. Before do create or drop inverted index check the table's state whether is `NORMAL` or not, Because of replay log for 'bitmap index' has change table state, and it didn't finish soon lead to table's state not change back to `NORMAL`, then replay log for 'inverted index' failed, FE start failed. [fix](ParquetReader) definition level of repeated parent is wrong (apache#17337) Fix three bugs: 1. `repeated_parent_def_level ` should be the definition of its repeated parent. 2. Failed to parse schema like `decimal(p, s)` 3. Fill wrong offsets for array type [Enchancement](Materialized-View) add more error infomation for select materialized view fail (apache#17262) add more error infomation for select materialized view fail [fix](publish) fix when TabletPublishTxnTask::handle() error, transaction publish success, and query table error (apache#17409) be use EnginePublishVersionTask to publish all replica of all tablets of table of one transaction, and EnginePublishVersionTask use TabletPublishTxnTask to truly publish tablet and make rowset visible. but if TabletPublishTxnTask error, tablet id will add _error_tablet_ids but no return some errors, and EnginePublishVersionTask will not report any error to fe, and fe make this transaction visible, and partition's version add 1. but if you query this table, will return error like "MySQL [test]> select * from test12;ERROR 1105 (HY000): errCode = 2, detailMessage = [INTERNAL_ERROR]failed to initialize storage reader. tablet=14023.730105214.d742d664692db946-386daa993d84d89d, res=[INTERNAL_ERROR][9.134.167.25]fail to find path in version_graph. spec_version: 0-3, backend=9.134.167.25". after this pr, _error_tablet_ids will report to fe, this transaction will not be visible and add ErrMsg like "publish on tablet 14038 failed.". Signed-off-by: nextdreamblue <zxw520blue1@163.com> [fix](merge-on-write) fix cu compaction correctness check (apache#17347) During concurrent import, the same row location may be marked delete multiple times by different versions of rowset. Duplicate row location need to be removed. [feature](filecache) add a const parameter to control the cache version (apache#17441) * [feature](filecache) add a const parameter to control the cache version * fix [docs](typo) fix faq docs, already support rename column. (apache#17428) * Update data-faq.md Already support rename column. * fix --------- Co-authored-by: zhangyu209 <zhangyu209@meituan.com> [doc](auth)auth doc (apache#17358) * auth doc * auth en doc * add note [fix](planner) Slots in the cojuncts of table function node didn't got materialized apache#17460 [enhencement](jdbc catalog) Use Druid instead of HikariCP in JdbcClient (apache#17395) This pr does three things: 1. Use Druid instead of HikariCP in JdbcClient 2. when download udf jar, add the name of the jar package after the local file name. 3. refactor some jdbcResource code [fix](priv) fix duplicated priv check when check column priv (apache#17446) when executing select stmt, columns privilege check will be invoked multiple times(column number in select stmt) Issue Number: close #xxx [deps](libhdfs) fix hadoop libs build error (apache#17470) [chore](config) Increase the default maximum depth limit for expressions (apache#17418) [fix](type compatibility) fix unsigned int type compatibility problem (apache#17427) Fix unsigned int type compatibility value scope problem. When defining columns, map UNSIGNED INT to BIGINT for compatibility. The problems are as follows: It is not consistent with this doc image We support the unsigned int type to be compatible with mysql types, but the unsigned int type is created as the int at the time of definition. This will cause numerical overflow. [Improvement](meta) support return total statistics of all databases for command show proc '/jobs (apache#17342) currently, show proc jobs command can only used on a specific database, if a user want to see overall data of the whole cluster, he has to look into every database and sum them up, it's troublesome. now he can achieve it simply by giving a -1 as dbId. mysql> show proc '/jobs/-1'; +---------------+---------+---------+----------+-----------+-------+ | JobType | Pending | Running | Finished | Cancelled | Total | +---------------+---------+---------+----------+-----------+-------+ | load | 0 | 0 | 0 | 2 | 2 | | delete | 0 | 0 | 0 | 0 | 0 | | rollup | 0 | 0 | 1 | 0 | 1 | | schema_change | 0 | 0 | 2 | 0 | 2 | | export | 0 | 0 | 0 | 3 | 3 | +---------------+---------+---------+----------+-----------+-------+ mysql> show proc '/jobs/-1/rollup'; +----------+------------------+---------------------+---------------------+------------------+-----------------+----------+---------------+----------+------+----------+---------+ | JobId | TableName | CreateTime | FinishTime | BaseIndexName | RollupIndexName | RollupId | TransactionId | State | Msg | Progress | Timeout | +----------+------------------+---------------------+---------------------+------------------+-----------------+----------+---------------+----------+------+----------+---------+ | 17826065 | order_detail | 2023-02-23 04:21:01 | 2023-02-23 04:21:22 | order_detail | rp1 | 17826066 | 6009 | FINISHED | | NULL | 2592000 | +----------+------------------+---------------------+---------------------+------------------+-----------------+----------+---------------+----------+------+----------+---------+ 1 row in set (0.01 sec) [Chore](schema change) remove some unused code in schema change (apache#17459) remove some unused code in schema change. remove some row-based config and code. [fix](planner) Fix incosistency between groupby expression and output of aggregation node (apache#17438) [fix](restore) fix bug when replay restore and reserve dynamic partition (apache#17326) when replay restore a table with reserve_dynamic_partition_enable=true, must registerOrRemoveDynamicPartitionTable with isReplay=true, or maybe cause OBSERVER can not replay restore auditlog success. [Enhance](auth)Users support multiple roles (apache#17236) Describe your changes. 1.support GRANT role [, role] TO user_identity 2.support REVOKE role [, role] FROM user_identity 3.’Show grants‘ Add a column to display the roles owned by users 4.‘alter user’ prohibit deleting user's role 5.Repair Logic of roleName cannot start with RoleManager.DEFAULT_ ROLE [feature](cooldown)add ut for cooldown on be (apache#17246) * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be * add ut for cooldown on be [fix](resource)Add s3 checker for alter resource (apache#17467) * add s3 validity checker for alter resource. * add s3 validity checker for alter resource. * add s3 validity checker for alter resource. [enhance](Nereids): refactor code in Project (apache#17450) [refactor](Nereids): refactor PushdownLimit (apache#17355) [enhance](Nereids): remove rule flag in LogicalJoin (apache#17452) [Enhancement](Planner)fix unclear exception msg when create table. apache#17473 [fix](planner) only table name should convert to lowercase when create table (apache#17373) we met error: Unknown column '{}DORIS_DELETE_SIGN{}' in 'default_cluster:db.table. that because when we use alias as the tableName to construct a Table, all parts of the name will be lowercase if lowerCaseTableNames = 1. To avoid it, we should extract tableName from alias and only lower tableName [fix](olap)Crashing caused by IS NULL expression (apache#17463) Issue Number: close apache#17462 [feature](Nereids) add rule split limit into two phase (apache#16797) 1. Add a rule split limit, like Limit(Origin) ==> Limit(Global) -> Gather -> Limit(Local) 2. Add a rule: limit-> sort ==> topN 3. fix a bug about topN 4. make the type of limit,offset long in topN And because this rule is always beneficial, we add a rule in the rewrite phase [Enhancement](spark load)Support for RM HA (apache#15000) Adding RM HA configuration to the spark load. Spark can accept HA parameters via config, we just need to accept it in the DDL CREATE EXTERNAL RESOURCE spark_resource_sinan_node_manager_ha PROPERTIES ( "type" = "spark", "spark.master" = "yarn", "spark.submit.deployMode" = "cluster", "spark.executor.memory" = "10g", "spark.yarn.queue" = "XXXX", "spark.hadoop.yarn.resourcemanager.address" = "XXXX:8032", "spark.hadoop.yarn.resourcemanager.ha.enabled" = "true", "spark.hadoop.yarn.resourcemanager.ha.rm-ids" = "rm1,rm2", "spark.hadoop.yarn.resourcemanager.hostname.rm1" = "XXXX", "spark.hadoop.yarn.resourcemanager.hostname.rm2" = "XXXX", "spark.hadoop.fs.defaultFS" = "hdfs://XXXX", "spark.hadoop.dfs.nameservices" = "hacluster", "spark.hadoop.dfs.ha.namenodes.hacluster" = "mynamenode1,mynamenode2", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode1" = "XXX:8020", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode2" = "XXXX:8020", "spark.hadoop.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider", "working_dir" = "hdfs://XXXX/doris_prd_data/sinan/spark_load/", "broker" = "broker_personas", "broker.username" = "hdfs", "broker.password" = "", "broker.dfs.nameservices" = "XXX", "broker.dfs.ha.namenodes.XXX" = "mynamenode1, mynamenode2", "broker.dfs.namenode.rpc-address.XXXX.mynamenode1" = "XXXX:8020", "broker.dfs.namenode.rpc-address.XXXX.mynamenode2" = "XXXX:8020", "broker.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" ); Co-authored-by: liujh <liujh@t3go.cn> [regression-test](Nereids) add binary arithmetic regression test cases(apache#17363) add all of the valid binary arithmetic expressions test for nereids. currently, float, double, stringlike(string, char, varchar) doesn't support div, bitand, bitor, bitxor. some results with float type are incorrect because of inaccurate precision of regression-test framework. [Fix](Lightweight schema Change) query error caused by array default type is unsupported (apache#17331) We have supportted array type default [], but when using lightweight schema Change to add column array type, query failed as follows: Fix "array default type is unsupported" error. Fix the default value filling assignment digit problem. [fix](nereids) fix bugs in nereids window function (apache#17284) fix two problems: 1. push agg-fun in windowExpression down to AggregateNode for example, sql: select sum(sum(a)) over (order by b) Plan: windowExpression( sum(y) over (order by b)) +--- Agg(sum(a) as y, b) 2. push other expr to upper proj for example, sql: select sum(a+1) over () Plan: windowExpression(sum(y) over ()) +--- Project(a + 1 as y,...) +--- Agg(a,...) [vectorized](bug) fix array constructor function change origin column from block (apache#17296) [fix](remote)fix whole file cache and sub file cache (apache#17468) [enhancement](planner) support case transition of timestamp datatype when create table (apache#17305) [Fix](vectorization) fixed that when a column's _fixed_values exceeds the max_pushdown_conditions_per_column limit, the column will not perform predicate pushdown, but if there are subsequent columns that need to be pushed down, the subsequent column pushdown will be misplaced in _scan_keys and it causes query results to be wrong (apache#17405) the max_pushdown_conditions_per_column limit, the column will not perform predicate pushdown, but if there are subsequent columns that need to be pushed down, the subsequent column pushdown will be misplaced in _scan_keys and it causes query results to be wrong Co-authored-by: tongyang.hty <hantongyang@douyu.tv> [fix](DOE)Fix es p0 case error (apache#17502) Fix es array parse error, introduced by apache#16806 [docs](doc) Add docs for Apache Kyuubi (apache#17481) * add kyuubi doc of zh-CN & en [chore](macOS) Disable detect_container_overflow at BE startup (apache#17514) BE failed to start up due to container-overflow errors reported by address sanitizer. [refactor](remove string val) remove string val structure, it is same with string ref (apache#17461) remove stringval, decimalv2val, bigintval [enhancement](regression-test) add sleep 3s for schema change and rollup (apache#17484) Co-authored-by: yiguolei <yiguolei@gmail.com> [enhancement](exception) add exception structure and using unique ptr in VExplodeBitmapTableFunction (apache#17531) add exception class in common. using unique ptr in VExplodeBitmapTableFunction support single exception or nested exception, like this: ---SingleException [E-100] test OS_ERROR bug @ 0x55e80b93c0d9 doris::Exception::Exception<>() @ 0x55e80b938df1 doris::ExceptionTest_NestedError_Test::TestBody() @ 0x55e82e16bafb testing::internal::HandleSehExceptionsInMethodIfSupported<>() @ 0x55e82e15ab3a testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x55e82e1361e3 testing::Test::Run() @ 0x55e82e136f29 testing::TestInfo::Run() @ 0x55e82e1376e4 testing::TestSuite::Run() @ 0x55e82e148042 testing::internal::UnitTestImpl::RunAllTests() @ 0x55e82e16dcab testing::internal::HandleSehExceptionsInMethodIfSupported<>() @ 0x55e82e15ce4a testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x55e82e147bab testing::UnitTest::Run() @ 0x55e80c4b39e3 RUN_ALL_TESTS() @ 0x55e80c4a99b5 main @ 0x7f0a619d0493 __libc_start_main @ 0x55e80b84602a _start @ (nil) (unknown) [feature](function) support type template in SQL function (apache#17344) A new way just like c++ template is proposed in this PR. The previous functions can be defined much simpler using template function. # map element extract template function [['element_at', '%element_extract%'], 'E', ['ARRAY<E>', 'BIGINT'], 'ALWAYS_NULLABLE', ['E']], # map element extract template function [['element_at', '%element_extract%'], 'V', ['MAP<K, V>', 'K'], 'ALWAYS_NULLABLE', ['K', 'V']], BTW, the plain type function is not affected and the legacy ARRAY_X MAP_K_V is still supported for compatability. [opt](string) optimize string equal comparision (apache#17336) Optimize string equal and not-equal comparison by using memequal_small_allow_overflow15. [feature](Nereids): pushdown complex project through inner/outer Join. (apache#17365) [Chore](execution) change PipelineTaskState to enum class && remove some row-based code (apache#17300) 1. change PipelineTaskState to enum class 2. remove some row-based code on FoldConstantExecutor::_get_result 3. reduce memcpy on minmax runtime filter function(Now we can guarantee that the input data is aligned) 4. add Wunused-template check, and remove some unused function, change some static function to inline function. [Feature](array-function) Support array_concat function (apache#17436) [feature](array_function) add support for array_popfront (apache#17416) [fix](memory) Fix MacOS mem_limit parse error and GC after env Init apache#17528 Fix MacOS mem_limit parse result is 0. Fix GC after env Init, otherwise, when the memory is insufficient, BE will start failure. *** Query id: 0-0 *** *** Aborted at 1677833773 (unix time) try "date -d @1677833773" if you are using GNU date *** *** Current BE git commitID: 8ee5f45 *** *** SIGSEGV address not mapped to object (@0x70) received by PID 24145 (TID 0x7fa53c9fd700) from PID 112; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at be/src/common/signal_handler.h:420 1# os::Linux::chained_handler(int, siginfo*, void*) in /usr/local/jdk/jre/lib/amd64/server/libjvm.so 2# JVM_handle_linux_signal in /usr/local/jdk/jre/lib/amd64/server/libjvm.so 3# signalHandler(int, siginfo*, void*) in /usr/local/jdk/jre/lib/amd64/server/libjvm.so 4# 0x00007FA56295A400 in /lib64/libc.so.6 5# doris::MemTrackerLimiter::log_process_usage_str(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) at be/src/runtime/memory/mem_tracker_limiter.cpp:208 6# doris::MemTrackerLimiter::print_log_process_usage(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) at be/src/runtime/memory/mem_tracker_limiter.cpp:226 7# doris::Daemon::memory_maintenance_thread() at be/src/common/daemon.cpp:245 8# doris::Thread::supervise_thread(void*) at be/src/util/thread.cpp:455 9# start_thread in /lib64/libpthread.so.0 10# clone in /lib64/libc.so.6 [improvement](inverted index)Remove searcher bitmap timer to improve query speed (apache#17407) Timer becomes a bottleneck when the query hit volume is very high. [enhance](cooldown) turn write cooldown meta async (apache#16813) [fix](insert) fix memory leak for insert transaction (apache#17530) [enhance](cooldown) skip once failed follow cooldown tablet (apache#16810) [bugfix](jsonb) Fix create mv using jsonb key cause be crash (apache#17430) [Feature](Nereids) support MarkJoin (apache#16616) 1.The new optimizer supports the combination of subquery and disjunction.In the way of MarkJoin, it behaves the same as the old optimizer. For design details see:https://emmymiao87.github.io/jekyll/update/2021/07/25/Mark-Join.html. 2.Implicit type conversion is performed when conjects are generated after subquery parsing 3.Convert the unnesting of scalarSubquery in filter from filter+join to join + Conjuncts. [dependency](fe)Dependency Upgrade (apache#17377) * Upgrade log4j to 2.X - binding log4j version to 2.18.0 - used log4j-1.2-api complete smooth upgrade * Upgrade filerupload to 1.5 * Upgrade commons-io to 2.7 * Upgrade commons-compress to 1.22 * Upgrade gson to 2.8.9 * Upgrade guava to 30.0-jre * Binding jackson version to 2.14.2 * Upgrade netty-all to 4.1.89.final * Upgrade protobuf to 3.21.12 * Upgrade kafka-clints to 3.4.0 * Upgrade calcite version to 1.33.0 * Upgrade aws-java-sdk to 1.12.302 * Upgrade hadoop to 3.3.4 * Upgrade zookeeper to 3.4.14 * Binding tomcat-embed-core to 8.5.86 * Upgrade apache parent pom to 25 * Use hive-exec-core as a hive dependency, add the missing jar-hive-serde separately * Basic public dependencies are extracted to parent dependencies * Use jackson uniformly as the basic json tool * Remove springloaded, spring-boot-devtools has the same functionality * Modify the spark-related dependency scope to provide, which should be provided at runtime [feature-wip](nereids) Support Q-Error to measure the accuracy of derived statistics (apache#17185) Collect each estimated output rows and exact output rows for each plan node, and use this to measure the accuracy of derived statistics. The estimated result is managed by ProfileManager. We would get this estimated result in the http request by query id later. [fix](bitmap) fix wrong result of bitmap_or for null (apache#17456) Result of select bitmap_to_string(bitmap_or(to_bitmap(1), null)) should be 1 instead of null. This PR fix logic of bitmap_or and bitmap_or_count. Other count related funcitons should also be checked and fix, they will be fixed in another PR. [Improvement](datev2) push down datev2 predicates with date literal (apache#17522) [FIX](complex-type) fix Is null predict for map/struct (apache#17497) Fix is null predicate is not supported in select statement for map and struct column [fix](planner) insert default value should not change return type of function object in function set (apache#17536) function now's return type changed to datetimev2 by mistake. It can be reproduced in the following way CREATE TABLE `testdt` ( `c1` int(11) NULL, `c2` datetimev2 NULL DEFAULT CURRENT_TIMESTAMP ) ENGINE=OLAP DUPLICATE KEY(`c1`, `c2`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`c1`) BUCKETS 10 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "light_schema_change" = "true", "disable_auto_compaction" = "false" ); insert into testdt2(c1) values(1); select now(); [BugFix](PG catalog) fix that pg catalog can not get all schemas that a pg user can access. (apache#17517) Describe your changes. In the past, pg catalog use sql SELECT schema_name FROM information_schema.schemata where schema_owner='<UserName>'; to select schemas of an user. Howerver, this sql can not find all schemas that a user can access, that because: A user may not be the owner of an schema, but may have read permission on the schema. A user may inherit the permissions of its user group and thus have read permissions on one schema. For these reasons, we replace the sql statement with select nspname from pg_namespace where has_schema_privilege('<UserName>', nspname, 'USAGE'); [fix](meta) fix catlog parameter when checking privilege of show_create_table stmt (apache#17445) the ctl parameter of show_create_table stmt is not set in checkTblPriv, this is not correct for multicatalog [enhancement](histogram) optimize the histogram bucketing strategy, etc (apache#17264) * optimize the histogram bucketing strategy, etc * fix p0 regression of histogram [fix](DOE) Fix esquery not working (apache#17566) Function esquery does not work because there is a problem parsing the first parameter type. The first parameter, which is SlotRef, will be cast to CastExpr. This will cause error while generating ES DSL. Add more types to adapt esquery function. (# [feature](ui)add profile download button 17547) [Fix](FQDN) fix slow when ip changed (apache#17455) [fix](scanner) remove useless _src_block_mem_reuse to avoid core dump while loading (apache#17559) The _src_block_mem_reuse variable actually not work, since the _src_block is cleared each time when we call get_block. But current code may cause core dump, see issue apache#17587. Because we insert some result column generated by expr into dest block, and such a column holds a pointer to some column in original schema. When clearing the data of _src_block, some column's data in dest block is also cleared. e.g. coalesce will return a result column which holds a pointer to some original column, see issue apache#17588 [feature](nereids)support bitmap runtime filter on nereids (apache#16927) * A in(B) -> bitmap_contains(bitmap_union(B), A) support bitmap runtime filter on nereids * GroupPlan -> Plan * fmt * fix target cast problem remove test code [fix](nereids)fix first_value/lead/lag window function bug in nereids (apache#17315) * [fix](nereids)fix first_value/lead/lag window function bug in nereids * add more test * add order by to fix test case * fix test cases [dependenct](fe) Replace jackson-mapper-asl with fastxml-jsckson (apache#17303) [typo](docs) Add a hyperlink to facilitate user redirect. (apache#17563) [fix](function) fix AES/SM3/SM4 encrypt/ decrypt algorithm initialization vector bug (apache#17420) ECB algorithm, block_encryption_mode does not take effect, it only takes effect when init vector is provided. Solved: 192/256 supports calculation without init vector For other algorithms, an error should be reported when there is no init vector Initialization Vector. The default value for the block_encryption_mode system variable is aes-128-ecb, or ECB mode, which does not require an initialization vector. The alternative permitted block encryption modes CBC, CFB1, CFB8, CFB128, and OFB all require an initialization vector. Reference: https://dev.mysql.com/doc/refman/8.0/en/encryption-functions.html#function_aes-decrypt Note: This fix does not support smooth upgrades. during upgrade process, query may report error: funciton not found [fix](file cache)fix block file cache can't be configured (apache#17511) [Enchancement](function) Inline some aggregate function && remove nullable combinator (apache#17328) 1. Inline some aggregate function 2. remove nullable combinator [fix](in-bitmap) fix result may be wrong if the left side of the in bitmap predicate is a constant (apache#17570) [Bug](array filter) Fix bug due to `ColumnArray::filter_generic` invalid inplace `size_at` after `set_end_ptr` (apache#17554) We should make a new PodArray to add items instead of do it inplace [Refactor](map) remove using column array in map to reduce offset column (apache#17330) 1. remove column array in map 2. add offsets column in map Aim to reduce duplicate offset from key-array and value-array in disk [vectorized](udaf) support array type for java-udaf (apache#17351) [fix](profile) modify load profile some bugs and docs (apache#17533) 1. 'insert into' profile has 'insert' type, can not query by 'load' type 2. 'insert into' profile does not have job_id, can not query by job_id. so put all profiles key with query_id 3. 'broker load' profile does not have some infos, npe [fix](Nereids) store offset of Limit in exchangeNode (apache#17548) When the limit has offset, we should add an exchangeNode and store the offset in it [enhancement](Nereids) refactor costModel framework (apache#17339) refactor cost-model frameWork: 1. Use Cost class to encapsulate double cost 2. Use the `addChildCost` function to calculate the cost with children rather than add directly Note we use the `Cost` class because we hope to customize the operator of adding a child host. Therefore, only when the cost would add the child Cost or be added by the parent we use `Cost`. Otherwise, we use double such as `upperbound` [enhancement](Nereids) support decimalv3 and precision derive (apache#17393) [fix](regression) close p0 fe regression pipline config for avoiding flink load fail (get tableList write lock timeout) (apache#17573) This pull request for bellow problem : when fe config set sys_log_verbos_modules = org.apache.doris, which will make fe get writeLock longer. In this config, make a stream load, that stream load will failed with this message ([ANALYSIS_ERROR]errCode = 2, detailMessage = get tableList write lock timeout, tableList=(Table [id=86135, name=flink_connector, type=OLAP])) [feature](regression) add http test action (apache#17567) [typo](docs) Fix some misspelled words (apache#17593)
Adding RM HA configuration to the spark load. Spark can accept HA parameters via config, we just need to accept it in the DDL CREATE EXTERNAL RESOURCE spark_resource_sinan_node_manager_ha PROPERTIES ( "type" = "spark", "spark.master" = "yarn", "spark.submit.deployMode" = "cluster", "spark.executor.memory" = "10g", "spark.yarn.queue" = "XXXX", "spark.hadoop.yarn.resourcemanager.address" = "XXXX:8032", "spark.hadoop.yarn.resourcemanager.ha.enabled" = "true", "spark.hadoop.yarn.resourcemanager.ha.rm-ids" = "rm1,rm2", "spark.hadoop.yarn.resourcemanager.hostname.rm1" = "XXXX", "spark.hadoop.yarn.resourcemanager.hostname.rm2" = "XXXX", "spark.hadoop.fs.defaultFS" = "hdfs://XXXX", "spark.hadoop.dfs.nameservices" = "hacluster", "spark.hadoop.dfs.ha.namenodes.hacluster" = "mynamenode1,mynamenode2", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode1" = "XXX:8020", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode2" = "XXXX:8020", "spark.hadoop.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider", "working_dir" = "hdfs://XXXX/doris_prd_data/sinan/spark_load/", "broker" = "broker_personas", "broker.username" = "hdfs", "broker.password" = "", "broker.dfs.nameservices" = "XXX", "broker.dfs.ha.namenodes.XXX" = "mynamenode1, mynamenode2", "broker.dfs.namenode.rpc-address.XXXX.mynamenode1" = "XXXX:8020", "broker.dfs.namenode.rpc-address.XXXX.mynamenode2" = "XXXX:8020", "broker.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" ); Co-authored-by: liujh <liujh@t3go.cn>
Adding RM HA configuration to the spark load. Spark can accept HA parameters via config, we just need to accept it in the DDL CREATE EXTERNAL RESOURCE spark_resource_sinan_node_manager_ha PROPERTIES ( "type" = "spark", "spark.master" = "yarn", "spark.submit.deployMode" = "cluster", "spark.executor.memory" = "10g", "spark.yarn.queue" = "XXXX", "spark.hadoop.yarn.resourcemanager.address" = "XXXX:8032", "spark.hadoop.yarn.resourcemanager.ha.enabled" = "true", "spark.hadoop.yarn.resourcemanager.ha.rm-ids" = "rm1,rm2", "spark.hadoop.yarn.resourcemanager.hostname.rm1" = "XXXX", "spark.hadoop.yarn.resourcemanager.hostname.rm2" = "XXXX", "spark.hadoop.fs.defaultFS" = "hdfs://XXXX", "spark.hadoop.dfs.nameservices" = "hacluster", "spark.hadoop.dfs.ha.namenodes.hacluster" = "mynamenode1,mynamenode2", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode1" = "XXX:8020", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode2" = "XXXX:8020", "spark.hadoop.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider", "working_dir" = "hdfs://XXXX/doris_prd_data/sinan/spark_load/", "broker" = "broker_personas", "broker.username" = "hdfs", "broker.password" = "", "broker.dfs.nameservices" = "XXX", "broker.dfs.ha.namenodes.XXX" = "mynamenode1, mynamenode2", "broker.dfs.namenode.rpc-address.XXXX.mynamenode1" = "XXXX:8020", "broker.dfs.namenode.rpc-address.XXXX.mynamenode2" = "XXXX:8020", "broker.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" ); Co-authored-by: liujh <liujh@t3go.cn>
Adding RM HA configuration to the spark load. Spark can accept HA parameters via config, we just need to accept it in the DDL CREATE EXTERNAL RESOURCE spark_resource_sinan_node_manager_ha PROPERTIES ( "type" = "spark", "spark.master" = "yarn", "spark.submit.deployMode" = "cluster", "spark.executor.memory" = "10g", "spark.yarn.queue" = "XXXX", "spark.hadoop.yarn.resourcemanager.address" = "XXXX:8032", "spark.hadoop.yarn.resourcemanager.ha.enabled" = "true", "spark.hadoop.yarn.resourcemanager.ha.rm-ids" = "rm1,rm2", "spark.hadoop.yarn.resourcemanager.hostname.rm1" = "XXXX", "spark.hadoop.yarn.resourcemanager.hostname.rm2" = "XXXX", "spark.hadoop.fs.defaultFS" = "hdfs://XXXX", "spark.hadoop.dfs.nameservices" = "hacluster", "spark.hadoop.dfs.ha.namenodes.hacluster" = "mynamenode1,mynamenode2", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode1" = "XXX:8020", "spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode2" = "XXXX:8020", "spark.hadoop.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider", "working_dir" = "hdfs://XXXX/doris_prd_data/sinan/spark_load/", "broker" = "broker_personas", "broker.username" = "hdfs", "broker.password" = "", "broker.dfs.nameservices" = "XXX", "broker.dfs.ha.namenodes.XXX" = "mynamenode1, mynamenode2", "broker.dfs.namenode.rpc-address.XXXX.mynamenode1" = "XXXX:8020", "broker.dfs.namenode.rpc-address.XXXX.mynamenode2" = "XXXX:8020", "broker.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" ); Co-authored-by: liujh <liujh@t3go.cn>
Proposed changes
Issue Number: close #13806
Problem summary
Adding RM HA configuration to the spark load.
Spark can accept HA parameters via config, we just need to accept it in the DDL
CREATE EXTERNAL RESOURCE spark_resource_sinan_node_manager_ha
PROPERTIES
(
"type" = "spark",
"spark.master" = "yarn",
"spark.submit.deployMode" = "cluster",
"spark.executor.memory" = "10g",
"spark.yarn.queue" = "XXXX",
"spark.hadoop.yarn.resourcemanager.address" = "XXXX:8032",
"spark.hadoop.yarn.resourcemanager.ha.enabled" = "true",
"spark.hadoop.yarn.resourcemanager.ha.rm-ids" = "rm1,rm2",
"spark.hadoop.yarn.resourcemanager.hostname.rm1" = "XXXX",
"spark.hadoop.yarn.resourcemanager.hostname.rm2" = "XXXX",
"spark.hadoop.fs.defaultFS" = "hdfs://XXXX",
"spark.hadoop.dfs.nameservices" = "hacluster",
"spark.hadoop.dfs.ha.namenodes.hacluster" = "mynamenode1,mynamenode2",
"spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode1" = "XXX:8020",
"spark.hadoop.dfs.namenode.rpc-address.hacluster.mynamenode2" = "XXXX:8020",
"spark.hadoop.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
"working_dir" = "hdfs://XXXX/doris_prd_data/sinan/spark_load/",
"broker" = "broker_personas",
"broker.username" = "hdfs",
"broker.password" = "",
"broker.dfs.nameservices" = "XXX",
"broker.dfs.ha.namenodes.XXX" = "mynamenode1, mynamenode2",
"broker.dfs.namenode.rpc-address.XXXX.mynamenode1" = "XXXX:8020",
"broker.dfs.namenode.rpc-address.XXXX.mynamenode2" = "XXXX:8020",
"broker.dfs.client.failover.proxy.provider" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
);
Checklist(Required)
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...