-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[enhancement](cloud) add table version to cloud #32738
Conversation
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
clang-tidy review says "All clean, LGTM! 👍" |
1 similar comment
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
TPC-H: Total hot run time: 38518 ms
|
TPC-DS: Total hot run time: 185760 ms
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
fe/fe-core/src/main/java/org/apache/doris/catalog/OlapTable.java
Outdated
Show resolved
Hide resolved
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TPC-H: Total hot run time: 37923 ms
|
TPC-DS: Total hot run time: 181406 ms
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
fe/fe-core/src/main/java/org/apache/doris/cloud/qe/SnapshotProxy.java
Outdated
Show resolved
Hide resolved
8a3ea44
to
a2ec83e
Compare
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
TPC-H: Total hot run time: 37932 ms
|
TPC-DS: Total hot run time: 181778 ms
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
run cloud_p0 |
a2ec83e
to
6500edf
Compare
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TPC-H: Total hot run time: 38040 ms
|
TPC-DS: Total hot run time: 181855 ms
|
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
PR approved by at least one committer and no changes requested. |
run cloud_p0 |
PR approved by anyone and no changes requested. |
import java.util.concurrent.TimeUnit; | ||
import java.util.concurrent.TimeoutException; | ||
|
||
public class SnapshotProxy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unnecessary to extract the visible version methods into SnapshotProxy. If you want to reorganize the code, then you should put this kind of helper class in the cloud.rpc package, and named it VersionHelper
might be better.
BTW, you have put the code for getting the version in CloudPartition in SnapshotProxy, should the code for getting the version in OlapTable also be put here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because table version and partition visible version use the same rpc interface, there will be a lot of duplicate code. This SnapshotProxy is designed to put duplicate code in one place.
If the name and package need to be changed, I can modify it in the next PR which will fix a regression test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I finish it in #32989. Please review it.
@@ -115,6 +118,8 @@ public class SummaryProfile { | |||
builder.put(GET_PARTITION_VERSION_TIME, 1); | |||
builder.put(GET_PARTITION_VERSION_COUNT, 1); | |||
builder.put(GET_PARTITION_VERSION_BY_HAS_DATA_COUNT, 1); | |||
builder.put(GET_TABLE_VERSION_TIME, 1); | |||
builder.put(GET_TABLE_VERSION_COUNT, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems we don't need table version in query profile summary.
* [fix](merge cloud) Fix cloud be set be tag map (#32864) * [chore] Add gavinchou to collaborators (#32881) * [chore](show) support statement to show views from table (#32358) MySQL [test]> show views; +----------------+ | Tables_in_test | +----------------+ | t1_view | | t2_view | +----------------+ 2 rows in set (0.00 sec) MySQL [test]> show views like '%t1%'; +----------------+ | Tables_in_test | +----------------+ | t1_view | +----------------+ 1 row in set (0.01 sec) MySQL [test]> show views where create_time > '2024-03-18'; +----------------+ | Tables_in_test | +----------------+ | t2_view | +----------------+ 1 row in set (0.02 sec) * [Enhancement](ranger) Disable some permission operations when Ranger or LDAP are enabled (#32538) Disable some permission operations when Ranger or LDAP are enabled. * [chore](ci) exclude unstable trino_connector case (#32892) Co-authored-by: stephen <hello-stephen@qq.com> * [fix](Nereids) NPE when create table with implicit index type (#32893) * [improvement](mtmv) Support more join types for query rewriting by materialized view (#32685) This pattern of rewriting is supported for multi-table joins and supported join types is as following: INNER JOIN LEFT OUTER JOIN RIGHT OUTER JOIN FULL OUTER JOIN LEFT SEMI JOIN RIGHT SEMI JOIN LEFT ANTI JOIN RIGHT ANTI JOIN * [Serde](Variant) support arrow serialization for varint type (#32780) * [fix](multicatalog) fix no data error when read hive table on cosn (#32815) Currently, when reading a hive on cosn table, doris return empty result, but the table has data. iceberg on cosn is ok. The reason is misuse of cosn's file sytem. according to cosn's doc, its fs.cosn.impl should be org.apache.hadoop.fs.CosFileSystem * [fix](nereids)EliminateGroupByConstant should replace agg's output after removing constant group by keys (#32878) * [Fix](executor)Fix regression test for test_active_queries/test_backend_active_tasks #32899 * [fix](iceberg) fix iceberg catalog bug and p2 test cases (#32898) 1. Fix iceberg catalog bug This PR #30198 change the logic of `IcebergHMSExternalCatalog.java`, to get locationUrl by calling hive metastore's `getCatalog()` method. But this method only exists in hive 3+. So it will fail if we using hive 2.x. I temporary remove this logic, because this logic is only used from iceberg table writing. Which is still under development. We will rethink this logic later. 2. Fix test cases Some of P2 test cases missed `order_qt`. And because the output format of the floating point type is changed, some result in `out` files need to be regenerated. * [revert](jni) revert part of #32455 (#32904) * [fix](spill) Avoid releasing resources while spill tasks are executing (#32783) * [chore](log) print query id before logging profile in be.INFO (#32922) * [fix](grace-exit) Stop incorrectly of reportwork cause heap use after free #32929 * [improvement](decommission be) decommission check replica num (#32748) * [fix](arrow-flight) Fix reach limit of connections error (#32911) Fix Reach limit of connections error in fe.conf , arrow_flight_token_cache_size is mandatory less than qe_max_connection/2. arrow flight sql is a stateless protocol, connection is usually not actively disconnected, bearer token is evict from the cache will unregister ConnectContext. Fix ConnectContext.command not be reset to COM_SLEEP in time, this will result in frequent kill connection after query timeout. Fix bearer token evict log and exception. TODO: use arrow flight session: https://mail.google.com/mail/u/0/#inbox/FMfcgzGxRdxBLQLTcvvtRpqsvmhrHpdH * [bugfix](cloud) few variable not initialized (#32868) ../../cloud/src/recycler/meta_checker.cpp can cause uninitialised memory read. * [fix](arrow-flight) Fix arrow flight sql compatible with JDK 17 and upgrade arrow 15.0.2 (#32796) --add-opens=java.base/java.nio=ALL-UNNAMED, see: https://arrow.apache.org/docs/java/install.html#java-compatibility groovy use flight sql connection to execute query SUM(MAX(c1) OVER (PARTITION BY)) report error: AGGREGATE clause must not contain analytic expressions, but no problem in Java execute it with jdbc::arrow-flight-sql. groovy not support print arrow array type, throw IndexOutOfBoundsException. "arrow_flight_sql" not support two phase read ./run-regression-test.sh --run --clean -g arrow_flight_sql * [fix](spill) SpillStream's writer maybe may not have been finalized (#32931) * [improvement](spill) Disable DistinctStreamingAgg when spill is enabled (#32932) * [Improve](inverted_index) update clucene and improve array inverted index writer (#32436) * [Performance](exec) replace SipHash in function by XXHash (#32919) * [feature](agg) add aggregate function sum0 (#32541) * [improvement](mtmv) Support to get tables in materialized view when collecting table in plan (#32797) Support to get tables in materialized view when collecting table in plan table scehma as fllowing: create materialized view mv1 BUILD IMMEDIATE REFRESH COMPLETE ON MANUAL DISTRIBUTED BY RANDOM BUCKETS 1 PROPERTIES ('replication_num' = '1') as select t1.c1, t3.c2 from table1 t1 inner join table3 t3 on t1.c1 = t3.c2 if get table from the plan as follwoing, we can get [table1, table3, table2], the mv1 is expanded to get base tables; SELECT mv1.*, uuid() FROM mv1 LEFT SEMI JOIN table2 ON mv1.c1 = table2.c1 WHERE mv1.c1 IN ( SELECT c1 FROM table2 ) OR mv1.c1 < 10 * [enhance](mtmv)support olap table partition column is null (#32698) * [enhancement](cloud) add table version to cloud (#32738) Add table version to cloud. In Fe: Get: If Fe is cloud mode, get table version from meta service. Update: Op drop/replace temp partition, commit transaction. In meta service: Add: create Index. init value is 1. Remove: by recycler. Update: commit/drop partition rpc, commit txn rpc. Atomic++. * [fix](cloud) schema change from not null to null (#32913) 1. Use equals instead of == for type comparing 2. null bitmap size is reisze by size of ref column. * [feature](Nereids): add ColumnPruningPostProcessor. (#32800) * [case](rowpolicy)fix row policy has been exist (#32880) * [fix](pipeline) fix use error row desc when origin block clear (#32803) * [fix](Nereids) support variant column with index when create table (#32948) * [opt](Nereids) support create table with variant type (#32953) * [test](insert-overwrite) Add insert overwrite auto detect concurrency cases (#32935) * [fix](compile) fe cannot compile in idea (#32955) * [enhancement](plsql) Support select * from routines (#32866) Support show of plsql procedure using select * from routines. * [fix](trino-connector) fix `NoClassDefFoundError` of hudi `Utils` class (#32846) Due to the change of this PR #32455 , the `trino-connector-scanner` package cannot access the `hudi_scanner` package, so the exception NoclassDeffounderror will appear. We need to write a separate Utils class. * [exec](column) change some complex column move to noexcept (#32954) * [Enhancement](data skew) extends show data skew (#32732) * [chore](test) let suite compatible with Nereids (#32964) * Support identical column name in different index. (#32792) * Limit the max string length to 1024 while collecting column stats to control BE memory usage. (#32470) * [fix](merge-iterator) fix NOT_IMPLEMENTED_ERROR when read next block view (#32961) * [improvement](executor)Add tag property for workload group #32874 * [fix](auth)unified workload and resource permission logic (#32907) - `Grant resource` can no longer grant global `usage_priv` - `grant resource %` instead of `grant resource *` before change: ``` grant usage_priv on resource * to f; show grants for f\G *************************** 1. row *************************** UserIdentity: 'f'@'%' Comment: Password: No Roles: GlobalPrivs: Usage_priv CatalogPrivs: NULL DatabasePrivs: internal.information_schema: Select_priv ; internal.mysql: Select_priv TablePrivs: NULL ColPrivs: NULL ResourcePrivs: NULL CloudClusterPrivs: NULL WorkloadGroupPrivs: normal: Usage_priv ``` after change ``` grant usage_priv on resource '%' to f; show grants for f\G *************************** 1. row *************************** UserIdentity: 'f'@'%' Comment: Password: No Roles: GlobalPrivs: NULL CatalogPrivs: NULL DatabasePrivs: internal.information_schema: Select_priv ; internal.mysql: Select_priv TablePrivs: NULL ColPrivs: NULL ResourcePrivs: %: Usage_priv CloudClusterPrivs: NULL WorkloadGroupPrivs: normal: Usage_priv ``` --------- Co-authored-by: yujun <yu.jun.reach@gmail.com> Co-authored-by: Gavin Chou <gavineaglechou@gmail.com> Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com> Co-authored-by: yongjinhou <109586248+yongjinhou@users.noreply.github.com> Co-authored-by: Dongyang Li <hello_stephen@qq.com> Co-authored-by: stephen <hello-stephen@qq.com> Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com> Co-authored-by: seawinde <149132972+seawinde@users.noreply.github.com> Co-authored-by: lihangyu <15605149486@163.com> Co-authored-by: Yulei-Yang <yulei.yang0699@gmail.com> Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com> Co-authored-by: wangbo <wangbo@apache.org> Co-authored-by: Mingyu Chen <morningman@163.com> Co-authored-by: Jerry Hu <mrhhsg@gmail.com> Co-authored-by: zhiqiang <seuhezhiqiang@163.com> Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com> Co-authored-by: Vallish Pai <vallishpai@gmail.com> Co-authored-by: amory <wangqiannan@selectdb.com> Co-authored-by: HappenLee <happenlee@hotmail.com> Co-authored-by: Jensen <czjourney@163.com> Co-authored-by: zhangdong <493738387@qq.com> Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com> Co-authored-by: jakevin <jakevingoo@gmail.com> Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com> Co-authored-by: zclllyybb <zhaochangle@selectdb.com> Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com> Co-authored-by: Xin Liao <liaoxinbit@126.com>
@@ -657,7 +663,7 @@ private void dropCloudPartition(long dbId, long tableId, List<Long> partitionIds | |||
} | |||
} | |||
|
|||
public void dropMaterializedIndex(Long tableId, List<Long> indexIds) throws DdlException { | |||
public void dropMaterializedIndex(long tableId, List<Long> indexIds, boolean dropTable) throws DdlException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is dropTable
referenced in this method?
…partition (apache#32989) 1.fix regression test test-table-version. 2.when dropping partition, only update table version when dropping a non-empty partition. 3.complete some unfinished work in apache#32738
Proposed changes
Add table version to cloud.
In Fe:
Get: If Fe is cloud mode, get table version from meta service.
Update: Op drop/replace temp partition, commit transaction.
In meta service:
Add: create Index. init value is 1.
Remove: by recycler.
Update: commit/drop partition rpc, commit txn rpc. Atomic++.
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...