Doris Roadmap 2024 #30669

morningman · 2024-01-31T15:02:43Z

Roadmap 2023
Roadmap 2022

Separation of Storage and Computation

Async Materialized View

Build materialized view
- Support full refresh
- Support partition level refresh
- Support building mv from olap table
- Support building mv from hive table
- Support building mv from iceberg table
- Support building mv from hudi table
- Nested materialized view with DAG
- Incremental building for external table with partition granularity
- Support partition rollup
- Support partition TTL
- Support REPLACE operation
- Support refresh materialized view by time range
Transparent Rewriting
- Support aggregation and rollup
- Support join
- Query Partial rewriting
- Rewriting supports nested materialized view
Materialized view management
- Materialized view recommendation

Semi Structure Data Analysis

Inverted Index
- Support Inverted Index
- Merging index files
- Working with separation of storage and computation
- Speed up the data loading with inverted index
VARIANT data type
- Support VARIANT data type
- Working with inverted index

Query Optimizer

Basic framework
- Fully supports DQL, DML and DDL
- Optimized memory consumption
- Optimized apply order of RBO rules
- Improved efficiency of Cascades enumeration
Planning quality
- Statistics
  - Support statistical for synced materialized views
  - Support partition level statistics collection
  - Supports histogram statistics collection
- New distributed cost model
  - Optimized distributed cost model framework
  - Support runtime cost revaluation
  - Supports more accurate operator cost fitting models
- Rules and enumerations
  - Expand RBO rules
  - Improve the quality of Cascades enumeration plan
  - Enhanced dphyper enumeration framework function, supports outer join enumeration and CDC
- Enhance runtime filter adaptive capability
  - Adaptive runtime filter size
  - Adaptive runtime filter type
  - Adaptive runtime filter waiting time
- Supports histogram-based data skew adaptive processing framework

DataLake Analysis

Support more file format
- RCFile
- SequenceFile
Support more lake format
- Support Iceberg with ORC
- Support Iceberg Equality Delete
- Support more systable on Hudi
- Support CDC scan on Hudi
- Support more systable on Paimon
Trino Connector compatibility
- Trino Connector compatibility framework
- Support Trino DeltaLake Connector
- Support Trino Bigquery Connector
- Support Trino Cassandra Connector
Datalake write back
- Hive
  - Support unpartitioned table
  - Support partitioned table
  - Support INSERT OVERWRITE
  - Support INSERT
- Iceberg
  - Support unpartitioned table
  - Support partitioned table
  - Support update and delete
- Hudi
- Paimon
Enhanced JDBC Catalog
- Support DB2
- Support sharded database
- Support query concurrency
Enhanced file analysis
- Support insert into table value function
Enhanced file cache
- Support memory-level file cache
- Enhanced cache statistic and hits analysis
Integrate with Apache Ranger
- Support Catalog/Database/Table/Resource/WorkloadGroup auth
- Support row policy
- Support data mask
- Support column level privilege
SQL dialect support
- Presto/Trino
- Spark
- Hive
- Clickhouse
- Oracle
- Postgres

Query Processing

Resource Isolation
- Support hard/soft resource isolation for Query & Load
- Enhance the visibility of resource usage
- Automatically workload management at runtime
Support store procedure
Support Spill to disk
- Sort Operator
- Aggregate Operator
- Join Operator
Working with shuffle service
Stage by stage query processing

Storage Engine

Ecosystem & Tools

The text was updated successfully, but these errors were encountered:

vinlee19 · 2024-01-31T15:38:35Z

Currently, I have completed the development and testing of the JDBC catalog for Apache Druid. If possible, I would like to contribute this feature. PR:#27270

vinlee19 · 2024-01-31T15:49:35Z

Flink-connector-doris will use FlinkCDC to synchronize multiple tables or the entire database from MongoDB and DB2 to Doris.

michael1991 · 2024-02-01T01:36:01Z

typo "Mutlt cluster support" => "Multi cluster support"

Hanchers · 2024-02-04T10:04:42Z

Looking forward to version 2.1

longzmkm · 2024-02-06T08:30:48Z

Looking forward to version 2.1

me too

vonwind · 2024-02-06T08:53:22Z

Walking with innovators

zhbdesign · 2024-02-06T09:07:09Z

Support generating columns

morningman · 2024-02-06T09:21:33Z

Currently, I have completed the development and testing of the JDBC catalog for Apache Druid. If possible, I would like to contribute this feature. PR:#27270

HI @vinlee19 , thanks for your contribution. I'm not sure if it is suitable for Druid to using JDBC as data connector? I'm concerning the performance issue. But indeed Trino is using JDBC connect Druid.
I will take a look at this feature, and could you please also provide test cases (eg. druid docker compose)?

liugddx · 2024-02-18T04:44:38Z

dbeaver/dbeaver#22836

liugddx · 2024-02-18T04:45:27Z

langchain-ai/langchain#17527

cs3163077 · 2024-02-19T07:22:40Z

Looking forward to version 2.1

me too

sdhzwc · 2024-02-19T08:56:31Z

Looking forward to version 2.1

me too

dragonkid · 2024-02-19T10:46:13Z

why there is no 'Support building mv from Paimon table'

qianmoQ · 2024-02-28T17:47:49Z

BI tools compatibility Can it be adapted to https://github.com/devlive-community/datacap?

morningman · 2024-03-06T15:07:56Z

why there is no 'Support building mv from Paimon table'

It will be supported

morningman · 2024-03-06T15:10:37Z

BI tools compatibility Can it be adapted to https://github.com/devlive-community/datacap?
Hi @qianmoQ,
I am not familiar with datacap, but you are very welcome to helping Doris adapt to it.
I saw the Doris is on the log wall, maybe you can post a blog on Doris website about how to connect to Doris using datacap?

ShawshankLin · 2024-03-09T03:44:23Z

Support transactional multi table DELETE INSERT for adapting aggregate tables in dbt's Incremental models
https://docs.getdbt.com/docs/build/incremental-models

JiangJamm · 2024-03-18T10:54:34Z

Look forward to supporting DataGrip and kettle!

zhangm365 · 2024-05-16T07:02:16Z

The correct url for async-materialized-view item is as follows:
https://doris.apache.org/docs/query/view-materialized-view/async-materialized-view

shiliming · 2024-06-03T02:27:25Z

binlog，binlog，binlog!

mohuaiyuan · 2024-06-27T06:53:06Z

Which version of Doris is preparing to support DB2？

johnpyp · 2024-07-12T21:33:15Z

On the "Index Overview" page in the docs, I see that Inverted Indexes have Accelerates LIKE marked as "COMING" - is that part of the 2024 roadmap? That would be amazing :)

rudyricci · 2024-07-19T09:47:22Z

HI,
The 2024 roadmap lacks support for AWS S3 via the IAM role, an activity that was marked in the 2023 roadmap. I think it is very important to avoid having hardcoded credentials for security reasons.
See #35928

morningman added kind/community Issues or PRs related to Doris community Discuss labels Jan 31, 2024

morningman pinned this issue Jan 31, 2024

morningman changed the title ~~[Draft] Doris Roadmap 2024~~ Doris Roadmap 2024 Feb 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Doris Roadmap 2024 #30669

Doris Roadmap 2024 #30669

morningman commented Jan 31, 2024 •

edited

Loading

vinlee19 commented Jan 31, 2024

vinlee19 commented Jan 31, 2024 •

edited

Loading

michael1991 commented Feb 1, 2024

Hanchers commented Feb 4, 2024

longzmkm commented Feb 6, 2024

vonwind commented Feb 6, 2024

zhbdesign commented Feb 6, 2024

morningman commented Feb 6, 2024

liugddx commented Feb 18, 2024

liugddx commented Feb 18, 2024

cs3163077 commented Feb 19, 2024

sdhzwc commented Feb 19, 2024

dragonkid commented Feb 19, 2024

qianmoQ commented Feb 28, 2024

morningman commented Mar 6, 2024

morningman commented Mar 6, 2024

ShawshankLin commented Mar 9, 2024 •

edited

Loading

JiangJamm commented Mar 18, 2024

zhangm365 commented May 16, 2024 •

edited

Loading

shiliming commented Jun 3, 2024

mohuaiyuan commented Jun 27, 2024

johnpyp commented Jul 12, 2024

rudyricci commented Jul 19, 2024

Doris Roadmap 2024 #30669

Doris Roadmap 2024 #30669

Comments

morningman commented Jan 31, 2024 • edited Loading

Separation of Storage and Computation

Async Materialized View

Semi Structure Data Analysis

Query Optimizer

DataLake Analysis

Query Processing

Storage Engine

Ecosystem & Tools

vinlee19 commented Jan 31, 2024

vinlee19 commented Jan 31, 2024 • edited Loading

michael1991 commented Feb 1, 2024

Hanchers commented Feb 4, 2024

longzmkm commented Feb 6, 2024

vonwind commented Feb 6, 2024

zhbdesign commented Feb 6, 2024

morningman commented Feb 6, 2024

liugddx commented Feb 18, 2024

liugddx commented Feb 18, 2024

cs3163077 commented Feb 19, 2024

sdhzwc commented Feb 19, 2024

dragonkid commented Feb 19, 2024

qianmoQ commented Feb 28, 2024

morningman commented Mar 6, 2024

morningman commented Mar 6, 2024

ShawshankLin commented Mar 9, 2024 • edited Loading

JiangJamm commented Mar 18, 2024

zhangm365 commented May 16, 2024 • edited Loading

shiliming commented Jun 3, 2024

mohuaiyuan commented Jun 27, 2024

johnpyp commented Jul 12, 2024

rudyricci commented Jul 19, 2024

morningman commented Jan 31, 2024 •

edited

Loading

vinlee19 commented Jan 31, 2024 •

edited

Loading

ShawshankLin commented Mar 9, 2024 •

edited

Loading

zhangm365 commented May 16, 2024 •

edited

Loading