Skip to content

[feature](vectorization) Support Vectorized Exec Engine In Doris#7785

Merged
morningman merged 33 commits intomasterfrom
vectorized
Jan 18, 2022
Merged

[feature](vectorization) Support Vectorized Exec Engine In Doris#7785
morningman merged 33 commits intomasterfrom
vectorized

Conversation

@HappenLee
Copy link
Contributor

@HappenLee HappenLee commented Jan 17, 2022

Proposed changes

Issue Number: close #6238

Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
Co-authored-by: wangbo <506340561@qq.com>
Co-authored-by: emmymiao87 <522274284@qq.com>
Co-authored-by: Pxl <952130278@qq.com>
Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
Co-authored-by: thinker <zchw100@qq.com>
Co-authored-by: Zeno Yang <1521564989@qq.com>
Co-authored-by: Wang Shuo <wangshuo128@gmail.com>
Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com>
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
Co-authored-by: xinghuayu007 <1450306854@qq.com>
Co-authored-by: weizuo93 <weizuo@apache.org>
Co-authored-by: yiguolei <guoleiyi@tencent.com>
Co-authored-by: anneji-dev <85534151+anneji-dev@users.noreply.github.com>
Co-authored-by: awakeljw <993007281@qq.com>
Co-authored-by: taberylyang <95272637+taberylyang@users.noreply.github.com>
Co-authored-by: Cui Kaifeng <48012748+azurenake@users.noreply.github.com>

Problem Summary:

1. Some code from clickhouse

ClickHouse is an excellent implementation of the vectorized execution engine database, so here we have borrowed a lot from its excellent implementation in terms of data structure and function implementation. We are based on ClickHouse v19.16.2.2 and would like to thank the ClickHouse community and developers.

The following comment has been added to the code from Clickhouse:
// This file is copied from
// https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/AggregationCommon.h
// and modified by Doris

2. Support exec node and query:

  • vaggregation_node
  • vanalytic_eval_node
  • vassert_num_rows_node
  • vblocking_join_node
  • vcross_join_node
  • vempty_set_node
  • ves_http_scan_node
  • vexcept_node
  • vexchange_node
  • vintersect_node
  • vmysql_scan_node
  • vodbc_scan_node
  • volap_scan_node
  • vrepeat_node
  • vschema_scan_node
  • vselect_node
  • vset_operation_node
  • vsort_node
  • vunion_node
  • vhash_join_node

You can run exec engine of SSB/TPCH and 70% TPCDS stand query test set.

3. Data Model

Vec Exec Engine Support Dup/Agg/Unq table, Support Block Reader Vectorized. Segment Vec is working in process.

4. How to use

  1. Set the environment variable set enable_vectorized_engine = true; (required)
  2. Set the environment variable set batch_size = 4096; (recommended)

5. Some diff from origin exec engine

doris-vectorized/doris-vectorized#294

Checklist(Required)

  1. Does it affect the original behavior: (No)
  2. Has unit tests been added: (Yes)
  3. Has document been added or modified: (No)
  4. Does it need to update dependencies: (No)
  5. Are there any changes that cannot be rolled back: (Yes)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

lihaopeng and others added 30 commits January 17, 2022 20:29
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
Co-authored-by: wangbo <506340561@qq.com>
Co-authored-by: emmymiao87 <522274284@qq.com>
Co-authored-by: Pxl <952130278@qq.com>
Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
Co-authored-by: thinker <zchw100@qq.com>
Co-authored-by: Zeno Yang <1521564989@qq.com>
Co-authored-by: Wang Shuo <wangshuo128@gmail.com>
Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com>
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
Co-authored-by: xinghuayu007 <1450306854@qq.com>
Co-authored-by: weizuo93 <weizuo@apache.org>
Co-authored-by: yiguolei <guoleiyi@tencent.com>
Co-authored-by: anneji-dev <85534151+anneji-dev@users.noreply.github.com>
Co-authored-by: awakeljw <993007281@qq.com>
Co-authored-by: taberylyang <95272637+taberylyang@users.noreply.github.com>
Co-authored-by: Cui Kaifeng <48012748+azurenake@users.noreply.github.com>
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
Co-authored-by: zuochunwei <zuochunwei@meituan.com>
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
…function alias at the same time && support substr(str,int) override (#7640)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
…y copy by the method get_data_type (#7600)

Co-authored-by: zuochunwei <zuochunwei@meituan.com>
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
Co-authored-by: zuochunwei <zuochunwei@meituan.com>
…cketShuffleJoin run fail && fix some compile fail (#7688)
…ll function (#7722)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>
* support function conv()

* add document
)

Co-authored-by: lihaopeng <lihaopeng@baidu.com>
@morningman morningman changed the title [Vectorized] Support Vectorized Exec Engine In Doris [feature][vectorized] Support Vectorized Exec Engine In Doris Jan 17, 2022
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
This is merged from branch vectorized and has been tested.
If there is no other comment, I will merge it very soon.

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 17, 2022
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 17, 2022
@morningman morningman changed the title [feature][vectorized] Support Vectorized Exec Engine In Doris [feature](vectorization) Support Vectorized Exec Engine In Doris Jan 18, 2022
@morningman morningman merged commit e1d7233 into master Jan 18, 2022
@adonis0147 adonis0147 deleted the vectorized branch November 13, 2023 06:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. area/vectorization kind/feature Categorizes issue or PR as related to a new feature. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Proposal] Vectorization Execution Engine optimization for Doris

8 participants